SignalQL v0.3 specification (draft proposal)
Status: Draft proposal with partial reference implementation. This document specifies additive extensions to v0.2. The reference compiler, playground, MCP server, and CLI implement v0.3-core surfaces; extended and capability tiers parse and compile to plan SQL or return structured
CapabilityError. See versions for the support matrix.
SignalQL v0.3 extends the v0.2 evidence-retrieval layer with the aggregation, grouping, and shaping primitives that analytics and experimentation require, plus three protocol-level surfaces — a metrics binding, catalog introspection, and a typed result envelope — that let an AI agent drive SignalQL against your own storage with no human in the loop.
Motivation
v0.1 was analytics-first (COUNT, GROUP BY DAY, FUNNEL, time windows). v0.2 generalized to domain-agnostic evidence retrieval but dropped aggregation: its only aggregate is bare count, and TREND returns a direction, not a series.
That is insufficient for the target use cases, and the reason is token economics rather than philosophy: an AI agent cannot consume a million raw rows. Evidence must arrive pre-aggregated. A retrieval-only language that cannot aggregate forces the agent to either dump rows it cannot fit, or emit raw SQL and bypass SignalQL entirely. v0.3 brings aggregation back as a domain-agnostic shaping layer — counting, summing, and bucketing are not domain opinions — while preserving the v0.2 core contract.
Preserved core principles (unchanged from v0.2)
- SignalQL retrieves and shapes structured evidence; it does not interpret, decide, or mutate state.
- No domain-specific nouns or interpretation labels in core grammar.
- No mutation/action language in core (see Write-back).
- Interpretation — significance, lift, prioritization, scoring — belongs to the AI/reasoning layer above SignalQL.
Aggregation is in scope because sum, count, avg, and time bucketing carry no business opinion. Statistical inference (p-values, confidence intervals, power) stays out of scope; v0.3 instead guarantees the sufficient statistics the AI layer needs to compute it (see Sufficient statistics).
Grammar (EBNF, additive over v0.2)
Keywords are case-insensitive in the reference implementation. Whitespace is insignificant outside string literals.
Reserved words. The reference tokenizer reserves all grammar keywords, including the short time units h d w m and the field-function names signal probability semantic_match. A bare identifier equal to a reserved word is not a valid field name. To reference a field whose name collides with a reserved word, qualify it as a dotted path (e.g. properties.signal, properties.d) — only the first segment of a path is keyword-checked, and a segment containing a . is never treated as a keyword.
Canonical retrieve_query clause order (also used by the canonical formatter):
FROM → WHERE → TRAVERSE → DURING [AS OF] → BUCKET BY → GROUP BY → HAVING → RETURN → ORDER BY → LIMIT → TREND
query ::= retrieve_query | sequence_query | funnel_query | watch_query | describe_query
retrieve_query ::= "FROM" source
where_clause? traverse_clause?
window_clause? bucket_clause?
group_clause? having_clause?
select_clause
order_clause? limit_clause?
trend_clause?
source ::= entity_source | event_source | metric_source
entity_source ::= "entity" "(" qualified_identifier ")"
event_source ::= "events"
metric_source ::= "metric" "(" qualified_identifier ")" (* governed metric, see Metrics binding *)
(* --- filtering: richer predicates, OR, grouping --- *)
where_clause ::= "WHERE" predicate_expr
predicate_expr ::= predicate_term ( "OR" predicate_term )*
predicate_term ::= predicate_factor ( "AND" predicate_factor )*
predicate_factor ::= predicate | "(" predicate_expr ")" | "NOT" predicate_factor
predicate ::= field_path comparator value
| field_path "IN" "(" value_list ")"
| field_path "NOT" "IN" "(" value_list ")"
| field_path "BETWEEN" value "AND" value
| field_path "CONTAINS" string_literal
| field_path "IS" "NULL"
| field_path "IS" "NOT" "NULL"
comparator ::= "=" | "!=" | ">" | "<" | ">=" | "<="
value_list ::= value ( "," value )*
field_path ::= identifier ( "." identifier )*
| "signal" "(" identifier ")"
| "probability" "(" identifier ")"
| "semantic_match" "(" string_literal ")"
(* --- time windowing + reproducibility anchor --- *)
window_clause ::= "DURING" time_window as_of_clause?
time_window ::= "LAST" integer time_unit
| "BETWEEN" timestamp_literal "AND" timestamp_literal
as_of_clause ::= "AS" "OF" timestamp_literal (* anchors relative windows *)
bucket_clause ::= "BUCKET" "BY" time_unit (* emits a time series *)
(* --- grouping + post-aggregation filtering --- *)
group_clause ::= "GROUP" "BY" group_dim ( "," group_dim )*
group_dim ::= field_path
having_clause ::= "HAVING" having_expr
having_expr ::= having_term ( "OR" having_term )*
having_term ::= having_factor ( "AND" having_factor )*
having_factor ::= having_predicate | "(" having_expr ")" | "NOT" having_factor
having_predicate ::= aggregate comparator value
| count_shorthand comparator value
| field_path comparator value
| field_path "IN" "(" value_list ")"
| field_path "NOT" "IN" "(" value_list ")"
| field_path "BETWEEN" value "AND" value
| field_path "IS" "NULL"
| field_path "IS" "NOT" "NULL"
(* --- ordering + limiting --- *)
order_clause ::= "ORDER" "BY" order_term ( "," order_term )*
order_term ::= ( field_path | aggregate ) order_dir?
order_dir ::= "ASC" | "DESC"
limit_clause ::= "LIMIT" integer
traverse_clause ::= "TRAVERSE" identifier "DEPTH" "<=" integer
trend_clause ::= "TREND" trend_dir "OVER" integer time_unit
trend_dir ::= "UP" | "DOWN" | "FLAT"
time_unit ::= time_unit_short | time_unit_long
time_unit_short ::= "h" | "d" | "w" | "m"
time_unit_long ::= "HOUR" | "HOURS" | "DAY" | "DAYS" | "WEEK" | "WEEKS" | "MONTH" | "MONTHS"
(* --- aggregation in RETURN --- *)
select_clause ::= "RETURN" return_field ( "," return_field )*
return_field ::= field_path | "entity_id" | "timestamp" | count_return | aggregate
count_return ::= "count" (* bare count keyword *)
aggregate ::= standard_aggregate | percentile_aggregate | stats_aggregate
standard_aggregate ::= agg_fn "(" agg_arg ")"
agg_fn ::= "count" | "count_distinct" | "sum" | "avg" | "min" | "max"
| "variance" | "stddev"
agg_arg ::= field_path
percentile_aggregate ::= "percentile" "(" field_path "," decimal ")"
stats_aggregate ::= "stats" "(" field_path ")"
count_shorthand ::= "count" (* HAVING shorthand for count() *)
(* --- funnels (restored, first-class conversion) --- *)
funnel_query ::= "FUNNEL" funnel_step ( "->" funnel_step )+
"FROM" source where_clause? window_clause?
"WITHIN" integer time_unit group_clause?
funnel_step ::= string_literal | identifier
(* --- sequence (unchanged from v0.2) --- *)
sequence_query ::= "FIND" "sequence" "WHERE" sequence_steps "WITHIN" integer time_unit
sequence_steps ::= sequence_step ( "->" sequence_step )+
sequence_step ::= identifier | string_literal
(* --- continuous / threshold subscription (optional capability) --- *)
watch_query ::= "WATCH" "(" retrieve_query ")" "WHEN" having_expr
(* --- catalog introspection --- *)
describe_query ::= "DESCRIBE" describe_target
describe_target ::= "entities" | "events" | "signals" | "metrics" | "relationships"
| "entity" "(" qualified_identifier ")"
qualified_identifier ::= identifier ( "." identifier )*
identifier ::= [a-zA-Z_] [a-zA-Z0-9_]*
integer ::= [0-9]+
decimal ::= [0-9]+ "." [0-9]+
value ::= string_literal | integer | decimal | boolean | timestamp_literal
string_literal ::= '"' { character } '"'
boolean ::= "true" | "false"
timestamp_literal ::= string_literal (* RFC 3339 instant, e.g. "2026-06-13T00:00:00Z" *)RETURN count vs count()
RETURN count— bare keyword allowed only inRETURN. Equivalent toRETURN count()for ungrouped or grouped queries. Produces a logical column namedcount.RETURN count()— explicit aggregate call form. Required when authors want uniform aggregate syntax alongside other aggregates (for exampleRETURN properties.path, count()).HAVING count > n— shorthand forHAVING count() > n. Only the zero-argumentcountaggregate may appear bare inHAVING; all other aggregates MUST use call syntax (for exampleHAVING count_distinct(user_id) > 100).
Implementation tiers (for examples below)
Examples marked v0.3-core are in the first reference implementation slice. Examples marked v0.3-extended or v0.3-capability are normative grammar but MAY be deferred by adapters (which MUST return an explicit capability error).
| Tier | Constructs |
|---|---|
| v0.3-core | Event aggregates incl. variance, stddev, percentile, stats, GROUP BY, BUCKET BY, HAVING, ORDER BY, LIMIT, AS OF, funnels |
| v0.3-extended | entity/metric sources, signal()/probability()/semantic_match() field refs, TRAVERSE, TREND |
| v0.3-capability | DESCRIBE, WATCH, governed metric(...) execution, catalog providers |
What v0.3 adds, by capability
| Capability | Construct | Use case it unlocks |
|---|---|---|
| Aggregations | sum/avg/min/max/count_distinct/percentile/variance/stddev | Metrics, distributions, SLOs |
| Dimensional breakdown | GROUP BY a, b | Segmentation |
| Time series | BUCKET BY d | Trend charts, run charts |
| Rich filtering | IN, !=, BETWEEN, CONTAINS, IS NULL, OR, NOT, parentheses | Real segments |
| Top-N / ranking | ORDER BY … LIMIT n | "Top 10 pages" |
| Post-aggregation filter | HAVING | "Cohorts with >100 users" |
| Funnels with conversion | FUNNEL a -> b -> c … WITHIN | Activation/checkout analysis |
| Reproducibility anchor | AS OF <ts> | Auditable, re-runnable A/B reads |
| Sufficient statistics | stats(metric) | AI-side significance testing |
| Governed metrics | metric(name) / FROM metric(...) | Consistent definitions |
| Catalog discovery | DESCRIBE … | Headless query authoring |
| Threshold watch | WATCH (...) WHEN … | Alerting / sequential monitoring |
Sufficient statistics for experimentation
To compute lift, confidence intervals, p-values, and power, the AI layer needs, per arm (and optionally per segment), the sufficient statistics of the metric. stats(field) returns a fixed-shape struct so SignalQL never performs inference itself:
stats(field) -> { n, sum, sum_sq, mean, variance, stddev, min, max }A headless A/B read is then a single grouped query (v0.3-core — stats over FROM events):
FROM events
WHERE event_name = "checkout"
DURING LAST 14 d
AS OF "2026-06-13T00:00:00Z"
GROUP BY signal(experiment_variant)
RETURN signal(experiment_variant), count_distinct(user_id), stats(order_value)SignalQL returns (variant, n_users, {n, mean, variance, …}) per arm. The AI layer computes the two-sample test, the lift, and the interval. SignalQL renders no verdict — consistent with the v0.2 non-interpretation contract.
Metrics binding (semantic layer)
metric(name) resolves a governed, versioned metric definition supplied by a domain pack or a metrics catalog. This is the construct that actually replaces a dashboard: a metric like analytics.weekly_active_users carries one definition, so every AI invocation computes it identically. Without it, "interpretation belongs to the AI" silently means every agent can define active_user differently — discarding the one durable asset a BI tool provides.
FROM metric(analytics.weekly_active_users)
DURING LAST 12 w
BUCKET BY w
RETURN timestamp, metric_value(v0.3-capability — requires metrics catalog binding.)
Metric definitions live in domain packs (see domain-packs.md), are versioned independently of the language, and MUST expose their resolved expression and version in the result envelope's metadata.metrics for audit. The core grammar gains the metric(...) reference only; it does not embed any metric formula.
Catalog introspection
(v0.3-capability — optional adapter surface; see Implementation tiers).)
Headless authoring requires the agent to discover the schema. DESCRIBE returns a machine-readable catalog (shape defined by the runtime, but stable per adapter):
DESCRIBE entities
DESCRIBE entity(analytics.session)
DESCRIBE metrics
DESCRIBE signalsEach DESCRIBE result SHOULD include: identifier, kind, fields with types and units, nullability, relationship edges with target types, available signals and probabilities, and — for metrics — the definition version. This is the surface that lets an agent write valid SignalQL with no human who already knows the schema. Adapters MAY also expose the catalog out-of-band (e.g. an HTTP endpoint) using the same shape.
Typed result envelope
v0.2 said execution metadata SHOULD include access filters. v0.3 specifies a result envelope an agent can program against deterministically:
{
"columns": [
{ "name": "variant", "type": "string", "unit": null },
{ "name": "n_users", "type": "integer", "unit": "users" },
{ "name": "order_value", "type": "struct", "unit": "USD" }
],
"rows": [ /* … */ ],
"metadata": {
"row_count": 3,
"truncated": false,
"limit_applied": 1000,
"as_of": "2026-06-13T00:00:00Z",
"access_filtered": true,
"metrics": [{ "name": "analytics.weekly_active_users", "version": "1.4.0" }],
"cost_estimate": { "scanned_rows": 184523, "engine": "postgres" },
"compiled_sql_digest": "sha256:…"
}
}columns (with type and unit), row_count, truncated, as_of, access_filtered, and resolved metrics are REQUIRED when applicable. cost_estimate and compiled_sql_digest are RECOMMENDED. Returning units and types lets the AI layer reason without guessing.
Execution model (additive)
The v0.2 pipeline is unchanged through step 5; v0.3 inserts shaping and a richer return contract:
- Parse to AST; reject unknown constructs.
- Bind references against the catalog (entities/events/relationships/signals/ metrics).
metric(...)resolves to its versioned definition here. - Build deterministic plan: filters → window/
AS OFanchor → traversal → bucketing → grouping → aggregation →HAVING→ ordering → limit. - Enforce limits: max window span, max group cardinality, max result rows, max funnel steps (all adapter-defined, surfaced in metadata).
- Apply caller permissions before evidence leaves the runtime boundary.
- Return the typed result envelope with execution metadata.
AS OF makes relative windows (LAST 14 d) reproducible: the runtime anchors "now" to the supplied timestamp. Absent AS OF, the runtime clock is used and the resolved anchor MUST still be reported in metadata.as_of.
Watch / subscription (optional capability)
(v0.3-capability — requires streaming runtime support.)
A dashboard also watches and alerts; WATCH expresses a continuous or threshold query so an agent need not poll. It is an optional capability: runtimes without streaming support MUST return an explicit capability error (consistent with how v0.2 treats semantic_match).
WATCH (
FROM events
WHERE event_name = "error"
DURING LAST 1 h
RETURN count
) WHEN count > 100Write-back (out of core)
The "Linear/Jira become headless storage" use case requires creating and moving records, which the v0.2 retrieval-only contract forbids (ASSIGN, EXECUTE, UPDATE, DELETE remain non-goals). v0.3 does not add mutation to the core. The deliberate decision recorded here:
- Writes live in a sibling protocol ("SignalQL-Command") that reuses the same catalog, provenance, and permissions discipline, OR
- Writes remain entirely outside SignalQL, owned by the application layer.
Either is defensible; what is not defensible is implying a retrieval-only spec can deliver a read+write headless tool. The core stays read-only by design.
Example queries
Daily active series with breakdown (v0.3-core)
FROM events
WHERE event_name = "session_start"
DURING LAST 30 d
BUCKET BY d
GROUP BY properties.platform
RETURN timestamp, properties.platform, count_distinct(user_id)
ORDER BY timestamp ASCTop-N with post-aggregation filter (v0.3-core)
FROM events
WHERE event_name = "page_view"
DURING LAST 7 d
GROUP BY properties.path
HAVING count > 1000
RETURN properties.path, count
ORDER BY count DESC
LIMIT 10Funnel with per-step conversion, segmented (v0.3-core)
FUNNEL "visit" -> "signup" -> "activated"
FROM events
DURING LAST 28 d
WITHIN 7 d
GROUP BY properties.planA/B sufficient statistics (headless experiment read) (v0.3-core)
FROM events
WHERE event_name = "purchase" AND properties.amount IS NOT NULL
DURING LAST 14 d
AS OF "2026-06-13T00:00:00Z"
GROUP BY signal(experiment_variant)
RETURN signal(experiment_variant), count_distinct(user_id), stats(properties.amount)Governed metric over time (v0.3-capability)
FROM metric(analytics.weekly_active_users)
DURING LAST 12 w
BUCKET BY w
RETURN timestamp, metric_valueCatalog discovery before authoring (v0.3-capability)
DESCRIBE entity(analytics.session)Result shapes
| Query shape | Logical columns |
|---|---|
RETURN count or RETURN count() | count |
… GROUP BY a | a, aggregates |
… BUCKET BY d | timestamp, aggregates |
RETURN percentile(x, p) | single numeric column (adapter names the percentile) |
RETURN stats(x) | n, sum, sum_sq, mean, variance, stddev, min, max |
FUNNEL … | step, users, conversion_rate (per group when grouped) |
DESCRIBE … | catalog records (see Catalog introspection) |
Exact column names may vary by dialect; adapters document mappings, and the typed result envelope is authoritative.
Determinism contract (extends v0.2)
- Canonical formatting produces stable clause ordering across all new clauses.
- Equivalent AST yields equivalent plan output.
- Grouping and ordering: when
ORDER BYis absent, adapters MUST NOT rely on engine-default row order; the envelope MUST report whether ordering was applied. AS OFmakes relative-window reads reproducible; the resolved anchor is always reported.- Resolved metric versions are reported, so a re-run can detect definition drift.
Reference compiler status
The monorepo ships the v0.3-core reference implementation:
- Parser / AST / canonical formatter — all v0.3 query shapes including
DESCRIBEandWATCH; AST validated againstschemas/signalql-ast-v0.3.schema.json. - Postgres — direct parameterized SQL for
FROM eventsretrieve (WHERE/GROUP BY/BUCKET BY/HAVING/ORDER BY/LIMIT/AS OF), all aggregates (count,count_distinct,sum,avg,min,max,variance,stddev,percentile,stats), and first-class funnels (chained-CTE conversion). Entity sources,signal()/probability()/semantic_match()field refs,TRAVERSE, andTRENDcompile to JSON execution plans. - Metric binding —
FROM metric(...)binds against a supplied metric registry and compiles to direct SQL; resolved name/version flow into envelope metadata. Without a registry the metric source stays a guarded plan. - Catalog / DESCRIBE —
DESCRIBEexecutes against a supplied catalog provider and returns catalog records in the typed result envelope; with no provider it returns an explicitCapabilityError. - Result envelope —
buildResultEnvelope/buildEnvelopeFromCompileused by MCP execute, CLI--envelope, and the CSV demo adapter; aggregate inputs are numerically cast. - Demo evaluator — playground and CSV adapter run v0.3 retrieve (incl.
count_distinct, distribution aggregates, andstats) and funnels against bundled fixtures.
Residual limitations. WATCH is parse + capability-gated (no streaming delivery). Entity-source retrieve and signal()/probability()/semantic_match()/TRAVERSE/TREND compile to plan SQL rather than direct Postgres. Grouped (segmented) funnels return an explicit capability error. The demo evaluator coerces JSON values to numbers best-effort and is intended for fixtures, not production warehouses.
Migration from v0.1 and v0.2
| v0.1 / v0.2 habit | v0.3 equivalent |
|---|---|
COUNT events FROM events WHERE … | FROM events WHERE … RETURN count() |
COUNT users FROM events WHERE … | FROM events WHERE … RETURN count_distinct(user_id) |
GROUP BY DAY | BUCKET BY d (or w / h / m) |
FUNNEL "a" THEN "b" FROM events … | FUNNEL "a" -> "b" FROM events … WITHIN … |
v0.2 bare count retrieve | RETURN count() with explicit aggregates |
v0.1 syntax remains accepted for backward compatibility. New authoring should prefer v0.3 retrieve/funnel shapes and consume the typed result envelope in agents.
When compile hits an unsupported tier, tools return CapabilityError with feature and tier fields instead of opaque SQL failures.
AI-side significance workflow
SignalQL does not emit p-values or “significant/not significant” labels. For experimentation, agents should:
- Query sufficient statistics with
RETURN stats(field)or grouped counts (when the adapter tier supports it). - Read
metadata.signalql_version,as_of, and metric version fields from the result envelope. - Run inference (z-test, Bayesian, sequential) in the reasoning layer using the returned
n,sum,sum_sq, and group keys.
The reference compiler expands RETURN stats(field) into eight numeric columns (n, sum, sum_sq, mean, variance, stddev, min, max) for direct Postgres execution on FROM events queries.
Non-goals (unchanged intent)
- No statistical inference verdicts (significance, lift, power) in core — only sufficient statistics.
- No domain nouns or interpretation labels in core grammar.
- No mutation/action language in core.
- No embedded metric formulas in core grammar (only
metric(...)references).
Normative references
- Version index: spec versions
- Baseline this extends: v0.2 spec, v0.2 scope
- Domain packs / metrics definitions: domain-packs.md
- Analytics heritage (funnels, grouping, windows): v0.1 spec