SQL Query Reference
On this page
- SELECT
- Syntax
- Basic projection
- Scalar-function projections (v0.5.1)
- WHERE
- IN / NOT IN
- BETWEEN / NOT BETWEEN
- LIKE / NOT LIKE / ILIKE / NOT ILIKE
- IS NULL / IS NOT NULL
- Parameterized
- Scalar functions in WHERE
- CASE WHEN
- ORDER BY / LIMIT / OFFSET
- Pagination with parameters
- Aggregates
- HAVING
- Filtered aggregates (FILTER (WHERE ...))
- JOIN
- Subqueries
- IN (SELECT ...), EXISTS, NOT EXISTS
- Common Table Expressions (CTEs)
- Recursive CTEs
- Set operations
- Time-travel
- AT OFFSET (supported)
- AS OF TIMESTAMP (not yet implemented)
- What is not supported (v0.4)
- Resource limits
- Related
Everything SELECT-shaped: joins, aggregates, CTEs, CASE, LIKE, and
time-travel.
SELECT
Syntax
SELECT [DISTINCT] select_list
FROM from_list
[WHERE condition]
[GROUP BY expression [, ...]]
[HAVING condition]
[ORDER BY expression [ASC | DESC] [, ...]]
[LIMIT n [OFFSET m]]
[AT OFFSET n];
AS OF TIMESTAMP '...'is planned for v0.6.0 but not currently implemented — see Time-travel below.
Basic projection
SELECT * FROM patients;
SELECT id, name FROM patients;
SELECT id, name AS patient_name FROM patients;
SELECT DISTINCT specialty FROM providers;
Scalar-function projections (v0.5.1)
The same scalar functions that work in WHERE work in SELECT:
UPPER, LOWER, LENGTH, TRIM, CONCAT, ||, ABS, ROUND,
CEIL/CEILING, FLOOR, COALESCE, NULLIF, CAST. Aliases are
preserved end-to-end; un-aliased projections synthesise a
PostgreSQL-style output name from the outermost function
(upper, coalesce, cast, concat).
SELECT id, UPPER(name) AS name_uc FROM patients;
SELECT COALESCE(middle_name, '') AS middle FROM patients;
SELECT CAST(account_id AS BIGINT) AS account FROM billing;
SELECT first_name || ' ' || last_name AS full_name FROM patients;
WHERE
Operators: =, !=, <, <=, >, >=, plus AND, OR, NOT.
SELECT * FROM patients WHERE id = 1;
SELECT * FROM patients WHERE active = true AND dob < '1980-01-01 00:00:00';
SELECT * FROM patients WHERE NOT active;
IN / NOT IN
NOT IN landed in v0.5.1; v0.4.x and earlier rejected it at parse time.
Both variants return UNKNOWN for NULL cells, which surfaces as
false — matching the bare-column =/!= semantics.
SELECT * FROM patients WHERE id IN (1, 2, 3);
SELECT * FROM patients WHERE id NOT IN (999, 1000);
BETWEEN / NOT BETWEEN
BETWEEN low AND high desugars internally to >= AND <=. NOT BETWEEN ships as a first-class predicate so NULL cells correctly
evaluate to false (rather than surprising behaviour from the naive
< OR > rewrite).
SELECT *
FROM encounters
WHERE encounter_date BETWEEN '2024-01-01 00:00:00' AND '2024-12-31 23:59:59';
-- v0.5.1: exclude the range.
SELECT *
FROM encounters
WHERE encounter_date NOT BETWEEN '2024-01-01 00:00:00' AND '2024-12-31 23:59:59';
LIKE / NOT LIKE / ILIKE / NOT ILIKE
Pattern matching with % (zero or more characters) and _ (single
character). Implemented iteratively so there is no exponential-backtracking
vulnerability.
LIKE is case-sensitive; ILIKE folds both the value and the pattern to
Unicode simple lowercase before matching. NOT LIKE / NOT ILIKE invert
the result and follow the same three-valued logic (a non-text cell matches
neither, consistent with LIKE).
SELECT * FROM patients WHERE name LIKE 'J%';
SELECT * FROM patients WHERE name LIKE '_ane%';
SELECT * FROM patients WHERE name NOT LIKE 'Test%';
SELECT * FROM patients WHERE name ILIKE 'jane%'; -- matches Jane, JANE, jAnE
SELECT * FROM patients WHERE email NOT ILIKE '%@test.com';
All four variants are available as of v0.5.0; v0.4.x and earlier rejected
NOT LIKE and did not parse ILIKE.
IS NULL / IS NOT NULL
SELECT * FROM patients WHERE email IS NULL;
SELECT * FROM patients WHERE email IS NOT NULL;
Parameterized
Use $1, $2, ...:
SELECT * FROM patients WHERE id = $1 AND active = $2;
Scalar functions in WHERE
v0.5.1 wired the scalar-expression evaluator into predicate position
so UPPER(col) = 'ALICE', COALESCE(col, 0) > 10, CAST(text_col AS BIGINT) = $1, and col || '!' = $2 all parse and execute. Supported
functions today: UPPER, LOWER, LENGTH, TRIM, CONCAT, ||,
ABS, ROUND, CEIL/CEILING, FLOOR, COALESCE, NULLIF, and
CAST(... AS <type>).
NULL on either side of the comparison surfaces as false —
matching SQL three-valued logic for bare-column predicates.
SELECT * FROM patients WHERE UPPER(name) = 'ALICE';
SELECT * FROM patients WHERE COALESCE(age, 0) > 30;
SELECT * FROM patients WHERE CAST(account_id AS BIGINT) = $1;
SELECT * FROM patients WHERE LENGTH(email) > 12;
SELECT * FROM patients WHERE scalar_name != 'unknown'; -- `!=` routes through scalar path
CASE WHEN
Searched form (conditions per branch):
SELECT
id,
name,
CASE
WHEN dob < '1960-01-01 00:00:00' THEN 'senior'
WHEN dob < '1990-01-01 00:00:00' THEN 'adult'
ELSE 'young'
END AS age_bucket
FROM patients;
Simple CASE (single expression with WHEN value THEN ...) is not supported
— rewrite as the searched form above.
ORDER BY / LIMIT / OFFSET
SELECT * FROM patients
ORDER BY name ASC
LIMIT 20 OFFSET 40;
SELECT id, name FROM patients ORDER BY dob DESC, name ASC;
Pagination with parameters
LIMIT and OFFSET accept either an integer literal or a $N parameter
placeholder. Bind a BIGINT (Value::BigInt from the SDK) at execution
time. Negative or non-integer bound values are rejected with a clear error.
-- Server-side pagination from any SDK
SELECT id, patient_id, status, updated_at
FROM clinical_note
WHERE patient_id = $1
ORDER BY updated_at DESC, id DESC
LIMIT $2;
-- Cursor-style pagination
SELECT id, updated_at
FROM clinical_note
WHERE patient_id = $1
AND (updated_at < $2 OR (updated_at = $2 AND id < $3))
ORDER BY updated_at DESC, id DESC
LIMIT $4;
Aggregates
Supported: COUNT(*), COUNT(col), SUM, AVG, MIN, MAX.
SUMuses checked addition (overflow → error, not silent wrap).AVGguards division-by-zero.
SELECT
provider_id,
COUNT(*) AS visits,
AVG(duration_minutes) AS avg_duration
FROM encounters
GROUP BY provider_id;
HAVING
Filters after GROUP BY:
SELECT provider_id, COUNT(*) AS visits
FROM encounters
GROUP BY provider_id
HAVING COUNT(*) > 100;
Filtered aggregates (FILTER (WHERE ...))
Per-aggregate row filters let one SELECT compute multiple counts/sums/averages over different slices of the same group:
-- Clinical dashboard: total encounters and abnormal-result count per provider
SELECT
provider_id,
COUNT(*) AS total_encounters,
COUNT(*) FILTER (WHERE result = 'abnormal') AS abnormal_count,
AVG(duration_minutes) FILTER (WHERE encounter_type = 'urgent') AS avg_urgent_duration
FROM encounters
GROUP BY provider_id;
Each aggregate’s FILTER is independent. FILTER is supported on
COUNT(*), COUNT(col), SUM, AVG, MIN, MAX.
JOIN
Supported: INNER JOIN, LEFT JOIN, RIGHT JOIN, FULL OUTER JOIN, and
CROSS JOIN. Multi-table joins are applied via the plan_join_query +
Materialize wrapper, so WHERE, ORDER BY, LIMIT, and CASE WHEN
over the joined result all work.
The USING(col1, col2, ...) shorthand expands to
ON left.colN = right.colN AND .... CROSS JOIN produces a Cartesian
product and is capped at 1,000,000 output rows to prevent runaway queries.
SELECT p.name, e.encounter_date, pr.name AS provider
FROM patients p
JOIN encounters e ON e.patient_id = p.id
JOIN providers pr ON pr.id = e.provider_id
WHERE p.active = true
ORDER BY e.encounter_date DESC
LIMIT 50;
LEFT JOIN keeps rows from the left side even if there is no match:
SELECT p.name, COUNT(e.id) AS encounter_count
FROM patients p
LEFT JOIN encounters e ON e.patient_id = p.id
GROUP BY p.name
ORDER BY encounter_count DESC;
RIGHT JOIN and FULL OUTER JOIN are not supported; swap the sides and
use LEFT JOIN.
Subqueries
Allowed in the FROM clause, inside JOIN, or in scalar positions:
-- Scalar subquery
SELECT name,
(SELECT COUNT(*) FROM encounters e WHERE e.patient_id = p.id) AS visits
FROM patients p;
-- Derived table
SELECT dept, AVG(salary) AS avg_salary
FROM (
SELECT dept, salary FROM employees WHERE active = true
) sub
GROUP BY dept;
IN (SELECT ...), EXISTS, NOT EXISTS
Uncorrelated subquery predicates are pre-executed at query entry and substituted into the outer query, so the subquery runs exactly once regardless of outer cardinality:
-- Active patients with at least one encounter
SELECT id, name FROM patients
WHERE id IN (SELECT patient_id FROM encounters);
-- Patients with no recorded encounters
SELECT id, name FROM patients
WHERE NOT EXISTS (
SELECT 1 FROM encounters WHERE encounters.patient_id = patients.id
);
Correlated subqueries (where the inner query references outer columns) are not yet supported — rewrite them as a CTE join.
Common Table Expressions (CTEs)
Non-recursive WITH clauses work today:
WITH recent_encounters AS (
SELECT patient_id, COUNT(*) AS cnt
FROM encounters
WHERE encounter_date > '2024-01-01 00:00:00'
GROUP BY patient_id
)
SELECT p.name, r.cnt
FROM patients p
JOIN recent_encounters r ON r.patient_id = p.id
WHERE r.cnt > 2
ORDER BY r.cnt DESC;
Recursive CTEs
WITH RECURSIVE is supported via iterative fixed-point evaluation. The
anchor SELECT seeds the working set; the recursive arm runs against the
accumulating result until no new rows are produced or the iteration cap
(default 1,000) is hit. This pattern honours the workspace “no recursion”
lint and prevents runaway queries.
WITH RECURSIVE descendants AS (
SELECT id, manager_id FROM employees WHERE id = 1 -- anchor
UNION ALL
SELECT e.id, e.manager_id -- recursive
FROM employees e
WHERE e.manager_id IN (SELECT id FROM descendants)
)
SELECT * FROM descendants;
Hitting the iteration cap returns a clear error rather than silently truncating.
Set operations
-- UNION (de-dupes) and UNION ALL (keeps duplicates)
SELECT id, name FROM patients WHERE active = true
UNION
SELECT id, name FROM archived_patients;
SELECT id FROM stream_a
UNION ALL
SELECT id FROM stream_b;
-- INTERSECT and EXCEPT also work with both bare and ALL forms
SELECT id FROM patients_2024 INTERSECT SELECT id FROM patients_2025;
SELECT id FROM all_users EXCEPT SELECT id FROM banned_users;
SELECT id FROM ledger_a INTERSECT ALL SELECT id FROM ledger_b;
SELECT id FROM ledger_a EXCEPT ALL SELECT id FROM ledger_b;
UNION / INTERSECT / EXCEPT deduplicate by row content. The ALL
variants preserve multiset semantics — useful for compliance reconciliation
where row multiplicities matter.
Time-travel
Every write is an immutable log entry, so historical state is first-class.
AT OFFSET (supported)
When you captured a log offset programmatically:
SELECT * FROM patients AT OFFSET 4200 WHERE id = 123;
The SDK-side equivalent is client.queryAt(sql, params, offset) (TypeScript,
Rust, Python).
AS OF TIMESTAMP (not yet implemented)
Audit reference: AUDIT-2026-04 L-4.
Timestamp-indexed time-travel is planned for v0.6.0 but not currently
implemented. The query parser recognises the AS OF TIMESTAMP grammar in
the syntax summary at the top of this page — callers that pass it today
will receive an UnsupportedFeature error, not a silently-ignored
clause. There is no timestamp-to-offset resolver in kimberlite-query::executor.
Until the v0.6.0 implementation lands, capture a log offset at the
point in time you need and issue AT OFFSET n (or the
PreparedQuery::execute_at(offset) API on the SDK). Common patterns:
- Snapshot the
log_offsetreturned by every write response, persist it alongside the business record, and use it later as an audit anchor. - For periodic checkpoints, run
SELECT current_offset()on a cron and store(wall_clock_timestamp, log_offset)in a side table — this is the same shape the v0.6.0 resolver will expose.
The v1.0 compliance-query surface ("what was the patient record on 2026-01-15?") will front-end this via the SDK once the resolver lands.
Until then, compliance queries must supply the offset explicitly.
What is not supported (v0.4)
| Feature | Alternative |
|---|---|
Window functions (OVER, PARTITION BY, ROW_NUMBER, RANK, LAG, …) | Rewrite with GROUP BY + self-join, or compute application-side. Planned v0.5.0. |
WITH RECURSIVE | Bounded iteration application-side. Deliberate omission. |
Multi-statement transactions (BEGIN / COMMIT / ROLLBACK) | Single statements are atomic. v1.0. |
RIGHT JOIN, FULL OUTER JOIN | Swap sides, use LEFT JOIN. |
Simple CASE (CASE x WHEN 1 THEN ...) | Use searched CASE WHEN x = 1 THEN .... |
CAST(... AS ...), EXTRACT(... FROM ...) | Compute application-side. |
ARRAY, JSON, JSONB types | Store via the append-only stream API. |
Unsupported syntax fails cleanly at parse time.
Resource limits
| Limit | Default |
|---|---|
| Max JOIN output rows | 1,000,000 |
| Max GROUP count | 100,000 |
Related
- DDL Reference — table creation
- DML Reference —
INSERT,UPDATE,DELETE - SQL Overview — architecture, supported types