Kimberlite SQL reference

SQL DDL Reference

On this page

Data Definition Language for creating projections and indexes.

Status: Planned for v0.6.0

CREATE PROJECTION

Create a materialized view of the append-only log.

Syntax

CREATE PROJECTION projection_name AS
  SELECT
    expression [AS alias],
    ...
  FROM events
  WHERE condition
  [ORDER BY column]
  [PARTITION BY tenant_id];

Examples

Basic projection:

CREATE PROJECTION patients AS
  SELECT
    data->>'id' AS id,
    data->>'name' AS name,
    data->>'date_of_birth' AS dob,
    position,
    timestamp
  FROM events
  WHERE tenant_id = 1
    AND stream_id = 100;

Multi-tenant projection:

CREATE PROJECTION all_patients AS
  SELECT
    tenant_id,
    data->>'id' AS id,
    data->>'name' AS name
  FROM events
  WHERE stream_id = 100
  PARTITION BY tenant_id;

Filtered projection:

CREATE PROJECTION active_patients AS
  SELECT
    data->>'id' AS id,
    data->>'name' AS name,
    data->>'status' AS status
  FROM events
  WHERE tenant_id = 1
    AND stream_id = 100
    AND data->>'status' = 'active';

Denormalized projection:

CREATE PROJECTION patient_appointments AS
  SELECT
    p.data->>'id' AS patient_id,
    p.data->>'name' AS patient_name,
    a.data->>'date' AS appointment_date,
    a.data->>'doctor' AS doctor
  FROM events p
  JOIN events a ON p.data->>'id' = a.data->>'patient_id'
  WHERE p.stream_id = 100
    AND a.stream_id = 200;

Options

OptionDescriptionDefault
PARTITION BYPartition projection by column (typically tenant_id)None
ORDER BYPhysical sort order for range queriesNone
WITH (option=value)Storage optionsSee below

Storage options:

CREATE PROJECTION patients AS
  SELECT ...
  FROM events
  WITH (
    compression = 'zstd',        -- Compression: none, zstd, lz4
    cache_size = '1GB',          -- Projection cache size
    checkpoint_interval = 10000  -- Entries between checkpoints
  );

Permissions

  • Requires CREATE PROJECTION permission
  • Projection inherits tenant isolation from WHERE tenant_id = ?

DROP PROJECTION

Delete a projection (does not affect underlying log).

Syntax

DROP PROJECTION [IF EXISTS] projection_name;

Examples

-- Drop projection
DROP PROJECTION patients;

-- Drop if exists (no error if missing)
DROP PROJECTION IF EXISTS patients;

Behavior

  • Deletes projection metadata and materialized data
  • Does not delete events from the log
  • Projection can be recreated with CREATE PROJECTION

CREATE INDEX

Create an index on a projection for faster queries.

Syntax

CREATE INDEX index_name
  ON projection_name (column [ASC|DESC], ...)
  [WHERE condition];

Examples

Simple index:

CREATE INDEX patients_name_idx
  ON patients (name);

Composite index:

CREATE INDEX patients_status_dob_idx
  ON patients (status, date_of_birth);

Partial index:

CREATE INDEX active_patients_idx
  ON patients (name)
  WHERE status = 'active';

Unique index:

CREATE UNIQUE INDEX patients_id_idx
  ON patients (id);

Index Types

TypeSyntaxUse Case
B-tree (default)CREATE INDEXRange queries, equality
HashCREATE INDEX ... USING HASHEquality only (faster)
GINCREATE INDEX ... USING GINJSON, arrays

Examples:

-- Hash index (equality only)
CREATE INDEX patients_id_hash_idx
  ON patients USING HASH (id);

-- GIN index for JSON queries
CREATE INDEX patients_metadata_idx
  ON patients USING GIN (metadata);

Performance

  • Index build time: ~1M rows/sec
  • Index size: ~50% of data size (B-tree)
  • Query speedup: 10-1000x for indexed columns

Best practices:

  • Index frequently queried columns
  • Use composite indexes for multi-column queries
  • Partial indexes for subset queries
  • Avoid over-indexing (slows writes)

DROP INDEX

Delete an index.

Syntax

DROP INDEX [IF EXISTS] index_name;

Examples

-- Drop index
DROP INDEX patients_name_idx;

-- Drop if exists
DROP INDEX IF EXISTS patients_name_idx;

ALTER PROJECTION

Modify an existing projection (v0.7.0+).

Syntax

-- Add column (rebuilds projection)
ALTER PROJECTION projection_name
  ADD COLUMN column_name AS expression;

-- Drop column
ALTER PROJECTION projection_name
  DROP COLUMN column_name;

-- Rename projection
ALTER PROJECTION old_name
  RENAME TO new_name;

Examples

-- Add computed column
ALTER PROJECTION patients
  ADD COLUMN age AS EXTRACT(YEAR FROM CURRENT_DATE) - EXTRACT(YEAR FROM dob);

-- Drop column
ALTER PROJECTION patients
  DROP COLUMN middle_name;

-- Rename
ALTER PROJECTION patients
  RENAME TO all_patients;

Note: Adding/dropping columns rebuilds the projection from the log.

REFRESH PROJECTION

Manually rebuild a projection from the log.

Syntax

REFRESH PROJECTION projection_name
  [FROM POSITION position]
  [CONCURRENTLY];

Examples

-- Full rebuild
REFRESH PROJECTION patients;

-- Rebuild from position
REFRESH PROJECTION patients
  FROM POSITION 1000;

-- Rebuild without blocking queries
REFRESH PROJECTION patients
  CONCURRENTLY;

When to Use

Projections update automatically, but manual refresh is useful for:

  • Recovery after corruption
  • Forcing re-evaluation of projection logic
  • Performance testing

SHOW PROJECTIONS

List all projections for current tenant.

Syntax

SHOW PROJECTIONS;

Output

 name            | tenant_id | position  | lag | size_mb
-----------------+-----------+-----------+-----+---------
 patients        | 1         | 12345     | 0   | 128
 appointments    | 1         | 12340     | 5   | 64
 active_patients | 1         | 12345     | 0   | 32

Columns

  • name - Projection name
  • tenant_id - Tenant owning projection
  • position - Current projection position
  • lag - Log position - projection position
  • size_mb - Projection size on disk

DESCRIBE PROJECTION

Show projection schema and metadata.

Syntax

DESCRIBE PROJECTION projection_name;

Output

 column          | type      | nullable
-----------------+-----------+----------
 id              | BIGINT    | NO
 name            | TEXT      | YES
 date_of_birth   | TIMESTAMP | YES
 position        | BIGINT    | NO
 timestamp       | TIMESTAMP | NO

System Catalogs

Query projection metadata:

-- All projections
SELECT * FROM __projections;

-- Projection columns
SELECT * FROM __projection_columns WHERE projection_name = 'patients';

-- Projection indexes
SELECT * FROM __projection_indexes WHERE projection_name = 'patients';

-- Projection statistics
SELECT * FROM __projection_stats WHERE projection_name = 'patients';

Best Practices

1. Partition by Tenant

-- ✅ Good: Explicit tenant filtering
CREATE PROJECTION patients AS
  SELECT ...
  FROM events
  WHERE tenant_id = 1
  PARTITION BY tenant_id;

-- ❌ Bad: No tenant filtering
CREATE PROJECTION patients AS
  SELECT ...
  FROM events;

2. Index Selectively

-- ✅ Good: Index frequently queried columns
CREATE INDEX patients_name_idx ON patients (name);
CREATE INDEX patients_status_dob_idx ON patients (status, date_of_birth);

-- ❌ Bad: Over-indexing
CREATE INDEX patients_idx_1 ON patients (name);
CREATE INDEX patients_idx_2 ON patients (date_of_birth);
CREATE INDEX patients_idx_3 ON patients (status);
CREATE INDEX patients_idx_4 ON patients (created_at);
-- Too many indexes slow writes

3. Use Partial Indexes

-- ✅ Good: Index only active records
CREATE INDEX active_patients_idx
  ON patients (name)
  WHERE status = 'active';

-- Smaller index, faster queries for active patients

4. Rebuild from Log When Changing Schema

-- Schema change requires rebuild
DROP PROJECTION patients;
CREATE PROJECTION patients AS
  SELECT
    data->>'id' AS id,
    data->>'name' AS name,
    data->>'new_field' AS new_field  -- Added field
  FROM events
  WHERE tenant_id = 1;

-- Projection rebuilds from log automatically

Key Takeaway: CREATE PROJECTION materializes views of the log. Projections can be rebuilt at any time. Index frequently queried columns for performance.