Data Model
On this page
- Three Levels of Abstraction
- Level 1: Tables (Most Developers Start Here)
- Level 2: Events (Audit/History)
- Level 3: Custom Projections (Advanced)
- Event Structure
- The Log
- Segment Structure
- Record Format
- Hash Chaining
- Projections (Derived Views)
- How Projections Work
- Example: Table Projection
- Time-Travel Queries
- Stream Model
- Idempotency
- Compaction (Future)
- Key Takeaways
- Related Documentation
Kimberlite’s data model is based on a single principle: all data is an immutable, ordered log; all state is a derived view.
Three Levels of Abstraction
Kimberlite presents three levels to developers, allowing you to work at the appropriate level of detail.
Level 1: Tables (Most Developers Start Here)
Most developers interact with Kimberlite through a familiar table abstraction:
-- DDL defines the projection
(
id BIGINT PRIMARY KEY,
name TEXT NOT NULL,
created_at TIMESTAMP,
region TEXT
);
-- DML appends to the log and updates projection
INSERT INTO records (id, name, created_at, region)
VALUES (1, 'Record Alpha', '2024-01-15 10:30:00', 'us-east');
-- Queries read from projection
SELECT * FROM records WHERE id = 1;
Under the hood:
INSERTbecomes an event appended to the tenant’s log- The projection engine applies the event to update the
recordsprojection SELECTreads from the materialized projection (B+tree)
Level 2: Events (Audit/History)
When you need the audit trail, query the event stream directly:
-- See all events in a stream
SELECT * FROM __events
WHERE stream = 'records'
ORDER BY position DESC
LIMIT 10;
-- Point-in-time reconstruction
SELECT * FROM records AS OF POSITION 12345
WHERE id = 1;
Events are the source of truth. Tables are just a convenient view.
Level 3: Custom Projections (Advanced)
For complex read models, define custom projections:
-- Projection with JOIN (computed at write time, not query time)
CREATE PROJECTION entity_summary AS
SELECT
e.id,
e.name,
COUNT(a.id) as activity_count,
MAX(a.timestamp) as last_activity
FROM entities e
LEFT JOIN activities a ON a.entity_id = e.id
GROUP BY e.id, e.name;
-- Query the pre-computed view
SELECT * FROM entity_summary WHERE id = 1;
Projections are maintained incrementally as events arrive, not computed at query time.
Event Structure
Every event in the log has this structure:
Key fields:
- position - Global sequence number in the log
- prev_hash - Links to previous event (tamper-evident chain)
- hash - Hash of this event’s contents
- tenant_id - Which tenant owns this event
- stream - Logical grouping within tenant
- timestamp - When the event was committed
- caused_by - Optional correlation to parent event
- event_type - What kind of event (INSERT, UPDATE, DELETE, etc.)
- data - Application payload (serialized bytes)
- checksum - CRC32 for integrity
The Log
The append-only log is the system of record. All state derives from it.
Segment Structure
The log is divided into segments for manageability:
data/
├── log/
│ ├── 00000000.segment # Segment 0: positions 0-999999
│ ├── 00000001.segment # Segment 1: positions 1000000-1999999
│ ├── 00000002.segment # Segment 2: current active segment
│ └── index.meta # Segment index metadata
Each segment is a sequential file of records:
┌─────────────────────────────────────────────────────────────────┐
│ Segment File │
│ │
│ ┌──────────┬──────────┬──────────┬──────────┬─────────────┐ │
│ │ Record 0 │ Record 1 │ Record 2 │ ... │ Record N │ │
│ └──────────┴──────────┴──────────┴──────────┴─────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────┘
Record Format
Each record is self-describing and integrity-checked:
┌──────────────────────────────────────────────────────────────────┐
│ Record (on disk) │
│ │
│ ┌─────────┬──────────┬───────────┬──────────┬────────────────┐ │
│ │ Length │ Checksum │ Prev Hash │ Metadata │ Data │ │
│ │ (4 B) │ (4 B) │ (32 B) │ (var) │ (var) │ │
│ └─────────┴──────────┴───────────┴──────────┴────────────────┘ │
│ │
└──────────────────────────────────────────────────────────────────┘
- Length: Total record size in bytes (u32)
- Checksum: CRC32 of the entire record (excluding length field)
- Prev Hash: SHA-256 hash of the previous record (32 bytes)
- Metadata: Position, tenant, stream, timestamp, event type
- Data: Application payload (serialized bytes)
Hash Chaining
Every record includes the hash of the previous record, creating a tamper-evident chain:
Record N-1 Record N Record N+1
┌─────────┐ ┌─────────┐ ┌─────────┐
│ ... │ │ ... │ │ ... │
│ hash ───┼────────►│ prev │ │ prev │
│ │ │ hash ───┼────────►│ hash │
└─────────┘ └─────────┘ └─────────┘
If any record is modified, all subsequent hashes become invalid.
Projections (Derived Views)
Projections are materialized views derived from the log.
How Projections Work
┌─────────────────────────────────────────────────────────────────┐
│ Append-Only Log │
│ [Event 1] → [Event 2] → [Event 3] → ... → [Event N] │
└────────────────────┬───────────────────────────────────────────┘
│ Apply
▼
┌─────────────────────────────────────────────────────────────────┐
│ Projection │
│ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ B+Tree Index │ │
│ │ ┌────┐ ┌────┐ ┌────┐ │ │
│ │ │ K1 │ │ K2 │ │ K3 │ ... │ │
│ │ └──┬─┘ └──┬─┘ └──┬─┘ │ │
│ │ │ │ │ │ │
│ │ ▼ ▼ ▼ │ │
│ │ [Row 1] [Row 2] [Row 3] ... │ │
│ └──────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
Key properties:
- Incrementally maintained: New events update projections immediately
- Rebuildable: Delete a projection and rebuild it from the log
- Multiple projections: Different views for different query patterns
- MVCC: Multiple versions for point-in-time queries
Example: Table Projection
(
id BIGINT PRIMARY KEY,
name TEXT NOT NULL,
dob DATE,
region TEXT
);
Behind the scenes:
- Creates a B+tree projection keyed by
id - On
INSERT: Add row to B+tree - On
UPDATE: Modify row (keeping old version for MVCC) - On
DELETE: Mark row as deleted (keeping tombstone)
Time-Travel Queries
Query data as it existed at any point:
-- As of specific timestamp
SELECT * FROM patients
AS OF TIMESTAMP '2024-01-15 10:30:00'
WHERE id = 123;
-- As of specific log position
SELECT * FROM patients
AS OF POSITION 1000
WHERE region = 'us-east';
How it works:
- Each row version stores
created_atposition - Query scans B+tree, filters versions by position
- Returns rows visible at that position
Stream Model
Events are organized into streams:
Tenant 1
├── Stream A (e.g., "patients")
│ ├── Event 1
│ ├── Event 2
│ └── Event 3
├── Stream B (e.g., "appointments")
│ ├── Event 1
│ └── Event 2
└── Stream C (e.g., "billing")
└── Event 1
Tenant 2
├── Stream A
│ └── Event 1
└── Stream B
└── Event 1
Stream properties:
- Logical grouping within a tenant
- Enables per-stream queries
- Supports event correlation
- Maps to tables in SQL layer
Idempotency
Every command includes an idempotency ID to prevent duplicates:
// Client generates ID before first attempt
let idempotency_id = generate;
// First attempt
let result = client.execute_with_id.await;
// If network fails, retry with SAME ID
let result = client.execute_with_id.await;
// Returns same result without re-executing
The log stores idempotency IDs, ensuring exactly-once semantics.
Compaction (Future)
While the log is append-only, old segments can be compacted to reclaim space:
- Snapshot projection at position P
- Delete log segments before P
- Store snapshot as new “base” for replays
This preserves audit trails while managing storage.
Status: Planned for v0.6.0. See [ROADMAP.md].
Key Takeaways
- Log is truth, state is cache: Projections can always be rebuilt from the log
- Immutability: Nothing is ever modified or deleted in the log
- Time-travel: Query any point in history
- Tamper-evident: Hash chains detect modifications
- Idempotent: Commands can be safely retried
Related Documentation
- Architecture - System overview
- Consensus - How VSR replicates the log
- Compliance - Audit trails and tamper evidence
- Storage Internals - Low-level storage format
Remember: In Kimberlite, there is no “current state.” There is only “what the log says.” Everything else is just a convenient view.