Frequently Asked Questions
On this page
- General
- What is Kimberlite?
- Who should use Kimberlite?
- Is Kimberlite production-ready?
- Is Kimberlite a blockchain?
- Architecture
- Why append-only? Isn’t that wasteful?
- How does MVCC work?
- How fast is it compared to PostgreSQL?
- What’s the “Functional Core / Imperative Shell” pattern?
- Compliance
- Does Kimberlite meet HIPAA requirements?
- What about GDPR “right to be forgotten”?
- Can I use it for SOC 2 compliance?
- Operations
- How do I back up a Kimberlite database?
- How do I monitor Kimberlite in production?
- What’s the disaster recovery strategy?
- Development
- How do I contribute?
- When will APIs be stable?
- Where’s the roadmap?
- Comparisons
- Kimberlite vs PostgreSQL?
- Kimberlite vs EventStoreDB?
- Kimberlite vs Datomic?
General
What is Kimberlite?
Kimberlite is a compliance-first database built on a single principle: All data is an immutable, ordered log. All state is a derived view.
It combines:
- Append-only storage (like event sourcing)
- SQL interface (like PostgreSQL)
- MVCC time-travel queries (like Datomic)
- Deterministic consensus (VSR protocol)
- Hash-chained tamper-evidence (like blockchain, without the blockchain)
Target users: Healthcare systems, financial services, legal tech, and any domain requiring audit trails and compliance.
Who should use Kimberlite?
Ideal for:
- 🏥 Healthcare apps needing HIPAA compliance and audit trails
- 🏛️ Legal tech requiring tamper-evident record keeping
- 💰 Financial services with regulatory requirements (SOC 2, PCI DSS)
- 🔬 Database researchers exploring immutable log architectures
- 💻 Systems programmers learning deterministic design patterns
Not ideal for:
- High-frequency trading (latency-sensitive)
- Social media feeds (append-only is overkill)
- Ephemeral data (caches, sessions)
- Workloads requiring in-place updates
Is Kimberlite production-ready?
Current status (v0.4.0): Developer preview focused on learning and exploration.
- Core is solid: 1,300+ tests, deterministic simulation testing, production-grade crypto
- Architecture is stable: No major breaking changes planned
- APIs are evolving: v0.x means breaking changes possible (follow SemVer)
- Limited deployment experience: Not yet battle-tested at scale
Recommendation: Use for:
- Internal tools and prototypes
- Learning database internals
- Compliance research and evaluation
Wait for v1.0 if you need:
- Guaranteed API stability
- Large-scale production deployments
- 24/7 commercial support
Is Kimberlite a blockchain?
No. Kimberlite uses hash chains (like blockchain) but is fundamentally different:
| Feature | Blockchain | Kimberlite |
|---|---|---|
| Consensus | Proof-of-Work / PoS | VSR (Viewstamped Replication) |
| Trust model | Trustless (anyone can join) | Trusted replicas (permissioned) |
| Performance | Slow (global consensus) | Fast (cluster consensus) |
| Purpose | Decentralized ledger | Centralized audit-first database |
| Energy use | High (mining) | Low (standard servers) |
TL;DR: Kimberlite borrows hash chains for tamper-evidence, but runs on standard servers with trusted replicas. No mining, no tokens, no decentralization.
Architecture
Why append-only? Isn’t that wasteful?
Benefits outweigh costs for compliance workloads:
Audit trail is free - History is built-in, no separate audit table Time-travel queries - View state at any past timestamp Simpler concurrency - No locks, no in-place updates Tamper-evident - Hash chain detects modification Deterministic replay - Reproduce exact state from log
Storage trade-off:
- 1M patient records ≈ 500 MB (with 5 updates each)
- Use retention policies to prune old versions
- Compression reduces storage by 60-70%
Real-world comparison:
- PostgreSQL with audit triggers: Similar storage, worse query performance
- Event sourcing: Same append-only model, but no SQL interface
How does MVCC work?
Each row version has two timestamps:
(
id INTEGER,
name TEXT,
_created_at TIMESTAMP, -- implicit
_deleted_at TIMESTAMP -- implicit, NULL if current
);
Query at time T:
SELECT * FROM patients AS OF TIMESTAMP '2026-02-03 10:00:00';
Returns rows where:
_created_at <= '2026-02-03 10:00:00'
AND (_deleted_at > '2026-02-03 10:00:00' OR _deleted_at IS NULL)
Under the hood:
- Log contains all versions:
[v1, v2, v3, ...] - Query specifies point-in-time
- Kernel filters versions by timestamp
- Returns consistent snapshot
No read locks needed - queries never block writes.
How fast is it compared to PostgreSQL?
Current benchmarks (v0.4.0, M1 Mac):
| Operation | PostgreSQL | Kimberlite | Notes |
|---|---|---|---|
| Single insert | 150 μs | 200 μs | 33% slower (hash chain overhead) |
| Batch insert (1K) | 80 ms | 90 ms | 12% slower (CRC32 + hashing) |
| Point query | 50 μs | 60 μs | 20% slower (MVCC filtering) |
| Time-travel query | N/A | 80 μs | Free (no audit table) |
| Full table scan (1M rows) | 300 ms | 450 ms | 50% slower (version filtering) |
Key insight: Kimberlite trades 10-50% performance for audit trail + time-travel queries.
When Kimberlite is faster:
- Audit queries (no separate audit table)
- Historical analysis (MVCC built-in)
- Compliance reports (tamper-evidence free)
When PostgreSQL is faster:
- Pure transactional workloads (OLTP)
- Workloads not needing history
- In-place updates (Kimberlite always appends)
What’s the “Functional Core / Imperative Shell” pattern?
The kernel is a pure function:
No IO inside the kernel:
- No file operations
- No network calls
- No system clock (
Instant::now()) - No randomness (except via explicit RNG parameter)
Why?
- Deterministic - Same inputs → same outputs (enables simulation testing)
- Testable - No mocks needed, pure unit tests
- Replayable - Rebuild state from log perfectly
- Simple - Easy to reason about, no hidden state
The shell executes effects:
for effect in effects
See docs/ARCHITECTURE.md for details.
Compliance
Does Kimberlite meet HIPAA requirements?
Kimberlite provides technical controls for HIPAA compliance:
Audit trails (§164.312(b)) - Immutable log records all data access Integrity controls (§164.312(c)(1)) - Hash chains detect tampering Access controls (§164.312(a)(1)) - Multi-tenant isolation Encryption (§164.312(a)(2)(iv)) - AES-256-GCM for data at rest
You still need:
- Business Associate Agreement (BAA)
- Administrative safeguards (policies, training)
- Physical security (server room access)
- Breach notification procedures
Kimberlite ≠ automatic HIPAA compliance. It’s a compliance-friendly tool, not a compliance solution.
See docs/COMPLIANCE.md for detailed guidance.
What about GDPR “right to be forgotten”?
GDPR Article 17 (right to erasure) seems incompatible with immutable logs. Here’s how Kimberlite handles it:
1. Logical deletion (preferred):
UPDATE patients SET name = '[REDACTED]', email = '[REDACTED]' WHERE id = 123;
Creates new version with redacted data. Old versions stay in log, but:
- Marked as “superseded by redaction”
- Not returned in queries
- Can be physically purged after retention period
2. Tombstone marker:
DELETE FROM patients WHERE id = 123;
Inserts tombstone: {id: 123, _deleted_at: now(), reason: 'GDPR erasure'}
3. Physical purging (advanced):
Kimberlite can rewrite the log, removing specific entries:
Creates new hash chain starting from purge point. Old chain archived for audit.
Legal note: Consult a lawyer. GDPR has exceptions for legal obligations and public interest.
Can I use it for SOC 2 compliance?
Yes! Kimberlite helps with SOC 2 Trust Service Criteria:
| Criterion | How Kimberlite Helps |
|---|---|
| CC6.1 (Logical access controls) | Multi-tenant isolation, role-based access |
| CC7.1 (Detect threats) | Integrity monitoring via hash chains |
| CC7.2 (Monitor system) | Event log records all operations |
| CC8.1 (Change management) | Schema migrations tracked in log |
See docs/COMPLIANCE.md for SOC 2 evidence templates.
Operations
How do I back up a Kimberlite database?
Simple approach (development):
# Stop the server
# Copy the entire data directory
# Restart server
Production approach:
- Streaming replication - Run 3+ replicas, one is always a backup
- Point-in-time recovery - Export log segments to S3/GCS:
- Checkpoint + incremental - Export checkpoint + log delta
See docs/BACKUP.md for detailed procedures.
How do I monitor Kimberlite in production?
Kimberlite exposes Prometheus metrics at /metrics:
# Key metrics
kimberlite_log_offset # Current log position
kimberlite_hash_chain_valid # Hash chain integrity (0/1)
kimberlite_crc_errors_total # Corruption detections
kimberlite_vsr_view # Current consensus view
kimberlite_query_duration_seconds # Query latency (histogram)
Alerting rules:
- alert: HashChainBroken
expr: kimberlite_hash_chain_valid == 0
severity: critical
- alert: HighCRCErrors
expr: rate(kimberlite_crc_errors_total[5m]) > 0
severity: critical
See docs/MONITORING.md for Grafana dashboards.
What’s the disaster recovery strategy?
RPO (Recovery Point Objective): Zero data loss if you have 3+ replicas
RTO (Recovery Time Objective): Depends on data size
- 1 GB: ~30 seconds (hash chain verification)
- 100 GB: ~5 minutes
- 1 TB: ~30 minutes
Recovery scenarios:
- Single node failure → Automatic failover to replica (30s)
- Data corruption → Restore from last valid checkpoint (5 min)
- Complete cluster loss → Restore from offsite backup (30 min - 2 hours)
- Logical error (bad DELETE) → Time-travel query to recover data (instant)
See docs/DISASTER_RECOVERY.md for runbooks.
Development
How do I contribute?
Read the docs:
- PRESSURECRAFT.md - Code quality standards
- CONTRIBUTING.md - Contribution guidelines
- CODE_OF_CONDUCT.md - Community standards
Pick an issue:
- Good first issues: https://github.com/kimberlitedb/kimberlite/labels/good-first-issue
- Join Discord to discuss before starting
Follow the checklist:
- No
unsafecode (workspace lint enforces) - No recursion (use bounded loops)
- 2+ assertions per function
- Tests pass (
just test) - Clippy clean (
just clippy)
- No
Submit PR using the template (includes PRESSURECRAFT checklist)
When will APIs be stable?
Target timeline:
- v0.5.0 (Q2 2026) - SQL dialect stabilization
- v0.8.0 (Q4 2026) - Client API freeze
- v1.0.0 (Q2 2027) - Full API stability guarantee
What changes between v0.x versions:
- SQL syntax (new keywords, functions)
- Storage format (will provide migration tools)
- Client protocol (breaking changes documented)
What WON’T change:
- Core invariants (immutable log, hash chains, FCIS)
- VOPR testing framework
- Compliance guarantees
Subscribe to releases: https://github.com/kimberlitedb/kimberlite/releases
Where’s the roadmap?
See ROADMAP.md for planned features:
v0.5.0 (Q2 2026):
- Streaming replication
- Retention policies
- Full-text search
v0.8.0 (Q4 2026):
- GraphQL interface
- Change data capture (CDC)
- Multi-region clustering
v1.0.0 (Q2 2027):
- API stability guarantee
- Commercial support options
- Certified compliance templates
Comparisons
Kimberlite vs PostgreSQL?
| Feature | PostgreSQL | Kimberlite |
|---|---|---|
| Model | Relational (mutable) | Relational (immutable) |
| History | Audit triggers (manual) | Built-in (MVCC) |
| Time-travel | Via extensions | Native SQL |
| Integrity | Checksums | Hash chains + CRC32 |
| Consensus | Streaming replication | VSR (deterministic) |
| Best for | General OLTP | Compliance-heavy workloads |
When to use PostgreSQL: General purpose, mature ecosystem, need 100+ extensions
When to use Kimberlite: Audit trails, compliance, tamper-evidence, time-travel queries
Kimberlite vs EventStoreDB?
| Feature | EventStoreDB | Kimberlite |
|---|---|---|
| Model | Event sourcing | Event sourcing + SQL views |
| Query | Stream reads, projections | SQL (familiar) |
| Schema | Schemaless (JSON events) | Typed (SQL DDL) |
| Time-travel | Replay events | SQL queries |
| Best for | Event-driven architectures | Compliance + traditional apps |
Key difference: EventStoreDB is event streams with projections. Kimberlite is SQL database backed by event log.
Kimberlite vs Datomic?
| Feature | Datomic | Kimberlite |
|---|---|---|
| Model | Datalog (facts) | SQL (tables) |
| Time-travel | As-of queries (native) | AS OF TIMESTAMP (SQL) |
| Storage | Pluggable (DynamoDB, etc.) | Append-only log (native) |
| License | Proprietary | Apache 2.0 (open source) |
| Best for | Clojure apps, graph queries | Compliance, SQL familiarity |
Key difference: Datomic is Datalog-native. Kimberlite targets SQL users needing immutability.
Didn’t find your question? Ask on Discord: https://discord.gg/QPChWYjD