Deploying on AWS
On this page
- Architecture
- Prerequisites
- Instance Selection
- Storage Configuration
- EBS Volume Types
- Volume Configuration
- /etc/fstab Configuration
- Networking
- Security Group Configuration
- Placement Groups
- Encryption
- At-Rest Encryption with AWS KMS
- In-Transit Encryption with TLS
- IAM Configuration
- Deployment with Terraform
- Monitoring Integration
- CloudWatch Metrics
- Backup Strategy
- EBS Snapshots
- Cost Optimization
- Related Documentation
Deploy Kimberlite on Amazon Web Services.
Architecture
┌─────────────────────────────────────────────────────┐
│ VPC │
│ ┌──────────────┐ ┌──────────────┐ ┌────────────┐│
│ │ us-east-1a │ │ us-east-1b │ │ us-east-1c││
│ │ │ │ │ │ ││
│ │ ┌────────┐ │ │ ┌────────┐ │ │ ┌────────┐││
│ │ │ Node 1 │ │ │ │ Node 2 │ │ │ │ Node 3 │││
│ │ │ (Leader)│ │ │ │ │ │ │ │ │││
│ │ └────────┘ │ │ └────────┘ │ │ └────────┘││
│ └──────────────┘ └──────────────┘ └────────────┘│
└─────────────────────────────────────────────────────┘
Prerequisites
- AWS account with EC2 and EBS permissions
- AWS CLI installed and configured
- Terraform or CloudFormation (optional)
Instance Selection
Recommended Instance Types:
| Workload | Instance Type | vCPUs | Memory | Network | EBS |
|---|---|---|---|---|---|
| Small | t3.medium | 2 | 4 GB | Up to 5 Gbps | GP3 100 GB |
| Medium | c6i.xlarge | 4 | 8 GB | Up to 12.5 Gbps | GP3 500 GB |
| Large | c6i.2xlarge | 8 | 16 GB | Up to 12.5 Gbps | GP3 1 TB |
| Production | c6i.4xlarge | 16 | 32 GB | Up to 25 Gbps | GP3 2 TB (3000 IOPS) |
Instance Selection Tips:
- Use
c6i(compute-optimized) for high-throughput workloads - Use
m6i(general-purpose) for balanced workloads - Use
r6i(memory-optimized) for large projection caches
Storage Configuration
EBS Volume Types
| Volume Type | IOPS | Throughput | Use Case | Cost |
|---|---|---|---|---|
| GP3 | 3,000-16,000 | 125-1,000 MB/s | Recommended for most | $0.08/GB-month |
| IO2 | 64,000+ | 4,000 MB/s | Ultra-high performance | $0.125/GB-month |
| ST1 | 500 (burst) | 500 MB/s | Cold data (NOT for log) | $0.045/GB-month |
Recommendations:
- Log volume: GP3 with 3000 IOPS minimum
- Projection volume: GP3 with 1000 IOPS minimum
- Separate volumes for log and projections
Volume Configuration
# Create EBS volumes
# Attach volumes
# Format and mount
/etc/fstab Configuration
# Add to /etc/fstab for auto-mount
UUID=xxx
UUID=yyy
Networking
Security Group Configuration
# Create security group
# Client traffic (from application VPC)
# Cluster traffic (between nodes)
# Metrics (from monitoring VPC)
# SSH (from bastion only)
Placement Groups
Use cluster placement groups for lowest latency:
Encryption
At-Rest Encryption with AWS KMS
# /etc/kimberlite/config.toml
[encryption]
enabled = true
kms_provider = "aws-kms"
kms_key_id = "arn:aws:kms:us-east-1:123456789:key/abc123"
Create KMS key:
kms-policy.json:
In-Transit Encryption with TLS
Generate certificates using AWS Certificate Manager or self-signed:
# Using ACM Private CA
# Or use Let's Encrypt with DNS challenge
IAM Configuration
EC2 Instance Role:
Deployment with Terraform
# main.tf
resource "aws_instance" "kimberlite_node" {
count = 3
ami = "ami-xxx" # Ubuntu 22.04
instance_type = "c6i.xlarge"
vpc_security_group_ids = [aws_security_group.kimberlite.id]
subnet_id = element(var.subnet_ids, count.index)
placement_group = aws_placement_group.kimberlite.id
iam_instance_profile = aws_iam_instance_profile.kimberlite.name
user_data = templatefile("user-data.sh", {
node_id = count.index + 1
cluster_peers = join(",", [for i in range(3) : "node${i+1}:7001" if i != count.index])
})
tags = {
Name = "kimberlite-node-${count.index + 1}"
}
}
resource "aws_ebs_volume" "log" {
count = 3
availability_zone = element(var.availability_zones, count.index)
size = 500
type = "gp3"
iops = 3000
throughput = 250
tags = {
Name = "kimberlite-log-${count.index + 1}"
}
}
resource "aws_volume_attachment" "log" {
count = 3
device_name = "/dev/sdf"
volume_id = aws_ebs_volume.log[count.index].id
instance_id = aws_instance.kimberlite_node[count.index].id
}
Monitoring Integration
CloudWatch Metrics
Export Kimberlite metrics to CloudWatch:
# Install CloudWatch agent
# Configure agent
Backup Strategy
EBS Snapshots
# Create snapshot schedule
snapshot-policy.json:
Cost Optimization
Estimated Monthly Costs (3-node cluster):
| Component | Configuration | Cost |
|---|---|---|
| 3x c6i.xlarge | 24/7 | $370 |
| 3x GP3 500 GB (log) | 3000 IOPS | $120 |
| 3x GP3 200 GB (proj) | 1000 IOPS | $48 |
| Data transfer | 1 TB/month | $90 |
| Total | ~$630/month |
Cost Reduction Tips:
- Use Savings Plans for 40% discount on compute
- Use Reserved Instances for steady-state workloads
- Lifecycle policy to delete old snapshots
- Use S3 Glacier for long-term archival
Related Documentation
- Deployment Guide - General deployment patterns
- Configuration Guide - Configuration options
- Security Guide - TLS setup
- Monitoring Guide - Observability
Key Takeaway: Use GP3 volumes with 3000 IOPS for log, spread nodes across AZs, enable KMS encryption, and use placement groups for low latency.