1. Database workers run on AWS Spot Instances for cheap, scalable clusters.
2. Pseudo-masterless architecture where clients read/write directly to the workers for true horizontal scaling + low latency (implemented in Golang).
In order to pull this off, we have also developed:
- automatic worker replacement with backfill through Kafka
- spot instance price and stability prediction for optimal instance selection and pre-emptive replacement
- connection pooling w/ pgbouncer
- zero downtime cluster scaling (adding/removing workers, rebalancing table-shards - things Citus only includes in their Enterprise fork).
We would love to develop an Open Source service that others can use and contribute to if there is interest. Would love to answer any questions!
It's a very interesting problem from a technical standpoint nonetheless.