Would you want a distributed Postgres that runs on Spot Instances?

Question

At my startup Tamber (https://tamber.com) we have had to deliver very high database I/O without breaking the bank. Our solution provides two main features:1. Database workers run on AWS Spot Instances for cheap, scalable clusters.2. Pseudo-masterless architecture where clients read/write directly to the workers for true horizontal scaling + low latency (implemented in Golang).In order to pull this off, we have also developed: - automatic worker replacement with backfill through Kafka - spot instance price and stability prediction for optimal instance selection and pre-emptive replacement - connection pooling w/ pgbouncer - zero downtime cluster scaling (adding/removing workers, rebalancing table-shards - things Citus only includes in their Enterprise fork). We would love to develop an Open Source service that others can use and contribute to if there is interest. Would love to answer any questions!

blackpanda · Accepted Answer

My Guess is that 99% of the companies do not need such a solution. But 1% does. And if you target them, you have a business. Unfortunately that 1% does not frequent here (I guess).

Chyzwar · Answer

Why this is better than Cassandra/ScyllaDB/CouchDB/Dynamo ? Data is partitioned and joins are not possible ?

ddorian43 · Answer

Maybe you can build something on top of YugabyteDB since it adds sharding/replication on top of PostgreSQL.

amirathi · Answer

(for regular workloads) cost of maintaining such a DB >> cost of RDSIt's a very interesting problem from a technical standpoint nonetheless.

tcbasche · Answer

Wouldn&rsquo;t it be cheaper and easier to just use a serverless option like AWS Aurora?

arthurcolle · Answer

Sounds pricy! So probably not

Would you want a distributed Postgres that runs on Spot Instances?

My Guess is that 99% of the companies do not need such a solution. But 1% does. And if you target them, you have a business. Unfortunately that 1% does not frequent here (I guess).

Why this is better than Cassandra/ScyllaDB/CouchDB/Dynamo ? Data is partitioned and joins are not possible ?

Maybe you can build something on top of YugabyteDB since it adds sharding/replication on top of PostgreSQL.

(for regular workloads) cost of maintaining such a DB >> cost of RDS
It's a very interesting problem from a technical standpoint nonetheless.

Wouldn’t it be cheaper and easier to just use a serverless option like AWS Aurora?

Sounds pricy! So probably not