I’m curious if anybody has taken the approach of provisioning one database per account? This means you’d have to run migrations per account and keep track of all the migration versions and statuses somewhere. Additionally, if an application has custom fields or columns, the differences would have to be tracked somehow and name space collisions managed.
Has anybody done this? Particularly with Rails? What kinda of tools or processes did you learn when you did it? Would you do it again? What are some interesting trade offs between the two approaches?
This architecture decisions is one of my biggest regrets, and we are currently in the process of rebuilding into a single database model.
FWIW, this process has worked well for what it was originally intended to do. Data-security has a nice db level stopgap and we can keep customer data nicely isolated. It's nice for extracting all data from a single customer if we have extended debugging work or unique data modeling work. It saves a lot of application layer logic and code. I'm sure for the most part it makes the system slightly faster.
However as we have grown this has become a huge headache. It is blocking major feature refactors and improvements. It restricts our data flexibility a lot. Operationally there are some killers. Data migrations take a long time, and if they fail you are left with multiple databases in different states and no clear sense of where the break occurred.
Lastly, if you use the Apartment gem, you are at the mercy of a poorly supported library that has deep ties into ActiveRecord. The company behind it abandoned this approach as described here: https://influitive.io/our-multi-tenancy-journey-with-postgre...
Happy to expand on this if anybody is interested. It's currently a cause of major frustration in my life.
tl;dr: Wait until you need it, but there are good reasons for it!
Since I didn’t see anyone mention it, the term I’ve seen a lot of people use for this pattern is “multi single tenant”.
Part of the reason we have Tenant Projects [1] is precisely so you can do a 1:1 mapping of “Customer A can have different controls, settings, and blast radii from Customer B”.
Many of our first-party services do this, but then again, many of the largest-scale ones do “true multitenancy” instead. There’s no great way to offer a scale-to-zero database like Datastore or BigQuery without “internalizing” the multitenancy. But then it’s on you as the implementor to ensure isolation (both security and performance).
In your scenario, if you could make each database small enough (or use a database that handles the multitenancy for you) you gain a lot in terms of “capability” for enterprises. Customer X wants a feature that you know how to do but aren’t sure it should be for everyone? Their instance is separate. Customer Y has a drastically different read pattern and should have a high-IO instance? Great.
The easiest advice though is: wait until after you need it. A single replicated pgsql or MySQL can go insanely far. Let your business needs push you to the more complicated architecture, later. Prefer whatever gets you to a success disaster. Just don’t forget backups, PITR, and actually testing that :). (Another thing that’s harder with a shared database!).
Edit: Adding a newline and a tl;dr.
[1] https://cloud.google.com/service-infrastructure/docs/manage-...
The biggest advantage is flexibility to handle customers requirements (e.g. change management might have restrictions on versioning updates) and reduced impact of any failures during upgrade processes. It's easier to roll out upgrades progressively with proven conventional tools (git branches instead of shoddy feature flags). Increased isolation is also great from a security standpoint - you're not a where clause away from leaking customer data to other customers.
I would go as far as saying this should be the default architecture for enterprise applications. Cloud infrastructure has eliminated most of the advantages of conventional multitenancy.
If an account is a single user then no.
PS: I have a quite a lot of experience with this so if you would like more details just ask.
MySQL connections can be reused with database per tenant. Rack middleware (apartment gem) helps with managing applying migrations across all databases, and with the mechanics of configuring connections to use a tenant based on Host header as requests come in.
For async jobs in Java, allocating a connection can idiomatically only be done via a callback - e.g. `executeInTenantContext(Tenant tenant, Runnable block)` - which ensures that all connections handed out have the right tenant selected, and everything is reset to the default tenant (an empty database) when exiting. Per-tenant jobs either iterate through all tenants, or have a tenant parameter under which they get executed, and the rest of the code can be more or less tenant unaware.
It gives you the "freedom" to move tenants to separate servers, or consolidate them into single servers, if tenants are much larger or much smaller than expected. In reality this is sufficiently painful it's mostly illusory. We're looking at Vitess to help scale out, and move away from a mix of multi-tenant servers and single-tenant servers.
In a traditional webapp backend, you have a pool of connections to the database. User01 hits your service, and grabs a connection off the pool. User02 does the same, and so on. These connections get put back in the pool for reuse once a user is done with them.
In your design, every time a user hits your service, a new connection, specific to that user, will have to be made. This will incur network traffic and the overhead of logging in to the DBMS.
If you're thinking about using something like SQLite, you will hit a hard wall when the OS isn't able to open any more file descriptors, as well.
Like you said, DB administration will be a huge pain in the ass. Rather than having Flyway or Liquidbase or whatever run a migration on one database, you'll have to run it on thousands of databases. There will be significant downtime when your databases are not in a consistent state with one another. There will also be bloat from the migration tool's record keeping, which will be duplicated for every user, rather than every database.
A lot of the tools a database gives you for free will also need to be implemented in application logic, instead. For example, you might want to run a query that says "show me every user using more than 1GB of storage," but under your schema, you'll have to log into every user's database individually, determine storage used, and add it to an in-memory list.
If you ever want to allow users to collaborate, you will end up replicating the owner_id field type metadata anyway, and the entire benefit of this schema will evaporate.
Most frameworks are not set up to handle this style of database access, either. I don't use Rails, but Spring Boot would fight you every step of the way if you tried to do this.
I currently work at a company where customers have similar concerns around data privacy, but we've been to continue using a single multitenant DB instance by using PostgreSQL's row level security capabilities where rows in a table are only accessible by a given client's database user:
https://www.postgresql.org/docs/9.5/ddl-rowsecurity.html
We customized both ActiveRecord and Hibernate to accommodate this requirement.
I also tend to think that the security benefit is more theatre than reality. If an adversary compromises an employee laptop or gets RCE on the web tier (etc, etc), they'll get all the databases regardless of whose account (if any) they started with.
(The way I generally deal with this in a cross-tenant application is to ban, in Rails parlance, Model.find(...) unless the model is whitelisted (non-customer-specific). All access to customer-specific data is through @current_account.models.find(...) or Model.dangerously_find_across_accounts(...) for e.g. internal admin dashboards. One can audit new uses of dangerously_ methods, restrict them to particular parts of the codebase via testing or metaprogramming magic, etc.
That said, there are some interesting challenges with this model like schema migration and DB backups etc. some of which can be easily overcome by smartly using workers and queuing. We run migration per schema using a queue to track progress and handle failures. We also avoid migrations by using Postgres JSON fields as much as possible. For instance, creating two placeholder fields in every table like metadata and data. To validate data in JSON fields we use JSONSchema extensively and it works really well.
Probably you also need to consider application caching scenarios. Even you managed to do one database per customer running Redis instance per customer will be a challenge. Probably you can run Redis as a docker container for each customer.
If I'm understanding this right, every team gets its own database, which is active-active replicated across two MySQL servers. There's also a separate mapping from teams to MySQL servers (which is itself a separate pair of MySQL servers), and of those two servers, one is primary for each team, distributed roughly evenly. Each MySQL server instance in the fleet is hosting many Slack workspaces, and they can scale that up/down as needed (i.e., presumably the tiny Slack I created years ago and forgot about is hosted on the same pair of MySQL servers as lots of other Slack instances in the same state and my employer's active Slack instance with thousands of people is on a less-crowded pair of MySQL server).
One user-visible effect is that it's possible to have outages that affect some fraction of workspaces. I also suspect this plays into things like rolling out changes over the course of many months - they don't need to do a database migration for everyone at one.
I don't think they use this for security - while yes, a buggy query can only affect one workspace at once, it doesn't sound to me like they do sharding at the PHP layer, it sounds like they're running a fleet of generic PHP instances that have the ability (and authorization) to talk to any workspace's database, not that they're running separate web/PHP instances per customer. But it definitely sounds like they rely on this for scalability/performance.
It worked out pretty well. The only downside was that analytics/cross customer stats were kind of a pain.
The customers all seemed to like that their data was separate from everyone else's. This never happened, but if one database was compromised, everyone else's would have been fine.
If I were starting a B2B SaaS today where no customers shared data (each customer = a whole other company) I would use this approach.
As one example, New Relic had a table per (hour, customer) pair for a long time. From http://highscalability.com/blog/2011/7/18/new-relic-architec... (2011):
> Within each server we have individual tables per customer to keep the customer data close together on disk and to keep the total number of rows per table down.
In a situation like that, all queries and table operations are customer- and time-specific anyway. At the time, dropping an entire table took less I/O than deleting specific rows in a multi-customer table (for MyISAM tables, this may still be true: https://mariadb.com/kb/en/big-deletes/). Also, there was no risk from locking the table.
https://www.slideshare.net/newrelic/how-to-build-a-saas-app-... has a bit more. I think Lew Cirne gave that presentation in 2011 but I can’t find a video of it.
In the example you gave, if the goal is to support customer-defined fields, I don't think most people would map the customer-defined fields directly to SQL columns. Consider something like Postgres hstore (with indices as needed) or the many similar implementations in other data stores.
1. Each tenant typically only has <10 users, never >20. And load is irregular, maybe only ever dealing with 2-3 connections simultaneously. Maybe <1000 queries per hour max. No concerns with connection bloat/inefficiency.
2. Tenants creates and archives a large number of rows on some tables. Mutable but in practice generally doesn’t change much. But >100M row count not unusual after couple years on service. Not big data by any means, limited fields with smallish data types, but...
I didn’t want to deal with sharding a single database. Also given row count would be billions or trillions at a point the indexing and performance tuning was beyond what I wanted to manage. Also, this was at a time before most cloud services/CDNs and I could easily deploy close to my clients office if needed. It worked well and I didn’t really have to hire a DBM or try to become one.
Should be noted, this was a >$1000/month service so I had some decent infrastructure budget to work with.
Start by optimizing your indexes. Ensure customer_id is in every index and is the first item. This allows the query to immediately filter to only rows for this customer. Ensure all queries include a customer_id (should be doing this anyway in multi-tenant environment). Even single row lookups by primary key can be really sped up this way; once the index becomes bigger than memory it has to be paged in/out. However with this approach the entire sub-tree of the index for “hot” users can remain in memory without paging, increasing cache hit rate, speeding up queries, and reducing IO.
The above is generally enough. However, some consumer apps have a ton of data per user and relatively low revenue per user. In this case there’s one more big trick to keep DB costs down: cluster the whole table by customer_id. Like the index problem, the row data can be inefficient. If your disk layout randomly places rows into pages, chances are there’s only one row for a given client on a given page. If you need to evaluate 1000 rows to do a a query, you’ll have to read close to 1000 pages, and IO is slow/expensive. You’ll use a ton of memory caching pages where very few of the rows are for hot/active users Note: this problem only really matters if your rows are small and you can fit many rows per page. To fix, cluster the table by customer_id. On MySQL+InnoDB this is easy (https://dev.mysql.com/doc/refman/5.7/en/innodb-index-types.h...). On Postgres there’s a CLUSTER command but it’s one time, locking and it will take forever; MySQL is the easier solution (and I really prefer Postgres). Lots of the NoSQL DBs allow you to control the physical layout. Once setup, your cache hit rates will go way up, total IO will go way down, and you'll see a pretty good speedup.
As always, benchmark a lot for your use case before implementing (but don't forget to benchmark the hot/cold use cases).
PS — I remember a post a few days ago about a company running a distributed SQLite (many DBs). Looked really interesting but can’t find the link. For smaller scales than millions of users, look into partitions.
Snapshotting / restoring entire accounts to a previous state was easy, and debugging data issues was also much easier when you could spin up an entire account's DB from a certain point in time locally.
We also could run multiple versions of the product on different schema versions. Useful when certain customers only wanted their "software" updated once every 6 months.
Each client's db is a little different due to some more questionable decisions (e.g. different features can result in different databases turning on, and most tables will have variable names). But it's not really any harder to write code that says "`select identifier from features_enabled` -> derive table name -> `alter (derived table name) add column baz`" (actually, nowadays we have an inhouse database migrator that does that for us, we just say "the table abstractly known as `foo` should have columns bar, baz, quux" and the migrator says "for this client, they have the feature uses_foo enabled with the identifiers A and B, so A_foo and B_foo should both have the columns baz, but do not; I will add it after `bar`".)
Perhaps it has discouraged us from seeking to use persistent connections. But that is not the biggest inefficiency in our system at this stage.
If I was designing the application from scratch, I think I would want a static schema, but database-per-tenant seems fine.
I can't remember all of the reasons that we ditched the idea, but it quickly became clear that we would be writing a lot of the core tooling ourselves, and we kept running into problems with AWS around dynamically provisioning the load balancers and certs for N boxes.
I wouldn't dream of trying to manage different schemas across all the databases either, that sounds like a serious headache that could only be mitigated by having separate codebases and separate development teams managing them.
If a customer needs custom fields, I would make them a first class citizen and the required `fields_definitions / field_values` tables to your database and let them manage those fields themselves.
I'm glad we ended up defaulting to a multi-tenant architecture even though we lost some of the proposed benefits (isolation, independent scaling etc).
At some point in the life of the product my company has, we were having a real bottleneck in the DB.
The solution was to "split" the data in different DBs, that would hold several clients. So, we could say we have the worst of both worlds (As code still has to deal with the "security" part).
It is even more complicated, as some table exist only in the "main" db, and we fetch data from there constantly.
So far (Probably 10+ years since we implemented this).
We haven't had any real issue because of this, and we have developed small in house tools to keep things tidy (like schema updates or data inspection).
Server is MySQL.
If you're going to go this route you might also want to consider creating a role-per-user and taking advantage of the role-based security features[1].
That said, this is not how people usually handle multi-tenancy, for good reason, the complexity often outweighs the security benefit, there are good articles on it, and here's one by CitusData[2] (pre-acquisition).
[0]: https://www.postgresql.org/docs/current/ddl-schemas.html
[1]: https://www.postgresql.org/docs/current/ddl-rowsecurity.html
[2]: https://www.citusdata.com/blog/2016/10/03/designing-your-saa...
Currently we deploy one DB + unique uWSGI instances for each customer. This has some drawbacks which has made us look a bit into multi-tenancy as well. Everything is served on dedicated hardware, using common codebase, and each customer is served on a unique sub-domain.
The two primary drawbacks of running unique instances for each customer are ease of deployment and utilization of resources.
When a new customer is deployed we need to set up the database, run migrations, set up DNS, deploy the application, deploy the task runner, set up DNS and configure the HTTP vhost. Most of this is painfully manual right now, but we're looking into automating at least parts of the deployment.
In the future, we aim to offer an online solution for signup and onboarding, where (potential) customers can trigger the provisioning of a new instance, even for a limited demo. If we were doing multi-tenancy that would just require a new row in the database + some seed data, which would make the deployment process exceptionally simpler.
The other issue is the utilization of resources. Running a few instances of the application with a big worker pool would be much easier to scale than running 50+ instances with their own isolated worker pool.
We're considering maybe going for a hybrid multi-tenant architecture, where each customer has their own isolated DB, but with a DB router in the application. That would give us a compromise between security (isolated databases - SQL queries don't cross into another customer's data) and utilization (shared workers across customers). But this would add another level of complexity and new challenges for deployment.
Do anyone have a similar case as this?
What I ended up doing is going for a multi tenant architecture, but I built it in such a way where the data is transient from one db to another. All account data lives in a separate 'master db', and the data in another. All ids are UUIDs. I build database routing into the application so each "account" can have it's data on a different database, and if needed I can move one account's data from a loaded db to a more available one.
It is great for data isolation, scaling data across servers, deleting customers when they leave easily.
The only trick are schema migrations. Just make sure you apply migration scripts to databases in an automated way. We use a tool called DbUp. Do not try to use something like a schema compare tool for releases.
I have managed more than 1500 databases and it is very simple.
https://www.postgresql.org/docs/current/ddl-schemas.html
Given how popular Postgres is, I’m sure there are lots of teams using this architecture.
https://www.vldb.org/pvldb/vol11/p540-shraer.pdf
> within each container CloudKit uniquely divides the data- space into many private databases that manage user-scoped data
> Each private database belongs to exactly one container and one user
https://docs.microsoft.com/en-us/azure/sql-database/sql-data...
If you want to do this for security, why not layer the DB behind some system that requires and verifies the users access tokens for each request?
The only situation where such a setup might make sense is when you actually need per-user migrations to cater to specific customer's needs, but then you'll make it very hard to work with all customer's data through a generic interface.
Schema migrations are kind of a pain, we roll out changes, so on auth there is this blue/green decision.
Custom fields in EAV data-tables or jsonb data.
Backups are great, small(er) and easier to work with/restore.
Easier to move client data between PG nodes. Each DB is faster than one large one. EG: inventory table is only your 1M records, not everyone's 600M records so even sequential scan queries are pretty fast.
The only real reason you mention is security, but to me this sounds like the worst tool for the job. Badly written queries accidentally returning other users' data, that makes it into production, isn't usually a common problem. If for some reason you have unique reasons that it might be, then traditional testing + production checks at a separate level (e.g. when data is sent to a view, double-check only permitted user ID's) would probably be your answer.
If you're running any kind of "traditional" webapp (millions of users, relatively comparable amounts of data per user) then separate databases per user sounds like crazytown.
If you have massive individual users who you think will be using storage/CPU that is a significant percentage of a commodity database server's capacity (e.g. 1 to 20 users per server), who need the performance of having all their data on the same server, but also whose storage/CPU requirements may vary widely and unpredictably (and possibly require performance guarantees), then yes this seems like it could be an option to "shard". Also, if there are very special configurations per-user that require this flexibility, e.g. stored on a server in a particular country, with an overall different encryption level, a different version of client software, etc.
But unless you're dealing with a very unique situation like that, it's hard to imagine why you'd go with it instead of just traditional sharding techniques.
Our original idea was to allow customers access to their form db directly so they could even run custom SQL queries on it and so on. In the end we actually never used that part (users just view and filter their records in the UI), so unless we still need this in the future, one could argue it's.. just slightly overengineered. It works well otherwise though, and migrations aren't really a problem for us, because they're done dynamically by the app (when people make changes to their form). The scaling is also no issue as one db will always be small enough to fit on one server, so we can "shard" easily by file. So I think the only obvious lesson here was make sure you really really need it (and if migrations/scaling would have meant a lot of overhead, I don't think we would have even considered it).
We already had experience writing DB migrations that were reliable, and we had a pretty solid test suite of weird edge cases that caught most failures before we deployed them. Still, some problems would inevitably fall through the cracks. We had in-house tools that would take a DB snapshot before upgrading each customer, and our platform provided the functionality to leave a customer on an old version of our app while we investigated. We also had tools to do progressive rollouts if we suspected a change was risky.
Even with the best tooling in the world I would strongly advise against this approach. Cost is one huge factor - the cheapest RDS instance is about $12/month, so you have to charge more than that to break even (if you're using AWS- we weren't at the time). But the biggest problems come from keeping track of scaling for hundreds or thousands of small databases, and paying performance overhead costs thousands of times.
What a lot of enterprise SaaS vendors do is have one single database for all customer data (single tenant). They then use features like Virtual Private Database to hide customer A data from customer B. So that if customer A did a “select *” they only see their own data and not all of the other customers data. This creates faux multi-tenancy and all done using a single db account.
However, we dont use a tenent id in all tables to differentiate customers we use (confusingly named) DB named prefix + tenent id for programatically making the connection.
Have a single server + db for shared data of tenents like product wide statistics, user/tenent data and mappings and such things. In the tenent table just have column for the name of the DB server for that tenent and that's pretty much it. Migrations are handled by an internal tool that executes the migrations on each tenent DB and 99% of the time everything works just fine if you are careful on what kind of migration you do and how you write your code
Some pitfalls concern column type changes + the read replicas going out of sync but that was a single incident that only hurt the replica.
It’s slightly clunky in that the public, shared schema tables, say, the one that holds the list of tenants, exists in every schema — they’re just empty.
I rolled my own based on apartment that has one shared public schema, and a schema for each tenant. Works well.
If all your customers are individuals or small accounts then I'd put them all in a single DB, but I'd still build in the option to redirect to another DB if the software has applicability to enterprise level customers.
Note that it is perfectly OK to upsell the capability to isolate your customers data from each other, segregation options could be offered at multiple levels (all in one bucket, different DB, different cluster), each with their own pricepoint. Some customers will simply demand this and move elsewhere if you don't offer it (and they are typically quite insensitive to price as long as the right boxes get checked for their auditors).
I also wanted one-filesystem-per-customer support and full isolation as in Docker, so I ended up writing a new OS to support all of this natively. It runs in production on boomla.com, a hosted website builder that is also a programmable platform.
WARNING: it gets even weirder, databases can be nested, a database entry is also a file and a directory at the same time, or a nested database if you want. There is no way understood this sentence. :) For example, you can store files in files. As in, style.css/background.png, etc.. Every change also creates a new filesystem snapshot, which may also be nested. All right I stop here.
This is not as much an advice for OP (to write a new OS) but more like a plug for the intellectually curious to explore a parallel universe. :)
For a site I run, I have one large shared read-only database everyone can access, and then one database per user.
The per-user DB isn't the most performant way of doing things, but it made it easier to:
+ Encrypt an entire user's data at rest using a key I can't reverse engineer. (The user's DB can only be accessed by the user whilst they're logged in.)
+ Securely delete a user's data once they delete their account. (A backup of their account is maintained for sixty days... But I can't decrypt it during that time. I can restore the account by request, but they still have to login to access it).
There are other, better, ways of doing the above.
How about dozens per account? :) I didn’t ship this, but I work for Automattic and WordPress.com is basically highly modified WordPress MU. This means every time you spin up a site (free or otherwise) a bunch of tables are generated just for that site. There’s at least hundreds of millions of tables. Migrating schema changes isn’t something I personally deal with, but it’s all meticulously maintained. It’s nothing special on the surface.
You can look up how WordPress MU maintains schema versions and migrations and get an idea of how it works if you’re really curious. If you don’t have homogeneous migrations, it might get pretty dicey, so I’d recommend not doing that.
Creating dedicated database app users with the proper permissions pretty much guarantees that you'll never serve data to the wrong customer.
It also probably makes bug and performance troubleshooting much easier.
The biggest downside is probably the maintenance overhead, but I suppose automation can mitigate it quite well.
It maybe makes internal reporting a bit harder, as you can't get the whole data from a single SQL query. You'd have to handle this in the application layer.
If you can bill the extra computing and devops work from your customers, I'd go with separate environments alltogether. You can do this easily with AWS.
On the plus side you can roll out changes gradually, upgrade the environments one user at a time.
Also if Customer X pays you to make a custom feature for them, you can sell the same to all other users if it's generic enough.
So to upgrade the application we'd upgrade the everyone application and then do the same with each enterprise client. We only had a few.
I'd recommend trying to avoid it if you can, and if you can't charging the appropriate amount of money which is an arm and a leg.
At the time we weren't that much experiences so we also implemented migrations (whereas today I would have chosen flyway or something similar).
Customer data was text based so we didn't have that many custom fields but at the sime time we had many migrations during the years.
So, would we still do it? Yes and no. It was amazing because it let us isolate each user. With Filesystem backups we could restore just in time the data from each user without having to reset the whole database and with little to know system knowledge.
At the same time it restricted our choices when improving the product. If we had to do a whole refactor of the project we always thought "How do I migrate all the users together?" We had to think about versioning, upgrade in different times and so on.
From a tech standpoint it wasn't much of a problem, but it burdened the project a little bit. Bigger refactorings are harder and you need much more depth when you think about big new features.
I still think it was a nice solution, but this solution came also because we didn't knew a thing about partitioning in databases and we weren't expert in clustering. All in all it allowed us to create a product sustaining a decent customer base with low knowledge for that time.
With today technology I'd test if we can do it other ways.
I think their primary reasoning is that Meteor doesn't scale super easy, so it was easier to just "shard" the whole stack per customer.
Personally, it's a lot of work. It depends on what you're doing to make the tradeoffs worthwhile.
I see this as an optimization. Build everything so you can deploy once for all your customers. If you need to shard by customer later, it's just an infrastructure problem.
Shared DB instances would have been completely off the table. Thankfully, most boarding schools have massive IT budgets, so cost minimization was not as important as adding additional features that justified more spend. Also the market was quite green when I was there. Strangely, the software seemed to market itself; the number of out-of-the-blue demo requests was very high, so first impressions and conversion to paying clients was the primary focus.
I think the biggest problems had to do with migrations and backups. We maintained multiple distinct versions of the application, and each had a unique DB schema, so there was frequent drift in the actual schemas across accounts. This was painful both from a maintenance POV, and for doing things like change data capture or ETLs into the data warehouse for data science/analysis.
Another big problem was dealing with backup/restore situations.
I suspect this decision was made early in the company's history because it was easier than figuring out how to scale an application originally designed to be an on-prem solution to become something that could be sold as a SaaS product.
Anyway, I think choosing a solution that nets your business fewer, larger database nodes will probably avoid a lot of maintenance hurdles. If you can think ahead and design your application to support things like feature flags to allow customers to gradually opt in to new versions without breaking backwards compatibility in your codebase, I think this is probably the better choice, but consider the safety and security requirements in your product, because there may be reasons you still want to isolate each tenant's data in its own logical database.
We had a long tail of customers with negligible usage and would run several thousand MySQL databases on a single server. As customers scaled we could migrate the database to balance capacity. We could also optionally offer "premium" and "enterprise" services that guaranteed isolation and higher durability.
Scaling was never a real issue, but the nature of our clients was steady incremental growth. I don't think we ever had a case of real "overnight success" where a shared host customer suddenly melted the infrastructure for everyone.
However, managing and migrating the databases could be a real issue. We had a few ways of handling it, but often would need to handle it in the code, `if schemaVersion == 1 else`. Over time this added up and required discipline to ensure migration, deprecation and cleanuop. As a startup, we mostly didn't have that discipline and we did have a fair bit of drift in versions and old code lying around.
We do it, but everyone gets the same schema and same app. .net/IIS/sql server and when we update the schema, we apply to all dbs via in-house scripts. It provides security warm fuzzies to our clients that their data is separated from other clients. If you want to try and version the app/schema for different accounts, that’s where your headaches are, regardless of db model
What this means is each customer has her own schema. Now, large customers want to be single tenant, so they have a single schema on the entire DB. Smaller (SMB) customers are a bit more price conscious so they can choose to be multitenant i.e multiple schemas on same DB.
Managing this is pushed out to a separate metadata manager component which is just a DB that maps customer to the DB/schema they reside on. Connection pooling is at the DB level (so if you are multitenant then you may have lower perf because some other customer in the DB is hogging the connections)... But this has not happened to us yet.
Large customers are more conscious in terms of data so want things like disc level encryption with their own keys etc, which we can provide since we are encrypting the whole DB for them (KMS is the fave here).
We are not really large scale yet, so dunno what they major gotchas will be once we scale, but this approach has served us well so far.
He has 564,295 tables in one SQL Server. Apparently this is for "Stack Overflow For Teams"
The acts_as_tenant gem might be what you’re looking for:
> This gem was born out of our own need for a fail-safe and out-of-the-way manner to add multi-tenancy to our Rails app through a shared database strategy, that integrates (near) seamless with Rails.
My recommended configuration to achieve this is to simply add a `tenant_id` column (or `customer_id` column, etc) on every object that belongs to a tenant, and backfilling your existing data to have this column set correctly. When a new account signs up, not a lot happens under-the-hood; you can create a row in the main table with the new account, do some initial provisioning for billing and such, and not much else. Being a multi-tenant platform you want to keep the cost really low of signing up new accounts. The easiest way to run a typical SQL query in a distributed system without restrictions is to always access data scoped by the tenant. You can specify both the tenant_id and an object’s own ID for queries in your controller, so the coordinator can locate your data quickly. The tenant_id should always be included, even when you can locate an object using its own object_id.
Anyway, if you have a web app you already have a DB-per-user, FWIW.
[1] https://en.wikipedia.org/wiki/Web_SQL_Database
Remember that a database is supposed to be shared. It's designed that way for performance reasons. If your big issue is you're constantly doing flyway and it's unmanageable, go schemaless, or refactor where/when/how you place your data. Rethink the architecture/design, not the tool. If it's a microservice-based application, remember that you are supposed to have independent state per service, so there shouldn't be one giant database anyway.
But for like 12 customers, sure, go for it. It's very common for "whale" accounts to get dedicated infra/databases while the rest get a common sharded layer.
Additionally, like someone else pointed out, trying to run any reporting data across multiple customers will become difficult code wise and less performant.
Realistically, if you are handling the sort of scale that would require more interesting scaling solutions for typical db software, you are most certainly making enough money to implement better approaches.
FWIW, I worked for a company that was handling a few hundred thousand customers with millions of order records on a relatively small AWS RDS server. Set up a database cluster and you're rolling for a while.
The reasoning was that customers data couldn't ever leak into each other, and moving a customer to a different server was easier. I vaguely recall Joel Spolsky speaking or writing about it.
I documented how they compare in a blogpost: https://blog.arkency.com/comparison-of-approaches-to-multite... - funnily it was waiting unpublished for some time, stumbling on your post made me finally publish it.
Looking forward to go through this comments question and see what others have experienced.
I have another draft in the making - about some of the pitfalls of PostgreSQL schemas approach, should be released soon.
The motivation was that we were on RDS’s highest instance and growing, with jobs mutating the data taking a less and less excusable amount of time.
The initial setup was using just the Octopus gem and a bunch of Ruby magic. That got real complicated really fast (Ruby is not meant to do systems programming stuff, and Octopus turned out very poorly maintained), and the project turned into a crazy rabbit hole with tons of debt we never could quite fix later. Over time, we replaced as many Ruby bits as we could with lower-level stuff, leveraging proxySQL as we could; the architecture should have been as low-level as possible from the get-go... I think Rails 6’s multi-DB mode was going to eventually help out too.
One fun piece of debt: after we had migrated all our major clients to their own shards, we started to work in parallel on making sure new clients would get their own shard too. We meant to just create the new shard on signup, but that’s when we found out, when you modify Octopus’s in-memory config of DBs, it replaces that config with a bulldozer, and interrupts all DB connections in flight. So, if you were doing stuff right when someone else signs up, your stuff would fail. We solved this by pre-allocating shards manually every month or so, triggering a manual blue-green deploy at the end of the process to gracefully refresh the config. It was tedious but worked great.
And of course, since it was a bunch of Active Record hacks, there’s a number of data-related features we couldn’t do because of the challenging architecture, and it was a constant effort to just keep it going through the constant bottlenecks we were meeting. Ha, scale.
Did we regret doing it? No, we needed to solve that scale problem one way or another. But it was definitely not solved the best way. It’s not an easy problem to solve.
In summary it was working in the following way:
There was table client(id, name).
And then dozens of other tables. Don't remember exactly the structure, so I will just use some sample names: - order_X - order_item_X - customer_X - newsletter_X
"X" being ID from the client table mentioned earlier.
Now imagine dozens of "template" tables become hundreds, once you start adding new clients. And then in the code, that beautiful logic to fetch data for given client :D
And to make things worse, sets of tables didn't have same DB schema. So imagine those conditions building selects depending on the client ID :D
Generally any large social network will need to follow the 1 database per account strategy to some extent because of the tricky many-to-many relationship that groups/users have with other groups/users- this creates a cartesian product that is too large to traverse in reasonable time with one large database.
This of course leads to a world of pain, and shouldn't be attempted unless there is really no other way to make it work.
That's precisely what this system was designed for.
Of course this requires a bunch of extra tooling, like upgrade scripts that don't ALTER tables directly but rather lock-copy-delete-rename, etc.
There are many tools out there which help out with this, and whatever we couldn't find we built ourselves. Tools like JOOQ can update code entities based on the database, so a database-first approach is what we used, but you can go either way.
The benefit of this approach is ultimately security and less multi-tenant catastrophes leaking data from customers, etc.
In some industries you'll also have to fight with lawyers about being allowed to use a database shared between customers because their standard terms will start with this separation. This approach is helpful when you have to keep data inside the EU for customers based there. If you want to get creative, you can also use the approach to game SLAs by using it as the basis to split customers into "pods" and even if some of these are down you may not have a 100% outage and have to pay customers back.
This design imposes challenges with speed of development and maintenance. If you don't know your requirements (think: almost any SaaS startup in the consumer or enterprise space) which is trying to find a niche, then following this approach is likely to add overhead which is inadvisable. The companies that can use this approach are going after an area they already know, and are prepared to go much more slowly than what most startup developers are used to.
Using row-level security or schemas are recommended for most SaaS/startup scenarios since you don't have N databases to update and keep in sync with every change. If you want to do any kind of split then you might consider a US/EU split, if your customers need to keep data in the EU, but it's best to consider this at the app-level since caches and other data stores start to become as important as your database when you have customers that need this.
Consideration should be given to URL design. When you put everything under yourapp.com/customername it can become hard to split it later. Using URLs like yourapp.com/invoice/kfsdj28jj42 where "kfsdj28jj42" has an index for the database (or set of web servers, databases, and caches) encoded becomes easier to route. Using customer.yourapp.com is a more natural design since it uses DNS, but the former feels more popular, possibly because it can be handled more easily in frameworks and doesn't need DNS setup in developer environments.
You might want to check out the CouchDB docs, and maybe take a look at their mailing list for both users and developers. Their dev team can provide the answers you're looking for as far a CouchDB goes.
It's my understanding that scaling up or down is a key feature of CouchDB. It's designed to make it easy to create clusters of them working together. But I really do not know much about that myself.
I suspect there was negative ROI, and the DBAs avoided me.
https://rubygarage.org/blog/three-database-architectures-for...
One thing we do strictly enforce is that the schema of the entire database must be consistent, so no one-off changes or migrations per customer. Database partitioning is completely opaque from the application's perspective.
Scaling is super easy since you can move a db to another host.
We've never lost a single piece of data since we started using it.
Depending on that answer, you may be interested in using row-level security: https://www.postgresql.org/docs/current/ddl-rowsecurity.html
i think that my thinking at the time that it would be easier from a security perspective since everyone had their own data and also speed wise since multiple smaller database could fit into memory better.
security wise, especially with an orm, it's not a huge concern as long as you use the top level model as an entry point, it will isolate everyone 95% of the time.
as for database sizes... we live in a world now where you can pretty much scale in an instance by throwing more virtual resources at anything, so i wouldn't worry.
Each account gets its own set of database tables (with a per-account table prefix) which are located in the same database. Upgrades can then take place on an account-by-account basis. They run many, many separate MySQL databases.
A fun read for sure
[0]https://www.colinsteele.org/post/27929539434/60000-growth-in...
For example, I imagine one database per account would make it easier to provide your customers with a flexible reporting solution.
Most of the reporting solutions available do not have good support for row level security.
I believe it will be helpful if it's necessary to separate customers into "pods" as we grow.
The main advantage I feel we get however is that it was quite easy to write a wrapper around mysqldump to retrieve data for development purposes.
I worked at a company that stored all customer data in a single database. The performance was comparable but the agility was poor. Firstly, you had to download all customer data to get a copy to debug a problem. This was a security concern I had, and eventually we had to build a serialisation format to retrieve the "slice" of data we needed. This tool needed frequent updating as new tables were added.
You might argue that we should just try to imagine the bug and recreate it but we have some pretty complicated data structures which can make investigations very hard.
We did it about 3 years in, when DB became a scale challenge. Eventually you'll also get to the point where you want to be able to rebalance and migrate data between each shard.
All of this is nothing you should be trying to solve too early; i struggle to think of any real benefits of single DB per user, unless you are separating out all architecture- including app servers - and that might only be relevant for large enterprise customers? Selling in that market is hard.
We've done this from day one, so I can't really speak to the downsides of not doing it. The piece of mind that comes from some very hard walls preventing customer data from leaking is worth a few headaches.
A few takeaways:
- Older MySQL versions struggled to quickly create 100+ tables when a new trial was provisioned (on the order of a minute to create the DB + tables). We wanted this to happen in seconds, so we took to preprovisioning empty databases. This hasn't been necessary in newer versions of MySQL.
- Thousands of DBs x 100s of tables x `innodb_file_per_table` does cause a bit of FS overhead and takes some tuning, especially around `table_open_cache`. It's not insurmountable, but does require attention.
- We use discrete MySQL credentials per-customer to reduce the blast radius of a potential SQL injection. Others in this thread mentioned problems with connection pooling. We've never experienced trouble here. We do 10-20k requests / minute.
- This setup doesn't seem to play well with AWS RDS. We did some real-world testing on Aurora, and saw lousy performance when we got into the hundreds / thousands of DBs. We'd observe slow memory leaks and eventual restarts. We run our own MySQL servers on EC2.
- We don't split ALBs / ASGs / application servers per customer. It's only the MySQL / Solr layer which is multi-tenant. Memcache and worker queues are shared.
- We do a DB migration every few weeks. Like a single-tenant app would, we execute the migration under application code that can handle either version of the schema. Each database has a table like ActiveRecord's migrations, to track all deltas. We have tooling to roll out a delta across all customer instances, monitor results.
- A fun bug to periodically track down is when one customer has an odd collection of data which changes cardinality in such a way that different indexes are used in a difficult query. In this case, we're comparing `EXPLAIN` output from a known-good database against a poorly-performing database.
- This is managed by a pretty lightweight homegrown coordination application ("Drops"), which tracks customers / usernames, and maps them to resources like database & Solr.
- All of this makes it really easy to backup, archive, or snapshot a single customer's data for local development.
We do not generally want to fork the product for different versions or schemas or special features -- the goal instead is to roll upgrades through different groups so we have more time to react to issues. We still want one single software version and data storage layout lineage. This matches the Salesforce.com model, so we won't need to deal with lots of different data migration histories, custom fields, etc. (I'm curious to see how long we stick with that). (I realize SFDC is all about custom objects, fields, UIs, etc. ... but their underlying software is same for all tenants. We also have some measure of customization, but within the same underlying DB layout that's the same across all tenants.)
The backend tenants use is written largely in Java / Spring with managed-RDBMS and other data-storage technologies from one of the big cloud vendors. Orchestration is into a Kubernetes/ISTIO environment provisioned from raw cloud-compute, not a managed service. The coordinator between the managed storage-and-other services, Kubernetes/ISTIO, the Docker backend-software registries, the secrets-managers, etc., is a custom Django REST Framework (DRF) server app that lets DevOps provision "groups", attached to them fixed data resources (that don't usually change from deployment-to-deployment) as well as periodically revised/upgraded software resources (i.e., Docker containers with backend software).
The DRF server app's main job is to let DevOps define the next-desired-state aka "deployment" for a "group" (upgrading one or more of the backend servers ... changing the provisioning parameters for a fixed resource ... etc.), and then the kick off a transition to that desired state. Each such "deployment" reviews and validates once again all resource availability, credentials, secrets, etc. ... stopping along the way as appropriate for human verifications. Each step is done within a Django transaction, leading from "old deployment" to "new deployment". Any failure in any step (after an appropriate number of retries) leads to a rollback to the previous deployment state. There's only one low-level step whose failure would lead to an undetermined "emergency" state getting stuck "between deployments", and that's very unlikely to fail since by that point all elements needed for the crucial "switch" in upgraded software have been touched multiple times such that failure at that point is real unlikely. There's a fairly straightforward recovery from that state as well, after human intervention.
We chose this custom method because there are so many elements in so many different infrastructures to verify and tie together that wrapping all the operations in transaction-mediated Python made sense, plus the Python APIs for all infrastructure elements a very good, and mostly involve sending/receiving/inspecting JSON or JSON-like data. There's plenty of logging, and plenty of side-data stored as JSON blobs in DB records for proper diagnosis and accounting when things to go wrong. Groups can have their software upgraded without impact to other groups in the system. Another advantage is that as the "architecture" or "shape" of data and software resources attached to a "group" changes (changes to how configuration is done; introduction of a new backend service; introduction of a new datastore), the DRF server app can seamlessly transition the group from the old to the new shape (after software revision to make the DRF server app aware of what those changes are).
The DRF server app itself is easy to upgrade, and breaking changes can be resolved by an entire parallel deployment of the DRF server app and all the "groups" using the same per-group backend datastores .. the new deployment listens on a "future" form of all tenant URLs. At switchover time the pre-existing DRF server app's tenant URLs get switched to an "past" form, the new DRF server app's groups tenant URLs get switched.
In any case, these are some of the advantages of the approach. The main takeaways so far have been:
- there was major commitment to building this infrastructure, it hasn't been easy
- controlled definition of "groups" and upgrades to "groups" are very important, we want to avoid downtime
- Kubernetes and ISTIO are great platforms for hosting these apps -- the topology of what a "group" and its tenants look like is a bit complicated but the infrastructure works well
- giving things a unique-enough name is crucial ... as a result we're able to deploy multiple such constellations of fake-groups-of-tenants in development/test environments, each constellation managed by a DRF server
- the DRF will host an ever-growing set of services related to monitoring and servicing the "groups" -- mostly it can be a single-source-of-data with links to appropriate consoles in Kibana, Grafana, cloud-provider infrastructure, etc.,
We're still early in the use but so far so good.
We ended up in the "thousands to tens of thousands" of clients range, with thousands of tables per client and a pretty hairy schema.
Each customer had their own schema on one of a few dozen giant database servers. The company pushed this idea out to other parts of their infrastructure: separate webserver/message broker/cache tiers existed for each underlying database server, so outages or brownouts in one component couldn't affect other customers' data that much.
Schema migrations, interestingly, weren't much of a problem. The practice was nailed down early of "everyone gets migrated during a release, no snowflakes". That, plus some pretty paranoid tooling and an acceptable-downtime (well, on paper it wasn't acceptable, but everyone kinda understood that it was) in seconds-to-minutes during a release made migrations roughly as traumatic as migrations anywhere I've worked (which is to say, "somewhat"), but not too much more. It did take a lot of work to get the tooling right across multiple schemas in the same database server though. Investment in tooling--up to and including dedicated teams working on a single tool without a break-up date or firm mandate other than "make this not suck and keep the lights on"--is critical here, as in most areas.
Things that were hard:
- Connection management. Others on this thread have pointed that out. Connection pooling and long-lived queue workers were essential, and the web/request tier couldn't "scale out" too far without hitting connection limits. Scheduled jobs (and this company loved cron jobs, thousands of distinct invocations per tenant) were a problem in the connection-management department. Carefully written tooling around webserver connection reuse, cron-job execution harnesses (they didn't really run as cron jobs, they got shipped to a worker already running the code with warm/spare database connections and run there--all highly custom), and asynchronous jobs was needed. That occupied a team or three for awhile.
- The "whale" problem. When an individual tenant got big enough to start crowding out others on the same database server, it caused performance problems. We eventually worked on a migration tool that moved a client's entire footprint (and remember, this isn't just databases, but webs/caches/queue worker hosts/etc.) onto another shard. Building this tool was a lot of work, but when it was done it worked surprisingly well. My advice in this area: build a good manually-initiated/semi-supervised migration system. Leverage underlying database technology (binlog based replication). Don't hesitate to get very dirty and custom with e.g. replication logfile formats, and don't assume that $off_the_shelf_data_replicator isn't gonna collapse when you want to do online per-schema replication in massive parallel from the same database (not even if that tool cost you millions of dollars). Do NOT succumb to the allure of "we can automate the bin-packing and it'll constantly rearrange clients' datasets for optimal resource usage!" Manual is just fine for data migrations that big. Worst case, part of someone's job is to initiate/supervise them.
- SPOFs sucked. Some datasets weren't per-tenant at all; sometimes client companies merged together or split up; some data arrived intended for a tenant but wasn't tagged with that tenant's ID, so it would have to go into some separate database before it found a home. These systems were, bar none, the biggest liabilities, causes of production issues, and hardest things to code around in the entire company. You'd think that having to write application code for thousands of logical databases across all the per-tenant schemas would suck, but in reality it wasn't too hard. It was making sure your code didn't accidentally talk to a SPOF that was the problem. My advice here: microservices do not help with this problem. HTTP, gRPC, or raw database wire protocol: if you have dependencies on a "tragedy of the commons"-type used-by-everyone server sneaking into your nicely sliced up per-tenant architecture, those callsites are going to be the cause of your sleepless nights. Get good visibility into where they occur. Favor "push" into per-tenant models over per-tenant code doing a blocking "pull". Even if the push approach causes massive additional complexity and work. The costs of pull are too great.
- Some database specific shit (even on polished hosted offerings from AWS, or big-budget Oracle installs) will start acting really squirrely when you're talking to thousands of identical schemas on the same database server (and thus tens or hundreds of thousands of identical tables with different data). If you double down on this route, be prepared to have a few really, really good database folks on staff. I don't mean "help me fix my slow giant reporting query" people, I mean "familiar with the internals" folks. Example: query plans can be cached based on query text, globally, across an entire database server. Different schemas have super different clients, and thus super different data distribution among 100s of GBs of data. The plan that gets cached for query X against client A is the product of running heuristics/histograms/etc. across client A's data. That plan might perform pathologically when query X runs against client B (on the same database server)'s data, and finding out how/why is really annoying. Solution: bust the cache by a) happening to know that SQL comments aren't stripped from query texts before the text is used as a plan-cache key and b) prepend each tenant's identifier to each query at the database-driver level to prevent cache pollution. Result: you have traded a spooky query performance issue for a query-plan-cache-size issue; now your queries are all predictably slow because all your tenants' different queries are thrashing the plan cache. Tradeoffs abound.
You will need a way to detect schema version and bulk apply (and optionally rollback) schema updates. A ‘Schema’ table in each Site database with rows inserted/deleted after each update/rollback is sufficient.
A separate ‘Hosting’ database keeps track of all the sites and knows about each schema package, which is a version number, a function which can detect if the change was applied, and the SQL code to apply the schema change. Don’t ever store any site specific information other than the name/ID of the site in the Hosting database - because it could get out of sync when you restore a site backup, or if you have to restore a Hosting backup.
Ideally you would want to make schema changes always backward compatible, as in an old version of the code can always run fine against a newer schema. So, e.g. new columns are always nullable, as are new parameters to stored procedures. This has been a very useful property a number of times during deployments when you can switch the app binaries around without worrying about schema rollbacks.
You’ll of course need to script the database setup/creation process, so you can click a button to bring up a new site/customer/tenant. As much as possible don’t ever touch the database by hand, and if you follow this rule from the start you will stay in a sane happy place without much overhead at all.
I’ve done this with up to 4-figure number of databases and it’s served me just fine. There were many times that certain customers would get ahead in the schema and then later everyone would catch up as new code rolled out.
I think it would be a heck of a lot scarier doing DB operations if it was all a single database. For example, you’ll have a new customer who is using a new feature which you are beta testing with them. Easy to have just their database ahead of the mainline, and make any fixes there as you go, and then deploy the final GA schema worldwide.
The only cardinal rule I always followed was that a single binary had to work for all customers. I would not want to cross the line into customer-specific code branches at practically any cost. There were certainly feature flags that were only enabled for single customers, but ultimately every site could run on the same binaries and indeed the same app farm.
It’s particular useful to be able to backup/restore easily on a per-customer basis, and to be able to pull in just one customer DB into dev to reproduce the issue - without needing to pull everything over.
Not with Rails but with SQL Server and C#/ASP.NET. In this case it’s easy to setup so that the domain name would map to a database connection string at a very low level of the code. Everything above would have no concept of what site it was operating on. You never had to worry about writing any kind of code to isolate sites except for one thing — mixing the domain name into the session token so that a malicious user couldn’t try to reuse a session from another domain. Because of course it’s all the same set of app servers on the front-end.