How do you decide what is a many-to-many vs. enums vs. array

Question

Whenever I'm designing a schema I always find myself wanting to just make everything a many-to-many relation because it makes things flexible and safe (referential integrity).But then there is always this feeling that these joins are going to make things complicated, verbose, and inefficient, when a simple enum or an array column would be simpler.My main gripe is that I hate having to impose constraints on my data model early on when I have no idea about how it will evolve over time.How do others deal with these decisions?Also, is there any product that would allow me to model the domain exactly as it is (i.e. like a semantic graph), and then have efficient and simple querying?

quickthrower2 · Accepted Answer

Just to note: many to many isn't "safer" than many to one. It is more dangerous in the sense you can have invalid data represented.
For example if you have an invoice line, it belongs to an invoice. The DB will ensure that for you with a foreign key. If you stick a join table in the middle to make it many to many, you can now have a invoice line that belongs to many invoices.
That is probably not what you want, and if it isn't the DB is asking your application enforce the constraint that it could have easily enforced itself.
And that means you are not getting the advantages of an enforced schema!
To answer your actual questions, I'd go for the most constraining option first (which is usually the easiest to implement) and then open it up to many to many later if required. That might need an 'upgrade' script, but that is par for the course in DB design.

oftenwrong · Answer

My advice for making these types of technical decisions would be to follow the spirit of YAGNI. That is, just do whatever is good enough for now. Do not overthink the decision. Do not over-engineer your implementation. Still, do a good job of it, and make it easy to change it later.
It sounds like you are most comfortable with the many-to-many approach, so I recommend using that option. Don't worry about hypothetical future problems that this approach may or may not bring. You can always change it later if it becomes a problem.
YAGNI: https://en.wikipedia.org/wiki/You_aren%27t_gonna_need_it

stocktech · Answer

I'd like to see an argument for using enums. I don't think enum constraints belong in the database - it's too business-y and belongs with the business logic. I also agree that join tables are simpler to use/query and will be more flexible. Join tables also make admin features a breeze.

dyeje · Answer

Many to many is almost always the right call. I've used an array on occasion when I knew we wouldn't be querying by it and didn't want the extra overhead. You can always extract it to a many to many later if needed.

cpach · Answer

What is the context here? An SQL database?

scott31 · Answer

You seem to be lacking basics, just get yourself a proper book and follow it