HACKER Q&A
📣 curiousgal

How to choose a database for a project?


Hi, I am working on an experimental org project that involves running multiple Monte Carlo simulations. The results have to be stored in order to facilitate backtesting and comply with audit.

What database choice seems appropriate in your opinion?

Thanks.


  👤 user_agent Accepted Answer ✓
I'm a junior dev, so I try to stick with things that have been proven to work without problems. This is why I think you can almost never go wrong with SQL. Some say that PostgreSQL is one of the most advanced ones and it's free. Others say and I can confirm, if your projects isn't huge give a try to SQLite. On the other hand there's this "new" trend with no-SQL approach, but the only proven thing I know about it is that there are people out there who made their career from saving other companies from troubles generated by young devs / CTOs / architects wanting let's say MongoDB very much, which occurred to be often a terrible choice. That makes me cautious. Don't ask me about details regarding that - I don't know.

I have personal experience with MariaDB / MySQL and SQLite. A pleasant thing about both is that they are reliable and you are going to find a lot of resources (also help) online, which saved me some headache a couple of times. In more complicated projects I'll be using Postgres. I'll try to keep Mongo-ish stuff away until I'm going to have a very specific reason to use its advantages, which I don't fully understand at this point.

Again, I like things I can rely on. SQL is one of them.

PS: I'd be willing to know more what are you going to simulate @curiousgal. MC simulators are cool!


👤 maps7
To answer the title question, I always use Postgres [1] because it's free and open source.

To answer the question in your post (which is different), I have no idea.

[1] https://www.postgresql.org/


👤 Jugurtha
Pick the one database you and your team/org have been using so far, so you'd spend close to zero time on database issues.

I think it would be better to remove as many degrees of freedom as possible, simplify aggressively, and delay decisions you don't have to absolutely make at this stage to allow you to focus on proving what it is you're trying to prove.


👤 brudgers
The simplest thing is to use the file system. Basically the file system is a key-value store where the value can be anything (a BLOB).

There are a few space and time and reliability tradeoffs but they don't matter for a lot of workloads. The positive side of the tradeoffs come in the form of simplicity and tooling.