HACKER Q&A
📣 pkkm

How do you handle observability and alerting at tiny companies?


What's your usual solution when you want better visibility into problems than you get from going through unstructured text logs with grep and vim, but your dev team of 1-10 people is too small to operate a bunch of complicated cloud software like Kubernetes and the ELK stack?


  👤 nmoadev Accepted Answer ✓
Depending on the scale of your actual systems, you might be surprised how far you can get with open source tools like Prometheus / Graphana and just not configuring them for huge scale.

You could for example just run one VM for all of your observability stuff, and stick these tools on it and store data to disk.

Alternatively, if you've got some money and you're systems are OK with outbound internet connections. SaaS monitoring solutions like NewRelic, Dynatrace, etc. are much more plug-and-play.


👤 uaas
Depends on what kind of problems mostly. If you need metrics, Prometheus and its ecosystem is as simple as it gets on or off Kubernetes. There are good quality “packages” for any kind of “infrastructure as code” solutions, like Ansible too.

For logs, there’s Loki which is a lot saner choice than ELK in 2025.

To have proper troubleshooting abilities, you will need a bit more than tooling. You should also need to spend some time instrumenting your apps (Prometheus exporters can only take you to a certain level, e.g. node_exporter for host level stats, or other technology-specific exporters) with metrics, and ensure that your apps are logging in a structured way at least.


👤 nikolay_sivko
Give https://github.com/coroot/coroot a try for full visibility in minutes with eBPF

disclaimer: I'm a co-founder


👤 delduca
I like http://papertrail.com; it’s super easy to integrate with any backend, offers alerts, and much more.

👤 nicbou
I face this problem running a single website with an API and a few side projects. Think a dozen docker containers on three servers in total. It's really hard to get a simple "wake me up if something breaks" system running.

I do like BetterStack heartbeats, but there is nothing similar for "pushing" a failure alert. You just push all your logs then configure filters in BetterStack logs.