HACKER Q&A
📣 aw123

How to make a globally fast search engine without using AWS?


Forgive me as I'm sort of a novice but after looking at how sluggish some of the small search engines like Kagi are compared to Google or DDG I've been thinking about what the best most cost effective way for a single person/small team to build an app that is low latency across the world without breaking the bank and without relying heavily on the big cloud providers.

If we only have one database in one location then that wouldn't work, right? We'd need a distributed solution so what could we use? For serverless functions to start maybe we use Cloudflare workers? Maybe we'd then have a strategy for moving off cloud and towards buying/renting more datacenter space as we scale?

I'm interested in hearing thoughts about this. If you were to build a fast search engine that could scale to millions of users today, how would you do it?


  👤 hugh_kagi Accepted Answer ✓
Question is the same as it ever was, can you locate your data and your compute close to your users.

Issue with search is there's a lot of data, and a lot of compute to run over it, so you end up replicating PBs just to reduce latency.


👤 benoau
As a starting point I'd go all-in on serverless because even though it's not necessarily the fastest response times or cheapest, it does let you completely ignore several sets of scaling challenges till the price matters. So I'd go serverless for spidering, indexing and user-facing APIs. UIs I'd go static HTML on a CDN.

For database, first I'd lean on plain file s3-like storage as much as possible for spidering and indexing data and try to keep the user-facing database under a couple hundred terabytes, I'd probably favor some bare metal at this point but that will fit on some "db as a service" providers.


👤 al2o3cr
Fly.io has been working in a similar space (globally-distributed apps) for a while; their blog posts are a great place to start reading.