Thoughts on /etc/hosts instead of DNS for production applications?

Question

Hi HN,"It's always DNS" is a theme we're all familiar with when it comes to outages. I understand why DNS is critical for most users. But for applications that are managed/deployed using an "Infrastructure as Code" system, where changes can, and should always be pushed in a way that treats the changes the same way code changes are treated (Devops and all that), is there any harm with using /etc/hosts files everywhere?That way name-to-ip association changes benefit from IaC, and DNS related instabilities are minimized. Of course, I am assuming the name-to-ip association is under the control of the system's engineers to begin with, for every other use case DNS can and should still be used.Why aren't cloud providers and FAANGs doing this already, where saving costs by eliminating things like DNS request traffic and CPU cycles is encouraged?

JohnFen · Accepted Answer

That's how it was done on the internet before DNS was developed. It's also how I still do it for a lot of the machines on my home network. As you note, it's faster and reduces network traffic.You do give up some good stuff, though. Load-balancing can be more tricky, for instance. And if any of the machines change their IP addresses, or you add new machines to the network, then you have to distribute a new hosts file to all of the machines that aren't using DNS.

Bender · Answer

In my opinion if there is no overlapping networks or the Infrastructure as Code understands pods, k8's and such then /etc/hosts can speed up resolution leaving things outside of the data-center to utilize DNS then it makes sense but requires some critical thinking about how all the inter-dependencies in the data-center play together and how fail-overs are handled.
Why aren't cloud providers and FAANGs doing this already
This probably requires that anyone touching the Infrastructure as Code are all critical thinkers and fully understand the implications of mapping applications to hosts including but not limited to applications having their own load balancing mechanisms, fail-over IP addresses, application state and ARP timeouts, broadcast and multicast discovery. It can be done but I would expect large companies to avoid this potential complexity trap. It might work fine in smaller companies that have only senior/principal engineers. Using /etc/hosts for boot-strapping critical infrastructure nodes required for dynamic DNS updates could still make sense in some cases. Point being, this gets really complex and whatever is managing the Infrastructure as Code would have to fully aware of every level of abstraction, NAT's, SNAT's, hairpin routes, load balanced virtual servers and origin nodes. Some companies are so big and complex that one human can not know the whole thing so everyone's silo knowledge has to be merged into this Inf as Code beast.
My preference would be to instead make DNS more resilient. Running Unbound [1] on every node with customized settings to retry and keep state up the fastest upstream resolving DNS nodes, also caching infrastructure addresses and their state, setting realistic min/max DNS TTL times is a small step in the right direction. Dev/QA environments should also enable query logging to a tmpfs mount to help debug application misconfigurations and spot less than optimal uses of DNS within the infrastructure and application settings before anything gets to staging or production. Grab statistical data from Unbound on every node and ingest it into some form of big-data/AI web interface so questions about resolution, timing, errors may potentially be analyzed.
This is just my two cents based on my experience.
[1] - https://unbound.docs.nlnetlabs.nl/en/latest/manpages/unbound...

SurceBeats · Answer

/etc/hosts works until you need to change an IP across 10,000 servers in under a minute. Then you understand why DNS exists.
DNS isn't just name resolution, I'd say it's kind of load balancing, service discovery, caching, and dynamic configuration "all in one".
The FAANGs do minimize external DNS calls, but they run massive internal DNS infrastructures because the alternatives (config management pushing files) are actually slower and more fragile at scale.

ActorNightly · Answer

The thing is, when it comes to AWS, its not like everyone is going to suddenly migrate off because of a DNS issues. If a company that runs on AWS is not making money, its very likely that their competitors are neither. So more optimized solutions are not really worth it.In a perfect world, everything would have a unique IPV6 address, and we wouldn't need DNS. Instead of NAT, you would just have any computer/vm that wants to be connected to the internet tacked on to existing address space, and then instead of DNS record being synced, you just use the address directly, and routing would take care of everything.