HACKER Q&A
📣 ev1

Who owns archive.is, and why are they trustworthy?


I understand the need for anonymity when you're doing this due to the sheer amount of abuse reports, fake and real DMCAs, etc.

But why do people trust it? How do you know the pages you're archiving haven't been tampered with selectively to change history? This is just out of sheer curiosity, and I am not saying they do this.

This is made further interesting because of the following:

- Analytics from various Russian providers, instead of self-hosted (FYI: I consider GA to be equally privacy-violating as Metrika or Mail.ru)

- Large amounts of reverse proxies off questionable or bulletproof hosting providers

- Indefinitely doing this can't necessarily be cheap either at scale, who is paying for this?

- Demanding tracking or else blocking your access to the site, blocking any resolver that doesn't send the first 3 octets of your IP to them (edns-client-subnet)

- Explicitly tracking you in odd ways: they repeatedly load pixels/do DNS preconnect/preload from wildcard subdomains containing a cookied number, IP, country, tracking IDs. View any archived page and ^F "pixel.archive.is"


  👤 adventured Accepted Answer ✓
Archive.is isn't very important yet. I don't believe they warrant much concern re such questions. It doesn't yet matter very much if they're super trustworthy or not.

At their present scale, going through and manually changing (tampering with) saved content for propaganda (or similar) purposes, would have very little impact. More realistically, it probably has close to zero potential consequential impact. It'd be quite the chore for very little return.

If they become important some day, with dramatically greater scale of usage, then getting answers to these questions might be important.

If they eventually betray trust, they're trivial to replace. Other competing variations of archive.is exist now. It's a relatively easy service to create. Someone should probably challenge them just on the basis of how bad their ui & ux are.

At scale, if they begin abusing their position, it would become well known, they would get a reputation and it'd kill their service. The barrier to competition extremely low.


👤 colejohnson66
Also, why do people use archive.is over web.archive.org? One (web.archive.org) is an actual library and gets all the legal protections that entails, while the other doesn’t.

👤 ulucs
Why would you trust them? If you are trying to archive something, make sure to use multiple (separately owned) services so that you don't need to trust them.

👤 bscphil
> This is made further interesting because of the following:

An additional concern: they've shown signs in the past of being capricious, or at least, easily annoyed by (subjectively) insignificant slights. They continue to block Cloudflare DNS users, last I checked. The "reason" is that Cloudflare doesn't send along the eDNS client subnet, as a way of protecting their users' privacy. [1]

I would argue this means archive.today / is can't be trusted to have the best interests of the community at heart. It's not a public service in the way that archive.org is.

[1] This bad behavior is actually mentioned in their Wikipedia article, along with the additional uncited claim that they throttle users to 20 MB of data per day, upon which they apparently ban your IP address. I haven't verified the latter claim. https://en.wikipedia.org/wiki/Archive.today#Worldwide


👤 TechBro8615
I share most of these concerns, especially since it’s primarily used in the “alt-right” sphere and could easily become a vector to sow discord, either by tampering with content or simply by mapping the communities of users visiting it.

Worth noting, it’s probably not that expensive to run. Most of the hosting services they use would be offering “unmetered” bandwidth, so the cost is probably fixed per month, likely under $1000.