Are all analytics violating user privacy?

Question

I am building an analytics platform myself, and have often seen comments from other users on HN saying that all client-side analytics are evil, or even that all analytics are bad and violate user privacy.Are analytics evil, even if their sole purpose is to improve user experience? Let's say that we are building the infrastructure in a city, if we had stats on what streets have daily traffic jams, that would tell us a lot about the behavior of the people, but would also allow us to better direct traffic and reduce those issues.Where do we draw the line?I do agree that any analytics selling or mining or for the purpose of persuading users into buying more stuff they don't need or make them spend more time in Instagram are not really ethical, but we can still have legitimate use cases for analytics whose sole purpose is to make the user experience better, even at the expense of some privacy (in a public space/website).

badrabbit · Accepted Answer

Without explicit consent? Yes. When you visit a website or a brick and mortar store, outside of security monitoring most reasonable people have no accepted expectation of their activity being monitored and analyzed for purposes unrelated to the transaction they are attempting to complete.
Most privacy issues boil down to consent.
1) Explicit consent must be granted by users for all groups of data collection or mining for which any significant portion of people have not granted implicit consent
2) Lack of consent should not be used as a reason to deny service, except if the service directly depends on the collected data to fuction.
> Let's say that we are building the infrastructure in a city, if we had stats on what streets have daily traffic jams, that would tell us a lot about the behavior of the people, but would also allow us to better direct traffic and reduce those issues.
Drivers that want to help reduce traffic jams should opt in by a sticker or some other solution. But to be honest a simple count+location is something I implicitly give consent for. If a person sits byba roadside counting cars I have no problem with it. The problem is when they record video,images or collect identifying information such as color,make/model,etc... Then I no longer give implicit consent.
To be extreme, blowing up the road also solves traffic jams, the solution should come with requirements such as keep the road intact and don't stalk people.

dkersten · Answer

I don&rsquo;t think analytics are evil in and of themselves. If they are used to improve user experience (eg by analysing what content or features people use and how so that the workflow can be improved, or new high value features or content added), then I think analytics are great. Without the data, you are blindly throwing stuff at the wall hoping it sticks.But that&rsquo;s rarely what analytics are actually used for and no matter how much you want your platform to be used for that, people will misuse it. Instead analytics are used to wring as much value out of a user as possible, typically by using analytics to find out how best to suck up users attention and get them to buy more stuff or interact with more adverts. I don&rsquo;t think people would mind their privacy being eroded quite so much if it wasn&rsquo;t being used against them to trick them into spending more money (either directly or indirectly via advertisement). That&rsquo;s ultimately what it comes down to: why do people buy personal data? To find out ways to trick users into spending more money.Understanding user behaviour so that you can build better products or produce better content (but why are you doing this? In most cases it s because better content = more users to click on adverts...) is fine, or even necessary. But most analytics data isn&rsquo;t used for that, or at least not solely for that.In a product I&rsquo;m currently working on, I&rsquo;ve taken a stand that the client will have NO third party scripts or analytics. Zero. And on the backend, I only use third party services necessary for running and maintaining a quality service and disclose all providers I use and share data is processed or stored by them. I&rsquo;m not completely analytics-blind, as I do track metrics and logs, but I try to keep them focused around what&rsquo;s needed to monitor service health and debug issues and only high level data on what people are using (anonymously) and how frequently.As others have mentioned, I think consent is an important aspect. And not this &ldquo;your privacy is important to us, so uncheck these thousand checkboxes if you want privacy&rdquo; bullshit. I think if you are open an honest with users about what data you collect and what you do with it, and ask them if it&rsquo;s ok with them first, then I have no real problem with it, especially if it really is only for improving the user experience.

ekimekim · Answer

I'm going to copy-paste from an older comment of mine (https://news.ycombinator.com/item?id=22332136) that I think captures my opinions on this question:
Client-side tracking, if you need any at all, should be a) high value, b) aligned with my goals as a user, and c) as respectful as possible of my privacy (eg. anonymising values, only taking what info you need).
The b) condition there is most nebulous - I'm mainly thinking of things like reporting client-side javascript errors. This is aligned with my goal of your site being bug-free so I can use it better. Another example would be an (opt-in!) recommendation system that I find valuable. What would NOT be an example of this would be tracking of my actions on the page in order to optimize the chances that I'll engage with the content. Engagement is your priority, not mine.

zzo38computer · Answer

I think that:- Client side analytics are not helpful (and may make invalid assumptions).- Client side analytics waste energy, bandwidth, RAM, etc.- You should avoid other wastes too, such as including too many pictures, CSS, scripts, animations, etc.- Client side analytics wrongly violate privacy.- In the case of client-side JavaScript errors, yes in that case it may be helpful (sometimes), but it should ideally ask first. Errors should also be displayed in the console window, so that the user can diagnose the errors by themself.- A web page should be designed to work without JavaScripts and without CSS as much as possible, although sometimes they are helpful (although a better "user oriented" design should be needed rather than the "author oriented" design; I have some ideas about how to do this).- Don't always use web pages! There is such thing also as Telnet, SSH, Gopher, NNTP, plain text files (over whatever protocol), etc. (I use many plain text files myself, actually.)- Let the user to write a comment (by email, perhaps). Otherwise, you will just have to guess, and might not be able to. Even if you use client side analytics, it cannot guess something that isn't there.So, I use server side analytics instead, is much better.(In the case of traffic jams: You can see how many cars they are; you do not need to add a device on each car to count them, nor to read license numbers, etc. A simple light sensors to see if the light is blocked by cars, would be good to have.)(In the case of stores, well they already need to count how many products have been purchased, as they very well should. They need not track who purchased each item, just to keep track of how many of each item has been sold. That will allow them to restock and to bring in enough for everyone, and whatever else they need to do. They can keep track of returns too.)

itronitron · Answer

A good faith start would be to start calling it data collection instead of the industry accepted misnomer 'analytics' . If your platform is collecting data and you aren't comfortable calling it data collection then maybe you should reconsider what data you would be comfortable collecting.

dylz · Answer

One of the nastiest things I've seen from the "analytics startups" espousing how they're GDPR friendly and compliant and privacy friendly is that they advertise stuff like CNAME cloaking, using random URLs or hostnames for data collection, etc.
This is incredibly disgusting behaviour: the end-user has EXPLICITLY signaled intent to opt out, gone out of their way to try and protect themselves while blatantly signaling that intent that they do not want it, and the "privacy respecting and caring new not-like-the-other-guys" data collection service is attempting to repeatedly force itself on the end-user, sometimes trying multiple times pretending to be different hostnames, lying about what it is, etc.
I have seen bullshit "analytics" SaaS bruteforce its way through dozens of generically-named cloudfront or akamai hostnames until they find one that works.
No matter what you are collecting, this type of behaviour is evil and abhorrent. Someone saying no a thousand times until you find a disguise that works on them is not consent, or a yes.
> Are analytics evil, even if their sole purpose is to improve user experience? Let's say that we are building the infrastructure in a city, if we had stats on what streets have daily traffic jams, that would tell us a lot about the behavior of the people, but would also allow us to better direct traffic and reduce those issues.
Server-side analytics can do this fairly well.
But it is always a slippery slope - I have never, ever seen this proven otherwise. You count (raw number only) on average travel time or # of cars at an intersection. Okay. Nothing else - just when a car trips your wire you add one to an integer associated with that day or something. Few people would have an issue with this.
But now you want more. Now you set up licence plate scanners at every intersection. Now you store them in a database so you can tell if the same car is driving the same way and at what time each day. But wait, someone offers you money for this data, possibly even more money if you also run recognition and guess vehicle make/models in real time. And now you have a large database of people correlated to where they are at what time open to retrieval by others.