I would expect with all the effort that has gone into these, and all the progress in machine learning, these systems would be fantastic and provide recommendations that I really enjoy. But they don’t.
YouTube seems to have a massive recency bias and music, film, and TV recommendations rarely end up being things I enjoy.
Yeah, you'd expect more from ML at this point. I wonder how much of ML research actually gets utilized in industry.
So If I like and disliked 10 movies just don't show me movies from users who also liked these. First, filter or downgrade all users who liked what I disliked and then create my recommendations.
Results these days seem worse then the old days of Altavista and Lycos.
Example: rap music and hiphop. For the most part, I don't enjoy it that much. There are a few things though that will make a track palatable to me (or instantly turn me off despite anything else positive about it):
- Sentimentality or romance in the lyrics
- Backing track or samples that are harmonically interesting
- No egregious sexism, misogyny, glorifying of violence, thug/gangbanger culture, etc
- Beats featuring stereotypical trap hi hats kind of annoy me
I've enjoyed tracks like Deja Vu by Post Malone, or Lucid Dreams by Juice WRLD. Browsing the rest of their discography consistently disappoints me though, because tracks like these are few and far between.The way I assume recommendation systems are traditionally designed does not account for this. It sees me listen to these tracks, and thinks I'll probably like something by similar artists or the same artists. As far as I'm aware, Spotify's recommendation system is not aware of things like tempo, meter, tonality, themes of the lyrics, harmony, etc. and so there's no way it can pick tracks like this out from the crowd.
And why would they bother? Those are all much more technically difficult things to implement than forming correlations between IDs in a database.
This is likely intentional to encourage more content creation. Competing with two decades of content is almost impossible, so they make them compete with just 2 weeks of content.
It is extremely hard to predict human behavior beyond simple schemes such as most popular items (or most similar items to those you’ve seen before).
(bio: six years of xp in a leading recommendation company)
Let's say for example that you've enjoyed the recent hit game Baldur's Gate 3 and you'd like to play something similar. You check out the Steam page, see that the game is tagged as a "RPG", so you click the tag and expect to get something similar. What you get instead are games that are not only very different but also so far removed from the genre that no one will ever list them in a forum thread talking about RPGs. Examples include titles such as Dota 2, Warframe, Palworld and Horizon Zero Dawn. There are genuine RPG games as well but the fact that there are so many titles that you need to ignore is pretty bad.
Tags aren't the only way Steam recommends new games. Going back to Baldur's Gate 3 Steam page there's a section called "more like this". I'd expect it to match more closely to BG3 and in many cases it does. But when it doesn't, it shows up ridiculous recommendations like The Sims 3 or Tom Clancy's The Division - games that have nothing to do with what Baldur's Gate 3 is.
And all of this is for an extremely popular game that at the same time doesn't do anything revolutionary. Trying the same approach with a more unconventional title that you've liked is a quick recipe for failure. I've just checked the recommendation page for Undertale and it's full of random games that have nothing to do with the title.
There were a lot of little things that added up:
1. Everyone interprets the 1.0 - 10.0 rating scale differently.
2. Most users just rate the same, universally known games.
3. For the other users, the games they've played are usually really different. It's a sparse matrix.
Every attempt at game-to-game analysis flopped. User-to-user analysis seemed to work better.I managed to find a few dozen similar users. Found some hidden gems by going through their pages manually. Fewer than I would have hoped though.
For example, if you have not seen The Shawshank Redemption, chances are you will like it. It's #1 among IMDB top 250 list. But a recommender does not know if you've seen it. If you've seen it already, it's a bad recommendation.
So the same recommendation for the same person can be good today and bad tomorrow, depending on something recommender engine does not see. That makes it very difficult to tune and measure performance.
YouTube: Is that recommendation for good content or for the highest value content you will consume?
Netflix: The more you use it, the less they make. There is a perverse incentive to put just enough good content in front of you to stay subscribed but not use it more.
Amazon: They dont give a fuck what you buy, the sellers are now in a race to the bottom and that business pays for it self. AWS makes all the money.
Find the perverse incentive and optimize for that.
When it comes to relaxing and watching something, I don’t like _my own recommendations_.
Maybe recommendation is like that friend who invites you to watch a movie—you know it’s a gamble. Haha.
All machine learning algorithms struggle at the edges — they’re very good at predicting aggregate behavior.
If you have eclectic tastes there’s probably not enough data on your demographic.
...and I'll never stop saying this: they have some sort of monetized recommendation system in place. Can't prove it but I can see it working almost every week. A video of a big company, "celebrity" or TV channel that I'd never watch, find it's way in my feed.
A passable analogy: you buy a car and get hassled, often hard-sold for a pre-paid maintenance package, tire insurance, financing insurance, undercoating, bla bla. You don't want any of it but it's what they push the hardest.