HACKER Q&A
📣 ev1

Captcha Alternatives?


TLDR: I help with a gaming community-related site that is being targetted by a script kiddie, they are registering hundreds of thousands of accounts on our forums to 'protest' a cheating (aimbot) ban. They then post large ASCII art spam, giant shock images (the first one started after we blocked new accounts from posting [img]), the usual.

Currently we use a simple question/answer addon at registration time - it works against all untargeted bots and is just a little "what is 4 plus six" or "what is the abbreviation for this website" type of question. It's worked fine for years and we don't really get general untargeted spam.

I am somewhat ethically disinclined to use reCAPTCHA, and there are some older members that can't reasonably solve hcaptcha easily. Same for using heavy fingerprinting or other privacy invading methods. It's also donation-run, so enterprise services that would block something like this (such as Distil) are both out of budget and out of ethics.

Is there a way I can possibly solve this? Negotiation is not really an option on the table, the last time one of the other volunteers responded at all we got a ~150Gbps volumetric attack.

I've tried some basic things, like requiring cookie and JS support via middleware; they moved from a Java HTTP-library script to some kind of Selenium equivalent afterward. They also use a massive amount of proxies, largely compromised machines being sold for abuse.


  👤 huhtenberg Accepted Answer ✓
* Allow new accounts, but hide messages from them until their posts are verified manually and the accounts are either approved or shadow-banned.

* Don't delete ban accounts, don't notify them in any way, but tag their IPs and cookies to auto shadow-ban any sock puppets, so that these don't even make into an approval queue.

* Use heuristics to automate the approval process, e.g. if they looked around prior to registering, or if they took time to fill in the form, etc.

* Add a content filter for messages, including heuristics for an ASCII art as a first post, for example, and shadow-ban based on that.

* Hook it up to StopForumSpam to auto shadow-ban known spammers by email address / IP.

* Optionally, check for people coming from Tor and VPN IP, and act on that.

Basically, make it so that if they spam once, they will need both to change the IP and to clear the cookies to NOT be auto shadow-banned. You'd be surprised how effective this trivial tactic is.

All in all, the point is not to block trolls and tell them about it, but to block them quietly - to discourage and to frustrate.


👤 laurieg
Other posters have given good advice on technical aspects. I'd like to add my experience from moderating a large subreddit.

Focus on making it not fun to troll. Never acknowledge the disruption. Make all your countermeasures as silent as possible. Never address the script kiddy directly. Don't accidentally make a "leader board" or similar by counting number of bans/deleted posts etc.

Eventually it just becomes a waste of time to scream into nothingness and they will go elsewhere.


👤 zaarn
A method used by a german community once was the troll throttle. The basic idea being that troll and spam content compressed better than average content.

So you point various compression algorithms against your community content to form an average/median/other statistical points. Try both compressing each post individually and compressing it as one giant text corpus and counting the size growth a post generates by being added. These are your measurement points.

An incoming post must solve a captcha to be able to post, however, the likelihood of solving it is tied to the compressability of your post.

A compressable post is likely to be spam or ascii art. The captcha fails even if the data was entered correctly. IIRC I used a relationship of 'min(1, sqrt(1/compress_factor)-1.05)'.

A non-compressable post is not only likely to succeed a captcha, they might succeed even if they actually failed it.

The entire point is that it shifts balances. Trolls will have to submit their posts a few times and resolve captchas, which slows them down. Making content that does not compress well across a variety of compression algorithms, especially if you also account for existing text corpus, is a very hard problem to solve. They'd have to start to add crap to the post to bloat it up, at which point you can counter with the next weapon.

Repeat all of the above, except instead of compression, you estimate entropy. High entropy blocking means you can block messages containing compression decoys.


👤 TomGullen
We’re a UK company, and we had an incredibly persistent spammer. He’d also send us threatening emails. His persistence and nastiness was draining and quite frankly impressive how much time he was putting into it all.

I don’t know if it was coincidence, but after some sleuthing found his real name and did the online FBI tip-off form about his emails to us. He had a bad history and may of been on bail.

Stopped pretty promptly after that - guessing he got a phone call.


👤 dougk16
You didn't mention email confirmation in the first place but figure I'd mention this for others. I recently ran into a similar situation and had the idea of registrants emailing ME a secret code I give them instead of confirming they receive it. Still technically automatable but would definitely throw a curveball to the bots. I confirmed with an Ask HN that this is a secure method: https://news.ycombinator.com/item?id=24116530

👤 SquareWheel
Not the answer you're looking for, but reCaptcha is probably your best option.

I attempted half a dozen mitigation strategies to prevent spam on one forum I ran. I tried honeypots, questionnaires, other captchas, and proxying services to block bots. They slowed the bots at best, but when there's a torrent of bad actors it really doesn't matter if you slow them down 50%.

I finally installed reCaptcha and it solved the problem instantly. Not a single bot has signed up in 6 months. I started getting suspicious that signups were just broken, but I tested it and it was fine.

After that experience, I'm very much on team reCaptcha. I tried hCaptcha as well (on a different project), but found it was much harder to solve.


👤 alexnewman
HCaptcha founder here. I am sorry you had trouble solving captchas. Perhaps your older members might have luck with https://www.hcaptcha.com/accessibility.

👤 obblekk
client side, GPU bound challenge (the user doesn't do anything but wait for a spinner to load, the javascript has to solve a np hard problem).

won't block all spammers, but will increase the server cost (even for selenium) to the point where they'll have to get GPU instances which will be too expensive for a script kiddie.

this is what cloudflare is sorta doing when they say "verifying your browser"


👤 crazypython
Hey, my game will be in a similar situation. I'm looking into building a CAPTCHA that works by taking submissions from r/notinteresting, r/mildlyinteresting, and r/interestingasfuck, and asking the user to take an image and classify the image into not interesting, mildly interesting, and very interesting. We can distort, crop, and recolor the image to defeat reverse image search. That should be enough of a stopgap to stop them. Contact me via email (in my profile page) if you want to work together on that project.

👤 mey
Some CDN services provide Bot detection. (as well as other DDoS mitigation options).

https://www.cloudflare.com/products/bot-management/

https://www.akamai.com/us/en/products/security/bot-manager.j...

Edit: I didn't see your comment about budget. I expect Akamai may be out of reach, not sure about Cloudflare's options. Most bot detection is going to need to finger print behavior of the interaction to the site (Captcha as well). If that data is handled correctly, (not being sold/made available to a third party/destroyed after use), I believe it can be done ethically. Obviously my ethics are not yours.


👤 kilburn
A more extreme approach that may or may not work for you is to make the community invite-only.

Track the network of invites and shadow-ban linked accounts when you detect the spammer popping up. The spammer will eventually run out of invitees.

You can combine this with "no invitation required" short periods, where you make changes to the signup flow, spam detection, etc. and make the window short enough for the spammer to not have the time to adjust their bots.


👤 tommica
If you can detect them as spammers, instead of banning them, shadow ban them, making their posts invisible to others, and slow down the servers response for them.

There is also alternatives to recaptcha, that might be more ethical, for example https://www.phpcaptcha.org/ - there are some image matching ones too, but I don't know any specific ones.


👤 bawolff
Maybe try stealing Wikipedia's ip ban list - wikipedia gets a massive amount of spam which makes it an easy resourse for getting a list of evil ip addresseses.

Their list is a combo of https://en.wikipedia.org/wiki/Special:GlobalBlockList and https://en.wikipedia.org/wiki/Special:BlockList?wpTarget=&wp... and TOR (which is handled automatically) [there is also an api version in json format]


👤 tdrp
Not sure why it's not mentioned but, in addition to technical mitigation, if you know the attacker's general info, then maybe you can also try other avenues such as law enforcement or legal claims.

More work as well but when you whois some of the attacking machines you can find out what the abuse@ email is for them and contact them. That can put the provider on notice if you later also go with some legal action.


👤 MattGaiser
Is there a reason to not use hcaptcha for signup only? Older members are already members, so all you are doing is applying it to the new people.

Or add 2FA with a text message for sign up. That is a lot harder to automate and unless he is willing to spend a ton of money on extra phone numbers, he should run out of them quickly.


👤 awinder
How many people are you registering a day normally? I’m wondering if you shut off signups for a while + handle the inevitable attack & they can’t get back in they might move on. How much money and time do you think they (or you) are willing to commit though, what a crappy tale :-(

👤 niftylettuce
I'm working on https://spamscanner.net, which will be useful very soon for this with a public and free API (which will store zero logs and adhere to same privacy as https://forwardemail.net).

👤 mpol
In my experience JavaScript filters work very well against spambots. For example, you could have 2 honeypot fields, 1 with a certain value, 1 empty. In JavaScript you switch their values, and on the server side it should validate this way. Most spambots don't run JavaScript (yet). Another one could be a simple timeout, again 2 fields with a certain value. You count 1 down, the other up. On server validation there should be a difference of more than 1.

For an example, check a WordPress plugin I made 2 years ago: https://wordpress.org/plugins/la-sentinelle-antispam/

There is also the slider thing on Ali Express, that you could check out. I haven't looked into it, not sure how it exactly works.


👤 heartbeats
Try requiring new accounts' first few posts to be manually approved. Then he'll have to make enough quality posts to build up credibility first. This is very difficult, especially for a script kiddie.

Alternatively, you can take away the instant gratification by adding a cooldown of, say, three days for each created account. Then he'll have to register them in bulk and hope the humans don't spot the patterns.

You could also try using Bayesian filtering, but you'd have to block the ASCII art first.


👤 2FA4spam
How about a simple out-of-band confirmation requirement for every account signup?

“Thank you for registering. Please send an SMS to number XXX with code YYY to activate your account.”

Kind of like a reverse 2FA.


👤 hinkley
You could also try detecting Selenium, but that could be cat and mouse as well:

https://stackoverflow.com/questions/33225947/can-a-website-d...

Remember, the goal is to flag accounts for cheap bulk rejection, without telegraphing to the attacker.


👤 NetToolKit
We at NetToolKit have been working on related problems for years and might have two products that directly address what you are looking for.

We launched Shibboleth (a CAPTCHA service) about a year ago, and you can select from a variety of different CAPTCHA types (including some non-traditional types; different types have different strengths and fun factors): https://www.nettoolkit.com/shibboleth/demo There are a variety of options that you can set, and you can also review user attempts to solve CAPTCHAs to see if you want to make the settings more or less difficult.

Recently, we launched Gatekeeper ( https://www.nettoolkit.com/gatekeeper/about ) which competes against Distil and others, but without fingerprinting. Instead, site operators can configure custom rules and draw on IP intelligence (e.g. this visit is coming from Amazon AWS or this IP address has ignored ten CAPTCHAs in two minutes), and Gatekeeper will indicate to your website how it should respond to a request based on your rules. There's also other functionality built in, such as server-side analytics. Some light technical integration is required, but we're happy to help with that if need be.

As with all NetToolKit services, we have priced both of these services very economically ($10 for 100,000 credits, each visit or CAPTCHA display using one credit).

We would very much appreciate a conversation, even if it is only for you to tell us why you think our solutions don't fit what you are looking for. I would be happy to talk to you over the phone if you send me your phone number via our contact form: https://www.nettoolkit.com/contact


👤 HEHENE
This may run afoul of your "no privacy invading methods", but are you able to implement email verification before new users can post? Then once they get bored of trying to attack the site you can go and purge all accounts created in the last n days that haven't been verified yet.

I run a gaming community with several thousand members and we regularly have to fend off attacks on both the community (spam bots in Discord) and the game servers themselves (targeted DDOS attacks usually in the 200-300Gbps range.)

From my experience, they tend to get bored and move on rather quickly so often times whatever we have to implement is more temporary in nature and doesn't really affect the existing community much if at all.


👤 bo1024
Sorry to hear you're dealing with this. I'm not in the field, but this is a case where I would abstractly be tempted to use javascript blockchain mining or similarly require some amount of useless computation by the browser during signup.

👤 Pick-A-Hill2019
Some good answers in the stuff others have posted (Especially the accessibility one).

You don't provide many details of what you do and do not have at your disposal in terms of skills, tech stack, access to log files etc so this is a non-expert cut and paste from SO [1]. Yeah I know (StackOverflow) and it doesn't even relate directly to your problem ....But if you read the long bit below it might give you a bit of blue-sky thinking.

>> The next is determining what behavior constitutes a possible bot. For your stackoverflow example, it would be perhaps a _certain number of page loads in a given small time frame from a single user (not just IP based, but perhaps user agent, source port, etc.)_

Next, you build the engine that contains these rules, collects tracking data, monitors each request to analyze against the criteria, and flags clients as bots. I would think you would want this engine to run against the web logs and not against live requests for performance reasons, but you could load test this.

I would imagine the system would work like this (using your stackoverflow example): The engine reads a log entry of a web hit, then adds it to its database of webhits, aggregating that hit with all other hits by that unique user on that unique page, and record the timestamp, so that two timestamps get recorded, that of the first hit in the series, and that of the most recent, and the total hit count in the series is incremented.

Then query that list by subtracting the time of the first hit from the time of the last for all series that have a hit count over your threshold. Unique users which fail the check are flagged. Then on the front-end you simply check all hits against that list of flagged users, and act accordingly. Granted, my algorithm is flawed as I just thought it up on the spot.

If you google around, you will find that there is lots of free code in different languages that has this functionality. The trick is thinking up the right rules to flag bot behavior. <<

[1] https://stackoverflow.com/questions/6979285/protecting-from-...


👤 issa
I've had a lot of luck with variants of a honeypot. Add a visually hidden field and any time it is submitted with content, block the post. Super simple and with some creativity, it's hard for the bots to keep up.

👤 boredatworkme
You have received some great suggestions so far.

One of the forums that I frequent has a "newbie" section, which is not visible to full members or guests (who are not logged in). Whoever registers to the website needs to get a predefined set of "Likes" on their posts. Not every post gets a "like" - only those that contribute to the discussion do (not everyone needs to agree, debates are welcome as long as they are civil).

This helps maintain the quality of the forum to an outside viewer and cuts out a large amount of spam.


👤 miki123211
Be sure you do email verification before users are able to post. Block domains of temporary email services (there are lists floating around GitHub, Google is your friend). Only allow one account per address. Figure out what domains the spammer is using to make email accounts. If you can, block them entirely, if not, require manual approval just for those domains. Use the other suggested technoques, like shadowbanning etc. Consider requiring or allowing social log in or phone number verification.

👤 Seb-C
It happened to me a long time ago. He was not only spamming and using SQL injections to destroy my community, but also advertising his own concurrent website.

When I also started to build scripts and destroyed his own website, he basically realized the harm he was doing, apologized and stopped.

Reminds me of the old good times when you could trap script-kiddies on msn.

"I think you are lying and not capable of hacking my computer. I'm waiting for you, my IP is 127.42.196.8"


👤 freitasm
Cloudflare perhaps with a firewall rule that blocks bots over a certain threshold? It may fall under fingerprinting if you need to know that.

👤 ve55
I mentioned quite a few alternatives to ReCAPTCHA that often work in situations like yours here: https://nearcyan.com/you-probably-dont-need-recaptcha/

Some of the best solutions include very minimal/quick captchas, or simple checks for things like javascript


👤 paulintrognon

👤 phenkdo
I think we might need more creative solutions to the problem of spam including online reviews, posts, phone calls. Some kind of PKI id verification system that every user should sign up with. Sure that will be turn off a lot of users, but the trade off is only the true enthusiasts will be participants.

👤 Rotten194
Can you enable hcaptcha with a whitelist for known-good accounts? Not ideal but might annoy them enough to give up.

👤 vmception
> Negotiation is not really an option on the table, the last time one of the other volunteers responded at all we got a ~150Gbps volumetric attack

that's hilarious, have you tried trolling them back like just enjoying their company? start saying pool's closed and things like that


👤 codegeek
Could new accounts not have an approved Post and instead, all new accounts must get manually approved ? Or is that not feasible due to the volume you have ? I would make sure that first few posts by a new account is now automatically approved/shown to anyone.

👤 ehutch79
Temporarily charge a dollar to create a new account?

Also, do you have cloudflare in front of you?


👤 methodmi
Do you know if they are using headless browsers? If so, we can block them without fingerprinting and without captchas. https://methodmi.com

👤 aoqooqoqoq
Try using a proof-of-work system. It won’t differentiate between humans and bots but it’ll significantly slow down the pace they can register accounts, and it is completely transparent to legitimate users.

👤 dawnerd
Are you sure they’re not posting directly to the registration endpoint and bypassing the signup form? We just had this problem with spammers in China and QQ emails. Adding a nonce helped dramatically.

👤 raverbashing
A JS tarpit (not sure this is the right name) might help

You add a JS snippet that does some work, but if you detect a bot you make it do increasingly more work. Think bitcoin mining but not actually that


👤 compsciphd
naive Q: how damaging would it be for you to stop accepting new accounts temporarily? or put differently, how many legitimate accounts are created on a daily basis? (single digits?)

👤 paxys
Do you have any problems with hCaptcha other than its accessibility? It sounds like that is a much easier problem to solve than everything else people are suggesting in this thread.

👤 sktguha
you could try google recaptcha v3 which does not require any user input and runs in background. not 100% sure however if it helps in your case but i think it should definitely help

👤 renewiltord
Recaptcha it and wait them out. He can't do it forever. Then unrecaptcha. Just do a month. You'll lose some sign-ups and then you can go back to the old thing.

👤 petre
Add e-mail verification and a introduce a random 30-60 min delay before sending the verification e-mails. Then you can cut on disposable e-mail domains and so on.

👤 rexfuzzle
Might be slightly unorthodox, but email the first post to a Gmail account from a random address and see if it is marked as spam, only display the post if not.

👤 GoblinSlayer
Can't he just do that 150Gbps thing to make you do anything? Also can't you allow aimbots? You can keep them in a separate space and put them in a separate leaderboard.


👤 laksdjfkasljdf
the only wining move against spam is to make it easier to clean up than it is to spam.

captcha only benefit google and the like, who couldn't care less for the community or content. Captcha makes honest content (and spam cleanup) more expensive than the spam! it's a losing proposition that only looks good when you look at it without considering all the situations.

make honest content (and spam) easy, but cleaning up easier. Things like every user can flag something, after a certain number of flags, also remove other content from the same IP (or same bundle of users with a close registration time window) automatically. And of course a feature for admins to automatically ban and erase content from users wrongly flagging honest content.

It's harder than captcha, but it is an actual solution. Captcha is lazy and ableist.