Cloudflare's Browser Intergrity Check/Verification/Challenge feature used by many websites, is denying access to users of non-mainstream browsers like Pale Moon.
Users reports began on January 31:
https://forum.palemoon.org/viewtopic.php?f=3&t=32045
This situation occurs at least once a year, and there is no easy way to contact Cloudflare. Their "Submit feedback" tool yields no results. A Cloudflare Community topic was flagged as "spam" by members of that community and was promptly locked with no real solution, and no official response from Cloudflare:
https://community.cloudflare.com/t/access-denied-to-pale-moo...
Partial list of other browsers that are being denied access:
Falkon, SeaMonkey, IceCat, Basilisk.
Hacker News 2022 post about the same issue, which brought attention and had Cloudflare quickly patching the issue:
https://news.ycombinator.com/item?id=31317886
A Cloudflare product manager declared back then: "...we do not want to be in the business of saying one browser is more legitimate than another."
As of now, there is no official response from Cloudflare. Internet access is still denied by their tool.
On one hand, I get the annoying "Verify" box every time I use ChatGPT (and now due its popularity, DeepSeek as well).
On the other hand, without Cloudflare I'd be seeing thousands of junk requests and hacking attempts everyday, people attempting credit card fraud, etc.
I honestly don't know what the solution is.
The thing is that these tools are generally used to further entrench power that monopolies, duopolies, and cartels already have. Example: I've built an app that compares grocery prices as you make a shopping list, and you would not believe the extent that grocers go to to make price comparison difficult. This thing doesn't make thousands or even hundreds of requests - maybe a few dozen over the course of a day. What I thought would be a quick little project has turned out to be wildly adversarial. But now spite driven development is a factor so I will press on.
It will always be a cat and mouse game, but we're at a point where the cat has a 46 billion dollar market cap and handles a huge portion of traffic on the internet.
They ignored robots.txt (claimed not to, but I blacklisted them there and they didn't stop) and started randomly generating image paths. At some point /img/123.png became /img/123.png?a=123 or whatever, and they just kept adding parameters and subpaths for no good reason. Nginx dutifully ignored the extra parameters and kept sending the same images files over and over again, wasting everyone's time and bandwidth.
I was able to block these bots by just blocking the entire IP range at the firewall level (for Huawei I had to block all of China Telecom and later a huge range owned by Tencent for similar reasons).
I have lost all faith in scrapers. I've written my own scrapers too, but almost all of the scrapers I've come across are nefarious. Some scour the internet searching for personal data to sell, some look for websites to send hack attempts at to brute force bug bounty programs, others are just scraping for more AI content. Until the scraping industry starts behaving, I can't feel bad for people blocking these things even if they hurt small search engines.
This usually includes people making a near-realtime updated perfect copy of your site and serving that copy for either scam or middle-manning transactions or straight fraud.
Having a clear category of "good bots" from either a verified or accepted companies would help for these cases. Cloudflare has such a system I think, but then a new search engine would have to go to each and every platform provider to make deals and that also sounds impossible.
The solution is good security-- Cloudflare only cuts down on the noise. I'm looking at junk requests and hacking attempts flow through to my sites as we speak.
Though annoying, it's tolerable. It seemed like a fair solution. Blocking doesn't.
Yup!
> I honestly don't know what the solution is.
Force law enforcement to enforce the laws.
Or else, block the countries that don't combat fraud. That means... China? Hey isn't there a "trade war" being "started"? It sure would be fortunate if China (and certain other fraud-friendly countries around Asia/Pacific) were blocked from the rest of the Internet until/unless they provide enforcement and/or compensation their fraudulent use of technology.
Robots went out of control, whether malicious or the AI scrapers or the Clearview surveillance kind; users learned to not trust random websites; SEO spam ruined search, the only thing that made a decentralized internet navigable; nation state attacks became a common occurrence; people prefer a few websites that do everything (Facebook becoming an eBay competitor). Even if it were possible to set rules banning Clearview or AI training, no nation outside of your own will follow them; an issue which even becomes a national security problem (are you sure, Taiwan, that China hasn't profiled everyone on your social media platforms by now?)
There is no solution. The dream itself was not sustainable. The only solution is either a global moratorium of understanding which everyone respectfully follows (wishful thinking, never happening); or splinternetting into national internets with different rules and strong firewalls (which is a deal with the devil, and still admitting the vision failed).
To make matters worse, I suspect that not even a splinternet can save it. It needs a new foundation, preferably one that wasn't largely designed before security was a thing.
Federation is probably a good start, but it should be federated well below the application layer.
Countries, whether it be Ukraine or Taiwan, can't risk other countries harvesting their social media platforms for the mother of all purges. I never assume that anything that happened historically can never happen again - no Polish Jew would have survived the Nazis with this kind of information theft. Add AI into the mix, and wiping out any population is as easy as baking pie.
Countries are tired of actual or perceived intellectual property theft. Just ask my grandmother, who has had her designs stolen and mass produced from eBay. Not just companies - many free and open source companies cannot survive with such reckless competition.
Countries are tired of bad behavior from other countries online. How many grandmothers does it take being scammed? How many educational systems containing data on minors needs to be stolen?
Startups are tired of paying Cloudflare protection money, and trying to evade the endless sea of SEO spam. How can a startup compete with Google with so much trash and no recourse?
Now we have AI, gasoline and soon to be dynamite on the fire. For the first time ever, a malicious country can VPN into the internet of a friendly nation, track down all critics on their social media, and destroy their lives in a real world attack. We are only beginning to see this in Ukraine - are we delusional enough to believe that the world is past warfare? That the UN can continue keeping countries in line?
What are you protecting cloudflare?
Also they show those captchas when going to robots.txt... unbelievable.
This hostility to normal browsing behavior makes me extremely reluctant to ever use Cloudflare on any projects.
It is either that or keep sending data back to the Meta and Co. overlords despite me not being a Facebook, Instagram, Whatsapp user...
Turnstile is the in-page captcha option, which you're right, does affect page load. But they force a defer on the loading of that JS as best they can.
Also, turnstile is a Proof of Work check, and is meant to slow down & verify would-be attack vectors. Turnstile should only be used on things like Login, email change, "place order", etc.
It's also a pretty safe assumption that Cloudflare is not run by morons, and they have access to more data than we do, by virtue of being the strip club bouncer for half the Internet.
Absolutely true. But the programmers of these bots are lazy and often don't. So if Cloudflare has access to other data that can positively identify bots, and there is a high correlation with a particular user agent, well then it's a good first-pass indication despite collateral damage from false positives.
They do not - not definitively [1]. This cat-and-mouse game is stochastic at higher levels, with bots doing their best to blend in with regular traffic, and the defense trying to pick up signals barely above the noise floor. There are diminishing returns to battling bots that are indistinguishable from regular users.
1. A few weeks ago, the HN frontpage had a browser-based project that claimed to be undetectable
For now
If you really do have a better way to make all legitimate users of sites happy with bot protections then by all means there is a massive market for this. Unfortunately you're probably more like me, stuck between a rock and a hard place of being in a situation where we have no good solution and just annoyance with the way things are.
"'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/132.0.0.0 Safari/537.36'"
That means my browser is pretending to be Firefox AND Safari on an Intel chip.
I don't know what features Cloudflare uses to determine what browser you're on, or if perhaps it's sophisticated enough to get past the user agent spoofing, but it's all rather funny and reminiscent just the same.
I forgot the script open, polling for about 20 minutes, and suddenly it started working.
So even sending all the same headers as Firefox, but with cURL, CF seemed to detect automated access, and then eventually allowed it through anyway after it saw I was only polling once a minute. I found this rather impressive. Are they using subtle timings? Does cURL have an easy-to-spot fingerprint outside of its headers?
Reminded me of this attack, where they can detect when a script is running under "curl | sh" and serve alternate code versus when it is read in the browser: https://news.ycombinator.com/item?id=17636032
If it's a https URL: Yes, the TLS handshake. There are curl builds[1] which try (and succeed) to imitate the TLS handshake (and settings for HTTP/2) of a normal browser, though.
[1] https://github.com/lwthiker/curl-impersonate
As a part of some browser fingerprinting I have access to at work, there's both commercial and free solutions to determine the actual browser being used.
It's quite easy even if you're just going off of the browser-exposed properties. You just check the values against a prepopulated table. You can see some of such values here: https://amiunique.org/fingerprint
Edit: To follow up, one of the leading fingerprinting libraries just ignores useragent and uses functionality testing as well: https://github.com/fingerprintjs/fingerprintjs/blob/master/s...
I know it happens, but also I've run plenty of servers hooked directly to the internet (with standard *nix security precautions and hosting provider DDoS protection) and haven't had it actually be an issue.
So why run absolutely everything through Cloudflare?
In the past you could ban IPs but that's not very useful anymore.
The distributed attacks tend to be AI companies that assume every site has infinite bandwidth and their crawlers tend to run out of different regions.
Even if you aren't dealing with attacks or outages, Cloudflare's caching features can save you a ton of money.
If you haven't used Cloudflare, most sites only need their free tier offering.
It's hard to say no to a free service that provides feature you need.
Source: I went over a decade hosting a site without a CDN before it became too difficult to deal with. Basically I spent 3 days straight banning ips at the hosting company level, tuning various rate limiting web server modules and even scaling the hardware to double the capacity. None of it could keep the site online 100% of the time. Within 30 mins of trying Cloudflare it was working perfectly.
Very true! Though you still see people who are surprised to learn that CF DDOS protection acts as a MITM proxy and can read your traffic plaintext. This is of course by design, to inspect the traffic. But admittedly, CF is not very clear about this in the Admin Panel or docs.
Places one might expect to learn this, but won't:
- https://developers.cloudflare.com/dns/manage-dns-records/ref...
- https://developers.cloudflare.com/fundamentals/concepts/how-...
- https://imgur.com/a/zGegZ00
That said, their Magic Transit and Spectrum offerings (paid) provide L3/L4 DDoS protection without payload inspection.
I incorrectly interpreted your comment as one of the multitude of comments claiming nefarious reasons for proxying without any thought for how an alternative would work.
Magic Transit is interesting - hard to imagine how it would scale down to a small site though, they apparently advertise whole prefixes over BGP, and most sites don't even have a dedicated IP, let alone a whole /24 to throw around.
I do. Many people I know do. In my risk model, DDoS is something purely theoretical. Yes it can happen, but you have to seriously upset someone for it to maybe happen.
A while ago, my company was hiring and conducting interviews, and after one candidate was rejected, one of our sites got hit by a DDoS. I wasn't in the room when people were dealing with it, but in the post-incident review, they said "we're 99% sure we know exactly who this came from".
I've only been here 1.5 years but sounds like we usually see 1 decent sized DDoS a year plus a handful of other "DoS" usually AI crawler extensions or 3rd parties calling too aggressively
There are some extensions/products that create a "personal AI knowledge base" and they'll use the customers login credentials and scrape every link once an hour. Some links are really really resource intensive data or report requests that are very rare in real usage
Why was that not enough to mitigate the DDoS?
Also you can buy a cheaper ipv6 only VPS and run it thru free CF proxy to allow ipv4 traffic to your site
The only time I had a problem was when gitea started caching git bundles of my Linux kernel mirror, which bots kept downloading (things like a full targz of every commit since 2005). Server promptly went out of disk space. I fixed gitea settings to not cache those. That was it.
Not ever ddos. Or I (and uptimerobot) did not notice it. :)
The availability part on the other hand is maybe something that's not so business critical for many but for targeted long-term attacks it probably is.
So I think for some websites, especially smaller ones it's totally feasible to not use Cloudflare but involves planning the hosting really carefully.
If it was actually a traffic based DDOS someone still needs to pay for that bandwidth which would be too expansive for most companies anyway - even if it kept your site running.
But you can sell a lot of services to incompetent people.
Cloudflare offers protection for free.
If you are just a small startup or a blog, you'll probably never see an attack.
Even if you don't host anything offensive you can be targeted by competitors, blackmailed for money, or just randomly selected by a hacker to test the power of their botnet.
ChatGPT.com is normally quite useful for generating Cloudflare prompts, but that page doesn't seem to work in Palemoon regardless of prompts. What version browser engine does it use these days? Is it still based on Firefox?
For reference I grabbed the latest main branch of Ladybird and ran that, but Cloudflare isn't showing me any prompts for that either.
https://forum.palemoon.org/viewtopic.php?f=3&t=32064
Is it worth giving the internet to them? Is something so fundamentally wrong with the architecture of the internet that we need megacorps to patch the holes?
The Cloudflare tool does not complete its verifications, resulting in an endless "Verifying..." loop and thus none of the websites in question can be accessed. All you get to see is Cloudflare.
But if someone has a site that is failing, feel free to post it and I will give it a try.
It's probably dependent on the security settings the site owner has choosen. I'm guessing bot fight mode might cause the issue.
Google itself tried to push crap like Web Environment Integrity (WEI) so websites could verify "authentic" browsers. We got them to stop it (for now) but there was already code in the Chromium sources. What makes CloudFlare MITMing and blocking/punishing genuine users from visiting websites?
Why are we trusting CloudFlare to be a "good citizen" and not block unfairly/annoy certain people for whatever reason? Or even worse, serve modified content instead of what the actual origin is serving? I mean in the cases where CloudFlare re-encrypts the data, instead of only being a DNS provider. How can we trust that not third party has infiltrated their systems and compromised them? Except "just trust me bro", of course
I witnessed this! Last time I checked, in the default config, the connection between cloudflare and the origin server does not do strict TLS cert validation. Which for an active-MITM attacker is as good as no TLS cert validation at all.
A few years ago an Indian ISP decided that https://overthewire.org should be banned for hosting "hacking" content (iirc). For many Indian users, the page showed a "content blocked" page. But the error page had a padlock icon in the URL bar and a valid TLS cert - said ISP was injecting it between Cloudflare and the origin server using a self-signed cert, and Cloudflare was re-encrypting it with a legit cert. In this case it was very conspicuous, but if the tampering was less obvious there'd be no way for an end-user to detect the MITM.
I don't have any evidence on-hand, but iirc there were people reporting this issue on Twitter - somewhere between 2019 and 2021, maybe.
If there were an alternative that would provide the same benefits at roughly the same cost, I would definitely be willing to take a look, even if it meant I needed to spend some time learning a different way to configure the service from the way I configure Cloudflare.
[0] https://azure.microsoft.com/en-us/products/cdn/
[1] https://cloud.google.com/cdn?hl=en
[2] https://docs.aws.amazon.com/AmazonCloudFront/latest/Develope...
(And if I were doing this on my own, rather than trusting Cloudflare to do it, I would almost surely decide that I don't care enough about Pale Moon users to fix an otherwise good rule that's blocking them as a side effect.)
I agree that this exposes the risk of relying overmuch on handful of large, opaque, unaccountable companies. And as long as Cloudflare's customers are web operators (rather than users), there isn't a lot of incentive for them to be concerned about the user if their customers aren't.
One idea might be to approach web site operators who use Cloudflare and whose sites trigger these captchas more than you'd like. Explain the situation to the web site operator. If the web site operator cares enough about you, they might complain to Cloudflare. And if not, well, you have your answer.
I do get a "your browser is unsupported" message from the forums.
If I am met with the dreaded cloudflare "Verify you are a human" box, which is very rare for me, I dont bother and just close the tab.
On the other, Pale Moon is an ancient (pre-quantum) volunteer-supported fork of Firefox, with boatloads of known and unfixed security bugs - some fixes might be getting merged from upstream, but for real, the codebases diverged almost a decade ago. You might as well be using IE 11.
CAPTCHAs are barely sufficient against bots these days. I expect the first sites to start implementing Apple/Cloudflare's remote attestation as a CAPTCHA replacement any day now, and after that it's going to get harder and harder to use the web without Official(tm) Software(tm).
Using Linux isn't what's getting you blocked. I use Linux, and I'm not getting blocked. These blocks are the results of a whole range of data points, including things like IP addresses.
What usually works for me is to close the browser, reload, and try again.
I have not tried less mainstream browsers, just FF and Chrome.
Think about it this way: when a framework (many modern websites) or CAPTCHA/Challenge doesn't support an older or less common browser, it's not because someone's sitting there trying to keep people out. It's more likely they are trying to balance the maintenance costs and the hassle involved in allowing or working with whatever other many platforms there are (browsers in this case). At what point is a browser relevant? 1 user? 2 users? 100? Can you blame a company that accommodates for probably >99% of the traffic they usually see? I don't think so, but that's just me.
At the end, site owners can always look at their specific situation and decide how they want to handle it - stick with the default security settings or open things up through firewall rules. It's really up to them to figure out what works best for their users.
"Challenges are not supported by Microsoft Internet Explorer."
Nowhere is it mentioned that internet access will be denied to visitors not using "major" browsers, as defined by Cloudflare presumably. That wouldn't sound too legal, honestly.
Below that: "Visitors must enable JavaScript and cookies on their browser to be able to pass any type of challenge."
These conditions are met.
I'm unsure what part of this isn't clear, major browsers, as long as they are up to date, are supported and should always pass challenges. Palemoon isn't a major browser, neither are the other browsers mentioned on the thread.
> * Nowhere is it mentioned that internet access will be denied to visitors not using "major" browsers *
Challenge pages is what your browser is struggling to pass, you aren't seeing a block page or a straight up denying of the connection, instead, the challenge isn't passing because whatever update CF has done, has clearly broken the compatibility with Palemoon, I seriously doubt this was on purpose. Regarding those annoying challenge pages, these aren't meant to be used 24/7 as they are genuinely annoying, if you are seeing challenge pages more often than you are on chrome, its likely that the site owner is actively is flagging your session to be challenged, they can undo this by adjusting their firewall rules.
If a site owner decides to enable challenge pages for every visitor, you should shift the blame on the site owners lack of interest in properly tunning their firewall.
Because in the end, the result is connection denial. I don't want to connect to Cloudflare, I want to connect to the website.
I read that part. They still do not indicate what may happen, or what is their responsibility -if any- for visitors with non-major browsers.
Not claiming this is "on purpose" or a conspiracy, but if these legitimate protests keep getting ignored then yes, it becomes discrimination. If they can't be bothered, they should clearly state that their tool is only compatible with X browsers. Who is to blame for "an incorrectly received challenge"? The website? The user who chooses a secure, but "wrong" browser not on their whitelist?
Cloudflare is there for security, not "major browser approval pass". They have the resources to increase response times, provide better support and deal with these incompatibility issues. But do they want to? Until now, they did.
There are actually hundreds of smaller chromium forks that add small features, such as built-in adblock and have no issues with neither Cloudflare nor other captchas.
I use uptodate Firefox, and was blocked from using company gitlab for months on end simply because I disabled some useless new web API in about:config way before CF started silently requiring it without any feature testing or meningful error message for the user. Just a redirect loop. Gitlab support forum was completely useless for this, just blaming the user.
So we dropped gitlab at the company and went with basic git over https hosting + cgit, rather than pay some company that will happily block us via some user hostile intermediary without any resolution. I figured out what was "wrong" (lack of feature testing for web API features CF uses, and lack of meaningful error message feedback to the user) after the move.