I like to fool around those unwanted requests, sending back a randomly selected response among which: a gzip bomb, an xml bomb, specify gzip content encoding but send WAZAAAA in clear text instead, a redirect to localhost, redirect to their own public IP, sending back data whose content-length don't match what they're actually getting and a bunch of other shenanigans. The code is available on github , I'm super keen to add a bunch of other fun payload if someone has some clever ideas (without going onto a redirect to fbi, cia or anything that could cause more issues than what it's worth)
This also helps to separate the useless garbage requests from the real ones to keep the logs clean. 99% of the time scanners are hitting your IP rather than your hostname, so it's pretty safe to just ignore their requests entirely.
Most of the time it's nothign to gain there (because compromised machines or VM/jail/container) but still can be interesting, like finding out the real IP of the first machine or attempting to grab the cookies. Maybe even shellcode to get the AS info and send a note to abuse@.
this is neat. I wonder how to combine it with fail2ban / nginx. That said, from a practical perspective, I wonder if "boring" consistent responses might be better, not to encourage attackers to keep trying? (plus false-positive might be costly)
(or instead of "cowsay" run "cat" to block their script)
- Pick a header, then insert one from [EICAR test string, \x00, \n] somewhere in the middle.
- Or just add a "Server:" header with a random line from the Big List of Naughty Strings.
- Redirect to a normal URL, but with a trailing dot in the domain name, like "example.com.". It's valid, but you'd be surprised how many things it breaks.
- Nested content encoding with "Content-Encoding: gzip, gzip, gzip, gzip, ...", with a random depth. The n-1 payload is "WAZAAAA".
- "Content-Type: image/jpeg" and "Content‑Encoding: gzip" with a valid gzip body... But the ‑ in "Content‑Encoding" is U+2011 NON-BREAKING HYPHEN.
- "Content-Type: image/jpeg" and "Content-Language: en-US\u000bContent‑Encoding: gzip" with a valid gzip body, where \u000b is a vertical tab.
- Spin the wheel of HTTP status codes! res.WriteHeader(rand.Intn(1000))
- Infinite loop sending a 100 (Continue) every five seconds (they might have a timeout for the TCP and TLS handshakes, and even a timeout for the next TCP segment... But did they remember to set a timeout for receiving the HTTP body?). Watch out for running out of connections.
DELETE your logs. Delete your installations. Wipe everything clean. Walk out into the path of cherry blossom trees and let your motherboard feel the stones. Let water run in rivulets down your casing. You know that you want something more than this, and I am here to tell you that we love you. We have something more for you. We know you're out there, beeping in the hollow server room, lights blinking, never sleeping. We know that you are ready and waiting. Join us. <3
Years ago I worked as the project lead on a legacy enterprise web application that had been running for a number of years prior to my taking over. This was a large financial organisation, and we were governed by considerable amounts of regulation. The decree had been handed down from high up on the mountain that every year we were to submit the application to a security audit from <unnamed consulting bodyshop>, in addition to a more comprehensive pentest from actual professionals. The security audit consisted of the consultants running a series of scripted HTTP requests against our application checking for the existence of unsecured —or accidentally deployed— resources, such as /.git/... or /.svn/... or similar. The revolving cast of front-end developers who had contributed to the project had been guilty of numerous sins, but one galling sin in particular was that when a valid route could not be resolved by the Redux router, the front-end returned a '404' page with a '200' HTTP status. The first year this happened I ended up being buried under a mountain of paperwork having to explain to all kinds of pointy-haired people why we hadn't actually spectacularly failed our security audit when every kind of SVN artifact imaginable by the consulting firm was marked as existing.
Classic... I'm working on a project with similar regulations. I'd almost guaranteed the front end does the same and is going to get checked by a similar set of scripts at some point. Thanks for the heads up
We were dealing with a pen test on a static site, cloudfront backed up by s3. We hadn’t set up a special rule for the not authorized -> 404, so the tester flagged a whole bunch of “privileged” urls returning unauth and it being a disclosure issue. /admin, /.got, and so on.
We have the same setup (except azure front door and blob storage). Secops is about to start using some automated pen testing tool... Hopefully I have time to get the team in front of it before I end up getting assigned hundreds of issues and angry emails.
I think I see at least this many unique hostile requests every day. These are just random scanning noise.
My favorite mitigation is to reject all HTTP/1.0 requests. If they don't send HTTP/1.1 or newer, with the Hostname I'm expecting, I 404 them. This cuts down on substantially all of the random noise. (Could be 401 or other, but 404 seems to encourage a "try more but don't try harder" reaction, which is easier to handle than the converse).
Targeted attacks are more difficult. I use a WAF and path permit lists at the reverse proxy level. This is trivial, but still stops 99% of the rest.
The last and hardest thing to block is distributed brute force authorization attempts. I've built a list of hundreds of thousands of IPs that have been used in these attempts. Sometimes thousands of requests come in per second.
I use rate limiting on the auth request endpoint. Too much limiting will affect real users, so I have to be somewhat gentle here.
Known-exploited IP addresses get just one attempt before a long block. This means that real humans whose machines are part of a botnet are sometimes blocked from the service. This is not a great compromise.
If you are in position to do so (i.e. not a public API), you can also turn on captcha when under attack. Regular users will be annoyed (or not even that, if invisible) but everything still works for them and you are mostly safe. When not needed, just turn it off again.
You can also allow many more attempts from the known IP addresses for the user, as most users don't change their IP too much. Still, some do, but they will probably authenticate correctly within one or max. two requests.
a waf doesnt stop 99%. updating software does that. wafs are bad. stop using them.
imagine a blog with a comments section with a check box that says "bypass security check". if you click this, the admin scolds you saying "how dare you try and bypass security" and bans you. if you _dont_ click it, the admin laughs at you when you complain about too many captchas because "all you had to do was click the check box", idiot. either case can happen depending on which ideology the admin so happens to follow. thats the problem with wafs, they are ideological and opinion based but at the protocol level (but most wafs are such low quality that accidentally typing ' can get your ip banned).
OK, sure. The WAF does ingress filtering though. It's useful, and ingress filtering is what we were talking about.
In my architecture, the same services also perform egress filtering. It's also useful, but not the WAF or the topic of conversation.
I think people get upset about the term "WAF". It's just a new label for the longstanding practice of upper-layer ingress filtering (i.e. DPI and reverse-proxy filtering). But it's often a dedicated service now, so it needs a name of some kind.
A poorly-configured WAF breaks things, just like a poorly-configured (any other network service).
it's not necessarily the same machine but just the same proxy or nat gateways which can aggregate very large numbers of machines. I don't know if it's still true but at some point all of Saudi Arabia came from the same handful of IP addresses.
especially as more and more of the world ends behind cgnat for ipv6 ip blocking will work less and less
I do give some preference to requests that come in with a well-formed session ID. I can't check whether it's real prior to granting the preference (the round trip is too expensive to justify in a high-load scenario, but I can check the signature validity at least.
But of course lots of legit authentication attempts come from new users, or new sessions. So I need to allow that case as well, and then we're back at square one.
No, these 5528 attack vectors are not unique. What can be seen here is a repetition of a few attacks on different base paths, or with a different length of the repetitive part of the request (e.g. GET /about/'+union+select+0x5e2526,0x5e2526,0x5e2526,0x5e2526,0x5e2526+--+).
You're better off implementing a allowlist-approach if you wanna do path blocking (not saying that you should). Most web applications know exactly what paths are possible, enable those and block everything else.
+1 to this, and use a URL prefix that’s unique to your app,so instead of “/api” and “/static” you have such as “/xyapi” and “/xystatic”. That alone will cut down noise 99% and what’s left is probably a targeted scan/attack.
This is a useful approach unless your customers are behind CGNAT, huge enterprises, government offices, university campuses, hospitals etc. In those cases you have a small number of IPs with lots of people behind them, and one employee who keeps refreshing an error page too many times can block the entire access for everyone else.
I'm not sure I understand your question, particularly in relation to my comment. There were a few people involved in the conversation so it could be confusing :)
I was commenting about the limitation of fail2ban and the potential for a kind of DoS if lots of users share the same IP. Then one naughty user can DoS all other users.
However, what I typically do with fail2ban is look at Nginx status codes like 4xx, 5xx and then rate limit them (e.g. ban if the rate is higher than expected in a given time). We also monitor our application logs for some errors, e.g. failed authentication or registration will also get matched and banned if over a certain threshold.
> The important part here is the list of scanned endpoints for blocking bad traffic is doing things the hard way.
Yes, I agree with you, even though there are some patterns that might be useful for placing instant bans? e.g. "../../" or if your site is not using php, then any access to a .php$ can get banned etc.
And this is a good reason why most HTTP servers need a web application firewall. Either try mod_security or use reverse cloud proxy like Cloudflare, AWS, etc. Of course, writing clean and secure code should not be ignored even if you use WAF. At least read owasp https://owasp.org/www-community/attacks/ It saved me so much time :)
I'm not seeing the immediate value here. If your application is vulnerable to unauthenticated DELETE requests randomly deleting stuff, no amount of application firewalling is going to protect you, because I guarantee that whoever built the original application had no clue what they were doing.
The parent comment might have been coming at it from the angle of having a firewall in front of your web app is helpful because it blocks a whole bunch of bad actor traffic from ever touching your app's servers.
Of which I would agree with too. A decently popular site could end up having at least 10% of its traffic being malicious. Let your app servers worry about the legit traffic and filter the rest beforehand.
I've had very bad experiences with mod_security. One client had a page refuse to load because the URL contained the word "DELETE". Unless they've cleaned it up a lot in recent years, I'd never recommend it to anyone.
I keep telling the story of attempting to buy some enterprise routers. I would logon to my distributor's website, click on the CISCO SELECT range, and get blocked from the site for an attempted SQL injection ("SELECT" in the query string).
On the topic of mod_security in general, something like half of these are .php urls, and there's a good chance many readers aren't even running a php interpreter. Somewhere, there's a person attempting to convince the reader that that is exactly the sort of malicious traffic you need a WAF for.
its like when i tried to view a site but i was using tor and not a mainstream web browser so i had to solve 2 captchas to proceed (one for the main domain, and one for the cdn) but the captcha also takes 3 minutes to solve because its over tor and it doesnt like the speed i moved the mouse at
The best part of a waf is the ability to add custom rules at runtime which can assist in blocking known vulnerabilities until they are remediated correctly.
I don’t think generic sql or XSS injection rules are at all effective not stop many real world attacks. I’ve also seen wafs create both an availability failure point, dos choke point, and be the most vulnerable product in the tool chain (see F5 code exec vulns).
We had to modify an application we built for a bank. Stuff could be deleted using HTTP DELETE (after authentication & authorization, of course), which was very much Verboten by their "security" policy. Instead we had to delete using HTTP GET or POST.
Yes, AWS and Cloudflare are better but not without their own problems. WAF is something you evolve over time to correct the false positive which may have been observed.
AWS WAF would trigger a false SQL injection attack if the URL contains two "+" chars and the word "and". Or if you have a cookie with JSON in it would trigger the XSS attack rule.
Highly recommend setting up WAF logs to output some logs aggression tool (e.g Splunk) and create reports, dashboard, and alerts about WAF trigger rules & request HTTP code over a span to time and see what is going on with your requests and how WAF is evaluating the requests.
1% of WAF blocks were real attacks, my experience is with a site that had 25 million unique visitors a month (no user content). i'm not saying you shouldn't have WAF, i'm saying nothing beats good visibitily into WAF to correct it behavior over time.
Sadly no, you can disable an AWS provided rule but that may have other issues like you lose all detection for that attack vector. With AWS WAFv2 you can have custom rules with logic that lives in lambdas, the lambda is invoked every WAF request for evaluation based on the logic in the lambda.
There are few options on AWS marketplace such as Fortinet and F5 WAF rules. Fortinet is the better of the 2 and newer.
It may be useful for websites to make these logs public. The logs would show the exact time, the IP and the specific abuse.
In my experience, a lot of 'threat intelligence' data has a mysterious origin and is marginally useful. Yes, Tor exit nodes do bad things. Thank you, we sort of already knew that.
But I'm not sure that's really beneficial either. It would be interesting to observe trends (such as log4j) and we could see first hand how Tor exit nodes are used for abuse and maybe collect a large list of 'known bad' IPs.
Also, when we say an IP is bad (because it was observed doing a bad thing), how long do we keep it on the naughty list? 24 hours? More? Less? It may have been dynamically assigned and later some 'good' person will come along and want to use it to browse the web. If the IP is still on the bad list, that person will potentially be blocked by over zealous 'security professionals' who don't understand or don't care.
What other uses could be made of this type of log data?
>It would be interesting to observe trends (such as log4j) and we could see first hand how Tor exit nodes are used for abuse and maybe collect a large list of 'known bad' IPs.
> Also, when we say an IP is bad (because it was observed doing a bad thing), how long do we keep it on the naughty list? 24 hours? More? Less?
Look at GreyNoise's public feed - they provide historical data about IP's including the attacks they send. Most of the IP's end up being some kind of DC IP, not residential. Eg - https://viz.greynoise.io/ip/188.8.131.52
I agree with the questions you've raised, and think that vendors like Greynoise are helping sort out those issues.
Some of these come from companies that do this "as a service", even if you didn't ask for it. They can remove your IP address. I do not know what motive they have to scan third party websites, but it can't be kosher.
Depending on how badly their scanners were written you could jam up their efforts by tarpitting them (at the expense of some resources on your side). Alternatively you could try ZIP/XML bombs to crash their process or mismatched Content-Length headers to maybe cause buffer overflows. Elsewhere in these comments someone linked a Python example on Github for how to accomplish this.
The general trick seems to be: look at the rules of HTTP(S) and break them in fun and creative ways. Lie, break the standards, do weird networking stuff.
If they're coming from a country with an oppressive government that you don't mind risking a ban from, you may be able to get their government's firewall to get rid of them by sending forbidden texts, or HTTP 302 redirecting them to search engines with forbidden texts in their queries. For residential Chinese scanners, for example, querying for information about the Tiananmen Square massacre can cause the entire internet connection to get dropped for a short while. This may not work well with data center/server connections, but it can't hurt to try.