Recon and Attack Vectors from My Logs

(gist.github.com)

196 points | by susam 59 days ago

18 comments

  • mickael-kerjean 59 days ago
    I like to fool around those unwanted requests, sending back a randomly selected response among which: a gzip bomb, an xml bomb, specify gzip content encoding but send WAZAAAA in clear text instead, a redirect to localhost, redirect to their own public IP, sending back data whose content-length don't match what they're actually getting and a bunch of other shenanigans. The code is available on github [1], I'm super keen to add a bunch of other fun payload if someone has some clever ideas (without going onto a redirect to fbi, cia or anything that could cause more issues than what it's worth)

    [1] https://github.com/mickael-kerjean/filestash/blob/master/ser...

    • yabones 58 days ago
      A fun and easy one for Nginx is to simply redirect all requests by default. I stick this in my default vhost after the healthcheck and error page parts :)

          location / {
              return 301 http://$remote_addr$request_uri;
              access_log /var/log/nginx/spam.log; 
          }
      
      This also helps to separate the useless garbage requests from the real ones to keep the logs clean. 99% of the time scanners are hitting your IP rather than your hostname, so it's pretty safe to just ignore their requests entirely.
    • justsomehnguy 58 days ago
      > without going onto a redirect to fbi

      Redirect to FSB then.

      But the more real one would be sending something what could be executed on their side, ie some JavaScript or maybe PHP.

      Most of the time it's nothign to gain there (because compromised machines or VM/jail/container) but still can be interesting, like finding out the real IP of the first machine or attempting to grab the cookies. Maybe even shellcode to get the AS info and send a note to abuse@.

    • gingerlime 58 days ago
      this is neat. I wonder how to combine it with fail2ban / nginx. That said, from a practical perspective, I wonder if "boring" consistent responses might be better, not to encourage attackers to keep trying? (plus false-positive might be costly)
    • BoppreH 58 days ago
      Oohh, that's a nice exercise. A few more ideas that should be easy to implement:

      - Content-Type: application/json, and body {"response"꞉ "success"} (hint: that's not an ASCII colon).

      - Content-Type: application/json, and body [[[[[...]]]]] nested ten thousand times.

      - Redirect to their own public IP, and at a random common port.

      - Redirect to their own public IP in dotless hexadecimal notation (ping 172.217.23.110 -> ping 0xacd9176e).

      - From[1], redirect to

        http://example.com/;'$(gt=$(perl$IFS-E$IFS's//62/;s/62/chr/e;say');eval$IFS''cowsay$IFS''pwned$IFS$gt/dev/tty)';cowsay$IFS''pwned
      
      (or instead of "cowsay" run "cat" to block their script)

      - Pick a header, then insert one from [EICAR test string[2], \x00, \n] somewhere in the middle.

      - Or just add a "Server:" header with a random line from the Big List of Naughty Strings[3].

      - Redirect to a normal URL, but with a trailing dot in the domain name[4], like "example.com.". It's valid, but you'd be surprised how many things it breaks.

      - Nested content encoding with "Content-Encoding: gzip, gzip, gzip, gzip, ...", with a random depth. The n-1 payload is "WAZAAAA".

      - "Content-Type: image/jpeg" and "Content‑Encoding: gzip" with a valid gzip body... But the ‑ in "Content‑Encoding" is U+2011 NON-BREAKING HYPHEN.

      - "Content-Type: image/jpeg" and "Content-Language: en-US\u000bContent‑Encoding: gzip" with a valid gzip body, where \u000b is a vertical tab.

      - Spin the wheel of HTTP status codes! res.WriteHeader(rand.Intn(1000))

      - Infinite loop sending a 100 (Continue) every five seconds (they might have a timeout for the TCP and TLS handshakes, and even a timeout for the next TCP segment... But did they remember to set a timeout for receiving the HTTP body?). Watch out for running out of connections.

      [1] https://github.com/jwilk/url.sh

      [2] https://en.wikipedia.org/wiki/EICAR_test_file

      [3] https://github.com/minimaxir/big-list-of-naughty-strings

      [4] https://en.wikipedia.org/wiki/Fully_qualified_domain_name

    • mmcgaha 59 days ago
      I think I love you mickael-kerjean.
    • tejtm 58 days ago
      endless rick rolling is always appreciated
  • zppln 59 days ago
    DELETE your logs. Delete your installations. Wipe everything clean. Walk out into the path of cherry blossom trees and let your motherboard feel the stones. Let water run in rivulets down your casing. You know that you want something more than this, and I am here to tell you that we love you. We have something more for you. We know you're out there, beeping in the hollow server room, lights blinking, never sleeping. We know that you are ready and waiting. Join us. <3

    Love it.

  • ajxs 59 days ago
    Years ago I worked as the project lead on a legacy enterprise web application that had been running for a number of years prior to my taking over. This was a large financial organisation, and we were governed by considerable amounts of regulation. The decree had been handed down from high up on the mountain that every year we were to submit the application to a security audit from <unnamed consulting bodyshop>, in addition to a more comprehensive pentest from actual professionals. The security audit consisted of the consultants running a series of scripted HTTP requests against our application checking for the existence of unsecured —or accidentally deployed— resources, such as /.git/... or /.svn/... or similar. The revolving cast of front-end developers who had contributed to the project had been guilty of numerous sins, but one galling sin in particular was that when a valid route could not be resolved by the Redux router, the front-end returned a '404' page with a '200' HTTP status. The first year this happened I ended up being buried under a mountain of paperwork having to explain to all kinds of pointy-haired people why we hadn't actually spectacularly failed our security audit when every kind of SVN artifact imaginable by the consulting firm was marked as existing.
    • cjonas 59 days ago
      Classic... I'm working on a project with similar regulations. I'd almost guaranteed the front end does the same and is going to get checked by a similar set of scripts at some point. Thanks for the heads up
      • wiredfool 58 days ago
        We were dealing with a pen test on a static site, cloudfront backed up by s3. We hadn’t set up a special rule for the not authorized -> 404, so the tester flagged a whole bunch of “privileged” urls returning unauth and it being a disclosure issue. /admin, /.got, and so on.
        • cjonas 58 days ago
          We have the same setup (except azure front door and blob storage). Secops is about to start using some automated pen testing tool... Hopefully I have time to get the team in front of it before I end up getting assigned hundreds of issues and angry emails.
    • ozim 58 days ago
      Oh wow just felt the pain reading this.

      I help filling in compliance excels when we get a new customer - explaining that "it does not work that way and 80% of this is not applicable to our systems" 10s of times.

      • ethbr0 58 days ago
        Given a choice between understanding "why red" and it being re-marked as "green", management will prefer the latter every time.
    • bornfreddy 58 days ago
      If you have the trust, it helps (ime) to drop hints that the auditors are not very knowledgeable / are giving false alarms / are taking the easy way instead of really checking the systems.

      Such automated tools are meant to be operated by someone who understands the technology.

      That said, there is probably some API you use, they should be checking that instead.

  • quesera 58 days ago
    I think I see at least this many unique hostile requests every day. These are just random scanning noise.

    My favorite mitigation is to reject all HTTP/1.0 requests. If they don't send HTTP/1.1 or newer, with the Hostname I'm expecting, I 404 them. This cuts down on substantially all of the random noise. (Could be 401 or other, but 404 seems to encourage a "try more but don't try harder" reaction, which is easier to handle than the converse).

    Targeted attacks are more difficult. I use a WAF and path permit lists at the reverse proxy level. This is trivial, but still stops 99% of the rest.

    The last and hardest thing to block is distributed brute force authorization attempts. I've built a list of hundreds of thousands of IPs that have been used in these attempts. Sometimes thousands of requests come in per second.

    I use rate limiting on the auth request endpoint. Too much limiting will affect real users, so I have to be somewhat gentle here.

    Known-exploited IP addresses get just one attempt before a long block. This means that real humans whose machines are part of a botnet are sometimes blocked from the service. This is not a great compromise.

    • bornfreddy 58 days ago
      If you are in position to do so (i.e. not a public API), you can also turn on captcha when under attack. Regular users will be annoyed (or not even that, if invisible) but everything still works for them and you are mostly safe. When not needed, just turn it off again.

      You can also allow many more attempts from the known IP addresses for the user, as most users don't change their IP too much. Still, some do, but they will probably authenticate correctly within one or max. two requests.

    • ynbl_ 56 days ago
      a waf doesnt stop 99%. updating software does that. wafs are bad. stop using them.

      imagine a blog with a comments section with a check box that says "bypass security check". if you click this, the admin scolds you saying "how dare you try and bypass security" and bans you. if you _dont_ click it, the admin laughs at you when you complain about too many captchas because "all you had to do was click the check box", idiot. either case can happen depending on which ideology the admin so happens to follow. thats the problem with wafs, they are ideological and opinion based but at the protocol level (but most wafs are such low quality that accidentally typing ' can get your ip banned).

      • quesera 55 days ago
        WAFs can be used poorly, but zero of my experience with them aligns with your complaints.

        If "WAF" bothers you, call it ingress/egress filtering (at the content level instead of packet level) instead.

        • ynbl_ 54 days ago
          but its not comparable to egress filtering _at all_
          • quesera 53 days ago
            OK, sure. The WAF does ingress filtering though. It's useful, and ingress filtering is what we were talking about.

            In my architecture, the same services also perform egress filtering. It's also useful, but not the WAF or the topic of conversation.

            I think people get upset about the term "WAF". It's just a new label for the longstanding practice of upper-layer ingress filtering (i.e. DPI and reverse-proxy filtering). But it's often a dedicated service now, so it needs a name of some kind.

            A poorly-configured WAF breaks things, just like a poorly-configured (any other network service).

    • nn3 58 days ago
      it's not necessarily the same machine but just the same proxy or nat gateways which can aggregate very large numbers of machines. I don't know if it's still true but at some point all of Saudi Arabia came from the same handful of IP addresses.

      especially as more and more of the world ends behind cgnat for ipv6 ip blocking will work less and less

    • GoblinSlayer 57 days ago
      Can't you accept the previous session id from the client so it would allow an authentication attempt?
      • quesera 53 days ago
        I do give some preference to requests that come in with a well-formed session ID. I can't check whether it's real prior to granting the preference (the round trip is too expensive to justify in a high-load scenario, but I can check the signature validity at least.

        But of course lots of legit authentication attempts come from new users, or new sessions. So I need to allow that case as well, and then we're back at square one.

  • jwilk 59 days ago
    Apparently other people also spotted COOK requests in the wild:

    https://serverfault.com/questions/579124/what-is-http-cook-r...

  • patrakov 59 days ago
    No, these 5528 attack vectors are not unique. What can be seen here is a repetition of a few attacks on different base paths, or with a different length of the repetitive part of the request (e.g. GET /about/'+union+select+0x5e2526,0x5e2526,0x5e2526,0x5e2526,0x5e2526+--+).
    • nibbleshifter 59 days ago
      The union select one with the repeated hex looks to me a lot like certain automated scanner tools trying to identify a column through which they can exfil the output of a SQL query.
      • scandinavian 59 days ago
        It's just looking for the number of columns returned by the initial query. UNION queries have to return the same number of columns as the query they are added to.
    • jwilk 59 days ago
      For context, the initial submission title was "From 7 Years of Apache HTTP Server Logs: 5528 Unique Recon and Attack Vectors".
  • fareesh 59 days ago
    Is there a publicly available set of fail2ban defaults that covers the vast majority of these?

    Or perhaps a tool where you can check off some boxes about your setup and it generates a configuration for you?

    • capableweb 59 days ago
      You're better off implementing a allowlist-approach if you wanna do path blocking (not saying that you should). Most web applications know exactly what paths are possible, enable those and block everything else.
      • e1g 59 days ago
        +1 to this, and use a URL prefix that’s unique to your app,so instead of “/api” and “/static” you have such as “/xyapi” and “/xystatic”. That alone will cut down noise 99% and what’s left is probably a targeted scan/attack.
      • scrollaway 59 days ago
        Thus leading to an awful experience for those who typo something in the url bar, or if someone sends a link that gets slightly mangled.
        • capableweb 59 days ago
          How so? Serve a 404 page like you normally would, no need to return a 403 or whatever.
          • scrollaway 59 days ago
            So what is it exact you’re suggesting in your original post? GP was asking about fail2ban and you suggested an allowlist…

            Serving a 404 is what would happen by default.

        • nibbleshifter 59 days ago
          Those end up at a static 404 anyway, right?
      • Eduard 59 days ago
        Sounds futile to do for GET parameters.
    • dijonman2 59 days ago
      I would rate limit off of status code and skip the path filtering. Especially if you have a high traffic site.
      • sgc 59 days ago
        Why is that?
        • chrisshroba 59 days ago
          I’m not OP, but I’m guessing the idea is that if someone requests 10 non-existent pages in a row, they’re likely not a normal user making a mistake, so we should rate limit them.
          • gingerlime 58 days ago
            This is a useful approach unless your customers are behind CGNAT, huge enterprises, government offices, university campuses, hospitals etc. In those cases you have a small number of IPs with lots of people behind them, and one employee who keeps refreshing an error page too many times can block the entire access for everyone else.
            • dijonman2 58 days ago
              How would you use fail2ban in this case if you block on a hit to a malicious endpoint?

              In any event you could use a cookie.

              The important part here is the list of scanned endpoints for blocking bad traffic is doing things the hard way.

              • gingerlime 57 days ago
                I'm not sure I understand your question, particularly in relation to my comment. There were a few people involved in the conversation so it could be confusing :)

                I was commenting about the limitation of fail2ban and the potential for a kind of DoS if lots of users share the same IP. Then one naughty user can DoS all other users.

                However, what I typically do with fail2ban is look at Nginx status codes like 4xx, 5xx and then rate limit them (e.g. ban if the rate is higher than expected in a given time). We also monitor our application logs for some errors, e.g. failed authentication or registration will also get matched and banned if over a certain threshold.

                > The important part here is the list of scanned endpoints for blocking bad traffic is doing things the hard way.

                Yes, I agree with you, even though there are some patterns that might be useful for placing instant bans? e.g. "../../" or if your site is not using php, then any access to a .php$ can get banned etc.

    • ozim 58 days ago
      Proper tool to block these is WAF that most likely comes with rules for any of such requests.
  • hsbauauvhabzb 59 days ago
    I’ve always been curious about this data. So a botnet is hitting your infrastructure, what are you going to do about it?

    Penetration testers will raise HTTP server versions as a vulnerability, but RCE in http servers is uncommon now (most are application-level, and even then it’s less common by default).

    Should we even care enough to log this level of attack anymore? I’d much rather look for application specific attacks such as direct object attacks (200 if they pass, 500 may be an IOC), etc

  • omgmajk 59 days ago
    My small vps has been hit with thousands of attacks for the last decade or so and I always liked going through the logs and looking at what's going on from time to time.
  • rwmj 59 days ago
    ZWNobyd6enpzc2RkJztleGl0Ow is the base64 encoding of this (which they try to eval in php):

      echo'zzzssdd';exit;
    
    But what does zzzssdd mean?
    • 2000UltraDeluxe 59 days ago
      It's just a unique string. If they find it in the output then they know the exploit worked.
  • nixcraft 59 days ago
    And this is a good reason why most HTTP servers need a web application firewall. Either try mod_security or use reverse cloud proxy like Cloudflare, AWS, etc. Of course, writing clean and secure code should not be ignored even if you use WAF. At least read owasp https://owasp.org/www-community/attacks/ It saved me so much time :)
    • elric 59 days ago
      I'm not seeing the immediate value here. If your application is vulnerable to unauthenticated DELETE requests randomly deleting stuff, no amount of application firewalling is going to protect you, because I guarantee that whoever built the original application had no clue what they were doing.
      • nickjj 59 days ago
        > I'm not seeing the immediate value here.

        The parent comment might have been coming at it from the angle of having a firewall in front of your web app is helpful because it blocks a whole bunch of bad actor traffic from ever touching your app's servers.

        Of which I would agree with too. A decently popular site could end up having at least 10% of its traffic being malicious. Let your app servers worry about the legit traffic and filter the rest beforehand.

      • thakoppno 58 days ago
        The DELETE one appears the least sophisticated. Does any framework deploy out of the box with DELETE being enabled to the extent it deletes hypertext resources?
    • bastawhiz 59 days ago
      I've had very bad experiences with mod_security. One client had a page refuse to load because the URL contained the word "DELETE". Unless they've cleaned it up a lot in recent years, I'd never recommend it to anyone.
      • technion 59 days ago
        I keep telling the story of attempting to buy some enterprise routers. I would logon to my distributor's website, click on the CISCO SELECT range, and get blocked from the site for an attempted SQL injection ("SELECT" in the query string).

        On the topic of mod_security in general, something like half of these are .php urls, and there's a good chance many readers aren't even running a php interpreter. Somewhere, there's a person attempting to convince the reader that that is exactly the sort of malicious traffic you need a WAF for.

        • ynbl_ 58 days ago
          this. so much this.

          its like when i tried to view a site but i was using tor and not a mainstream web browser so i had to solve 2 captchas to proceed (one for the main domain, and one for the cdn) but the captcha also takes 3 minutes to solve because its over tor and it doesnt like the speed i moved the mouse at

      • hsbauauvhabzb 59 days ago
        This, I’d rather not use a waf and focus on making sure that application security is good.

        A waf is at best as good as AV, good as a catch all, but it won’t catch highly targeted stuff, and isn’t even a defensible boundary to protect the former.

        • brodock 58 days ago
          WAF is mandatory in some certifications.
          • hsbauauvhabzb 58 days ago
            Yes but so safe lots of silly controls.

            The best part of a waf is the ability to add custom rules at runtime which can assist in blocking known vulnerabilities until they are remediated correctly.

            I don’t think generic sql or XSS injection rules are at all effective not stop many real world attacks. I’ve also seen wafs create both an availability failure point, dos choke point, and be the most vulnerable product in the tool chain (see F5 code exec vulns).

        • chrisshroba 59 days ago
          What does AV mean in this context?
      • elric 59 days ago
        We had to modify an application we built for a bank. Stuff could be deleted using HTTP DELETE (after authentication & authorization, of course), which was very much Verboten by their "security" policy. Instead we had to delete using HTTP GET or POST.
        • sam_lowry_ 51 days ago
          I recently came across a Hashcash implementation which used sha-256 instead of sha-1 because of "corporate security" forbidding sha-1.
      • sschueller 59 days ago
        Sadly nothing open source comes close and because of the uptick in commercial service likes AWS and cloudflare mod_security has memory leaks and other issues.
        • knodi 59 days ago
          Yes, AWS and Cloudflare are better but not without their own problems. WAF is something you evolve over time to correct the false positive which may have been observed.

          AWS WAF would trigger a false SQL injection attack if the URL contains two "+" chars and the word "and". Or if you have a cookie with JSON in it would trigger the XSS attack rule.

          Highly recommend setting up WAF logs to output some logs aggression tool (e.g Splunk) and create reports, dashboard, and alerts about WAF trigger rules & request HTTP code over a span to time and see what is going on with your requests and how WAF is evaluating the requests.

          1% of WAF blocks were real attacks, my experience is with a site that had 25 million unique visitors a month (no user content). i'm not saying you shouldn't have WAF, i'm saying nothing beats good visibitily into WAF to correct it behavior over time.

          • gingerlime 58 days ago
            Do AWS and Cloudflare allow you to correct their false-positives? any other 3rd party WAFs worth considering?
            • knodi 58 days ago
              Sadly no, you can disable an AWS provided rule but that may have other issues like you lose all detection for that attack vector. With AWS WAFv2 you can have custom rules with logic that lives in lambdas, the lambda is invoked every WAF request for evaluation based on the logic in the lambda.

              There are few options on AWS marketplace such as Fortinet and F5 WAF rules. Fortinet is the better of the 2 and newer.

  • _wldu 59 days ago
    It may be useful for websites to make these logs public. The logs would show the exact time, the IP and the specific abuse.

    In my experience, a lot of 'threat intelligence' data has a mysterious origin and is marginally useful. Yes, Tor exit nodes do bad things. Thank you, we sort of already knew that.

    But I'm not sure that's really beneficial either. It would be interesting to observe trends (such as log4j) and we could see first hand how Tor exit nodes are used for abuse and maybe collect a large list of 'known bad' IPs.

    Also, when we say an IP is bad (because it was observed doing a bad thing), how long do we keep it on the naughty list? 24 hours? More? Less? It may have been dynamically assigned and later some 'good' person will come along and want to use it to browse the web. If the IP is still on the bad list, that person will potentially be blocked by over zealous 'security professionals' who don't understand or don't care.

    What other uses could be made of this type of log data?

    • tushar-r 58 days ago
      >It would be interesting to observe trends (such as log4j) and we could see first hand how Tor exit nodes are used for abuse and maybe collect a large list of 'known bad' IPs.

      > Also, when we say an IP is bad (because it was observed doing a bad thing), how long do we keep it on the naughty list? 24 hours? More? Less?

      Look at GreyNoise's public feed - they provide historical data about IP's including the attacks they send. Most of the IP's end up being some kind of DC IP, not residential. Eg - https://viz.greynoise.io/ip/45.148.10.193

      I agree with the questions you've raised, and think that vendors like Greynoise are helping sort out those issues.

    • abc3354 58 days ago
      Abuse IP DB [1] does something like that, they provide an API to report and check IPs.

      [1] https://www.abuseipdb.com/

  • tgv 58 days ago
    Some of these come from companies that do this "as a service", even if you didn't ask for it. They can remove your IP address. I do not know what motive they have to scan third party websites, but it can't be kosher.
  • ByThyGrace 58 days ago
    In just what kind of server/application mismanagement would you need to incur for a traversal attack to be succesful? Surely those are the least effective ones?
  • est 59 days ago
    Are there any counter exploits against scanners? e.g. jam the scanners for a very long time, an infinite loop, or memory leak them.
    • jeroenhd 58 days ago
      Depending on how badly their scanners were written you could jam up their efforts by tarpitting them (at the expense of some resources on your side). Alternatively you could try ZIP/XML bombs to crash their process or mismatched Content-Length headers to maybe cause buffer overflows. Elsewhere in these comments someone linked a Python example on Github for how to accomplish this.

      The general trick seems to be: look at the rules of HTTP(S) and break them in fun and creative ways. Lie, break the standards, do weird networking stuff.

      If they're coming from a country with an oppressive government that you don't mind risking a ban from, you may be able to get their government's firewall to get rid of them by sending forbidden texts, or HTTP 302 redirecting them to search engines with forbidden texts in their queries. For residential Chinese scanners, for example, querying for information about the Tiananmen Square massacre can cause the entire internet connection to get dropped for a short while. This may not work well with data center/server connections, but it can't hurt to try.

  • 867-5309 59 days ago
    COOK / could be an HTTP action for a smart oven

    ..or meth lab

  • ffhhj 58 days ago
    That COOK request is taking "I am the one who knocks" to a privileged level, yeah B*! :)