An interesting thing that I've noticed is that some of the attackers watch the Certificate Transparency logs for newly issued certificates to get their targets.
I've had several instances of a new server being up on a new IP address for over a week, with only a few random probing hits in access logs, but then, maybe an hour after I got a certificate from Let's Encrypt, it suddenly started getting hundreds of hits just like those listed in the article. After a few hours, it always dies down somewhat.
The take-away is, secure your new stuff as early as possible, ideally even before the service is exposed to the Internet.
> The take-away is, secure your new stuff as early as possible, ideally even before the service is exposed to the Internet.
Honestly it feels like you'll need at least something like basicauth in front of your stuff from the first minutes it's publicly exposed. Well, either that, or run on your own CA and use self-signed certs (with mTLS) before switching over.
For example, when some software still has initial install/setup screens where you create the admin user, connect to the DB and so on, as opposed to specifying everything initially in the environment variables, config files, or other more specialized secret management solutions.
Yes, if you follow the advice of "not exposing anything unless you deployed the security for it" of course you block password auth before exposing SSH to the internet.
Not everyone is following that advice. Just last week I taught a friend about using tmux for long-running sessions on their lab's GPU server, and during the conversation it transpired that everyone was always sshing in using the root password. Of course plugging that hole will require everyone from the CTO downward to learn about SSH keys and start using them, so I doubt anything will change without a serious incident.
Are we just speculating? SSH scanners are not sources of DDoS. Large companies have ssh bastions on the internet and do not worry about ssh DDoS. Its not really a thing that happens.
You don't need to freak out if you see a bunch of failed ssh auth attempts in your logs. Just turn off password based authentication and rest easy.
You want to keep these things behind multiple locked doors, not just one.
For the servers themselves, you shouldn't be able to get to sshd unless you're coming from one of the approved bastion servers.
You shouldn't be able to get to one of the approved bastion servers unless you're coming from one of the approved trusted sources, on the approved user access list, and using your short-lived sshd certificate that was signed through the use of a hardware key.
And all those approved sources should be managed by your corporate IT department, and appropriately locked down by the corporate MDM process.
And you might want to think about whether you should also be required to be on the corporate VPN. Or, to be using comparable technologies to access those approved sources.
Agreed. Another thing you can do to drastically reduce the amount of bots hitting your sshd is to listen on a port that is not 22. In my experience, this reduces ~90% of the clutter in my logs. (Disclaimer: this may not be the case for you or anyone else)
Just to reduce the crap in the log and also because I can, I have my SSH servers (not saying what their IPs are) using a very effective measure: traffic is dropped from the entire world, except for the CIDR blocks, which I put in ipsets, of the five ISPs over three countries I could reasonably be in when I need to access the SSH servers.
And if I'm really, say, in China or Russia an really need to access one of my servers through SSH, I can use a jump host in one of the three countries that I allow.
So effectively: DROPping traffic from 98% of the planet.
This is the way, outbound only connections so you can stop all external unauthenticated attacks. I wrote a blog 2 years back comparing zero trust networking using Harry Potter analogies... what we are describing is making our resources 'invisible' to silly muggles - https://netfoundry.io/demystifying-the-magic-of-zero-trust-w...
I used to have an iptables config that just drops everything by default on the SSH port and run a DNS server that when queried a magic string would allow my IP to connect to SSH. It did help that the DNS server was actually used to manage a domain and was seeing traffic so you couldn't isolate my magic queries so easily.
Yes, better to make your bastion 'dark' without being tied to an IP address. This is how we do it at my company with the open source tech we have developed - https://netfoundry.io/bastion-dark-mode/
Until a junior from another project enables password-based root logins because Juniper team that was on site to help them install beta-version of some software they collaborated on asked them to.
Few days after they asked to redirect an entire subnet to their rack.
And yes, you still need to remember to close password logins or at least pick serious password if you need them. Helps to have no root login over SSH and normal users that aren't defaults for some distro...
I am not blaming this on SSH (also, no longer in that org for many years).
I am just pointing out (also in few other, off-site discussions), that one should not even think of exposing a port before finishing locking it down.
Because sometimes people forget, even experienced people (including myself), and sometimes that's enough (I think someone few weeks ago submitted a story which involved getting pwned through accidentally exposed postgres?).
And there's enough people who get it wrong for various reasons that lowest of low script kiddies can profit buying ready-made extortion kits on chinese forums, getting a single VM with windows to run them, and extort money from gambling/gameserver sites. Not to mention all the fun stuff if you search for open VNC/RDP.
Your security is only as good as the people running your system. Unfortunately not everyone has teams of the best of the best. Sometimes you get a the jr dev guy assigned to things. They do not know any better and just do as they are told. It is the deputized sheriff problem.
In that case it wasn't even the junior's fault - they were following experts from Juniper who were supposed to be past-masters on installing that specific piece of crap (as someone who later accidentally became developer of that piece of crap for a time, I feel I have the basis for the claim).
And those people told him the install system didn't support SSH keys (hindsight: they did) and got him to make the root logins possible with passwords. Passwords that weren't particularly hard to guess because their only expected and planned use was for the other team to login for the first time and set their own, before the machines were to be exposed to internet, using BMC.
I wish. I use basicauth to protect all my personal servers, the problem is Safari doesn't appear to store the password! I always have to re-authenticate when I open the page. Sometimes even three seconds later.
Was looking into Certificate Transparency logs recently. Are there any convenient tools/methods for querying CT logs? i.e. search for domains within a timeframe
Cloudflare’s Merkle Town[0] is useful for getting overviews, but I haven’t found an easy way to query CT logs. ct-woodpecker[1] seems promising, too
It seems like the principle of least power would apply here. There's value in restricting capability to no more than strictly necessary. Consider the risk of a compromised some-small-obscure-system.corporate.com in the presence of a mission-critical-system.corporate.com when both are issued wildcard certs.
Wildcard certs are indeed a valuable tool, but there is no free lunch.
You'd usually put a reverse proxy exposing the services and terminating TLS with the wildcard cert.
The individual services can still have individual non-wildcard internal-only certs signed by an internal CA. These don't need to touch an external CA or appear in CT logs - only the reverse proxy/proxies should ever hit these, and can be configured to trust the internal CA (only) explicitly.
A compromised wildcard certificate has a much higher potential for abuse. The strong preference in IT security is a single-host or UCC (SAN) certificate.
Renewing a wildcard is also unfun when you have services which require a manual import.
Using them like that never occurred to me. I was thinking multiple sites on one host or vanity hostnames: dfc.example.com / nullindividual.example.com. etc.
Unless you're running some sort of automated system to churn out vanity host names (like an Azure or AWS would to provide you an OOTB URI), a UCC/SAN cert is a better choice.
More restrictive is better than less restrictive when it comes to certificates.
"I've had several instances of a new server being up on a new IP address for over a week, with only a few random probing hits in access logs, but then, maybe an hour after I got a certificate from Let's Encrypt, it suddenly started getting hundreds of hits"
I host so many services, but I gave up totally on exposing them to the internet. Modern VPNs are just too good. It lets me sleep at night. Some of my stuff is, for example, photo hosting and backup. Just nope all the way.
If you're the only one accessing those services, then why use a VPN instead of port mapping those services to localhost of the server, and then forwarding that localhost port to your client machine's localhost port via SSH?
I am in the same situation with the grandparent. I don't even expose the SSH port to the outside. The only port open is the UDP port of Wireguard which allows only the packets signed by the correct key. Everything works perfectly, no issues with NAT, I even give my mobile devices an IPv6 that my ISP allocates.
Tunneling through SSH is significantly worse because you encapsulate a TCP connection inside a TCP connection and it's userspace.
I have also set up wireguard but I changed my model and only use to troubleshoot.
The reason is privacy. I use VPN to obfuscate my IP which means I would have to VPN my entire network. Unfortunately this has proven surprisingly difficult to do properly, meaning with appropriate performance (MTU), IPv6, no blocking (exit IP reputation), etc.
Hence I switched to Argo/cloudflare tunnels for pretty much everything.
I work as a security engineer and, yes, the CT logs are extremely useful not only for identifying new targets the moment you get a certificate but also for identifying patterns in naming your infra (e.g., dev-* etc.).
A good starting point for hardening your servers is CIS Hardening Guides and the relevant scripts.
Fun anecdote - I wrote a new load balancer for our services to direct traffic to an ECS cluster. The services are exposed by domain name (e.g. api-tools.mycompany.com), and the load balancer was designed to produce certificates via letsencrypt for any host that came in.
I had planned to make the move over the next day, but I moved a single service over to make sure everything was working. Next day as I'm testing moving traffic over, I find that I've been rate limited by Lets Encrypt for a week. I check the database and I had provisioned dozens of certificates for vpn.api-tools.mycompany.com, phpmyadmin.api-tools.mycompany.com, down the list of anything you can think of.
There was no security issue, but it was very annoying that I had to delay the rollout by a week and add a whitelist feature.
On censys.io you can search by domain for example. Some internet facing appliances generate certificate automatically with letsencrypt but use a central DNS server, meaning every one of these appliances is on the same domain, using random Subdomains.
Once you figured out what the domain is, you can easily build a list of IPs out of the cert transparency log and if there is ever an exploit for this specific type of appliances, attackers now have a bespoke list of IPs to hack, a dream come true.
I don't see a solution for this particular use case, I would argue self signed certs would be more secure in this case.
Same! As soon as a new cert is registered for a new subdomain, I get a small burst of traffic. It threw me off at first assuming I had some tool running that was scanning it.
> The take-away is, secure your new stuff as early as possible, ideally even before the service is exposed to the Internet.
What? Ideally..before? Seriously? It is 2024.. and this was true even decades ago, absolutely mandatory.
(Still remembering that dev that discovered file sharing in his exposed mongo instance (yes, that!! :D) without password only hours after putting it up.. "but how could they know the host it is secret!!" :D ).
Back when I started managing self-hosted sites, I would look through access logs as well. We even had an IDS for while that would aggregate the data and flag incoming attack attempts for us.
Eventually I stopped proactively reviewing logs and stopped paying for the IDS. It was a waste of time and a distraction.
It's not hard to find really useful content that summarizes common vulnerabilities and attacks, and just use that to guide your server management. There are a ton of best practices guides for any common web server technology. Just executing these best practices to 100% will put you way ahead of almost all attackers.
And then the next best use of your time and resources is to prioritize the fastest possible patching cadence, since the vast majority of attacks target disclosed vulnerabilities.
Where logs are super helpful is in diagnosing problems after they happen. We used log analysis software to store and search logs and this was helpful 2-3 times to help find (and therefore address) the root cause of attacks that succeeded. (In every case it turned out to be a known vulnerability that we had been too slow to patch.)
> In every case it turned out to be a known vulnerability that we had been too slow to patch.
Yes. This is why relying on "patching" is a bound to fail at some point. Maybe it's a 0-day, or maybe the attackers are just quicker.
The solution to this is defence in depth, and it's very easy for most services, especially when self-hosting personal things. Few tips most people can do is.
Put up a firewall in front or put it behind VPN/tailscale.
Hide it in a subfolder. The automated attacks will go for /phpmyadmin/ , putting it in /mawer/phpmyadmin/ means 99.9% of the attackers won't find it. (This is sometimes called security by obscurity and people correctly say you should not rely on it, but as a additional layer it's very useful).
Sandbox the app, and isolate the server. If the attackers get in, make it hard or impossible for them to get anywhere else.
Keep logs, they allow you to check if and how you got attacked, if the attack succeeded and so on.
Depending on the service, pick one or more of these. Add more as necessary.
The key thing is that you should not rely on any ONE defence, be it keeping it patched or firewalled, because they will all fail at some point.
> This is sometimes called security by obscurity and people correctly say you should not rely on it, but as an additional layer it's very useful.
Anyone who thinks “security by obscurity” is useless should try reverse engineering some properly obfuscated executables (or even code). Obscurity is absolutely useful; definitely not a complete solution by itself, but a very useful component to a security solution.
The term "security by obscurity" or "security through obscurity" implies that ONLY obscurity is being used to provide the security in question. Like leaving a totally unsecured server on your network with root access available with no password, and telnet or sshd open on a high numbered port instead of the regular ones.
Obscurity is a useful tool to be added on top of real security, and can help reduce the random baseline doorknob jiggling attacks, where people are just scanning the standard ports. But obscurity by itself is not enough to provide any real security beyond that.
Level 1 : I don't know what I'm doing, so I'll invent stuff only I know in the expectation that'll be enough. This person is told (with good reason) that security by obscurity is no security at all.
Level 2. They got the above message, so do everything right. Setups, firewalls, permissions, and so on. They are proud of their expertise and lecture level 1s all day long.
Level 3. Understand that all the fundamentals need to be done right. Add addional obscurity onto that because it doesn't hurt, and can filter out some useless traffic. (These folk should also lecture level 1 with the simplified message, but can explain the benefits to level 2.)
The problem with HN threads like these is that I don't know who's giving the advice. Level 1 2 or 3. Equally readers could be any level. Which might be dangerous if they are level 1.
IF you ARE level 1, learn the correct way to secure things first. THEN feel free to add obscurity onto that if you like.
I dunno, it feels to me that people are too ready to tell "white half-truths" to simplify the message, and assume everyone else is too dense to understand nuance...
Why can't we just outright say to everyone: "learn the correct way to secure things first. THEN feel free to add obscurity onto that if you like"? There's probably some way to convert that message into a catchy phrase, instead of just demonizing security by obscurity, yet having a small sect of elites in the know who break their own "rules"...
FWIW, this (rant) applies to a broader scope as well, beyond "security by obscurity".
When educating anyone about anything we tell half-truths all the time. It's a necessary path to understanding.
(For example, we teach kids the earth is round, then we tell them it's an oblate spheroid, then we tell them even that is a approximation. )
In a forum like this, the message is often distilled because one doesn't know the makeup of the audience. The simple rule is a good starting point for entry-levels.
Nuance can be hard to convey, because its usually a combination of context and resources which determine how far down a rabbit hole you can go.
I agree that this happens all the time, in every field. I'm dumb when it comes to my car, so my mechanic gives me simple rules to follow. Doctors tell everyone to eat less, exercise more, but in truth you can eat too little, and exercise too much.
The -real- problem happens even a simplification becomes obsolete. We see this in security a lot. Someone wrote the company policies 20 years ago, and they're insisting on say no-paste-on-the-password field, because that's in the policy, even though its since been shown to weaken security.
Yeah but why is it a saying and "security by anything-else is not good" isn't a saying.
What layer of security is sufficient on its own? None. Why is "obscurity" being singled out?
It always feels a little condescendant, like "oh I got a cool saying that applies to most noobs in security, let's use it", implying one is assuming the other party in the conversation knows less.
How would you apply this "anything else"? What "anything else" did you have in mind?
If you can come up with some good examples, maybe we can make that a thing.
Until then, we know that "Security by obscurity" is really bad, if that's the only thing you're relying on for security.
Otherwise, that would be "Security by something else plus obscurity", which might or might not be a bad thing, depending on what the "something else" happens to be.
Access lists alone are not sufficient
Authentication alone is not sufficient
IDS alone is not sufficient
Encryption in transit alone is not sufficient
Etc.
I detest this phrase right up there with "fake it till you make it". All security is by definition obscurity. Just a meaningless platitude that rhymes.
I suppose un-formalized un-proofed security practices will eventually be broken or counting on hackers not to do any investigation of your system will get you hacked, doesn't roll off the tongue though.
Can you expound on this? For example, if I add a 30 second lockout after failed authentication attempts I don’t see how that comes under any non-tortured definition of “obscurity”.
The lockout doesn't exist without the authentication system in the first place. Which exists to keep people from knowing the information. Its a subset of something that is needed for obscuring the information in the system from anyone that should not see it.
I really feel dirty trying to justify this level of nerdy pedanticness. I'm sure you can poke some holes from my off the hip internet comment if you really want to I'm not trying to be academic. I mostly fueled this comment with my distaste for that other platitude.
I think you miss the point of security through obscurity. It's not about keeping the information itself obscure (in your example login information), but rather the method. For example, your password hashing mechanism. If you have a strong password hash function, you don't need to obscure which hash function you use (otherwise, open source software couldn't even exist in certain areas). However, if your security relies on you obfuscating your broken, home-made hash function that only hashes the first three letters of the password, you're not really secure. Security through obscurity is an attempt at securing an otherwise unsecure system by hiding or disguising the implementation.
That being said, obscuring parts of an otherwise secure system is fine as an additional layer, especially if you just want to deter script kiddies that always hammer the same endpoints
"and if at all possible keep them off the public internet", this is the way. I would recommend going beyond a VPN to implement zero trust networking which does outbound-only connections so that its impossible to be subject to external network attacks. Tailscale does part of that, other exist, such as the open source project I work on - https://github.com/openziti
> And then the next best use of your time and resources is to prioritize the fastest possible patching cadence, since the vast majority of attacks target disclosed vulnerabilities.
Just curious, do you leverage any tools to decide when to patch or is it time-interval based? We currently attempt[0] to update our packages quarterly but it would be nice to have a tool alert us of known vulnerabilities so we can take action on them immediately.
[0] "Attempt" meaning we can't always upgrade immediately if the latest version contains difficult-to-implement breaking changes or if it's a X.0.0 release that we don't yet trust
So then you set up something like a cron job to scan everything for you and email the results once a week or whatever if you don't want to monitor things actively.
I just setup GVM / OpenVAS[0] to work out where I need to put in some maintenance work, how does that rate as worthwhile in comparison to those you've listed above? (which I will also look into).
(not a fan of the effort Greenbone have gone to for hiding their community edition and promoting their commercial products)
Thanks for these. I should have clarified, I'm more interested in something that will alert me when newly-discovered vulnerabilities surface. The systems I maintain are protected [enough] today but a new hack could drop that doesn't make mainstream media and I may not hear about it. We have annual security audits but it would be nice to patch things immediately. Aside from subscribing to a security forum/discord/slack, I'm wondering what other methods folks are employing to solve this.
Keep your pentesting tools up-to-date. Run them against yourself on every single deployment, if you can. Don't just run them quarterly because that's all that your PCI-DSS requirements say you have to do.
Integrate security code scanning tools into your CI/CD process. Tools like Dry Run Security, or something comparable.
There's much more, but that has to do with how to run your CI/CD systems and how to do your deployments in general, and less to do with security aspects thereof.
The simplest is to only use packages from a distribution like Debian and run unattended-upgrades or equivalent for the security-updates repository. They usually fix vulnerabilities in less than a day.
The author says they aren’t a security person, so to correct a minor thing: the first examples are credential and configuration discovery, not directory traversal. The latter, to the best of my knowledge, is reserved for techniques where the attacker “escapes” the webroot or otherwise convinces the server to serve things outside of its normal directories.
The elephant in the room is that–at least in my experience–a lot of these attacks come from hostile nation states. This is going to be controversial, but one may find it useful to block entire IP ranges of problematic states that you cannot do business with. I was able to block 100% of probes to one of my new services by doing this.
American websites don't typically have customers dialing in from North Korea. Just saying that IP blocking is something more businesses should consider. Traffic can also be routed to different subdomains for businesses that need to provide a subset of services to a region.
15+ years ago I was reviewing server logs for small, local businesses we hosted and came to the conclusion we should just block all APNIC IP addresses.
Is it? Which of these "attackers" are you most concerned with? I hesitate to even call them "attackers." They're jiggling your front door knob, at worst.
This is actually a decent analogy because a person sneaking around the neighborhood trying to open doors should be considered a threat. Nothing good happens when someone like that gets inside.
Eh, yeah, but at the same time, can you jiggle doorknobs from halfway around the world, and is it so overwhelmingly common that within minutes of every door being built, dozens of people come by just to jiggle the knob?
It's just so unbelievably common and so frequently harmless that it's hard to take all that seriously. But you're right, it is a threat, I won't deny that.
Can you even sell stuff to North Korea or Iran? Arent they on some embargo?
A friend told me that someone in his very big company sold some random stuff to North Korea and now 1) they have obligatory training not to sell there 2) they have to go through obligatory training on non proliferation of nuclear weapons
I work in application/product security and have managed WAFs for multi-billion dollar companies for many many years.
Move DNS to Cloudflare and put a few WAF rules on your site (managed challenge if bot score less than 2 / attack score == x). I doubt you'll even pay anything, and it will resolve a lot of your problems. Just test it before moving it to production please (maybe setup a test domain). Remember, a WAF is not an end-all be all, it's more of a band-aid. If you app isn't hardened to handle attacks, no amount of advanced WAF/bot protection will save it.
Selfhoster here. I use mutual TLS rules with CloudFlare's WAF to filter out everyone but my known-good callers. Works great. Since the only folks with access are my family, it was pretty easy to setup as well (everyone gets a unique cert that I can revoke if need be).
Usually I only manage internal facing applications these days, which makes the attack surface greatly reduced compare to public ones.
But since you seem to have a lot of knowledge in this area. Have you manage solutions which also includes infrastructure in Azure combined with Cloudflare?
And if so, any suggestions on things people usually miss? except for the usual stuff of OWASP and what not
Yes, that's just what the internet needs is even more websites centralized behind Cloudflare. Why do we even bother with TLS anymore if we're going to give them unencrypted access to practically all of our internet traffic.
Hacker news is so funny, they complain about the amount of power we've allowed Google, Amazon, and Microsoft to have, and then go right around and recommend putting everything behind Cloudflare.
Once Cloudflare starts using attestation to block anyone not on Chrome/iOS Safari it'll be too late to do anything about it.
Can you please not post in the flamewar style? It's not what this site is for, and destroys what it is for.
You're welcome to make your substantive points thoughtfully but it needs to be within the rules. If you wouldn't mind reviewing https://news.ycombinator.com/newsguidelines.html and taking the intended spirit of the site more to heart, we'd be grateful.
We should be suggesting self hosted and decentralized solutions to website hosting and file hosting.
On that note, does anyone have any secure methods of providing serving a file from your computer to anyone with a phone/computer that doesn't require them downloading/installing something new? Just a password or something? Magic-wormhole almost seems great, but it requires the client to install wormhole (on a computer, not phone), and then type specific commands along with the password.
> Once Cloudflare starts using attestation to block anyone not on Chrome/iOS Safari it'll be too late to do anything about it.
That's just plain bs...
Eg
1) they have customers and their customers want protection, with minimal downsides.
2) Cloudflare is the only one with support for Tor. I'm 100% sure you didn't knew that.
What "examples" do you have to blame them for something they aren't doing? Based on what?
I'm getting tired of people blaming Cloudflare for providing a service that no one else can provide for free to small website owners => DDOS protection.
Could you please stop breaking the site guidelines? You've unfortunately been doing it repeatedly. It's not what this site is for, and destroys what it is for.
You're of course welcome to make your substantive points thoughtfully while staying within the rules.
Which circumvents the bad reputation of certain exit nodes:
> Due to the behavior of some individuals using the Tor network (spammers, distributors of malware, attackers), the IP addresses of Tor exit nodes may earn a bad reputation, elevating their Cloudflare threat score.
> Hacker news is so funny, they complain about the amount of power we've allowed Google, Amazon, and Microsoft to have, and then go right around and recommend putting everything behind Cloudflare.
It’s almost as if those saying contradictory things are actually different people despite being on the same website. But it can’t be that, surely? Truly a perplexing phenomenon that I hope someone can one day explain.
Fair, although I know quite a few people that hold both of these opinions simultaneously because I've met them in person. It's only after I point out their hypocrisy do they even realize what a danger Cloudflare poses to the free and open internet.
I suspect it's because hating on Google is in vogue, and so is recommending Cloudflare.
I'm going to try to provide / justify my potentially hypocritical viewpoint:
I use Cloudflare (free tier) in front of the very few and almost entirely unused websites that I run. I believe that the service they provide is useful for protecting the IP addresses of the servers on which the content is hosted, whilst also providing some amount of protection from malicious traffic.
I also agree that centralisation of services is a big problem for the future of the internet.
My position is that, whilst there seem to be increasing voices / examples of Cloudflare's (potential in) acting against the nebulous notion of "spirit of the internet", for me they certainly haven't reached the "evil" stage. I'm also of the understanding that it's Cloudflare customers that choose to block access from Tor or VPS IP address ranges and / or add Captcha's or other bothersome verification. True Cloudflare enable it and make it possible, but the administrators of the website that you're trying to visit have made the choice to make it more difficult for you to access their content; not Cloudflare themselves.
I would prefer there to be similar-scale alternatives to Cloudflare as a kind of a middle-ground decentralisation of centralisation. I'm sure there are alternatives, but I'm not yet motivated enough to even consider starting the research process.
If Cloudflare start selling visitor analytics to data brokers, however, very fast goodbye.
Thank you! I've been self-hosting for about a year running a 400-line http/s server of my own design, and it's remarkable all the attacker traffic my 3 open ports (22, 80, 443) get, although I've never taken the time to analyze what the attackers are actually trying to do! This post fills in a LOT of blanks.
Would be cool to do the same thing for the weird stuff I see in /var/log/auth.log!
It's crazy that attackers would bother with me since the code is entirely open source and there is no server-side state. The best outcome for an attacker would be root access on a $5/mo VPS, and perhaps some (temporary) defacement of the domain. A domain no-one visits!
These are all automated bots. No one is “bothering”. You open the 3 most well known ports you’re going to get connections. They don’t know what you’re running nor do they care.
By "bothering with me" I mean "add my IP to the long list of IPs they are scanning".
By the way, I find it annoying that my logs get filled with this kind of trash. It has the perverse effect of making me long for something like Google Analytics since they rarely if ever bother running a javascript runtime.
That long list isn’t curated, it’s every publicly routable IPv4 address. It really does not take long to run some canned probes against 3.7 billion addresses.
Making your service IPv6 only tends to cut down on this traffic.
You’re anthropomorphizing a script on some botnets.
This isn’t entirely true. Many scanners do preference specific IP ranges such as cloud providers. Cloud IPs receive substantially more scanning traffic than darknet IPs or even random corporate IPs.
Access to your VPN is a great way to launch attacks on other machines, and to add another layer to covering his tracks. Not to mention hosting malware to be downloaded to other places, and even a crypto miner.
I set up a honeypot once and logged the passwords of created accounts. I then used 'last' to find the incoming ip.
I then used ssh to try and connect to the originator (from an external box). I went back 5 jumps until I got to a windows server box on a well known hosting service that I could not get into.
Lots of what looked like print servers and what looked like linux machines connected to devices. Maybe just the exploit at the time.
Consider blocking 22 except whitelist your own IP. My ISP changes my IP rarely in practice and when they do I can log into the hosting web admin panel and update the rule.
I make a point of running fail2ban on my servers and will even add custom jails to catch attacks specific to the types of functionality I may be exposing on the site(s) hosted on them. But it’s been a long time since I checked whether fail2ban’s other defaults are still comprehensive enough to block the most common attacks. I guess I’ll bookmark this link for when I get around to doing that.
If it makes you feel good, do it. It can also cut down on log noise a bit, for when you’re really looking for something.
But in general, I’ve given up on caring about the routine “attacks” listed in all the logs. If you have good security, they don’t matter. And if you don’t, they don’t matter either.
I have recently started using my own sledgehammer-subtle approach of detecting (what I refer to as) Uninvited Activity on any port not offering a service, and straight-up banning the source IP (indefinitely at the moment) from accessing any actual service ports.
Over the few months I've had it running I've needed to progressively create failsafes for IP addresses that I know are trustworthy so I don't lock myself out. I've also started tiering the importance of blocking based on different sets of ports which are being probed. I've also discovered that there's a significant amount of Uninvited Activity coming from "security" companies in their pro-active scanning of the entire IPv4 space - which I don't trust at all and ban with prejudice.
(I'm aware of various limitations and footguns inherent in this un-subtle approach but, as another commenter elsewhere alluded to, "it makes me feel better". I also think that a fair bit of processing volume can be taken off IDS' if a heap of "known garbage" traffic is blocked prior - it's all about tiers).
I also check the access logs collected by my self-hosted services, and I think there's a detail that's conspicuously absent from this analysis: the bulk of these malicious requests are made by regular people running plain old security scanners that are readily available from sites such as GitHub. These are largely unsofisticated attacks that consist of instances of one of these projects just hammering a server without paying any attention to responses or even if they have been throttled or not.
Some attacks don't even target the IP and instead monitor domains and its subdomains, and periodically run the same scans from the exact same range of IP addresses.
For example, on a previous job we had a recurring scan made over and over again from a single static IP address located in Turkey that our team started to refer to it as "the Turkish guy", and our incident response started featuring a preliminary step to identify weird request patterns that was basically checking if it was the Turkish guy toying with our services.
If anyone is receiving these types of logs on AWS, please do yourself a favor and place AWS WAF in front of your VPC.
It's not expensive and can significantly help you, saving you from many headaches in situations like this. While it might not block everything that arrives at your service, it can be a great help!
This is a good suggestion, but careful with the default rulesets. We turned on AWS WAF (in our case, the motivation was just SOC 2 compliance).
There were a few overzealous rules that subtly broke parts of our app.
There were request body rules that did things like block requests that contained "localhost" in the request body. There was also a rule that blocked requests without a User-Agent header, which we were not previously requiring on API requests, so we broke our entire API for a few users until we figured that out.
In my experience WAFs are not something that one should ever "just turn on".
Complete due diligence is required to fully understand and realise the impact of the rules and should be tested like any software change by going through a testing phase.
Ideally software teams should be fully trained and be responsible for their lifecycle.
Yes you need to be familiar with the rulesets being applied, and prepared to closely monitor what is being blocked. Ideally I think I’d roll it out one ruleset at a time to limit the number of potential issues being introduced at once.
Had a fun one after turning on the AWS WAF with some default rules–a small number of users reported they couldn’t upload new logo images anymore. Turned out some Adobe product was adding XML metadata to the image files, which the WAF picked up and blocked.
Agree. There are so many ways WAF rules can unintentionally block legitimate traffic. From very long URLs (is that a DoS attempt?), to special characters in a POST with a file upload (is that = part of a SQL injection attempt or is that just part of a base64 encoded file?) and so on.
I use the Azure equivalent of the AWS WAF but I have no direct experience with AWS WAF. Azure WAF leverages the OWASP ruleset[0] and many of those rules throw false-positives, SQL-related rules being one of the top offenders.
As you note, it requires adjustment due to overzealous rules. OWASP has Paranoia Levels[1] which allow you to be more targeted.
It is not that easy as using the AWS WAF with default rules for our application led to many valid requests and IPs being blocked. You need to know what is being blocked and verify at first or you will in some cases be losing customers.
Your best plan is to start with all the rules in count mode. Let that sit for awhile and analyze anything that was counted. As you feel good about it, slowly start to move things into block.
In actuality, WAFs hurt more than help. They give a false sense of security since they are so easily bypassable, plus they have a significant performance cost and a significant chance of blocking legitimate traffic: https://www.macchaffee.com/blog/2023/wafs/
WAF tends to ban widely, sometimes for dubious reasons. For example, researchers at my university study Twitter data, and the mere fact of following links from a small random sample of tweets means that our university's IPs are blocked by most WAF.
Ah, tarpit refers to a system that purposely slows down answers, while honeypot is a system that _looks like_ it's delivering the goods but it is just a trap.
I'm sure they mostly refer to the same thing, though.
It does waste their time if your honeypot is constantly responding with legitimate looking but fake credentials. Presumably the hacker is going to try to use them?
It’s the same idea used by anti-spam activists back in the day with software that would flood spam website forms with fake but realistic looking info, so the real data would be buried in the noise.
From what I understand the automated part is sometimes a first pass. Just to see if you are there and have something. They will then wait some period of time and come back later with the real attack. Sort of like the war dialer from the movie wargames. Basically try everything get a list of interesting targets. Then go at them. Now some are fully automated and just try whatever exploits they are trying right then and there.
Tangentially the sad thing about whole cyber sec space is that well resourced APTs like say APT-29 Cozy Bear. Have enough resources and actually run labs where they deploy all top Endpoint solutions and validate their offensive tools against them.
Fail2ban doesn't help when the attacker/abuser rotates their IP address constantly. I now look for aberrations in a few http headers they often neglect to spoof in their attempt to act like an honest human.
fail2ban has calmed attacks (ssh dictionary mostly) from 100-1000s per day to about a dozen, on my private rpi thing. I assume most attackers are looking for low hanging fruit with little (cheap) effort.
Bit they don't. The most of the ssh attacks come from the same 10 addresses from China Telecom. Sometimes they don't change for.months. Maybe it is some country sponsored attack organisation or maybe just carrier grade nat.
Rate-limit after x failed logins on either source IP, username and password. Just provide a realworld sidechannel escape hatch for legitimate users (ex: phone or email). Barely anyone will actually use it.
We had to go further. The attacks were from arbitrary IPs, and were working through a huge list of leaked username/password combinations.
The attacker even went to the trouble of spoofing some custom headers in our API client.
Our eventual solution was to a) attack our own auth credentials first, identify any users with leaked creds (from other services) and force a password reset for them. b) disallow users from setting common leaked passwords. c) make the auth checking request as low-cost as possible, and scalable separately from the main application. d) when an attack request is detected, bypass the relatively-expensive real cred check but return the same failure response (including timing) as a real failure. e) build a secondary requirement in the auth flow that can be transparently enabled when under high volume attack.
This works, so far. It sheds the volume to the application, and has low-to-zero impact on legit users. This took a couple weeks away from feature development though!
This interests me a lot as we do security but don't run a big service ourselves and so don't have data on what motivated attackers' behavior is exactly.
How many active users (to an order of magnitude; no need for precise numbers of course) does this service have? 100k IPs sounds pretty costly to burn, so I'm curious how important one needs to be before that's considered worth it.
And could you say what type of IP addresses those were? Did it look residential such as from compromises computers (botnet), do they rent lots of IP addresses temporarily from aws/netcup/alibaba, or is it a mix such that neither category has the overwhelming majority?
If it's all server IP ranges for a service where end users normally log in, you could apply entirely different rate limits to those than to residential IPs, for example. Hence I'm wondering how these cred stuffing attacks are set up
It was a mix. Lots of apparent botnets (residential service scattered across many providers and networks), but also healthy chunks of colo/SaaS netblocks as well.
IMO, you shouldn't do anything special. They're all very low-skill automated attacks. Just design and deploy stuff well, do the basic stuff correctly before you make anything publicly accessible, and don't worry about the noise. If you're doing things properly, none of it will work. Whatever fancy thing anyone suggests to try to reduce the noise likely won't slow down any actual determined human attacker much, and will just cause you more hassle to deploy and maintain.
Updates as the other commenter says. Also isolation technology like docker containers, chroots, bsd jails, protections that systemd offers, or virtual machines. While not perfect, it means that the attackers must have the ability to chain exploits in order to break out of the compromised application to the rest of the host system.
Docker is great but it is easy to shoot yourself on the foot if you use it conveniently but don't actually understand it.
A common mistake is to publish the Docker ports unknowingly to all interfaces (e.g `5432:5432`), which makes your Docker container available to everyone. It is common to see this in Docker tutorials or pre-made Docker Compose files. Coupled with UFW, it may give you a false sense of security because Docker manages its own iptables rules.
I do make the habit of not expose ports and just use reverse proxy for the container. Of course, you will need a bridged network between the reverse proxy container and the target container, but that's fine. I'm sure there is more clever ways around that.
Instead of exposing your applications externally, you create a private network that uses UDP hole punching.
This isn't completely self-hosted, as you need some server to auth / broadcast connection details with. Self-hosting might be possible on ZeroTier, but I'm not familiar enough to say for sure.
Don't expose your services publicly unless it is necessary. If you're self-hosting services that are meant to be accessed only by you then consider accessing them exclusively over a VPN like Wireguard (Tailscale is nice) and firewall everything else.
I am not aware of a vulnerability in popular web server software that affected a basic auth login screen for over a decade. Assuming you use a proper password and don't typo the domain and end up on someone who wants to specifically phish you through typo squatting, it's about as solid as SSH or Wireguard
They serve different use-cases but I wouldn't say that VPN is strictly better than HTTP auth or vice versa. Recommending to double up for a self-hosted little something, not a big target like 4chan or Gmail, is overkill
Well, I did mean or; sometimes just sticking httpd in front of the application with a user:pass over https is fine, and also much easier if the client can't run a VPN client or doesn't want to.
Why? Code is open source and I don't check config changes into .git. Do you mean like .htpasswd which is already default disallowed (on web servers that make use of it in the first place; I think Nginx doesn't block it by default but also doesn't use it so it wouldn't grant any access)?
Use packages from a distribution like Debian and run unattended-upgrades or equivalent for the security-updates repository. They usually fix newly reported vulnerabilities in less than a day.
As someone who knows very little about security, this is really interesting, thanks! A question though: how would one know if there has been a breach? These examples look relatively easy to detect, but I guess there would be more complex cases?
I also know very little, but something that struck me upon reading your question: if a breach is successful, the logs can't be relied upon for detection/analysis if they're on the same server. It's important to ship them elsewhere.
This is why some people run a honeypot in their network... and even those won't necessarily catch everything if the honeypot only mimics services that the attacker isn't probing for. You can set up tripwires on access and egress of sensitive data but that's only part of the surface area (and if the system gets attacked those tripwires could be disabled, if the attacker either knows what to look for or has a plan for a side channel for exfiltrating data).
Really the only good answer is defense in depth and keep looking for any indicators of odd behavior, and wall out unrelated systems entirely from each other, keep the DMZ and public facing bits as simple as possible.
You can use honeypot that bait hackers . I am running a non-intrusive one where you put baits in your servers or laptop, when hackers see it, they'll try to use them.
IOC or indicators of compromise, but if you know little it is always advisable to hire someone to go through it on demand or periodically as there’s no one trick to rule them all.
I’m looking for a secure authentication/proxy application to put in front of the webservers that have to be exposed to the internet. The application will authenticate the user with SSO or hardware key, before forwarding the traffic to the right internal address.
Cloudflare Tunnels are great, but CF doesn’t allow the TLS pass through. CF man in the middles and decrypts the traffic.
So far I have looked into Teleport, authelia, authentik, and keycloak, perhaps combined with Traefik.
Any feedback on the level of security of these tools for being exposed to the public internet?
Granted we are a Microsoft-based shop, but Microsoft Entra Application Proxy has worked out great for exposing our internal web based apps to the outside for mobile/home workers.
This seems to be very similar to the Cloudflare tunnels. They are reverse proxies in the cloud, with TLS termination. The traffic is terminated and scanned in the cloud.
Curious enumerations on the most common items. There's definitely a topical bias. By far the most common attack attempt I see on all the various webhosts I administer is WordPress-oriented (despite not present on any of the hosts), which doesn't even get an honorable mention by the author. Perhaps he hosts WordPress content and didn't discern attacks from legitimate traffic.
Does anyone else have the experience that almost every single one of the 'attacks' is using HTTP and not following a 301 redirect to HTTPS? I have my internet-facing web server set up to redirect all HTTP -> HTTPS and this thwarts almost every 'attack'. I have to say, they're not very smart about it.
I do that as well, but nothing gets through anyways because I'm not running Wordpress or a version of Apache from 2012. I also have nginx set up to reject connections without the expected 'Host' header, and that quickly bounces a few attackers. But many still end up getting all the way to a 404 or 400 for their attack URL.
Hyper-naive take: Couldn't nearly all of these attacks be blocked by a white-list approach, essentially hiding every file or directory from the internet except a very controlled list of paths and escaping all text sent so it can't contain code?
I somehow always imagine these types of hacks to be more clever, like, I dunno, sending harmless-looking stuff that causes the program receiving it to crash and send some instructions into unprotected parts of RAM or whatever. This all looks like "echo ; /bin/cat /etc/passwd" and somehow the server just spitting it out. Is that really the state of web security?
> Couldn't nearly all of these attacks be blocked by a white-list approach, essentially hiding every file or directory from the internet except a very controlled list of paths and escaping all text sent so it can't contain code?
This is basically how things work.
For convenience, instead of itemizing each filename, the webserver root is a subdirectory and anything underneath is fair game. The webserver uses the OS "chroot" facility to enforce this restriction. What you are seeing is ancient exploitation strings from 30 years ago that haven't worked on any serious webserver since that time, but a) keeping the test in the attackers lib is essentially free, and b) there are some unserious webservers, typically in cheap consumer hardware.
Webservers pass plain text to the app server. It is the app server/framework's responsibility to understand the source of the request body and present it to the application in a clear way, possibly escaped. But the app needs to process this and sometimes through poor coding practices, fails to respect the untrusted nature of the data. This again is more typical in historical systems and low-cost consumer products where software is not a marketing advantage.
> ancient exploitation strings from 30 years ago that haven't worked on any serious webserver since that time
Unfortunately, there are plenty of serious (business critical) servers that _ARE_ vulnerable to these types of attacks. I've found and remediated things like this all the time. One very common example I've seen of the `.env` issue is Django servers that are exposed to the internet in with debug=True. There's probably thousands if not tens of thousands of servers leaking credentials this way on the internet now.
Beyond that, companies often have internal systems that do not meet the same security standards that external systems require, and sometimes those systems get shifted around, maybe it's moved to a new subnet, maybe a third-party needs access and the CIDR range gets fat fingered in the firewall. Regardless - now that "internal system" is exposed to the internet with all the dangerous configuration.
There are different types of web security vulnerabilities and the attacks you see from automated scanners are likely to be far less sophisticated than targeted web attacks. Specifically these scanners are going to spam out widespread and common CVE's that might grant privileged access to the server or dump credentials in some fashion.
The more sophisticated attack you described is essentially an overflow, and most modern web servers are usually written in memory-safe languages making it very unlikely to see that type of attack on the web. More often it's the underlying OS, servers, or communication stacks (bluetooth, TCP, nginx, etc) that have these types of vulnerabilities since they are often written in low level non memory safe languages like C and C++.
Attacks that exploit the HTTP and HTTPS protocol are a little more interesting. Request smuggling lets you trick certain load balancers and webservers by sending an HTTP request "smuggled" inside of another HTTP request.
There's really a lifetime's worth of knowledge on web security and the type of stuff you see in scans is just trying to hit the low hanging fruit.
Portswigger has loads of free challenges and information about different web security topics.
Security through obscurity is like a ninja tiptoeing in a room full of laser beams; make one loud move and you'll reveal that your entire protocol hinges on no one sneezing!
WAF seems to be an essential piece for any website with even a little bit of visibility / traffic on the net. Some questions:
* Comparison of AWS WAF vs Cloudflare vs Others?
* Many services like EC2 charge for data transfer [1], so how much of your monthly/yearly costs of hosting goes toward fending scans like these? Does AWS count any traffic blocked by the WAF toward the transfer limits?
If you're on AWS the AWS WAF is pretty low cost. You can expect to pay less than $10 / month and still get an ok amount of value on a decently popular site.
The problem is you have to manually configure a lot, the rate limiting aspect is way worse than Cloudflare and while the AWS WAF can geolocate an IP address and block by country it does not send the country code back to you in a header where as Cloudflare does. The last one stings because it's super handy to have an accurate country code attached to each request, especially if it's something you don't need to think about or waste I/O time calling out to a 3rd party service in the background to backfill in that data later.
This is helpful! I found some CDK libraries that allows for connecting a load balancer or Cloudfront to WAF with a few lines of code. I'll give it a try! [1] [2].
Yep, that's one of the values of the WAF, it can be associated with your ALB which means you can match rules on headers, cookies, etc. after the traffic has been decrypted.
With Fastly WAF - SignalSciences, Distil Networks - Imperva, Akamai - StackPath. There are a few companies that started alone in this ecosystem and found the evolution path with the mayor CDN networks, those who have failed themselves in provide a good internal alternative service.
Human security - PerimeterX, Haproxy WAF, Datadome, are other players to different target audience.
If you have a good control over the app exposed, maybe you only need a WAF in the sense of stop stuff outside your infra. The sql injections, weird urls trying the way to /etc/passwd or related things look from the past and only makes noise nowadays. The real issue is when someone hits you in a rate impossible to manage with your resources or when it cost you more than the securing layer.
Nice analysis! You should protect your infra to avoid this kind of scanning:
- Disable password login for SSH, use keys instead.
- Limit access to known IPs (with a managed vpn)
- Use Cloudflare: Their WAF is really good
- Forward logs to an other service that can analysis logs (datadog is nice)
shameless plug: started a small honeypot service[1] if anyone would need it as a last resort[1] to catch hackers in your servers . Feedbacks appreciated!
I always find it fun to respond in strange ways to "malicious" requests on all my webservers derived from a defcon talk[0] that I watched a while ago, there's a lot of great fun to be had in honeypots and other things of the nature.
I know many people decry security through obscurity but some times it works. For example several years ago I changed the ssh port on a VPS I was running to a non-standard port. Logged login attempts from strange IP addresses went from dozens per day to zero.
% ssh example.com
Last failed login: Sun Jan 28 16:59:35 UTC 2024 from 180.101.88.233 on ssh:notty
There were 5385 failed login attempts since the last successful login.
Last login: Sat Jan 27 13:33:30 2024 from xxx.xxx.xxx.xxx
5.3k failed attempts in ~30 hours. I know, I should be setting up fail2ban.
fail2ban will reduce your log noise but it's another thing to manage, you can end up locking yourself out also, and if you're using good passwords (or better, public key auth) it's not really providing any additional safety.
Ideally you don't need ssh open to the whole world anyway, and can restrict it to a certain subnet or set of addresses. Then your attacks will drop to nearly zero.
fail2ban is not very performant and it will only reduce the amount of attempts.
An alternative is to add a nftables rule (or iptables or whatever firewall). Something like :
table inet filter {
chain input {
tcp dport ssh accept comment "Accept SSH"
tcp dport 22 ct state new limit rate over 2/minute drop
But even with rate limiting, the logs are still polluted by auth attempts. Changing the port does little. The only solution we found was to configure port knocking (with direct access from a few whitelisted IPs).
Most of my servers are IPv6 only and there are no failed ssh attempts on those. I install fail2ban just in case and firewall the IPv4 address, since I don't SSH via IPv4.
I’m not a fan of fail2ban. A simple but quite effective approach would be permitting remote login only from certain IP ranges. I know that it looks like a bad trade for self hosted web apps, but it is very easy to setup on many cloud providers.
Also I normally set up a jump host first - a smallest instance that only runs ssh and everything else would not open ssh port to the outside at all. One nice effect is having to search just one auth log if something about ssh looks concerning.
No, that does nothing if you followed best practices and disabled password login. fail2ban is just another Denial of Service risk that has the added bonus of bloating your firewall table and slowing down all your new connections
Interesting. Came to similar conclusions when analyzing my (httpd) access logs for https://turmfalke.httpd.app/demo.html ... but so far nothing really out of the ordinary.
Yeah some thing in websites I have, even the personal one, they get on average tens of these attacks, and again, it seems they are after .env too and other php related directories. Big portion of these attacks come from TOR networks too.
As a thought experiment - if there was public money to back this, would it be making us all safer to run a series of honeypot servers that automatically start DDOSing the various C&C servers that attempt to compromise them?
Such a response would generally be a crime; existing cybercrime legislation generally does not have any clauses permitting retaliation as "self defence", and also very often you'd be DDoSing an innocent third party, another victim whose compromised device is abused to route traffic, and also affecting their neighbors on the same network.
I would be interested in reading something similar but focused on less common ports and services, like game servers that run on high ports and more typically use UDP.
I wonder if that’s more or less of a “safe” situation.
There are quite a few examples in there of “the bad guy” trying to find files (like .env) that are accidentally left somewhere. From my PHP days I remember that indeed Apache/PHP just serves up any file from anywhere as a static file if it isn’t PHP. My memory is pretty vague on this and I don’t remember if this behaviour is configurable but I guess it must be. Having done mostly Ruby on Rails since, it feels so strange now that a web server can be some kind of a file browser into your code base. Am I remembering this correctly? Are there other languages/web servers that work like this?
Most of them are probably bots running without the knowledge of the IP owner. There’s little benefit to sharing those IPs with anyone other than the provider who owns them.
Is there a potential benefit in larger sharing of malicious-traffic IP addresses in that if more sites blocked these known-bad IP addresses then there'd be a higher chance that a victim would notice something is wrong as their online services have started blocking them?
It feels to me like we're too polite, so we're letting the infected walk amongst us. It might not be their fault, but I'd be guessing it'd also be better for victim to find out sooner rather than later if they're pwned.
I suppose the slippery slope / end game of this would then concentrate all intentional malicious traffic to VPNs and proxy's and Tor and the like, with them becoming useless due to being blocked.
Why not already publish the current state while testing? If the readme says it's currently in testing then it's still more inviting to look into than an empty repo
I've had several instances of a new server being up on a new IP address for over a week, with only a few random probing hits in access logs, but then, maybe an hour after I got a certificate from Let's Encrypt, it suddenly started getting hundreds of hits just like those listed in the article. After a few hours, it always dies down somewhat.
The take-away is, secure your new stuff as early as possible, ideally even before the service is exposed to the Internet.