@_elena@mastodon.social @stefano@mastodon.bsd.cafe How do I say "I knew it! I'm surrounded by assholes!" in Italian?
Post
@stefano I wasn't aware of this kind of problems with internal monitoring and the importance of external monitoring. However, I think is more important to monitor the monitoring server or to have one heartbeat of the monitoring system (external or internal). Because the external monitoring system could also fail without being aware of it.
@stefano
I just want to say, this is one of those long, esoteric, fascinating, entertaining threads like you used to see on Reddit, and it's great to see here on the Fedi, minus all the Reddit bullshit. Good job everyone!
@stefano
even my new home alarm is coupled with a external monitoring alarm center that recognize tampering/sabotage jn addition to the "normal" alarms based on sensors etc. it costs a yearly subscription, but having a break in in the past, we considered it worthwile when we renovated our home.
@stefano I must repeat this Never trust in onsite backups either. Fire will destroy those. And RAID is not backup.
You know this but it bears repeating!
In the first sentence you mention a "data center", but such an attack would not work with a data center, to be one you need to have two buildings with independent power supply, at a safe distance, etc etc. I think this was at best a hosting room, not a data center.
@lorenzo @stefano
I think Stefano, the mild mannered barista of the BSD Cafe who posts pictures of sunsets and from his walks in nature is just a cover, and in reality he is a tough-as-nails secret military agent who's chasing cybercriminals around the globe.
See also his comment to my blog post about "just telling people to call the Barista" to make them crap their pants... this Barista has a secret! 🕵️
Internal monitoring can go dark.
External monitoring tells the truth.
Great example of why both matter.
@stefano AFAIK, professional alarm systems should function based on the principle that "if it doesn't send periodic alerts saying that everything is ok, and there's no scheduled downtime, then something clearly isn't ok, and somebody needs to be send to investigate it asap."
@stefano The true horror part of this story:
> The office was closed for the holidays, but I contacted the IT manager anyway. He was home sick with a serious family issue, but he got moving.
Home for the holidays, sick, serious family issue?? Who cares! You know what's more important?? Keeping that data center up and running!
Glory to sacrificing yourself for the system!!
Or maybe get someone else next time.
@stefano zapping the power lines, eh? Looks like the perfect solution to my nuisance neighbors with the big loudspeakers.
@stefano And while not relying on internal monitoring make sure your external monitoring doesn't share anything with the monitored systems:
Different ISP, different cloud provider if in the cloud, no shared infra at any level
@stefano Thanks for all the info about the company's internal setup.
@stefano
Hey! Thanks for the inside story! I love happy endings.
@stefano Great story and appropriate setup!
@stefano
Wow! Cool story
@stefano that's impressive. meanwhile I accidentally stumbled on your website:
You have shared many useful items in a thoughtful way. I appreciate it, and am glad to let you know. 😀
@stefano This immediately brought to mind coming into the office after a holiday weekend in 2005 and finding “my” computer room dark. I found our infrastructure manager, who told me that they had an unexpected power outage over the weekend. Confused, I said “But how is that possible? We have multiple feeds and a huge uninterruptible power supply!”
I will never forget his response, delivered in his thick Scottish brogue: “Yes, we do. But it doesn’t do much good when the UPS catches fire.” 😳
@thegaffer @stefano That reminds me of an incident that happened at work. We have multiple sources of electricity and generators, but none of that matters if the room with the UPS and power controller where all the power sources meet floods from an overflowing toilet a floor above 🙃😅
Whoopsie daisy!
I just finished bypassing all the network switches in the closets from that circuit when they managed to bypass it and catastrophe averted.
That was a fun night! /s
@stefano This is a pretty important knowledge to have!
@stefano thanks for sharing this.
@stefano thank you for this knowledge, I have boosted it for reference for others. 🤗
@stefano Cool story bro, but it's too fictional, I'd say.
First off, as a Ukrainian, I know that powerlines can survive "the spikes" by just cutting the power at the very input. No damage to equipment behind the input electric circuit breaker, nope. You just get damaged input.
Next, I used to work in a bank. And here we had a clear requirement for data storage center: more than one power input -- is a must.
@stefano
Third, given it's a data center, power consumption is probably tens of KW. The "gang" could probably be killed in action playing with it.
Fourth, if there is a power spike and cut off, it won't go unnoticed by those who control power lines. They will be the first on site to see what happened.
@stefano There was an attack a few years back near here where they dropped burning rubbish into manholes around a a data centre; the theory at the time was it was to try and cut off some CCTV or alarm monitoring for something. Well caught!
@stefano I wonder how they generate a big enough power surge.
@stefano 10+ years ago i started volunteering at a festival. Everything was new that year including the small outdoor racks for the area field routers (Juniper MX80). They barely fit but we managed. The racks were left in the sun in the summer. It was only when we enabled Observium (LibreNMS predecessor) that graphs almost everything it gets from SNMP that we discovered the inlet temperature was getting close to 80 degrees C. #monitorallthethings
@stefano About 15 years ago, the place I worked had a supercomputer. One night, the aircon in the machine room failed. The machine kept computing, and the temperature rose. It rose *quite a lot*.
Sadly, the first thing to fail from the heat was the core switch for the room. You know, the one that handles all of the network for everything in the room. Including the temperature alerts.
It was finally spotted about 8am when the security patrol wondered why the door shutters were so hot.
@stefano so refreshing to read a quality tech tale on Mastodon. Thanks for sharing!
@stefano Uptime Kuma instance from waaaaay downtown!!!
You are the hero I aspire to be!
@stefano That’s a rather cool war story. Great for a lecture.
@stefano Sounds like a case of either good design or *very* good luck too that the UPS took the brunt of it.
We can't protect against everything, but we *can* have an idea for what to do when the unimagined happens.
@stefano that advice also applies to monitoring scheduled backup jobs (or any other automated process). I use a service that emails me if I don't hit a specific URL roughly every 24 hours, and I hit that at the end of my backup job if it was successful.
Better than finding out the hard way at some point in the future that something happened with my backup job, preventing it from running for the last month.
@stefano Only in BSDcafé can you read actual techno thrillers like this.
@stefano nice story! and, yeah, internal monitoring is a must, but you also need an external one, operated by someone else than yourself.