Browsed by
Tag: server

This is your sysadmin speaking: please expect some turbulence.

This is your sysadmin speaking: please expect some turbulence.

A few months back I blogged about my HP DL320 Gen8’s (in)compatibility with the outside world, and someone suggested me to solve the problem by replacing the P420i RAID controller with an LSI-something which would ensure wider flexibility.

Others were suggesting to replace (again) the hard drives instead, and someone was even pushing to swap this “hobby” with something healthier and go cloud instead*.

For the first time in my life I decided to listen to friends, so I replaced the RAID controller with an LSI 9300i HBA (I’m using mdraid anyway)…

…well, not really: I also replaced the chassis, motherboard, CPU, RAM banks, fans, PSUs and drive caddies.

Meet “ZA Rev2″**:

This is how it evolved:

  • HP -> Supermicro (yay!)
  • Xeon E3-1240 v2 -> Xeon E3-1240 v6
  • 4×8 GB DDR3 RAM -> 2×16 GB DDR4 RAM (2 slots free for future upgrades)
  • HP P420i -> LSI-9300i
  • 2x SSD Samsung 850 EVO 250 GB -> no change
  • 2x HGST SATA 7.2k 1 TB -> no change

D-Day for replacement is April 18th (taking a day off from my job to go and do the same things, just for hobby, feels really weird, yes), with a 6 AM wake up call, flight to AMS, 8/10 hours to do everything and a flight back to LON (LTN to be precise, because I didn’t double check before hitting “Buy”).

Now to the sad part: there is no (easy) way to just move the drives to the new server and have everything working, so I have to reinstall it from the ground up. This means my stuff (including this blog, because loose-coupling is a thing but I decided to run its DB and NFS from another country… …for some reason) will be down (or badly broken) during that time window and possibly longer, depending how much I manage to do while I’m onsite.

The timing couldn’t be better for a clean start, as in the last few months I had been considering the option to move away (escape) from Proxmox (which, as an example, is so flexible that its management port number is hardcoded everywhere and can’t be changed) to something else, most likely oVirt or OpenNebula. Haven’t taken a decision yet, but I’ve really fallen in love with the latter: it’s perfect for the cloud-native minds and runs on Debian, whereas oVirt would force me to move to the RPM side of the world.

Deeply apologise in advance for my rants on Twitter while I try to accomplish this mission. Stay tuned.

Giorgio

 

* I.AM.100%.CLOUD. There are two things you can’t (yet) do in the cloud: physical backup of your assets that live in the cloud and testing stuff which requires VT extensions. This is what I’m doing here: ZA is my bare-metal lab.

** this is not ZA Rev2. It was supposed to be, but it came in with a faulty backplane so I pushed for it to be entirely replaced. I don’t have a picture of the new one with me at the time of writing but… yeah, it looks exactly the same (with better cable management).

Don’t buy servers.

Don’t buy servers.

No, please don’t. Not even for personal use.

Let me start from the beginning: during my relocation last year, I left my desktop computer behind. It hadn’t been my primary machine for a while and I was probably powering it on only once a month, but it was still my core repository for backups and long term storage.

As I went 100% cloud years ago (no USB drives, no external HDDs, etc) my “current” dataset is now online, synchronised with my laptop(s). Still, there are some hundreds of GBs of “cold” (as in: I will probably never need them again) pictures/docs/archives that I want to be able to access, even remotely, at any time. After exploring some mid-range NAS solutions, I ended up realising that despite having a reliable internet connection, my flat was not the best location for hosting it, so started looking around for a decent colocation space.

It didn’t take much time to figure out that space and power in a datacenter are so expensive that a NAS isn’t suitable nor effective for this purpose.

As a consequence…


…meet MY-ZA*.

MY-ZA is an HP DL320e Gen8 server, equipped with an E3-1240 v2 CPU, 32 GB of RAM, and 2×250 GB SSD + 4x1TB SATA drives. Dual PSU, P420i hardware RAID controller, iLO4, etc… …yes: a real server.

I’m sure you’re now wondering what the hell I am doing here. The answer is easy, and anybody with an engineering mindset can probably confirm: sometimes we need to spend time and energy in experiments even if we know they will fail, because what we want to figure out is how exactly they will fail.

To be honest, even if I knew this choice was sub-optimal at the very least, I was like: “Hey, what could go wrong? It’s just a server”.

Well, now I know the answer: anything – (and if you cross this with Murphy’s law…).

My background is in traditional IT, but looks like I quickly forgot about the pain of having to deal with bare metal. To make sure this doesn’t happen again, here’s a quick reminder that might also help you all:

  • Servers are expensive: this is a $2800 machine (I’ve paid roughly 50% of that), that will cost around 70/80$ per month just by colocation and bandwidth. Moreover, in 2 years time it will be obsolete.
  • Bare metal servers are… …heavy: arranging shipping back and forth costs time. And money, of course.
  • They’re slow, reaaaaaaalllllly slooooooww. This thing wastes 10+ minutes just to get to the operating system boot. Don’t forget this if you’re doing something that requires a lot of reboots (like trying different RAID configurations, updating a newly installed Windows, etc). We’re now in an era where the boot time of an instance is shorter than what it takes to you, slow and inefficient human, to copy and paste connection details in your SSH client.
  • What about the risk? Well, it’s huge. I have onsite support, but no spare parts. So, should something bad happen, the downtime will be counted in hours, at least.
  • They don’t scale. This “thing” has already reached the maximum amount of RAM it can hold. What if I need more? I have two options, double the colocation space (and thus cost) and buy a similar second server, or buy a larger one to replace it and begin a slow, complex and painful migration.
  • Agility? What? – You must manage it as you would do with a pet. If something breaks, repair it, if the OS is out of date, upgrade it. Well, in a world where if an instance is broken you immediately spin up a new one, having to fix an OS doesn’t seem appropriate.
  • SSDs do have a well defined lifespan. This is not something you care about if you’re using a cloud hosting service, but here you should keep it in mind, as they will eventually die. Both at the same time, as their load will be similar.

After having spent the last 7 days (evenings to be fair, as I have a job during the day) on this project, I think I have definitely debunked the theory about cloud not being effective for personal workloads.

Project failed, time to terminat…

…no, wait, you can’t terminate a bare metal server: it’s an investment, it’s a long term decision, you can’t just roll back as you would do with a cloud instance.

Oh, God.

 

* don’t even try to understand my host naming convention. There are no standards, names are just random letters. Servers are cattle, not pets, right?