Browsed by
Category: Personal

My second year at AWS: down the rabbit hole

My second year at AWS: down the rabbit hole

We all know, time flies. I posted about my first year at Amazon Web Services just 12 months ago and now I’m already celebrating my second AWS-birthday.

It’s been a year of both personal and professional growth: it began in December when I became head of a project that has been helping some of our customers running their platforms at massive scale, all the way up to my promotion to Senior TAM a few months back. Needless to say, customers I’m looking after have made giant leaps too, as part of transformation processes that start from the infrastructure and can reach up to their corporate culture.

My post about “Year One” was organised by subsequent evolutional phases but this won’t really make sense from the second year onward: I’ve been involved in a bunch of projects, every one of them with its own life. Why not trying then to recap the last 12 months by picking the project (more details on it here, with a cameo appearance) that took most of my time and checking our slogan “Work Hard, Have Fun, Make History” has been truly met in it? Let’s start here.

Work Hard

This is how it begins. It might seem obvious, as no one ever will pay us to do something other than working hard, but it’s not. Working hard in AWS means taking responsibilities, being effective, facing challenges and turn every opportunity into an huge success.

“Hard” as in pushing our brains to 100%, not necessarily as in working 16 hours a day. True, we carry pagers, and might end up having late evening calls with the teams in Seattle or doing late night debugging sessions from our hotel room, but this only happens in exceptional situations.

I personally find this extremely rewarding: when you focus on a project with all your energy, then the sense of achievement when it’s done is super strong.

…check ✓

Have Fun

“Having Fun” is something we keep reading in job offers: it’s a “new economy” concept, meant as enjoying what you do and finding personal motivation in addition to the obvious business one.

I find this kind of comes by itself: if you work effectively on something and achieve results, then customers will trust you, the relationship will become more friendly and relaxed and you will end up having a lot of fun with them, even in the day to day.

Check ✓

Make History

Last one, and possibly just another consequence. Is there any other way a successful project can finish?

Sometimes we might not realise how big a given change can be. We might focus on some virtual machines becoming EC2 instances and some hard drives becoming S3 partitions, but there’s much more behind the curtains: you will see a quickly changing and moving world there.

Never underestimate the importance of small actions and small steps as they can quickly prove to be giant leaps.

(in the picture below you can see me helping with the final step of a cloud migration: loading a massive storage array on a truck after datacenter decommissioning and shutdown) …check ✓

So What?

Two years in, and for me it still feels like it’s Day One. Learning something new every day, consciously jumping in rabbit holes every other day just to re-emerge stronger and wiser later on. Being surrounded by the smartest people on earth makes you feel extremely small sometimes, but also guarantees you endless opportunities for growth.

This is what I’ll keep doing.

(want to join the band? just ping me!)

Customer experience and discrimination on Gumtree

Customer experience and discrimination on Gumtree

In a world where Customer Experience is key, Gumtree, the leading online classifieds website in England, is telling me to f*ck off. Something that I will do indeed as their TOS are crystal clear (they can restrict access with no explanation due), but not without telling the story first.

It all started this morning: after having signed up using my 10 years old Google account and my real name (which, even if I’m not Donald Trump, should have a decent reputation and trust online) I posted an AD for my HP DL320e (and even paid to have it featured). Location was my real postcode (which you can verify in public records) and I paid with an UK credit card (just another way for them to verify my identity).

Good experience up to that point: nice and easy UI, clean workflow. The AD went into moderation queue, but after a while it moved straight into the “Removed” state:

Allegedly, I’ve broken some posting rules and should have expected an email with some explanations. Except the email never came in (no, it’s not my spam filter, I got other emails from them) and the link to the posting rules leads to a blank page (there is a menu on the left, but every single item leads to a blank page).

I asked for support on Twitter, genuinely thinking it should have been a mistake of some automated fraud prevention system (a very cheap one, probably):

The only thing I managed to get back was this response via DM, which classifies as the worst and unnecessarily rude answer I ever got from someone’s customer care department:

I tried to appeal by sending an email to their support department, hoping for a deeper review and consideration.

And it happened: someone came back to me apologising and explaining that my AD and account were absolutely fine, no rule had been broken and that the block was the result of a mistake. It would have been lifted immediately.

The end of an odyssey, you would think. Well, no: two hours later my account got blocked again, and this email landed in my mailbox:

In short, I’m now permanently banned. They won’t tell me why and won’t answer any further query on the matter. I would love to dig and figure out what’s wrong, but I’m in front of a brick wall.

Let me make a couple of assumptions: I’ve been blocked before having published my first AD, so I can’t have been reported by other users. The only things Gumtree knows about me are:

  • My full name
  • My email address
  • Where I live
  • My Credit Card details

We already have a word for when you are denied something based on those four parameters: discrimination – and this is what’s happening here.

If anybody from Gumtree wants to get in touch and explain feel free, you have my contact details.

This is your sysadmin speaking: please expect some turbulence.

This is your sysadmin speaking: please expect some turbulence.

A few months back I blogged about my HP DL320 Gen8’s (in)compatibility with the outside world, and someone suggested me to solve the problem by replacing the P420i RAID controller with an LSI-something which would ensure wider flexibility.

Others were suggesting to replace (again) the hard drives instead, and someone was even pushing to swap this “hobby” with something healthier and go cloud instead*.

For the first time in my life I decided to listen to friends, so I replaced the RAID controller with an LSI 9300i HBA (I’m using mdraid anyway)…

…well, not really: I also replaced the chassis, motherboard, CPU, RAM banks, fans, PSUs and drive caddies.

Meet “ZA Rev2″**:

This is how it evolved:

  • HP -> Supermicro (yay!)
  • Xeon E3-1240 v2 -> Xeon E3-1240 v6
  • 4×8 GB DDR3 RAM -> 2×16 GB DDR4 RAM (2 slots free for future upgrades)
  • HP P420i -> LSI-9300i
  • 2x SSD Samsung 850 EVO 250 GB -> no change
  • 2x HGST SATA 7.2k 1 TB -> no change

D-Day for replacement is April 18th (taking a day off from my job to go and do the same things, just for hobby, feels really weird, yes), with a 6 AM wake up call, flight to AMS, 8/10 hours to do everything and a flight back to LON (LTN to be precise, because I didn’t double check before hitting “Buy”).

Now to the sad part: there is no (easy) way to just move the drives to the new server and have everything working, so I have to reinstall it from the ground up. This means my stuff (including this blog, because loose-coupling is a thing but I decided to run its DB and NFS from another country… …for some reason) will be down (or badly broken) during that time window and possibly longer, depending how much I manage to do while I’m onsite.

The timing couldn’t be better for a clean start, as in the last few months I had been considering the option to move away (escape) from Proxmox (which, as an example, is so flexible that its management port number is hardcoded everywhere and can’t be changed) to something else, most likely oVirt or OpenNebula. Haven’t taken a decision yet, but I’ve really fallen in love with the latter: it’s perfect for the cloud-native minds and runs on Debian, whereas oVirt would force me to move to the RPM side of the world.

Deeply apologise in advance for my rants on Twitter while I try to accomplish this mission. Stay tuned.

Giorgio

 

* I.AM.100%.CLOUD. There are two things you can’t (yet) do in the cloud: physical backup of your assets that live in the cloud and testing stuff which requires VT extensions. This is what I’m doing here: ZA is my bare-metal lab.

** this is not ZA Rev2. It was supposed to be, but it came in with a faulty backplane so I pushed for it to be entirely replaced. I don’t have a picture of the new one with me at the time of writing but… yeah, it looks exactly the same (with better cable management).

Story of a journey: my first year at Amazon Web Services

Story of a journey: my first year at Amazon Web Services

Exactly one year ago today I was sitting in a room in Amazon’s London Holborn office, attending the New Hire induction and waiting for my manager to pick me up and introduce me to the rest of the Technical Account Managers team.

It has been one year already – it’s about time to tell my story, and share my experience in this (amazing) reality.

(this is me at this year’s London Summit, looking for something, somewhere)

Looking back at the first year (or, in Amazonian terms: “those first 365 day one’s.”), I can easily highlight a few different phases. Here they are, in a more or less chronological order.

Phase 1: “lost” (in an hexagonal office)

Technical Account Managers (TAM) spend a lot of time with customers, and only drop into the AWS office when required. As a new starter this can be a little daunting, especially when trying to get set up – configuring your mobile, using the vast array of internal tools you have at your fingertips and the simple things, like finding the toilet.

The good news is: everybody is always happy to help you. Literally: everybody. In my first days I had phone calls with most of my team mates, shadowing sessions in front of customers, and even asked a mix of random people in the office for various kinds of help: they always guided me, as if it was a single, big family and that helped me, and I never really felt lost (yeah, I know, but it looked as a good title for this chapter…).

(about the toilet, if you’re wondering: I realised that as our office was hexagonal – or kind of -, everything was “straight on and then on the left”)

I’ll skip phase 1.5, the official training: we spend about two to three weeks in classes with Support Engineers before getting hands on with the day to day job. The training is what you’d expect from training, but it provides a great opportunity to meet and learn from tenured colleagues. This is also when I personally went from getting lost in the London office to getting lost in the Seattle campus (every. single. time.).

Phase 2: the ramp up (aka: “OMG I don’t know anything”)

The ramp up that comes after the training is exciting: you’re back, you’ve had 2/3 weeks to try to learn as much as possible and after three weeks of training, you think you know what you are doing – you’ve learnt the theory, you know how to use the tools, you think you know what to do when, and you’re ready to get on with it.

In theory.

What you realise at this point is that yes, it’s true, and you’re working with Amazon Web Services. If you work with cloud, you hear this name daily, and becoming part of it doesn’t simply feel real for a while.

One of the first matters I understood was that the only thing I was bringing with me in AWS was my brain: your past experience can definitely help, but Amazon is so different from other companies that you have to learn, literally from scratch, almost everything. If you’ve been hired it’s because you share the mindset, so it’s not hard and it’s not an obstacle, it’s just something to keep in mind.

The main differences? First, and by far, is our “Customer Obsession”. We obsess over our customers, and not over our technology: every discussion we have ends up focusing what’s best for our customers, and how we can improve their experience. We work every day making sure we help them doing what’s best for their platforms – not for us – and we spend our time listening to them and trying to figure out how to make their life easier.

The second one is definitely what’s summarised in our “Everyday is Day One” motto, which is much more tangible than you would expect from something that is written on every wall in an HQ. Our customers and us are moving so quickly that you must always be ready to wake up and start as if you were in a completely new world. You learn new things daily and the technology you were using / evangelising three months before could not be the best one for a given use case anymore.

This is all about change and how it becomes part of your daily routine.

Phase 3: the First Customer

After a few months you’re ready to onboard your first customer. I had spent some time shadowing and helping a more tenured colleague, and in November I was ready for onboarding my first “very own” account.

At that point in time I was confident on my daily tasks, had already had to deal with critical situations, and everything was looking good. But the first customer you onboard onto AWS Enterprise Support is just different: you’re starting a journey together, with some pre-defined goals and some others that will eventually show up.

It’s journey of change, a journey toward continuous improvement and optimisation.

It’s just matter of weeks, and you will start knowing your customer’s team members by first name, and recognising who’s logging a support case just by looking at their writing style.

Yes, that’s a very close relationship: some of my colleagues love to say that we work for Amazon, but on behalf of our customers.

Phase 4: the first event

You don’t really feel part of the customer’s team until you go through your first event. An event could be anything, from a planned traffic spike or feature launch, to, ehm, yes, an unplanned downtime.

Let’s pick a feature launch: it’s something big, the customer’s development teams have been working for months on it, the marketing team is heavily pushing and the operational teams do have a single focus, making sure everything will work smoothly.

This is where our teams become glued together with the customer’s: we share a goal, we share a focus, we setup “war rooms” and make sure everything is in place and properly architected for when the big day arrives. The TAM acts here as a customer facing frontman for an army of Support Engineers, Subject Matter Experts, Service Team Engineers, and many more – and during this kind of events, everyone comes together.

And then it happens – detailed and obsessive planning ensure everything works smoothly and meets expectations, leaving plenty of time to celebrate – and to realise that none of this would be possible without the super close relationship we develop with our customers.

Phase 5: personal development

This is not really a phase (mainly because it never ends), but after you’ve been in the company for 6/8 months you begin having really clear ideas on how things work, where you want to go and what you want to do.

AWS is a world of opportunities, for any kind of person: in this first year I joined a team which is helping our customers with the migration of strategic workloads and presented at the AWS Summit in London.

I’m currently trying to decide what to target next.

Phase 6: retrospective

As said, technology is evolving quickly, and so are we and our customers. When you reach the one-year mark, you try to look back and this is when you really understand where you used to be, and where you are now.

Where your customers were, and where they are now: the distance they have most likely covered in a single year looks unbelievable.

Phase 7: writing a blog post about your first year

Come on, I’m just joking.

Time to wrap up: I’m enjoying my new working life, my team, my mentor(s), my manager(s) and the extended Enterprise Support team. I have the opportunity every day to work with exciting customers, to actually be part of my customer’s teams and to experience the latest innovations first hand.

There is a question I get asked a lot, especially from people who know my background: do I miss being hands on, had to do with operations? Not really. First, we have time and business needs for testing and using any new product we launch, so I still spend some time actually “playing” with stuff. Second, despite the name, this role is super-technical – we get to see a lot of operations, development and devops.

 

If you are reading this and looking for a new and interesting challenge, or would like to consider joining the AWS team, then get in touch.

Giorgio

Don’t buy servers.

Don’t buy servers.

No, please don’t. Not even for personal use.

Let me start from the beginning: during my relocation last year, I left my desktop computer behind. It hadn’t been my primary machine for a while and I was probably powering it on only once a month, but it was still my core repository for backups and long term storage.

As I went 100% cloud years ago (no USB drives, no external HDDs, etc) my “current” dataset is now online, synchronised with my laptop(s). Still, there are some hundreds of GBs of “cold” (as in: I will probably never need them again) pictures/docs/archives that I want to be able to access, even remotely, at any time. After exploring some mid-range NAS solutions, I ended up realising that despite having a reliable internet connection, my flat was not the best location for hosting it, so started looking around for a decent colocation space.

It didn’t take much time to figure out that space and power in a datacenter are so expensive that a NAS isn’t suitable nor effective for this purpose.

As a consequence…


…meet MY-ZA*.

MY-ZA is an HP DL320e Gen8 server, equipped with an E3-1240 v2 CPU, 32 GB of RAM, and 2×250 GB SSD + 4x1TB SATA drives. Dual PSU, P420i hardware RAID controller, iLO4, etc… …yes: a real server.

I’m sure you’re now wondering what the hell I am doing here. The answer is easy, and anybody with an engineering mindset can probably confirm: sometimes we need to spend time and energy in experiments even if we know they will fail, because what we want to figure out is how exactly they will fail.

To be honest, even if I knew this choice was sub-optimal at the very least, I was like: “Hey, what could go wrong? It’s just a server”.

Well, now I know the answer: anything – (and if you cross this with Murphy’s law…).

My background is in traditional IT, but looks like I quickly forgot about the pain of having to deal with bare metal. To make sure this doesn’t happen again, here’s a quick reminder that might also help you all:

  • Servers are expensive: this is a $2800 machine (I’ve paid roughly 50% of that), that will cost around 70/80$ per month just by colocation and bandwidth. Moreover, in 2 years time it will be obsolete.
  • Bare metal servers are… …heavy: arranging shipping back and forth costs time. And money, of course.
  • They’re slow, reaaaaaaalllllly slooooooww. This thing wastes 10+ minutes just to get to the operating system boot. Don’t forget this if you’re doing something that requires a lot of reboots (like trying different RAID configurations, updating a newly installed Windows, etc). We’re now in an era where the boot time of an instance is shorter than what it takes to you, slow and inefficient human, to copy and paste connection details in your SSH client.
  • What about the risk? Well, it’s huge. I have onsite support, but no spare parts. So, should something bad happen, the downtime will be counted in hours, at least.
  • They don’t scale. This “thing” has already reached the maximum amount of RAM it can hold. What if I need more? I have two options, double the colocation space (and thus cost) and buy a similar second server, or buy a larger one to replace it and begin a slow, complex and painful migration.
  • Agility? What? – You must manage it as you would do with a pet. If something breaks, repair it, if the OS is out of date, upgrade it. Well, in a world where if an instance is broken you immediately spin up a new one, having to fix an OS doesn’t seem appropriate.
  • SSDs do have a well defined lifespan. This is not something you care about if you’re using a cloud hosting service, but here you should keep it in mind, as they will eventually die. Both at the same time, as their load will be similar.

After having spent the last 7 days (evenings to be fair, as I have a job during the day) on this project, I think I have definitely debunked the theory about cloud not being effective for personal workloads.

Project failed, time to terminat…

…no, wait, you can’t terminate a bare metal server: it’s an investment, it’s a long term decision, you can’t just roll back as you would do with a cloud instance.

Oh, God.

 

* don’t even try to understand my host naming convention. There are no standards, names are just random letters. Servers are cattle, not pets, right?