2007-01-11

I am the MFM

So today is day one of our new help desk at work.

I'm no longer on the help desk - cause I gotz mad server skillz 'n' shit.

I am, now, a level three Intel Engineer, which means the level one and level two help desk monkeys rely on me to fix what they can't.

The place I worked at before we had an automated ticketing system. The level one person would get the phone call or email. If she couldn't (or wouldn't) fix it, she'd assign it to me, and I got an email and/or page depending on how critical the issue.

If I needed to escalate the issue up to level three because I couldn't (or wouldn't) fix it, I updated the ticket with my info, and reassigned it. The person I assigned it to received an email or page, again, depending on how critical the issue.

This new account ... it was all manual. We'll use a ticket being assigned to me as an example.

Ticket would come in to level one at the regional help desk. The regional level one would enter it into the tracking system. If they couldn't resolve it, it would get escalated to the regional level two tech. If the regional level two couldn't fix it, they'd escalate it to our combined level two help desk.

Here's where things get messed up and slowed down.

The old regional combined level two help desk would CALL ME ON THE PHONE and say, "Eric, this Grover from Tier II. I have a server down that needs rebooted. Can I assign you the ticket?"

Um, what? Can you? Um, sure. That's my job, right? Just send it over.

I swear I spent at least 40 minutes a day on the phone talking to people giving them permission to assign me tickets that it is my job to fix.

Yep, wasted effort. Seriously. Wasted. Time and Effort.

So today we start the new system - which is all automated. Which I like. Ticket gets assigned to me, I get a page and an email. I acknowledge the ticket, and start working on it. No need to take or make phone calls to say it's okay to give me the ticket. Just here's the ticket now get to work.

Here's where I prove just why I'm the MFM.

MFM is a term we use at work for whoever has the best idea of the day to solve the biggest problem of the day. Somedays, that's shit so far above my head I''m relegated to being the dumbest guy in the room. And I'm okay with that because I'm learning, a lot.

Other days, the MFM is the guy who has the best suggestion on where to eat lunch. So the criteria is flexible.

Today, as we started the new help desk procedures, which require us to acknowledge all tickets a specified time frame based on the severity of the ticket - we were all wondering who was gonna miss the first "Severity One" ticket and have the bosses gather around for the beatdown.

For three days I've been watching a server die. The customer said, basically, let it limp along. When it dies, we'll migrate over to the backup and then you can fix it.

All of this was documented in the ticket in the old system, which I'm not allowed to create in the new system because it exists in the old system.

This morning - about 4:55 a.m. - it died. I got emailed by the customer (the regional level two tech) saying, "It's dead. You can take it offline and fix it." So I pull the server out of the rack at 8:15 and try to fix it with existing spare parts so there's no additional cost to the region. No love. Spares are bad. We figured as much, but hoped maybe we'd luck out.

Nope. So I go back to my desk and weed through the tickets, and pages, and emails, glad I don't have to talk on the phone. Because I hate talking on the phone. And at 12:15, four hours after I turned it off, I get a SEVERITY ONE ticket saying the server that I had performed the autopsy and board replacement on was not responding.

No shit. It's in pieces on my workbench, its guts intermingled with innards of two other servers.

So me and the guys I work with had a good laugh about the fact it took our new 'state of the art' monitoring tool four hours to realize this server was not working.

Unfortunately, for me, I forgot to acknowledge the fact that our 'state of the art' monitoring system told me the server was dead in the 15 minute time frame I have to respond with "Yeah, I know. I got a ticket. Yay me." or something a bit more corporate like, "I am aware of this issue and working on the ticket."

So 15 minutes later, the secondary guy gets a page about a sev one issue that's not been acknowledged by the primary guy. He's at lunch.

So 15 minutes later, my manager (and his manager, and probably his manager, too) get a page because it's been 30 minutes since a SEVERITY ONE ticket was entered and it's not been acknowledged. Since the three of them are in Cincinnati, cell phones ignite and marching orders are issued ... and their footsoldiers beat a path to my desk.

Who's the first guy to blow the clock on a Severity One ticket?

Me.

Cause I'm the MFM.

Fuck yeah.

No comments: