Wednesday, April 11, 2007

沉重的打击

I was assigned a task and I have been prepared for it since a month ago. I had no issue with this task since this is not my first time doing the similiar job and in addition I have been tested it on testing site. There are 4 hours maintenance windows approved by the US big boss. On the day I implement this task, I hit an idiot Oracle bug which only returned at the very last phase of the whole implementation and I have no choice but to rollback the whole implementation, and at that time I have only 30minutes left to recover the database, damn, and that was I have been working from 4 hours since 12am, I tried my best to keep my brain operate well for not to do a single mistake in recovering the databse. And it ends up I exceeded 1.5 hours for the whole downtime windows... Shit.. you will never believe how much it cost of this 1.5 hours downtime, it is like the cab's meter which counting every minutes in USD...damnit SLA

Then after the rollback, I was busy with preparing all the reports to explain it to management, I found this bug only occured on certain server which I am so "lucky" to pick up this task on this "special" server. Imagine I worked from 12am till 8am in the morning, then I go to bed like a zombie.. but i couldn't sleep well, my support phone keeps ringing and I have to drag myself out from my cozy bed to reply all the email sent by those BIG BOSS and clean up all the shit. Well, it was a great blow to me, definitely my manager was so mad on this.

Then today, 2 days after this incident, I create a trouble again. I backup some file and by the time I need to restore it back to the server, it returned "checksum error"!!!! gosh, not again, this is another SLA impact production server. Why all these gao gao issue only happened while I did something on production server!!! so this is another 2 hours downtime for me to request the tape backup offshore.. I think I spend my whole life in this company also not able to pay the SLA penalty. This really reminds me of the Elizaberth Town.

I was so depressed, why am i being so careless, I should have think further for the backup plan in case of any failure. I shouldn't be so over-confident and just take everything easy, it is not easy man!! IT IS A PRODUCTION SERVER!!! I have been told the same sentence from my manager few times over the week..I am not new in the company anymore, I should aware of the SLA impact, there is no excuse for me to hide from the blame. I was so ashame to face my manager now, I think I really disappoint him.

1 comment:

David said...

Hey, chin up ya... I am sure next time you can do a better job one :)