Single Point of Failure
I cannot name the company for obvious reasons except to say it was a very large enterprise company. I worked in the IT department on middleware apps. One day the applications came to a halt, but not just the middleware apps but many, many others. A P1 incident was created and emergency alerts were sent out. We could not get back to normal work for nearly 3 full days. It had come to the attention of the IT department that there were not enough UPC's backing up the main data center. So a contractor came in to bolster the power load. He was assured by the data center team that advanced failover measures were in place, so he went to work.
He found that one switch did not have a UPC at all. He unplugged it, plugged the UPC in and re-plugged the switch to the UPC. All is well correct? Nope. Come to find out, the architecture drawings were incorrect. Hundreds of thousands of dollars later (in lost revenue and time), it was found that the high powered switch he unplugged brought down over 3000 servers and virtual servers. Each had to be rebooted in sequence to be brought back online. Yes, this was a nightmare!!! Lesson learned, never allow a single point of failure to exist on your enterprise network. In this case, it was a good thing to provide UPC's. Ironically, the very thing being supplied to thwart an outage, became the source of the outage. Submitted by Mike H.
Clean Up on Aisle Intranet
In 2000, due to a communication gap between my team and the IT team, I wasn't aware that the files supporting our company's intranet had been restructured. When I ran a report to clean up old files on servers, I mistakenly thought these 10K files were no longer in use and had them deleted. Net result - our entire intranet had been deleted and it took many resources several days' work to restore it. I was SURE I was going to be fired, but instead it gave us the opportunity to review and fix what went wrong in our processes and I continued working there for another 10 years. Submitted by Yolanda J.
When a Patching Server Just Wants to Do its Job
Some years back, we had a change window set up to patch the Windows servers. Everything had been done according to policy and Change Management had approved our change and the timing and notification had been sent out that we would be doing this after hours to minimize impact. At the start of the change window the admin logged onto the patch management server, opened the application, and kicked-off the patching job. The patch management server immediately crashed and would not reboot. We cancelled the change per policy and contacted hardware support to come out and troubleshoot the patching server. Due to the level of support that we had, the field tech did not arrive until business hours the following day. After running diagnostics he determined that a memory module in the patching server had failed. He replaced the memory module and booted the server. A few minutes later the monitoring console lit up like a Christmas tree and pagers started going off all around the server and application admins' aisles. Servers were going down all across the enterprise. Senior Managers started calling to find out what the heck was going on. Customers were being impacted as applications became unavailable. After a little investigation we realized what had happened. The patching server had set a flag to run the patching jobs before it crashed. When it was repaired and restarted, it dutifully picked up where it had left off the night before and kicked off those patching jobs and started patching all the servers - in the middle of the business day! Lesson learned, don't do work in the middle of the business day even if it not supposed to cause impact. You just never know.
Submitted by Derrick B.
Add your IT horror story on Twitter!
A Computer "Guru" Who Built Her Own Anti-Virus Program and Has Never Heard of Google
I used to work for an IT services company that is part of large corporation from 2004 to 2008. They used to own a Cable & High Speed Internet company. At that time, the company I worked for provided tech support for them, which is always interesting.
One evening, I had a client call up looking for help with her e-mail and then proceeded to tell me that she knew almost everything about computers, and how she had her Master CNE designation & computer science degree. First thought was why are you calling then? But, being the good help desk techie I am, I proceeded to troubleshoot her issues, verify settings, etc., etc.
Halfway through she started cursing the pop-up ads that keep coming up, so I ask if she has run a virus scan on her computer. She says no and I ask her, maybe you should consider it and ask her do you have Norton, McAfee or AVG installed. She says none, I wrote my own.
I asked her to confirm, what anti-virus program do you have installed and she replied, “None, as I wrote my own software.” WTF?? So I suggested some anti-spyware tools and she asks which one. I suggest three and she says okay, I might reverse engineer them after seeing the code. This call then took a turn for the weird.
She asks me how to find them and I tell her go to Google.com and "Google it." ... She then said that she had never heard of Google, that they must have sprung up overnight and how she was going consult a lawyer about suing them over the name.
I asked her if there was anything else I could do for her and she asked me what qualifications I had for the job, to which I replied "My qualifications and experience are only the concern of my boss & the HR department." I then asked her where she got her computer science degree from and her reply was a “F*** you” and phone slammed down in my ear. Submitted by Donald M.
Servers and Oil
I was building three new 2012 servers to create a security network and performing the rack and stack. On the day the systems were set to go live I walked into the server room which was in an adjacent storage room in a parking garage. The room had an inch of oil over everything. The elevator seals holding the hydraulic fluid had burst in the next room flooding both rooms. To my shock, the servers were still running because the hydraulic fluid is not conducive. I was very lucky because I could save the hard drives and since this was a three-location project I used the future servers. Submitted by Eric M.
Never Forget Computer Basics
A B2B e-commerce project was launched with best RAC servers running on Oracle 10g database. It was a cross-breed of applications running on a Windows Server platform, Oracle database cluster with PL/SQL cartridges, early days of Java 1.3, PKI Security using card authentication and mainframe backend. Everyone is excited for the first business internet product of the year 2000, eliminating physical cash transfers on the road between the bank and the client. The B2B e-commerce internet product is a game-changer to perform electronic fund transfers, payroll payout, cash collection and more so all eyes are on. The initial setup and installation was a success with the physical servers mounted on the data center, cabled and all the pre-requisite software are installed.
Day 2 is the big event of the deployment with the DBAs, developers, operations, project manager and head of IT all inside the locked data center. It was 10pm and everyone had all their installation scripts and application packages ready for the 3hrs installation. The deployment went smoothly and completed ahead of time ending at 1am so everyone is ready to call it for QA testing and about to head home... until something strange happened.
The PL/SQL packages started getting corrupted, Java processes were not 100% processing all the transactions and the Windows Server started triggering system event logs. The DBA was immediately engaged to determine if there was data corruption and it indeed affected about 10,000 records and growing. The operations team also checked the event logs and seeing intermittent I/O errors on the mapped drives used by the database. Developers are seeing various Java exceptions on the logs and unable to figure out if a missing Java package caused it. Oracle Support was called out on-site after 15mins and raised it to high priority incident, requiring an immediate database backup and recovery. Surprisingly, even the backup process fails and has dug deeper by rebuilding the corrupted database thru the database index and structure files. It was already about 4am and the big launch is set at 8am with the press media with the bank CEO presenting. This should never happen after 5 cycles of QA testing and audit and everyone is scrambling of what is happening.
As the Head of IT was about to call out the deployment backout, face the dreaded shame and the whole IT team thinking of next job opportunities, a young operations guy was staring at the back of the server. He had a basic understanding of the server infrastructure based on home computer with the power supply, video card, PS keyboard and mouse, audio card and the network adapter card. He looked at it closely, seemingly something odd is in place. And, there it was...
The network adapter card was vibrating and the LAN cable was loose. The network clip was broken and the network adapter card was not secured to the frame. He inserted it back in securely and all of the sudden all the errors went away. All the experts look at him and said "What did you do?" The pale-faced young ops said, "I noticed the network cable was not secured and plugged it back in." Everyone went to the back of the server and confirmed that it was indeed a bad network connection.
Everyone was cheerful with the project saved, the corrupted records were immediately restored and QA/audit completed at 6am. The team went home happy with a night to remember, and the CEO had never heard of the frustration that night. It has been 20 years and I still remember this experience and why I always look at the back of the server. Submitted by Jorge G.
Every IT Professionals' Fear
Every day is a horror story, because nobody really knows what we do. Everyone specializes in something different and when a Network Specialist gets a question about a user’s cell phone, it hurts. Submitted by Natasha A.
The Good ‘ole “I Love You” Email From Your Manager
Well, I didn't actually witness it but my older brother tells a cautionary tale that goes something like this: Several years ago while working for a tech company his manager poked her head in his office one day and said "If you receive an email from me with 'I love you' in the subject line, don't open it.
Well, a short time later, wouldn't you know an email appeared that read "I Love You". Needless to say, curiosity got the best of him (Did his boss actually LOVE him?) and he opened it up, only to realize it was a virus which immediately began sending the email out to everyone on his contact list. At this point in the story he acts out how he lunged from his desk and yanked the cable out of the wall. I bet that was entertaining to watch. Submitted by Julia S.
Add your IT horror story on Twitter!
Hot Dates Get In the Way of Weekly Backups
When I was asked to hire a computer operator I was told to hire the first person who walked in the door, so I did. He worked out well for a while, backing up our data on big reel-to-reel tapes: incremental backups on weekdays, with full backups every Friday. Then one day our removable 300mb disk drive crashed, we found out that the operator had been overwriting the full backups with incremental backups. He explained, "I have hot dates on Friday nights and the full backups took too long." We had to recreate our order entry system because it hadn't been changed in a long time, and wasn't on an incremental backup.
Luckily the consultant who wrote the original programs was still in the business but I estimate it took 2 weeks of full-time programming for the both of us to finish the work. Needless to say, I verified every backup every day for the rest of my career. Submitted by Angel L.
The Reply-All Email That Crashed The Server
Years ago I worked for a software development company that had just migrated from Lotus Notes to Exchange 4.0 for email and the users had some issues getting used to the new email client. The administrators also took some time to get used to the Exchange backend. After about a month of use, the new email server crashed - the hard drive was full. Turns out, no one thought to backup the email server and as such, the database logs grew too large too quickly. Once we figured out the problem, we started backing up the server but soon realized that we were greatly under provisioned in hard drive space and ordered new drives. We sent out a message to all users on our system asking them to refrain from sending large attachments until we solved the issue. One of our sales managers thought that this was a perfect time to use the fancy new mail client and "reply all" with a video of a disgruntled user bashing in his monitor.... yep, he took the server down for good with that one! Two days later we got the disk we needed to bring the server back online...... Thankfully I was user support and wasn't held responsible for the server crash and kept my job.... the network admin wasn't so lucky. Submitted by Tina I.
Not IT nightmares, but too good not to share
I Learned Computers Have a "Secret" CD Drive
User complained her CD drive was eating her CD's. Upon opening her CD drive there was nothing there, she was amazed what I did, had no idea THAT was the CD drive. She showed me the small slot between the 5.25" CD drive and the empty plastic holder on the second 5.25" hole, and pushed a CD into that slot. It made a horrid noise when it hit the bottom of the case as one would assume it would. Popped the side of her case open and roughly a dozen CDs were at the bottom of the case. Submitted by Eddie N.
Excel Files are Heavy
I'm asked by my way-cool CEO to deliver a mini and full-size laptop to a new bigwig coming from Germany the next day; for her to choose which she preferred. This goes back about 10-15 years ago so, yes, CD Rom was a big deal.
It went something like this:
"How much do these laptops weigh? Is there a big difference?"
"I'd guess 3 to 5lbs maybe, the larger unit has a CD drive built-in, and this smaller unit does not, is that a concern or request?"
"I think so... can a CD be added to the smaller unit?"
"Kind of, you'd need to carry this sleeve, the CD drive itself, and 2 additional cables though.... and in the long run, may be the same weight, only now it'd be easier to lose or misplace the parts"
"Oh yes I agree.... so, 5 pounds huh? How much more is it going to weigh when you add my data to it, like my emails, Word and Excel files.... I have loooots of Excel files."
At this point I used every muscle in my face not to smile, let alone burst out laughing... but just know I was dying to say, "Any PowerPoint or PDF’s because those are the worst offenders." Submitted by Mark M.