Mirror Mirror on the wall – Where’s my Backup!
I know you have heard this before---backup the system, make sure you have some redundancies, check your tapes once in a while, and maybe spend a little money on that upgrade to install some new recovery technologies.
Let’s face it. Most of us know that we have to back up. We know we have to be able to get the business back on line quickly if there is a system problem. But few of us really do much more than run a backup every night.
The AS/400 is a wonderful machine with arguably the best operating system in the world. Built in to it is the ability to backup and unlike Windows NT – you can actually restore it back to where you were. Assuming of course your backup strategy is a sound one. This is where many shops fail, and unfortunately, you don’t know it until it is too late.
Recently, a shop I know had a disk crash. The AS/400 being the nice system it is, gave them a warning that there was a “pending disk failure”. Quick to respond, they kicked everyone off the system and attempted to do a backup as of right at that moment, hoping to get a final snapshot of the system. Of coarse, it failed. The disk crashed.
In come the hardware guys – try to pump the drive to a new one. No good – the bad drive was unrecoverable. Now they have been down for a day (slow hardware support didn’t help – remember, you get what you pay for – cheap support gets you – well – cheap support).
The next day – new disk in place, they loaded up the backup tape – but wait – the previous backup from the weekend (this happened on Monday morning) did not run because of a software problem with their month end. That means they had no backup of their data from the weekend or for Friday’s work. Well, that sometimes happens, so now they go and get Thursdays backup. It’s there and they can read the tape. Whew! But wait not all the files are there. In fact the bulk of their primary files are not there. What happened?
Back one more day on the tapes. This one, they can’t read – media errors. Back one more day – victory, they can read it and all the files are there. Meanwhile, the new disk is installed, the hardware guys cleared all the disk getting it ready for a new install.
Feeling a little calmer, they say to themselves “Now, where the heck is that system backup tape. You know the one we do every … wait we back up the data every night (or thought they did), but the system – isn’t the nightly one enough?”
Well, no, it is not enough. The only system backup found was from the middle of 1997. In case you were sleeping, it’s 1999. The system backup was not even from the same operating system release as the current one they were on. So what to do. Actually, there was not much choice. They installed the operating system from the current CD’s, loaded the current PTF’s, and then restored users and configurations from the 1997 tape. Not exactly ideal, but better than recreating everything. That being finished, the data from last week Tuesday (it is now Thursday in the new week) is restored. So the system is back up and ready for entry.
While all this tape chasing was going on, the users tracked back as much activity from the previous week as they could. Most had some report or document that had been generated and could re-enter data from the hard copy. Not exactly fun, but kept the business in business. But wait (again), they take orders over the phone. Unless a pick paper was created, there may not be any hard copy record of the transaction. Oh sure, they create daily audit reports, but they are so big, they archive them on disk – and then move them to tape.
Oops – the disk crashed and the tape didn’t back anything up. Lot’s of research went into getting things back to normal. It took two weeks, and even then there is no way to know if something was lost until a customer calls asking for their merchandise. A quick excuse will be made, and a new order will be generated. But they certainty do not want their customers to know that they have had a computer problem that lost data. That does not exactly instill confidence in your business.
All right - don’t laugh. This story is quite true. Murphy certainty had a hand in it. Now the company is looking into RAID, mirroring, journaling and new tape technologies. They have also revamped their backup strategies and now look at the silly messages that come out of the backup jobs at night. They also got rid of the old CL program that they had doing the backup, and went to the IBM backup facility, which has been greatly enhanced over the last few releases.
Unfortunately, this happens all too often. Excuses include:
-Tape drives are expensive (so I can’t get two.)
-RAID is expensive.
-I don’t have enough horsepower for mirroring.
But going through this once will fix all those problems – a little money (or a lot of money) spent up front will help save lots of aggravation and real business loss costs.
The way this story should have read is:
The Company lost a disk drive today, our mirroring strategy took over and a message was sent to our hardware support vendor. They showed up within 4 hours, replaced the drive while the rest of the system continued to process orders. The users did not even know a problem had occurred that day.
John Bussert is president of Swift Technologies (Marengo, Ill.), a company specializing in AS/400 and Windows NT software.