‘Twas the night before Christmas, when all thro’ the house
Not a creature was stirring, not even a mouse;
Save for the IT manager at a German TV channel attempting to restore some footage from their PresStore backup archive in time for a New Year ’s Day TV Show…
To this day we are not sure what was not working, but working it was not. The catalogue of the PresStore backups had been cleared, it was thought that the 200 x LTO4 tapes would no longer be needed, so now a frantic effort was being made to re-create the infrastructure and scan 200 tapes.
With a single LTO4 drive it was already going to be a matter of luck whether the right tape was found in time as even working 24 hours a day you could hope only to get through 40 or so tapes in the time available, and the re-cataloguing kept failing with configuration errors, and so came the call for help, followed by an IT manager in a Volvo bearing LTO tapes and an expression of angst.
One of the benefits of writing one’s own software for various backup formats is that you don’t need to worry about configuration, processing is pretty much a function of tapes drive numbers, and with 20 LTO5 drives running in less than 2 days the PresStore backups had been re-scanned, a catalogue created and the required data identified.
Restoration of the required data from the PresStore backup took about an hour and so with time to spare our hero was racing back under the channel to save the day.
I have no idea who the footage was of, someone I am too old to appreciate I am told, so I apologise now to any Bavarian parent whose New Year peace we were responsible for disrupting, but the happy smiling face of the IT manager made it all worthwhile.
Attempting to process large numbers of tapes natively (i.e. using the the originating backup application and IT environment) can prove to be a daunting task when large volumes of data are needed quickly. Most infrastructures are designed around getting data written out to tapes, but only sometimes needing to restore. This is not an adverse criticism, if daily restores are needed then either it is time to stop getting IT equipment from fire sales or there are some staff training issues to address.
However, when a large volume of data is needed quickly from a broad cross-section of backup tapes, perhaps a legal case requiring email data spanning several years, how can it be done?
In the case of one UK insurance company they had over 1000 LTO tapes, LTO1 to LTO3, containing Backup Exec backups from a number of Windows based systems, these included MSSQL backups, Exchange IS backups, and user data file backups from a number of servers running SIS (single-instance-store). They were not in a desperate hurry, but they knew a requirement to disclose data was likely in the next 4 months and that it could include email and user’s file data, so they were looking for a 3rd party tape restoration service.
As fortune would have it we had just released our latest version of our ADR Tape Restoration software suite, complete with a new module to handle SIS backups without the need to have SIS installed, and the Exchange and file data were duly restored to USB disks. Our processing for this, to meet the deadline, meant using multiple drives restoring in parallel on standalone PC systems, no need to Exchange servers or Windows servers with client systems. Each time a set of 4TB USB3 drives became full they were shipped post-haste to the customer so they could introduce the data to their new archive management system, and by the time they needed it they had it all in place.
Being a generally optimistic person and not believing in bad luck, not even if someone turns up with an OnStream cartridge, the prospect of 13 x LTO2 tapes containing ARCserve backups was not of particular concern. Then the additional information was provided, they had been to another company for restoration and there was a problem with the data, the backups were corrupted and could not be restored, “was there anything we could do”?
Not having seen the tapes it was not honest to give more than a cautiously optimistic opinion. No-one here had encountered a corrupted ARCserve backup since some problems with Adaptec 1542 cards and MSDOS with too much memory installed back in the early 1990s. It seemed more likely we would find that they were either not ARCserve at all, or else were encrypted.
When the tapes arrived it all became clear. They were ARCserve with multiplexing, which means that the backup data from several backups can be interleaved and any attempt to proceed in a linear manner without first loading and interpreting the ARCserve MUX (multiplex) tables is going to end in tears, or at least with worthless data. The next challenge was restoration within an average life-span, restoring a single backup would be relatively straightforward one the MUX tables had been correctly interpreted, but with over a hundred backups per tape the idea of restoring one set at a time was not overly appealing as each tape would have to be read over 100 times, effectively turning a 13 tape restore into a 1300+ tape restoration exercise. This is where the benefits of developing software for tape restoration come to the fore, and being able to modify code to enable the simultaneous processing of all backups so each tape took less than 3 hours to read.
Once the tapes had been catalogued the required Exchange email data was located. Anyone fancy a guess at which number tape it was on? Sorry, that would have been too poetic, it was on tape 7.
3,500 backup tapes containing Commvault Galaxy backups from which selected emails are required within 30 days might seem like a tall order, until the tapes arrive and turn out to be 1400 TSM backups on LTO2, 1200 Galaxy on LTO4, 900 NetBackup on LTO1 and LTO2, along with a selection of additional DATs and AIT tapes of unknown origin (it transpired that these were AS/400 SAVLIB). The water having been muddied it now turned to sludge as the court deadline to get the data turned out to be 30 days from 18 days earlier, so there were 12 days until the deadline. One other small detail, the email system in use had changed at some point from Notes to Exchange.
Planning around formats such as NetBackup and Galaxy where there is at least the option to position along tape to filemarks and get backup set information without having read every block of data is one thing, for TSM there was no option but to read every block of every tape and identify all of the file present.
Under such circumstances using the originating backup applications is not an option, for NetBackup and Galaxy where this would be possible, the infrastructure set-up requirement prior to starting work would take us past the deadline. With TSM it is just not an option. To meet the deadline tapes had to be “spinning” from day 0.
This is where the benefit of having written your own “non-native” restoration software and having spent years proving it in live situations reaps rewards. Rather than needing media servers to host drives & backup servers to host backup software, we were able to process the tapes using single PC systems each with 4 tape drives attached and scale up to 60+ drives running simultaneously on a 24/7 basis, filtering the file information as we went to identify Exchange backups and Notes files and where found process the tapes in question and restore the data. The deadline was met, not easily, but a day early.
Whilst there are cases where using the originating “native” backup application is the way to go, in a case like this being able to scale up processing with the relatively simple addition of Windows PCs each with multiple tape drives and no requirement for additional servers is what made it possible.
The restoration of data that was backed up from NetApp and EMC appliances, when those appliances have been retired, has long been a source of angst for IT departments. Do you have to retain appliances in case data is needed? Do you just accept that data is lost or that legacy hardware will have to be re-commissioned if a restoration from an NDMP backup is required?
Altirium’s “restore-on-demand” service now provides a solution with NDMP support being an integral part of Altirum’s much vaunted ADR Suite tape restoration software. Whether you have NetWorker NDMP backups from a NetApp filer or NetBackup backups from an EMC Celerra, files can be restored from your tapes direct to USB disk and returned to you quickly.
Contact Mark Sear or Laura Sangster on 01296 658737 to find how your access to your NDMP backups can be retained without the main of maintaining legacy systems.
If you’ve ever wondered what tools a technical data recovery engineer might use on a daily basis, then here are my top five tools, although they may not be quite what you’re expecting.
If you think that data recovery is just about running a bunch of software tools on hard disks or RAIDs or data from backup tapes then for a professional technical company, this is far from the truth. As a data recovery expert my job at Altirium involves interrogating raw data and writing software to solve often complex problems. This is one thing that I think sets us apart from some of the other companies in the data recovery industry. Yes we use “off the shelf” recovery software where it’s appropriate but often they are found lacking and don’t give the best results or properly report their findings. Therefore to recover data where there are no tools available, we develop them in-house.
Some of the achievements we’ve delivered in the past 12 months, using my top five tools include:
- Reverse engineering the MS SQL Server data structures and written software to recover data from dropped tables, where off the shelf tools failed.
- Extending our Tivoli Storage Manager recovery software capabilities, adding extractors for more data sources and software compressed data.
- Identifying and implementing the processing of many undocumented structures in the Microsoft Tape Format, used in software such as Symantec BackupExec.
- Reverse engineering Atempo Time Navigator backup format including processing of software compressed data.
All of this has contributed to solving genuine data recovery issues and has saved the companies that have come to us, thousands of pounds in lost revenue, many hours of support time and countless terabytes of potentially “unrecoverable” data.
Here are the top five tools that I use pretty much every day during the course of my job.
Many off-the-shelf Microsoft SQL Recovery tools state that they can recover from corrupt files, deleted data and some claim to recover from dropped tables, so the recent arrival of an MSSQL Recovery into the lab (all of the tables within the database had been dropped) gave us the ideal opportunity to undertake some tests. From looking at the MDF file of the database, the data was still present, yet out of the 4 or so packages we tried “NONE” of them could retrieve any of the dropped tables, yet we were still able recover the required data for our client.
Read the rest of this entry »
I’ve often heard it said, “the RAID has been rebuilt – the data cannot be recovered” and often this is the case. With RAID5, if the configuration is changed, and new parity is calculated, then there will be a significant loss of any data that was previously stored on the RAID.
As Hamlet so eloquently put it “There are more things in heaven and earth, Horatio, than are dreamt of in your philosophy.”, just because something is outside of our normal experience does not mean that it is not possible.
Read the rest of this entry »
Tivoli Storage Manager (“TSM”) provides a sophisticated heterogeneous data storage environment within which large volumes of data can be held. These might include email backups, user documents and SQL database, in fact all of the information that might be just a little bit useful in a computer forensic investigation or a tape data discovery exercise.
So, you are an investigator who has been handed a case containing 25 LTO4 cartridges from a TSM archive, now what?
Read the rest of this entry »
(With apologies to Mark Twain)
The release of LTO5 by Quantum Corporation brings 1.5TB native/3TB compressed tape to the market, and it is a sure fire bet that IBM and HP will shortly follow with their own offerings, which means that for the past 20 years or so, a technology many said was going the way of the Dodo, has managed to more than keep pace with competing technologies, and seen quite a few off (remember how optical disk was the future of storage back in the late 1980′s?).