To Be Sure, To Be Sure
A tape archive can grow with impressive yet daunting speed, and with it the volume of data that is held within it. With the pressure to be able to respond to internal restore requests or regulatory demands for data, knowing that files can be restored from the archive is essential. Since a single copy of the data at one location is vulnerable to tape failure, fire or flood, having the tapes tested and a second copy made can give great peace of mind.
In today's world where the focus of data storage is very much on reducing the amount of data you store and need to backup in the day to day backup or archival regime, it may appear strange to some people that you would want to duplicate your set of backup tapes.
Quite simply for the purpose you have data backed up in the first place, to ensure that if there is a requirement to restore the data, there is more than one copy of the data to recover or restore from.
The requirement in this instance was not only to ensure that a duplicate copy of the data was produced but also to provide a full electronic file listing of the files that were stored on the tapes.
The Duplication - Copying 200 near full LTO3 backup tapes
Altirium received an enquiry regarding the duplication and processing of approximately 200 LTO3 data cartridges containing Symantec BackupExec backup sets. The initial requirement was simply to produce a duplicate set of tapes. After consultation with the client a further requirement to produce a file level listing of the data stored on the tapes was identified to ensure that the data they believed to be on the tapes was actually within the backups.
Additional processing work from the data on the tapes had been scheduled by the client with a third party so there was the added urgency to get the tapes duplicated before this work took place.
The LTO3 tapes were near full to capacity, so each would take approximately 2 - 3 hours to process. This calculates to approximately 600 hours running time. Even at 24 hours a day this would have taken nearly a month to read the tapes, which was outside the scope of requirement.
Because each tape could be treated individually, processing was not dependent on processing any other tape. This meant that the processing time could be significantly reduced by utilising more hardware and scaling up the data processing power allowing multiple tapes to be processed at the same time.
An additional pass of the tapes was also required to ensure the the duplicate set of tapes were completely readable and that they matched the original set.
The Processing - Listing the contents of 80 terabytes of BackupExec data
Many mainstream backup software packages are very good at producing printed or visual reports on the status of your backup and restore jobs, or visually representing the file storage structure of the data within your backups in a human friendly tree view display. But if you've ever tried to produce a file listing to a structured text file, depending on the software used, this may have been a problem.
Altirium develop software in-house for the purpose of both data migration and data recovery, and have an extensive working knowledge of the backup format used by BackupExec, which is also used in the backup software incorporated within Windows NT, Windows 2000 and Windows XP. This meant that we could provide a list of the files that were actually backed up on the tapes rather than relying on the originating backup software's own methods, or on-tape-catalogues. The production of the file listing could also be integrated with other additional data validation and processing tasks.
To produce a duplicate set of tapes and to ensure that the data on the duplicate copy was an exact match to the original, Altirium used its own tape duplication and processing software. During the duplication process MD5 hashing values were calculated for both sections of the data and the entire data contents of the tape along with data block counts for each tape file.
Post process validation of the duplicate tape data was performed to ensure that the duplicate set of tapes was readable and that the MD5 checksums and block counts matched the original tapes.
Altirium's data processing software was customised to produce file listings in a format required by the client. Each tape was processed to produce a complete set of file listings in a text file format and any anomalies within the backups, such as corrupt or incomplete files, were reported.
We were able to perform this task on time and within the client's budget due to our ability to understand, design, develop and deliver the right solution for the job.
Last Updated (Thursday, 18 June 2009 15:35)