Preventing a Disaster: My Methodology Part 4


This is my last post in a series called Preventing a Disaster: My Methodology. In Part 1 I discussed the importance of running DBCC CHECKDB on your databases and provided tips on how to do this in VLDB and very busy systems.  In Part 2: Backups, I discussed the importance of a DBA knowing the RPO/RTO (Recovery Point Objective/Recovery Time Objective) of the business.  It is the RPO/RTO of a company that should determine your backup policies and procedures.  The 3rd installments discussed Off-site Locations and the fact that your backup strategy is not complete until you have a copy of your backups off the physical site.

In this installment I would like to discuss my last point in Preventing a Disaster and that is to ask yourself, “are my backups valid?”  Earlier, in post 2, the concept of “verifying” your databases was introduced and it should be a part of your backup process.  This basically verifies that what was written to the disk is equal to what is in the database.

To properly validate your backups, a DBA must perform a RESTORE of that backup to ensure that 1) it can be done, 2) that the RESTORE process works and 3) to validate the backup process.

This can be done to a development box, your desktop, a VM server that you can blow away later, it really doesn’t matter.  The point is to restore the database to a SQL Server. Typically, after a restore is complete you should run DBCC CHECKDB on the database to validate the integrity of the database.

To properly test your entire backup procedures, a DBA should get a backup file or files from archive, tape or off-site: i.e from the final resting place and restore that backup.  Restoring last nights backups is only half the process.

I once worked in a shop where a SQL server backed up to a network share where the 3rd party file backup solution then archived it tape.  Their policy was to keep 1 month’s worth of tape.  So just for kicks, I asked to get access to a 29 day old backup file to test restores.  Unfortunately, the Backup Administrator did not know how to retrieve a file from tape and place it on a network share. He was competent in getting all the necessary files to write to tape; but was unsure how to retrieve data (in his defense, he was new and was never asked to do a restore from tape).  The “backup procedures” as a whole was broken.

In Summary

In Preventing Disasters: My Methodology, I hope I explained what DBAs should do and why it is important to not skip a step. Be aware of who else is involved in the process and work closely with them to execute and test the process.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s