This is my last post in a series called Preventing a Disaster: My Methodology. In Part 1 I discussed the importance of running DBCC CHECKDB on your databases and provided tips on how to do this in VLDB and very busy systems. In Part 2: Backups, I discussed the importance of a DBA knowing the RPO/RTO (Recovery Point Objective/Recovery Time Objective) of the business. It is the RPO/RTO of a company that should determine your backup policies and procedures. The 3rd installments discussed Off-site Locations and the fact that your backup strategy is not complete until you have a copy of your backups off the physical site.
In this installment I would like to discuss my last point in Preventing a Disaster and that is to ask yourself, “are my backups valid?” Earlier, in post 2, the concept of “verifying” your databases was introduced and it should be a part of your backup process. This basically verifies that what was written to the disk is equal to what is in the database.
To properly validate your backups, a DBA must perform a RESTORE of that backup to ensure that 1) it can be done, 2) that the RESTORE process works and 3) to validate the backup process.
This can be done to a development box, your desktop, a VM server that you can blow away later, it really doesn’t matter. The point is to restore the database to a SQL Server. Typically, after a restore is complete you should run DBCC CHECKDB on the database to validate the integrity of the database.
To properly test your entire backup procedures, a DBA should get a backup file or files from archive, tape or off-site: i.e from the final resting place and restore that backup. Restoring last nights backups is only half the process.
I once worked in a shop where a SQL server backed up to a network share where the 3rd party file backup solution then archived it tape. Their policy was to keep 1 month’s worth of tape. So just for kicks, I asked to get access to a 29 day old backup file to test restores. Unfortunately, the Backup Administrator did not know how to retrieve a file from tape and place it on a network share. He was competent in getting all the necessary files to write to tape; but was unsure how to retrieve data (in his defense, he was new and was never asked to do a restore from tape). The “backup procedures” as a whole was broken.
In Preventing Disasters: My Methodology, I hope I explained what DBAs should do and why it is important to not skip a step. Be aware of who else is involved in the process and work closely with them to execute and test the process.
Welcome back to part 3 of my 4 part series on Preventing a Disaster. In Part 1, I discussed the importance of Database Integrity checks and possibly ways to run them against VLDB (very large databases). The second article was an explanation on Backups in general.
In this entry, I would like to discuss the concept of “Off-Site Locations”.
It is my belief that to truly say you have a effective disaster recovery plan, your plan needs to include how to get the SQL backups off-site.
There are a couple of options that I prefer and then there are less desirable options that are still technically effective.
Several Cloud vendors proved storage containers, i.e. hard drive space. Of course MS SQL works best with MS Azure. SQL Server 2012 SP1-CU2 and SQL Server 2014 provided the ability to backup directly to an Azure storage container. This pretty much combines steps 2 and 3 into one efficient step. In April of 2014 Microsoft provided a secondary tool that allows previous versions of SQL to backup directly to Azure as well.
However, there are some downsides in my opinion. Your server obviously needs an internet connection to the outside world and you have to have purchased an Azure account with the appropriate storage size. And as that storage blob grow, so does your monthly bill.
The benefits of using Azure storage include: compression, encryption and seamless integration into SQL Server.
Option 2 – SAN Replication
One option that I have seen to be successful is what I am calling SAN Replication. If your company has a backup datacenter in a different location, then chances are you have a SAN storage array their.
In this configuration, you would use native SQL compressed backups to a local SAN, other than your data SAN of course! Then that SAN is replicated to your secondary Data Center SAN, either using SAN snapshot or true block-by-block replication.
This method can be very effective in getting your data off-side. This method may take a little longer however it usually is effective. The major downside to this is cost. The cost of running a 2nd datacenter and a 2nd comparable SAN is enormous. That is one of the reasons why cloud storage is becoming a more viable option as time goes by.
Option 3 – Personal Relocation
This is personally my favorite! (just kidding) In this method you would use native SQL backup with compression targeted to an external drive. And then at the end of the day personally take that external drive off-site.
Now, I am sure some of you are laughing at this method but with the cost of consumer hard drive rapidly decreasing this is a very viable option for some smaller companies. I actually knew a company that did this every Friday and the IT manager relocated the USB external drive to a bank vault. The company purchased 5 drives large enough to hold a weeks worth of backup files and rotated them out weekly. This method allowed them to keep 30 days worth of backups at all times off-site.
This method is probably the cheapest; however not necessarily the safest.
I am sure there are many other scenarios that are effective to getting SQL backups off-site, these are the ones that I have seen work in the real world successfully.
The important thing to remember and what to take away from this post, is to get your backups off-site. In the event of your primary datacenter crashing, you need to be able to get your data restored ASAP. And if your most recent backup is on a server/SAN in that datacenter, your recovery time has just been exponentially increased.
Your backup plan is not complete until your backups are off-site!
Last night I had the opportunity to once again speak at our local Baton Rouge SQL User Group meeting. And as usual, it was a blast.
The topic of choice was “Indexes and Execution Plans: Using them together for the better!” This was somewhat of a 101 class designed to show how you can read and use Execution Plans to build your indexing strategies.
An introduction into basic terms with minimal slides and then I was on to demos. My demos were all based on a Phone Book database that I created with a White Pages table and a Yellow Pages table. I populated the data using this nifty website Mokcaroo. One bit of new knowledge I learned during the process. The tSQL phrase of INSERT INTO ….VALUES… has a 1000 row limit. Who knew?
I believe meaningful discussion was had by all and without hesitation the “peanut gallery” were in top form! Enjoyable to say the least!
My last point of the discussion was “how to determine if the indexes were useful”. I demonstrated the code I use to list all indexes and their usefulness. My code to do this can be found in my previous blog post in the 8 Weeks of Indexes series: “Determining what you have now.”
During my presentation, I mentioned one of my favorite books on Execution Plans by Grant Fritchey (b|t) called SQL Server Execution Plans, Second Edition and ITS FREE so there is no excuse not to get it and read it!
Since my slide deck was very minimal because the focus was on the indexing demos, I really don’t see the point of posting it; but I will include the demo scripts just in case.
In my previous post I introduced My Methodology in Disaster Recovery along with Step 1: Integrity Checks.
Today, let’s talk about Step 2: Backups; but first we have to answer 2 questions. How much data can are you willing to lose and how long are you willing to be down after a disaster? In this line of business we call these Recovery Point Object and Recovery Time Object (RPO/RTO).
Recovery Point Objective
RPO is the “line in the sand” where you say, I can’t lose XXXX amount of data. Of course this line is determined by what type of data it is, how often the data is changed and is it OK to lose any of the data or can the data be rebuilt.
Example: A hospital would not want to lose any of their data. I would like to believe my medical history is safe and I do not have to keep my own copies. This type of data is crucial to keep, so basically acceptable loss of data is Zero to none! This type of RPO usually requires High Availability Disaster plans which is outside the scope of this article, but lets just say you will take many, many transactional log backups!
Recovery Time Objective
RTO is the “line in the sand” whey you say, “I can’t be down more than XXXX amount of time”. How long does it take you to get a restorable server up and running. Again, the type of data determines this as much as anything else.
Using a previous example of a hospital: electronic medical records could possibly be down a little while because there are such things a paper medical records that can be scanned/typed in later. It is not ideal situation; but hey following a disaster I would be willing nothing is working ideally!
Backing up your databases will (not may) save your job one day, it has mine! The most important thing to remember about backups is to ensure that they are done. They second most important thing to remember about backups is to ensure that they are “verified”. And finally ensure your backups are valid.
First let’s talk about methods of backing up databases. Then I will discuss my thoughts on where to put backup files.
There are two main ways to backup databases: Native and 3rd Party tools. I hope to break down the pros and cons of each.
Steps of the Backup Process
1. Perform the Backup – using what ever method your company uses; make sure the backups are being done.
2. Verify the Backup — SQL provides a way to verify the file written to the drive is verified against the live data file. This process basically ensures what was read was truly written to the disk. This doesn’t verify IF your data is valid; that is where Step 1 of Disaster Recovery Integrity Checks.
USE master GO RESTORE VERIFYONLY [MyDatabase] FROM DISK = N'C:\Backups\>MyDatabase.bak' GO
Native SQL Backups
Native backups are built into SQL Server, i.e. no further costs. SQL Server is expensive enough you shouldn’t have to spend more money to just backup your data. Native Backups (NBs) have been around since at least SQL 2000, that’s the beginning of my SQL experience. NBs functionality has grown up right along with SQL. The ability to write to network shares, ability to compress, ability to verify are all options that make NBs an excellent choice.
If your SQL Server is experiencing CPU pressure, you may want to reconsider using compression. The compression is actually done with CPU cycles before writing the data to a file; so using compression will increase your CPU usage during your backup window.
3rd Party Backup Tools
Prior to SQL Server 2008 R2, I could make an argument for using 3rd Party Tools because of the importance of saving space with backup compression. However, since 2008 R2, backup compression is an available feature in all but Express editions.
Some 3rd Party Tools, are nothing more than a windows service that manages backups. Some 3rd Party Tools are actually a windows service that will perform proprietary backups these are the ones that usually provide a variety compression levels to choose from.
If your SQL Server is experiencing CPU pressure, you may want to reconsider using compression. The compression is actually done with CPU cycles before writing the data to a file; so using compression will increase your CPU usage during your backup window.
Your choice of methods is completely up to you and your business requirements. I personally see no reason to spend more money on a product that another product already does. My method of choice is Native Backups.
Location of Backups
This is always a hotly contested arguments. Do I backup my files to a local drive, attached SAN drive, network share or cloud?
To start with, I think the majority of the SQL Community will agree that the faster you get the BAK and TRN files off of the SQL Server the better off you will be.
Using a network share as your backup folder destination is just as valid as using a local drive or an attached SAN drive or heaven forbid an attached external USB drive. Remember, the destination should not be the same drive that your MDF and/or LDF files are on.
The real point of the discussion of Location of Backups should not be where my backups should be stored; but what is the final resting place of those backups? Does my company have a plan to move the files off site to a secure location? Is that plan automatic or manual?
Once your daily backup routines are complete, there should be a process in place to archive these flat files off of the SAN and relocate them off site. I have seen everything from an external USB drive carried home to SAN replication to a completely different data center on the other side of the country. Either method is functional in getting the backups off site.
One very reasonable and possible location is cloud storage. This concept, if enough space is purchased, kills the preverbal “two birds with one stone”. It not only backups up your files off of the same drive as your live data but it also by its very nature ensures that the backups are indeed offsite.
My Personal Ideal Scenario
My personal opinion is to use native SQL Backup with Compression and write the files to a network share, preferably a different SAN/NAS than your data. This does two things: 1) compression reduces the size of the backup files going across the network and 2) physically writes the files off of your SQL server as fast as possible and allows your immediate backups to be stored on a different physical drive than your live data.
Some will argue and say “it will blow up my network”; however, with the current specs of modern NIC cards and routers, it is highly unlikely that you will cause network issues.
The most difficult part of this planning is how to get the backup files offsite. Replicated SANs can be VERY expensive, archiving to Tape is an option, cloud storage is available as an option or just copying files to external drive. The thing to remember is the amount of time it takes you to get the files to their final resting place is the amount of time it takes to recover them (RTO).
I believe as cloud storage gets cheaper and cheaper (except for you Microsoft), the feasibility of backing up directly to the cloud will be more and more popular.
Know your RPO/RTO of your company, have a plan to accomplish those goals and know how to make that happen in the cheapest, fastest, safest way possible.
I was hoping a career in consulting would possibly spark blog ideas!
One thing I am passionate about with SQL Server is Disaster Recovery. Having worked for a hospital during hurricane season as a DBA, I truly had some sleepless night wondering if I could bring a 2nd system up successfully in the event of a total disaster. Medical data is of the utmost importance when it comes to the field of health care. And I am sure, everyone has the stance that “their data is the most important data”! That is why you must protect it at all costs.
Either way, as a DBA it is my job to be able to stand up a 2nd server as soon as possible in the event of a total disaster. If it was only as easy as pushing a button, our job as DBAs would be much easier. Unfortunately though, proper disaster recovery requires fore thought, planning and testing.
I have a 4-fold plan of Disaster Recovery and I would like to discuss my thoughts on this. Part one will be discussed here with later parts in subsequent posts.
- Integrity Checks
- Off-Site duplicates
- Recovery Testing
1. Integrity Checks
Most everyone is aware of DBCC CHECKDB commands, and they are vital to the stability of your database. If you are not familiar with them, then please for the love of the SQL Gods learn about it!
It is of utmost importance to run these as often as feasible. I typically run integrity checks once a day during non-business hours or downtime. This process will be resource intensive which is why it needs to be done during downtime.
Now some of you are going to say, I tried running DBCC CHECKDB on my 350 GB database and it brought my server to its knees so I stopped doing them. All I can say is I hope your data pages are not corrupt.
Per MSDN DBCC CHECKDB does the following:
Runs DBCC CHECKALLOC on the database.
Runs DBCC CHECKTABLE on every table and view in the database.
Runs DBCC CHECKCATALOG on the database.
Validates the contents of every indexed view in the database.
Validates link-level consistency between table metadata and file system directories and files when storing varbinary(max) data in the file system using FILESTREAM.
Validates the Service Broker data in the database.
Now that’s a whole lot of checking! If CHECKDB command does all these commands, then possibly we can shorten the duration by manually executing CHECKALLOC one night, then CHECKTABLE another, and maybe CHECKCATALOG a 3rd night. It is a thought.
There are many options to integrity checks that can shorten the execution time. Many blogs posts by Paul Randal, Aaron Bertrand (to name a few) have written many articles about DBCC CHECKDB and how to effectively use the different options and actually the different commands of DBCC to shorten the duration of the integrity check.
The one thing to remember here is if you databases is corrupt, so are your backups! SQL Backups are only copies of what is in the databases, if the data pages are corrupt so is your backup! This is why it is impetrative to regularly perform integrity checks on your databases.
One of my favorite methods because I can script it out right after Step #4 of My Methodology is off-load integrity check. If your databases are too large or too busy to do checks in production, after you have test restored your backups (hopefully you are doing this), that is an ideal time to run DBCC CHECKDB. The restored database is ideally on another server that will not have any impact on users if you run the DBCC CHECKDB commands.
In my next post, I will discuss my thoughts on SQL Backups: discussing native vs. 3rd Party and local vs. network backups.
No, I am not talking about Brittany Spears; one blog post could not possibly define or explain that child!
I am talking about SQL User Groups! I opened my mouth for a brief second and sure enough I was shocked and surprised by what came out of it.
“I will present at the November meeting.”
What the heck did I just say? What did I just volunteer my self to do? I really need to learn to take my wife’s advice. “Just say NO” LOL
Any way, on to picking a topic to present. One of the most intriguing things for me as a SQL DBA is performance tuning. The ability to push the limits and make queries go faster! I get excited when I figure out how to shave milliseconds of off of a query.
For me the best way to approach performance is to try and develop your queries and indexes correctly from the start. My presentation, “Indexes and Execution Plans”, will begin with a brief overview of indexes and execution plans; how to read them and to use them to your advantage in development. I will then give code examples on how to use execution plans to determine the best need (if any) for indexes.
My presentation and code will be available after the user group meeting.
On Monday, September 14th, I began a new chapter in my career as a SQL DBA Consultant for a local IT Service/Consultant firm here in Baton Rouge, LA.
After almost 10 years of being a “jack-of-all-trades/master-of-none”, I have decided to concentrate my efforts on DBA work and do my best to help others. Over the last 10 years, I have been a SharePoint Admin, .NET web developer, .NET desktop developer, a BI Analysts, a SQL Developer and a SQL DBA. It was time to dedicate my career.
The company I am working for now believes in the concept of “transferring knowledge”; not just fixing their problems. I like that approach and I like the idea of something different everyday! Hopefully, this transferring of knowledge translates to more blog posts!
The concept of “billable time” is definitely something to get used to; but aren’t there new things about every new job?
This is post #5 of my 8 Weeks of Indexes; these are not posted weekly; but I am getting faster at posting them. (Side note: I am making it a personal effort to use time at work [hey it’s learning right] to write my blogs)
When you are given access to a database and are told to manage it; what is the best way to figure out what indexes are available, are they being used, and how big are they?
Let us figure this out step by step.
What Indexes do I have?
Any time I have to determine what indexes a database has, I always use the following script. It is quick, easy and returns most of the information you need to identify an index and it’s configuration.
The below picture is a sample of the results for the above query.
Next, I usually want to know IF the indexes are actually being used. Believe it or not; there are sometimes indexes placed on databases that never get used. And that is just a waste of I/O and disk space.
This query provides the information you need to determine how often the indexes are used and if the SQL Engine is using SEEK, SCAN or LOOKUP to find data (this will be discussed in my next article). If the index is not listed with these results; it has not been used since last index rebuild or SQL Engine restart.
And that is how I determine what indexes a server has, how they are being used and how big they are.
Next week (hopefully), I will go into how to determine if the indexes are effective!
Late last night, I got the call that no DBA wants to get.
“Our Data Center has crashed”
It appears, both controllers in the VM Ware SAN crashed at the same time bringing every VM server down. Once the controllers were replaced, servers were brought back online. But that’s not the real point of this article, it is just the background of what happened.
Typically when I install a SQL server, I use a domain Security Group as the Sys Admin. Members of this group include myself, my monitoring service account and domain admins. Because I am a member of this group, I typically remove my individual login. This has not been an issue until last night; well really this morning.
SQL Agent Jobs
I had SQL Agent jobs failing all over the network once my servers were back up and running from the earlier crash and I couldn’t figure out why.
The job failed. The owner () of job SystemDB.Backup-Full does not have server access.
The owner()…does not have server access. Why did it not list the owner? Is there an owner?
I understand the importance of having non-domain accounts “owning” SQL Agent jobs; now more than ever. As much as I try to replace myself as job owner when I edit a job; it just sometimes slips my mind.
Every job owned by my network account was failing across the company. When researching I tried to get a listing of all jobs and their owners using the script below; but not all the jobs were listed. Why weren’t all jobs being listed?
Digging deeper, I remembered that syslogins is actually a view of sys.server_principles. Reading the entries there, my network login was not a principle which made sense because I removed my network login principle. But my login was a job owner; apparently SQL will use the current logged in user as the new owner ever if it is not a principle. I tested this by removing my login from a test server, creating a new job and there she blows!
wanted needed a quick way to change job ownership of all SQL Agent Jobs. So I developed the following script.
This script will be added to my “toolbox” for sure. I may set this up as a SQL Agent job itself randomly checking, or maybe a SSIS package using @Variables for Server Names, but for now, I will manually run this on severs that need it.
This is the 4th post in my series of 8 Weeks of Indexes
In this installment, I hope to describe what makes up an index, how it is created and some of the more common options.
The Following table will be used to demonstrate my examples.
If you look at the structure of the table above, I already have a Unique Clustered Index built in because of the CONSTRAINT/PRIMARY KEY clause. So my table physically will be stored in numerical order on the ID column. This type of structure and index are useful for pulling individual records, if you know the ID of the row.
My first example will be of a non-clustered index. You can have up to 249 non-clustered indexes per table in SQL 2005 and SQL 2008 increased that limit to 999.
This is a simple index on the Patients table so you can query by Last Name. This tells me very plainly that I want to create an index called “IX_LastName” on the Patients table using the column “LastName” in Ascending alphabetical order and store that index on the [PRIMARY] file group. There are different options (MSDN) with creating a non clustered index. The more common options are: (all quotes are from MSDN)
- INCLUDE – “specifies the non-key columns to be added to the leaf level of the nonclustered indexes”. Usually these are other columns in your SELECT statements
- WHERE – the ability to filter the index by a certain criteria to limit the number of rows returned
- WITH – this option has many different settings that can be applied to the index: these include the more common options PAD_INDEX, FILLFACTOR, SORT_IN_TEMPDB, DROP_EXISTING and ONLINE (see MSDN article for complete list)
The last command ON is a very cool feature. This particular feature tells SQL where you want to physically store the index based on filegroups. On very busy systems, it may be advantageous to store these on a separate drive in a separate file group.
The above code uses 3 common options and 4 common parameters for the WITH options.
INCLUDE – adds the two columns to the leaf level so the query can easily return those results with index
WHERE – this will effectively return only the rows that are included in this filter. Only patients in Baton Rouge will be included in the index.
WITH – The following index options applied are as follows: FILLFACTOR will force SQL to only use 80% of space of the index page. It is recommended to use PAD_INDEX=ON with FILLFACTOR because this will allow SQL to force let’s say 81% of the index page if 80% is not enough to store one row. The ONLINE=ON is only available with Enterprise and Developer edition but this reduces the locks on the table to allow the data to be available. And the last option is SORT_IN_TEMPDB=ON. This does exactly what it say, it will copy the index to the tempdb, sort it there than write it back to its original location. This feature is VERY helpful if your tempdb is on a separate drive because SQL can READ/WRITE to 2 separate drives at the same time but not the same drive.
Well that is enough for today. My mind can only take so much. Next week I will be out of the office and away from computers so there is no chance to get a post out (not like I have been doing them every week anyway).
But I will endeavor to have Post #5 of my 8 Weeks of Indexes out the week of Aug 10th.