Rolling in the Deep

I was at SQL Saturday in Boston this past weekend and I sat in on a session given by Paresh Motiwala: “Why do we shun using tools for DBA job?”

It’s a decent question and sometimes rolling your own (hence the title here and yes I’m listening to Adele right now) is the right answer.

But often, it’s not.

A case in point are DBAs who avoid using the built-in SQL Server maintenance plans, especially for simple tasks such as Backups.

Now, I’ll start out by saying straight up, sometimes they’re not the optimal solution. (You may want to backup certain partitions on a different a rotation schedule and the like as one example.)

Below is a recent situation I came across.

But, if you do decide to roll your own, please do NOT do the following:

  1. First write a sproc that has to be placed into each database as it’s created in order to run the backup. (If you DO decide to go this route, please make this a scripted part of your release procedure so it’s not missed.)
  2. If you DO in fact ignore suggestion #1, I will point out that it doesn’t do much good to make the first step of your sproc to check to see if the database exists. BIG HINT: If the database doesn’t exist, the sproc won’t run in the first place! Yes, I know you’re saying, “Obviously” but this is the situation I just came across.
  3. If you DO in fact ignore suggestion #2, you probably don’t need to call your own special “alert sproc”.
  4. But if you DO in fact ignore suggestion #3, make sure your “alert sproc” does more than call a “email sproc” and pass it a few parameters.
  5. And if you DO in fact ignore suggestion #4, please make sure your “email sproc” does a bit more than build a message and call the built-in SQL Server stored proc to send an email.

Now granted, there may be reasons to do some of the above. Perhaps your alert sproc also calls some sort of 3rd party monitoring tool. Or your email needs are very specific.

In the case I just rectified, none of that was true.  So there was a lot of additional complexity (which really didn’t work anyway) for no good reason.

Another problem this roll your own backup setup had was that it used the same filename every time for its backups. i.e. On Monday the backup name was M:\Backups\FOO_FULL.BAK.  On Tuesday it was M:\Backups\FOO_FULL.BAK, etc.

In theory (and generally in practice) each of these would be backed-up to a 3rd party so it was in theory possible to find the backup for a specific day, but that was an added complexity; and probably not one you want in an actual DR situation.  Also, if for some reason the backup to the the 3rd party failed (but the local backups continued) they’d definitely lose the ability to restore specific days of backups.

In addition, the person who built this procedure setup differential backups for MOST databases to run every 15 minutes.  Now, I wouldn’t necessarily call that a terrible idea, but in this case, almost certainly not the best approach in my opinion.  However, again, the same file name was used each time.

This means that in a DR event, the company could restore the previous nights backup and if they wanted, the most recent Diff backup and that was it.  If they wanted to restore to a point in time in between, that was impossible. And in my experience this is far more common than most other restore needs.

Finally, the developer who wrote all this clearly did not understand what the CHECKPOINT command did. He had scheduled a checkpoint job to run every 30 minutes.  Again, in general, not only not necessary, but probably a bad idea. However in this case it not only was not necessary, the reason given in the job comments was completely wrong.  He seemed to think it would keep the transaction logs from growing.

This is of course NOT what it does and sure enough on the one database still with FULL RECOVERY enabled the transaction log was far larger than the actual database. (Fortunately it was a lightly used database or the disk might have filled up years ago.)

Since discovering all this, I’ve gone and replaced all this complexity with a set of maintenance jobs.  These will guarantee each system database is backed up weekly (for their needs this should be fine) with unique names.  User databases will be backed up nightly and retained for 4 nights (and perhaps extended once we determine fully how much disk space we want to set aside for this.)  Transaction logs will be performed every 15 minutes. These too will have unique names.

Now the customer can restore to any point in time in the last 4 days (if they go to their 3rd party backup, even further back) up to 15 minutes before a failure (and in some cases if the log is still available and they can backup the tail of the log, up to the instant before the failure).

If they add additional databases, they don’t have to worry about remembering to put in 3 separate sprocs for each database added and adding new jobs to the SQL Server Agent.

Now they not only have a far more robust backup plan, they have one that is far easier to maintain.  Oh and one that will actually send an email if there’s a problem with the backup.

The morale is: Don’t make things more complex unless you absolutely need to and if you do, make sure you actually achieve the goals you’re trying to achieve.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s