Erik Darling – Brent Ozar Unlimited®

Query Store is so cool

Billed as a flight data recorder for SQL Server, the Query Store is a repository of execution plan information, like the plan cache, except a bit more actionable. And it has a GUI.

“Try looking into that place where you dare not look! You’ll find me there, staring out at you!”

You can read all about what it does and what you can do with it around the internet. You can be suitably impressed and enchanted by the promise of data that’s persisted between restarts, being able to quickly and easily address plan regression issues, and so forth.

I’m here to bum you out when you’re done with all that.

You can’t use it in tempdb

On tempdb? Whatever.

What? Really? With all the stuff that goes on there? Aside from the many selfish demo reasons I have for wanting it, I know lots of people who make tempdb their default database for user logins so they don’t go messing things up in the real world, or more fragile system databases. I’m also aware of several vendor apps that dump users into tempdb and then use fully qualified names to point them to application data.

I found this all out by accident. I had just nabbed CTP 3 of SQL Server 2016, and was settling in for an evening of staring contests with execution plans and new features.

That’s when ~~tragedy struck~~ I got an error message:

ALTER DATABASE tempdb SET QUERY_STORE = ON; 

Query Data Store cannot currently be enabled on tempdb

Msg 12438, Level 16, State 1, Line 1
Cannot perform action because Query Store cannot be enabled on system database tempdb.
Msg 5069, Level 16, State 1, Line 1
ALTER DATABASE statement failed.

And yeah, it’s an official no-go

Oh, but model and msdb are fair game?

I’m all disillusioned and stuff, man.

This is going to be frustrating out in the wild for a lot of people. Not just fancy-~~pants~~ camo shorts consultants who like to make demos where they treat tempdb like it owes them money. Real people. Who pay real money, for real servers.

Anyway, I started a Connect item to get this changed. Vote for it. Or don’t. Just don’t vote more than once, or if you’re dead.

Hey, remember 2005?

What a great year for… not SQL Server. Mirroring was still a Service Pack away, and there was an issue with spinlock contention on OPT_IDX_STATS or SPL_OPT_IDX_STATS. The KB for it is over here, and it’s pretty explicit that the issue was fixed in 2008, and didn’t carry over to any later versions. For people still on 2005, you had a Trace Flag: 2330.

Like most things, there are trade offs. When you enable it, SQL stops collecting missing index requests. Probably not a great idea unless you’re working with extreme edge cases where you’ve already tuned your indexes really well. Most people will never fall into that category, though many of them will think they have.

The issue that it quite commonly addressed was around the creation of many objects in tempdb. You probably don’t need missing index details on a bunch of temp tables. What a lot of people didn’t realize was that it also made SQL stop collecting them for every other database.

Here in 2015

That Trace Flag can still be enabled with the same effect. Some people may not be aware that it’s not fixing anything, and still hurting things. Below is a script to reproduce the lousiness in a user database.

First we’ll clear out missing index requests by rebuilding the index (you know this happens, right?), and verify that there are no missing index requests with sp_BlitzIndex®. Running it in @mode = 3 will generate only missing index request details.

USE [StackOverflow]

ALTER INDEX [PK_Users] ON [dbo].[Users] REBUILD;

EXEC [master].[dbo].[sp_BlitzIndex] 
    @DatabaseName = N'StackOverflow' , -- nvarchar(128)
    @Mode = 3

Once we verify that’s empty, we’ll run a query that will generate a missing index request in the Stack Overflow database, and verify it registers with sp_BlitzIndex® again.

DECLARE @id INT

SELECT @id =  id 
FROM [dbo].[Users] AS [u]
WHERE [u].[LastAccessDate] > 2015-01-01
ORDER BY [u].[LastAccessDate]

EXEC [master].[dbo].[sp_BlitzIndex] 
    @DatabaseName = N'StackOverflow' , -- nvarchar(128)
    @Mode = 3

Whaddya know? We got ourselves a missing index.

You look helpful! What’s your name?

Now comes the ugly

We’ll enable TF 2330 globally, and use some GO magic to run the same query 5x consecutively, then check back in on sp_BlitzIndex®.

DBCC TRACEON(2330, -1)

GO 

DECLARE @id INT

SELECT @id =  id 
FROM [dbo].[Users] AS [u]
WHERE [u].[LastAccessDate] > 2015-01-01
ORDER BY [u].[LastAccessDate]
GO 5

EXEC [master].[dbo].[sp_BlitzIndex] 
    @DatabaseName = N'StackOverflow' , -- nvarchar(128)
    @Mode = 3

The compiles ticked up one, but SQL stopped counting requests for the index.

Bummer.

There is a light

But hey! We can turn that off. And… Well, you know the drill.

DBCC TRACEOFF(2330, -1)
GO 

DECLARE @id INT

SELECT @id =  id 
FROM [dbo].[Users] AS [u]
WHERE [u].[LastAccessDate] > 2015-01-01
ORDER BY [u].[LastAccessDate]
GO 5

EXEC [master].[dbo].[sp_BlitzIndex] 
    @DatabaseName = N'StackOverflow' , -- nvarchar(128)
    @Mode = 3

We have stopped breaking SQL!

1 + 5 = 6, FYI

What did we learn?

2005! What a lousy year for hair, and Trace Flags, and, uh, I don’t remember much else about 2005. Feel free to tell me what you hated about 2005 in the comments. Just check all your servers to make sure no one turned on this Trace Flag as an April Fool’s Day joke and then never came back from lunch, first.

Thanks for reading!

Everyone knows tempdb is a wierd [sic] place

Brent refers to it as a public toilet. I agree with that sentiment. It will let anyone in to do whatever they want.

Recently I was trying to track down what was causing tempdb’s log to grow. I was thinking about using a Trace to do it, but then I remembered that it’s at least 2015, and I should be using Extended Events. Don’t worry, there won’t be any PowerShell. You can keep reading.

I use the below command to fire up my session. It worked on my 2012 and 2014 instances. Anything older or newer than that, and YMMV. You will likely have to change the output directory to whatever exists on your server. Most people have c:\temp, though.

CREATE EVENT SESSION [PublicToilet] ON SERVER
ADD EVENT [sqlserver].[database_file_size_change] (
    ACTION ( [sqlserver].[session_id], [sqlserver].[database_id],
    [sqlserver].[client_hostname], [sqlserver].[sql_text] )
    WHERE ( [database_id] = ( 2 )
            AND [session_id] > ( 50 ) ) ),
ADD EVENT [sqlserver].[databases_log_file_used_size_changed] (
    ACTION ( [sqlserver].[session_id], [sqlserver].[database_id],
    [sqlserver].[client_hostname], [sqlserver].[sql_text] )
    WHERE ( [database_id] = ( 2 )
            AND [session_id] > ( 50 ) ) )
ADD TARGET [package0].[asynchronous_file_target] (  SET filename = N'c:\temp\publictoilet.xel' ,
                                                    metadatafile = N'c:\temp\publictoilet.xem' ,
                                                    max_file_size = ( 10 ) ,
                                                    max_rollover_files = 10 )
WITH (  MAX_MEMORY = 4096 KB ,
        EVENT_RETENTION_MODE = ALLOW_SINGLE_EVENT_LOSS ,
        MAX_DISPATCH_LATENCY = 1 SECONDS ,
        MAX_EVENT_SIZE = 0 KB ,
        MEMORY_PARTITION_MODE = NONE ,
        TRACK_CAUSALITY = ON ,
        STARTUP_STATE = ON );
GO

ALTER EVENT SESSION [PublicToilet] ON SERVER STATE = START;

With that fired up, let’s kick tempdb around a little. Don’t do this in production, it’s going to suck. We’re going to shrink files and run an insert and update in a loop. You can increase the number of loops if you want, but 3 is good enough to illustrate how this works.

USE [tempdb]
SET NOCOUNT ON

IF (OBJECT_ID('tempdb..#Users')) IS NOT NULL
DROP TABLE [dbo].[#Users]

DBCC SHRINKFILE('tempdev', 1)
DBCC SHRINKFILE('tempdev2', 1)
DBCC SHRINKFILE('tempdev3', 1)
DBCC SHRINKFILE('tempdev4', 1)

DBCC SHRINKFILE('templog', 1)

CREATE TABLE [#Users](
  [Id] [int] NULL,
  [Age] [int] NULL,
  [DisplayName] [nvarchar](40) NULL,
  [AboutMe] [nvarchar](max) NULL
  )

DECLARE @i INT = 0;
DECLARE @rc VARCHAR(20) = 0

WHILE @i < 3
      BEGIN

            INSERT  [#Users]
                    ( [Id] ,
                      [Age] ,
                      [DisplayName] ,
                      [AboutMe] )
            SELECT TOP ( 75000 )
                    [Id] ,
                    [Age] ,
                    [DisplayName] ,
                    [AboutMe]
            FROM    [StackOverflow].[dbo].[Users];

            UPDATE  [u1]
            SET     [u1].[DisplayName] = SUBSTRING(CAST(NEWID() AS NVARCHAR(MAX)), 0, 40) ,
                    [u1].[Age] = ( [u2].[Age] * 2 ) ,
                    [u1].[AboutMe] = REPLACE([u2].[AboutMe], 'a',
                                             CAST(NEWID() AS NVARCHAR(MAX)))
            FROM    [#Users] [u1]
            CROSS JOIN    [#Users] [u2];

      SET @rc = @@ROWCOUNT
      RAISERROR ('Current # rows %s', 0, 1, @rc) WITH NOWAIT

            SET @i += 1;			
      RAISERROR ('Current # runs %i', 0, 1, @i) WITH NOWAIT
      END;

Now you can either stop the XE session, or keep it running. I don’t care. It’s your ~~~DEVELOPMENT SERVER~~~ and definitely ~~~NOT PRODUCTION~~~, right?

Right.

Let’s see what kind of germs are on the toilet seat

SELECT  [evts].[event_data].[value]('(event/action[@name="session_id"]/value)[1]',
                                    'INT') AS [SessionID] ,
        [evts].[event_data].[value]('(event/action[@name="client_hostname"]/value)[1]',
                                    'VARCHAR(MAX)') AS [ClientHostName] ,
        COALESCE([evts].[event_data].[value]('(event/action[@name="database_name"]/value)[1]',
                                           'VARCHAR(MAX)'), ';(._.); I AM KING KRAB') AS [OriginatingDB] ,
        DB_NAME([evts].[event_data].[value]('(event/data[@name="database_id"]/value)[1]',
                                            'int')) AS [GrowthDB] ,
        [evts].[event_data].[value]('(event/data[@name="file_name"]/value)[1]',
                                    'VARCHAR(MAX)') AS [GrowthFile] ,
        [evts].[event_data].[value]('(event/data[@name="file_type"]/text)[1]',
                                    'VARCHAR(MAX)') AS [DBFileType] ,
        [evts].[event_data].[value]('(event/@name)[1]', 'VARCHAR(MAX)') AS [EventName] ,
        [evts].[event_data].[value]('(event/data[@name="size_change_kb"]/value)[1]',
                                    'BIGINT') AS [SizeChangeInKb] ,
        [evts].[event_data].[value]('(event/data[@name="total_size_kb"]/value)[1]',
                                    'BIGINT') AS [TotalFileSizeInKb] ,
        [evts].[event_data].[value]('(event/data[@name="duration"]/value)[1]',
                                    'BIGINT') AS [DurationInMS] ,
        [evts].[event_data].[value]('(event/@timestamp)[1]', 'VARCHAR(MAX)') AS [GrowthTime] ,
        [evts].[event_data].[value]('(event/action[@name="sql_text"]/value)[1]',
                                    'VARCHAR(MAX)') AS [QueryText]
FROM    ( SELECT    CAST([event_data] AS XML) AS [TargetData]
          FROM      [sys].[fn_xe_file_target_read_file]('c:\temp\publictoilet*.xel',
                                                        NULL, NULL, NULL) ) AS [evts] ( [event_data] )
WHERE   [evts].[event_data].[value]('(event/@name)[1]', 'VARCHAR(MAX)') = 'database_file_size_change'
        OR [evts].[event_data].[value]('(event/@name)[1]', 'VARCHAR(MAX)') = 'databases_log_file_used_size_changed'
ORDER BY [GrowthTime] ASC;

Look, I never said I was good at this XML stuff. If you are, feel free to tidy up all the VARCHAR(MAX) and BIGINT types into something more sensible. If you don’t like my King Krab emoji, blame Jeremiah.

But hey! The results are in. That didn’t take long.

The first file to get it is the log file. All sorts of helpful stuff got collected. If you can’t figure out what was going on from this, go home. Seriously. The query text is right there.

Meaningful data.

Yay. There’s more. The log file wasn’t the only one that blew up. All four tempdb files got used, as well. Is there an important lesson here? Probably. Even on my laptop, multiple tempdb files help. Don’t let me catch you out there with one tempdb data file. Or two tempdb log files. I’m watching you.

Happy Birthday!

In case you were wondering…

Yes, it will capture shrinking files, too. So if anyone is being a horrible, you can catch them, and throw them out several windows.

For the love of Milton Berle, please stop shrinking files.

You can use this on other databases as well, just change the database ID in the XE session definition. Just, you know, use it sparingly. There’s overhead for any type of monitoring, and XE is no XcEption (GET IT?). If you have a growth-heavy environment, capturing them all and the accompanying information could be a real burden on your server.

Building a better BAND-AID®

Stuff like this is good to diagnose problems short term, but it’s not meant to be a replacement for a full time monitoring solution. A lot of people spent a lot of time making nice ones. Go try them out and find one you like.

Thanks for reading!

I found a quirky thing recently

While playing with filtered indexes, I noticed something odd. By ‘playing with’ I mean ‘calling them horrible names’ and ‘admiring the way other platforms implemented them‘.

I sort of wrote about a similar topic in discussing indexing for windowing functions. It turns out that a recent annoyance could also be solved by putting the column my filter expression is predicated on in the included columns definition. That’s the fanciest sentence I’ve ever written, BTW. If you want more, don’t get your hopes up.

Ready for Horrible

Let’s create our initial index. As usual, we’re using the Stack Overflow database. We’ll look at a small group of users who have a Reputation over 400k. I dunno, it’s a nice number. There are like 8 of them.

CREATE UNIQUE NONCLUSTERED INDEX [Users_400k_Club] 
ON dbo.[Users] ([DisplayName], [Id]) 
WHERE [Reputation] > 400000;

With that in place, we’ll run some queries that should make excellent use of our thoughtful and considerate index.

--Will I Nill I?
SELECT [u].[Id], [u].[DisplayName]
FROM [dbo].[Users] AS [u]
WHERE [u].[Reputation] > 400000

SELECT [u].[Id], [u].[DisplayName]
FROM [dbo].[Users] AS [u]
WHERE [u].[Reputation] > 400000 AND [u].[Reputation] < 450000

SELECT [u].[Id], [u].[DisplayName]
FROM [dbo].[Users] AS [u]
WHERE [u].[Reputation] > 400001

SELECT [u].[Id], [u].[DisplayName]
FROM [dbo].[Users] AS [u]
WHERE [u].[Reputation] > 500000

If you were a betting organism, which ones would you say use our index? Money on the table, folks! Step right up!

Yes, Sorta, No, No.

That didn’t go well at all. Only the first query really used it. The second query needed a key lookup to figure out the less than filter, and the last two not only ignored it, but told me I need to create an index. The nerve!

Send me your money

Let’s make our index better:

CREATE UNIQUE NONCLUSTERED INDEX [Users_400k_Club] 
ON dbo.[Users] ([DisplayName], [Id])
INCLUDE([Reputation]) 
WHERE [Reputation] > 400000
WITH (DROP_EXISTING = ON)
;

Run those queries again. You don’t even have to recompile them.

Can’t you tell by the way I run every time you make eyes at me?

They all magically found a way to use our New and Improved index.

What was the point?

When I first started caring about indexes, and filtering them, I would get so mad when these precious little Bloody Mary recipes didn’t get used.

I followed all the rules!
There were no errors!

But why oh why didn’t SQL use my filtered indexes for even smaller subsets of the filter condition? It seemed insane to me that SQL would know the filter for the index is on (x > y), but wouldn’t use them even if (z > x).

The solution was to put the filtered column in the include list. This lets SQL generate statistics on the column, and much like getting rid of the predicate key lookup, allows you to search within the filtered index subset for even more specific information.

I love stuff like this!

Even though it’s not on my list of dream features, it’s pretty neat. Getting new views into what SQL is doing when queries execute is pretty cool. You can read the short and gory details at the KB here: Improved diagnostics for query execution plans that involve residual predicate pushdown in SQL Server 2012

The wording might seem a little odd at first. The phrase “predicate pushdown” might make your face scrunch like you just stepped in something wet when the only thing you have on your feet are your lucky socks, and “residual” might just remind you of the time you left Tupperware in your desk drawer before a long weekend.

I would imagine this is coming to SQL Server 2014 in an upcoming CU/SP, but it hasn’t yet. As of this writing, the newest CTP for 2016 is 3.1, and it is working there as well.

You’re rambling, man. Get on with it.

Let’s take a look at what this ol’ dog does, and how we can use it to troubleshoot query problems. We’ll start with a small-ish table and some indexes.

USE [tempdb]

SET NOCOUNT ON

DROP TABLE [dbo].[RowTest]

;WITH E1(N) AS (
    SELECT NULL UNION ALL SELECT NULL UNION ALL SELECT NULL UNION ALL 
    SELECT NULL UNION ALL SELECT NULL UNION ALL SELECT NULL UNION ALL 
    SELECT NULL UNION ALL SELECT NULL UNION ALL SELECT NULL UNION ALL 
    SELECT NULL  ),                          
E2(N) AS (SELECT NULL FROM E1 a, E1 b, E1 c, E1 d, E1 e, E1 f, E1 g, E1 h, E1 i, E1 j),
Numbers AS (SELECT TOP (1000000) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS N FROM E2)
SELECT
        IDENTITY (BIGINT, 1,1) AS [ID] , 
        ISNULL(SUBSTRING(CONVERT(VARCHAR(255), NEWID()), 0, 9) , 'AAAAAAAA') AS [PO] ,
        ISNULL(CONVERT(DATE, DATEADD(HOUR, -[N].[N], GETDATE())),     '1900-01-01') AS [OrderDate] ,
        ISNULL(CONVERT(DATE, DATEADD(HOUR, -[N].[N], GETDATE() + 1)), '1900-01-01') AS [ProcessDate] ,
        ISNULL(CONVERT(DATE, DATEADD(HOUR, -[N].[N], GETDATE() + 3)), '1900-01-01') AS [ShipDate] 
INTO [RowTest]
FROM    [Numbers] [N]
ORDER BY [N] DESC;

ALTER TABLE [dbo].[RowTest] ADD CONSTRAINT [PK_RowTest] PRIMARY KEY CLUSTERED ([ID]) WITH (FILLFACTOR = 100)

CREATE UNIQUE NONCLUSTERED INDEX [IX_RowsRead] ON dbo.RowTest ([PO], [OrderDate], [ProcessDate]) WITH (FILLFACTOR = 100)

A million rows and nothing to do

Before the addition of this element, the only way to get any information like this right in SSMS was to SET STATISTICS IO ON, but that only gave us a partial story about the pages read by our query. Let’s look at a couple examples.

SELECT [rt].[ID]
FROM [dbo].[RowTest] AS [rt]
WHERE [rt].[PO] LIKE '%0%'
AND [rt].[OrderDate] >= CAST('2015-01-01' AS DATE)
AND [rt].[ProcessDate] <= CAST('2015-12-31' AS DATE)

Running the above query with the actual execution plan turned on, the tool tip that pops up over the index scan looks like this:

We read all million rows! Bummerino duder.

This is something Brent wrote about years ago. Great post, Brent! It’s further evidenced here, by the new Number of Rows Read line. We had to scan the index, all million rows of it, to search out the double wildcard LIKE predicate.

If we alter the query slightly, we can cut down dramatically on the number of rows we’re looking at. I get that this changes the logic, but really, you need to take care when allowing people to search full strings like above.

SELECT [rt].[ID]
FROM [dbo].[RowTest] AS [rt]
WHERE [rt].[PO] LIKE '0%'
AND [rt].[OrderDate] >= CAST('2015-01-01' AS DATE)
AND [rt].[ProcessDate] <= CAST('2015-12-31' AS DATE)

Note that we’re no longer scanning the index, either.

Granted, getting away from double wildcard searches and towards more sane search methods is a big leap. What if we just tighten our last query’s predicates up a bit? Say that we only needed POs that start with ’00’, and we only needed results since June. We’re filtering on [ProcessDate] to make sure that the order was actually fulfilled, or something. It’s a dream!

SELECT [rt].[ID]
FROM [dbo].[RowTest] AS [rt]
WHERE [rt].[PO] LIKE '00%'
AND [rt].[OrderDate] >= CAST('2015-06-01' AS DATE)
AND [rt].[ProcessDate] <= CAST('2015-12-31' AS DATE)

That’s way less, boss.

Now we’re down to reading just a few thousand rows to find what we need.

So what?

If you’re on SQL 2012, or, for some reason on CTP 3.1 of SQL 2016, you should take a new look at troublesome queries. Perhaps you can track down similar predicate inefficiency using this new tool. You may be reading way more data than you think. Anything you can do to cut down on data movement will very likely speed things up. The queries, in order, ran for an average of 110ms, 47ms, and 3ms respectively. Small changes can make differences.

To get this to work, I had to be on the SSMS November preview. It wasn’t showing up in other versions of SSMS for me.

I’d also like to thank everyone who voted for my Connect Item. It’s nice to know that 162 of you are at least as weird as I am about tempdb.

TL;DR

Unless you’re on SQL 2000, don’t worry about scheduling log backups during full backups
Log backups during full backups won’t truncate the transaction log
You want to keep taking log backups in case your full backup fails

The first time I ever set up backups

Was, unfortunately, using a maintenance plan. All of the databases were in simple recovery. It was largely used for staging and tenderizing data during ETL. No log backups need apply.

Fast forward a bit, and I’m setting up backups for a server where losing 30 minutes of data could set a project back several hours. We’re now hiring log backups.

You’re hired!

So there I was, dutifully creating extravagant maintenance plans, pondering the miracle of the differential backup, and the grand eloquence of log backups. They were running every 10 minutes, those log backups.

Every 10 minutes.

Even during full backups.

I’m a developer and what is this?

What is SQL going to do with those?
Do I have to restore them?
Should I pause log backups during full backups?
Will this break something?
How much Laphroaig do I have to drink to forget I thought about this?

This was confounding to me. So I did some digging. Back in the SQL 2000 days, this could have gotten weird. But I was, thankfully(?) on 2005. Heh. Yeah. 2005. I know.

After 2005, you can totally, absolutely, 100% take log backups during full backups. You can take them and nothing will break and you won’t have to restore them (unless your full backup fails).

Wanna see a demo?

Of course you wanna see a demo. This one is a little more complicated than usual. It will require several-SSMS-tab technology.

ALTER DATABASE [StackOverflow] SET RECOVERY FULL

BACKUP DATABASE [StackOverflow]
TO DISK = 'D:\Backup\Crap\StackOverflow\StackOverflow.bak'
WITH CHECKSUM, COMPRESSION, STATS = 1

I usually keep my SO database in simple, because I do horrible things that I don’t want to fully log. Fun fact: if you switch from simple to full recovery model and don’t take a full backup, you’re basically still in simple recovery. You should think about that for a minute and then take steps to avoid getting fired. Backing up SO takes me about 3.5-4 minutes.

In another window, kick off some log backups 1 minute apart. Note that doing this won’t overwrite log backups, it will stack them all within a single file.

You can verify this behavior by running the RESTORE HEADERONLY command at the end of this block. If you want to restore a particular backup out of a file with multiple backups in it, you use the position column and specify it with FILE = [n], which you can read more about here.

BACKUP LOG [StackOverflow]
TO DISK = 'D:\Backup\Crap\StackOverflow\StackOverflow_log.trn'
WITH COMPRESSION, CHECKSUM, INIT, FORMAT

WAITFOR DELAY '00:01:00.000'

BACKUP LOG [StackOverflow]
TO DISK = 'D:\Backup\Crap\StackOverflow\StackOverflow_log.trn'
WITH COMPRESSION, CHECKSUM

WAITFOR DELAY '00:01:00.000'

BACKUP LOG [StackOverflow]
TO DISK = 'D:\Backup\Crap\StackOverflow\StackOverflow_log.trn'
WITH COMPRESSION, CHECKSUM

WAITFOR DELAY '00:01:00.000'

BACKUP LOG [StackOverflow]
TO DISK = 'D:\Backup\Crap\StackOverflow\StackOverflow_log.trn'
WITH COMPRESSION, CHECKSUM

WAITFOR DELAY '00:01:00.000'

BACKUP LOG [StackOverflow]
TO DISK = 'D:\Backup\Crap\StackOverflow\StackOverflow_log.trn'
WITH COMPRESSION, CHECKSUM

WAITFOR DELAY '00:01:00.000'

BACKUP LOG [StackOverflow]
TO DISK = 'D:\Backup\Crap\StackOverflow\StackOverflow_log.trn'
WITH COMPRESSION, CHECKSUM

RESTORE HEADERONLY FROM DISK = 'D:\Backup\Crap\StackOverflow\StackOverflow_log.trn'

Just so you know I’m not pulling any shenanigans, let’s generate some log activity. This will dump 100 rows into a dummy table every 30 seconds. It is neither pretty nor elegant.

USE [StackOverflow]

IF OBJECT_ID('dbo.YouBigDummy') IS NOT NULL
DROP TABLE [dbo].[YouBigDummy]

CREATE TABLE [dbo].[YouBigDummy]
(
ID INT IDENTITY(1,1),
[Name] VARCHAR(50) DEFAULT 'You',
CreateDate DATETIME DEFAULT GETUTCDATE()
)

INSERT dbo.[YouBigDummy]
        ( [Name], [CreateDate] )
DEFAULT VALUES
GO 100
WAITFOR DELAY '00:00:30.000'
GO 
INSERT dbo.[YouBigDummy]
        ( [Name], [CreateDate] )
DEFAULT VALUES
GO 100
WAITFOR DELAY '00:00:30.000'
GO 
INSERT dbo.[YouBigDummy]
        ( [Name], [CreateDate] )
DEFAULT VALUES
GO 100
WAITFOR DELAY '00:00:30.000'
GO 
INSERT dbo.[YouBigDummy]
        ( [Name], [CreateDate] )
DEFAULT VALUES
GO 100
WAITFOR DELAY '00:00:30.000'
GO 
INSERT dbo.[YouBigDummy]
        ( [Name], [CreateDate] )
DEFAULT VALUES
GO 100
WAITFOR DELAY '00:00:30.000'
GO 
INSERT dbo.[YouBigDummy]
        ( [Name], [CreateDate] )
DEFAULT VALUES
GO 100
WAITFOR DELAY '00:00:30.000'
GO 
TRUNCATE TABLE [dbo].[YouBigDummy]

When that’s all done, you can run something like this to see what happened. You’ll probably have to replace the date. I wrote this, like, two or three weeks ago by now.

SELECT  [b].[database_name] ,
        [b].[backup_start_date] ,
        [b].[backup_finish_date] ,
        [b].[type] ,
        [b].[first_lsn] ,
        [b].[last_lsn] ,
        [b].[checkpoint_lsn] ,
        [b].[database_backup_lsn]
FROM    [msdb].[dbo].[backupset] AS [b]
WHERE   [b].[database_name] = 'StackOverflow'
        AND [b].[backup_start_date] >= '2015-12-02 17:30:00.000'
ORDER BY [b].[backup_start_date];

When you take a full backup, the first thing it does it issue a checkpoint. That’s why the full and all subsequent log backups have the same checkpoint LSN. The first four log backups all have the same database backup LSN because they occurred during the full backup. That doesn’t change until the full is done.

RED SQUARES AT NIGHT

For toots and snickers, I ran this all a second time, and cancelled the full backup halfway through. The full backup issued a new checkpoint, so the checkpoint LSN changes, but the database backup LSN never changes, because it got canceled. That means taking log backups during full backups is totally useful. If your full backup fails for whatever reason, these things keep the chain alive.

NOTHING CHANGES

If the third time is to be a charm, and it is, the same thing occurs as the first run. New checkpoint LSN, and the database backup LSN runs through until the backup finishes. You can verify that by looking at the start and end times columns.

I’m always convinced that whoever came up with the term LSN Chains really liked 90’s Grunge.

If you still don’t believe me

Just look at sys.databases while you’re running a full backup.

SELECT  [name],
        [log_reuse_wait] ,
        [log_reuse_wait_desc]
FROM    [sys].[databases]
WHERE [name] = 'StackOverflow';

Yes, I made you read all that to get here.

The result is acknowledged, though not documented, here. This does indeed mean that log truncation will not occur during a full backup even if you take log backups. It will happen when you take the first log backup after the full finishes. You may want to consider this when scheduling maintenance items that may hog up log space alongside full backups.

Recap

Backups are beautiful things. You should take full ones, and probably differential ones, and if you’re in full recovery model, definitely log ones. How often you take them is up to you and your boss. Or maybe their boss. But it’s definitely not up to you. Unless you’re your boss.

Log backups during full backups won’t hurt anything, and may end up helping things if your full backup fails, and you need to restore something.

Log backups during full backups will not truncate the log. That has to wait until the first log backup after the full finishes.

Everybody wants to know about index fragmentation

It is an inescapable vortex of malaise and confusion. Like that swamp in The Neverending Story that killed the horse. Sorry if you haven’t seen that movie. The horse wasn’t that cool, anyway.

Neither is index fragmentation, but it’s not worth losing sleep over. Or a horse.

I see a lot of people messing with the fill factor of their indexes. Sometimes you gotta. If you use GUIDs for a clustering key, for example. If you don’t lower fill factor from 100, you’re going to spend a lot of time splitting pages when you insert records. GUIDs are hard to run out of, but they’re even harder to put in order.

Setting fill factor under 100 tells SQL to leave free space on index pages at the leaf level for new records to be add to. If you don’t, and a record needs to be added to a page, it will do about a 50/50 split to two other pages.

When does it hurt?

Like most things, not at first. To prove it, let’s rebuild an index at different fill factors, and insert some fragmentation information into a table. It’s pretty easy. Create a table, rebuild the index, insert record to table. I could have done this in a loop, but I’m kind of lazy today.

CREATE TABLE [dbo].[FillFactor](
  [TableName] [nvarchar](128) NULL,
  [IndexName] [sysname] NULL,
  [index_type_desc] [nvarchar](60) NULL,
  [avg_fragmentation_in_percent] [float] NULL,
  [page_count] [bigint] NULL,
  [fill_factor] [tinyint] NOT NULL
) ON [PRIMARY]
GO

INSERT [dbo].[FillFactor]
        ( [TableName] ,
          [IndexName] ,
          [index_type_desc] ,
          [avg_fragmentation_in_percent] ,
          [page_count] ,
          [fill_factor] )
SELECT  OBJECT_NAME([ddips].[object_id]) AS [TableName] ,
        [i].[name] AS [IndexName] ,
        [ddips].[index_type_desc] ,
        [ddips].[avg_fragmentation_in_percent] ,
        [ddips].[page_count],
    i.[fill_factor]
FROM    [sys].[dm_db_index_physical_stats](DB_ID(N'StackOverflow'),
                                           OBJECT_ID('dbo.Votes'), 
                       NULL, 
                       NULL,
                                           'LIMITED') AS [ddips]
JOIN    [sys].[tables] AS [t]
ON      [t].[object_id] = [ddips].[object_id]
JOIN    [sys].[indexes] AS [i]
ON      [i].[object_id] = [ddips].[object_id]
        AND [i].[index_id] = [ddips].[index_id];

ALTER INDEX [PK_Votes] ON [dbo].[Votes] REBUILD WITH (FILLFACTOR = 100)
ALTER INDEX [PK_Votes] ON [dbo].[Votes] REBUILD WITH (FILLFACTOR = 80)
ALTER INDEX [PK_Votes] ON [dbo].[Votes] REBUILD WITH (FILLFACTOR = 60)
ALTER INDEX [PK_Votes] ON [dbo].[Votes] REBUILD WITH (FILLFACTOR = 40)
ALTER INDEX [PK_Votes] ON [dbo].[Votes] REBUILD WITH (FILLFACTOR = 20)

SELECT [ff].[TableName] ,
       [ff].[IndexName] ,
       [ff].[index_type_desc] ,
       [ff].[avg_fragmentation_in_percent] ,
       [ff].[page_count] ,
       [ff].[fill_factor]
FROM [dbo].[FillFactor] AS [ff]
ORDER BY [ff].[fill_factor] DESC

Put on your thinking cap. Fragmentation percent doesn’t budge. Granted, we rebuilt the index, so that’s expected. But look at page counts. Every time we reduce fill factor, page count gets higher. Why does that matter? Each page is 8kb. The more pages are in your index, the more you’re reading from disk into memory. The lower your fill factor, the more blank space you’re reading from disk into memory. You could be wasting a lot of unnecessary space both on disk and in memory by lowering fill factor.

Space Age Love Song is probably the best Flock of Seagulls song, just so you know.

Let’s do math!

Because everyone loves math. Let’s take page count, multiply it by 8, and then divide it by 1024 twice to get the size of each index in GB.

SELECT [ff].[TableName] ,
       [ff].[IndexName] ,
       [ff].[index_type_desc] ,
       [ff].[avg_fragmentation_in_percent] ,
       [ff].[page_count] ,
       [ff].[fill_factor],
  ([ff].[page_count] * 8.) / 1024. / 1024. AS [SizeGB]
FROM [dbo].[FillFactor] AS [ff]
ORDER BY [ff].[fill_factor] DESC

Even reducing this to 80 takes up about an extra 600MB. That can really add up. Granted, disk and memory are cheap, but they’re not infinite. Especially if you’re on Standard Edition.

Ack! Numbers!

It’s in everything

It’s not just queries that reading extra pages can slow down. DBCC CHECKDB, backups, and index and statistics maintenance all have to deal with all those pages. Lowering fill factor without good reason puts you in the same boat as index fragmentation does, except regular maintenance won’t “fix” the problem.

You can run sp_BlitzIndex® to help you find indexes that have fill factor set to under 100.

With each new version of SQL comes a slew of new stuff

While some changes are cosmetic, others bewildering, and the rest falling somewhere between “who cares about JSON?” and “OH MY GOD TAKE MY MONEY!”, but not really my money, because I only buy developer edition. Aaron Bertrand has done a better job finding, and teaching you how to find new features than I could. Head over to his blog if you want to dive in.

What I wanted to look at was something much more near and dear to my heart: Parallelism.

(It’s so special I capitalized it.)

In the past, there were a number of things that caused entire plans, or sections of plans, to be serial. Scalar UDFs are probably the first one everyone thinks of. They’re bad. Really bad. They’re so bad that if you define a computed column with a scalar UDF, every query that hits the table will run serially even if you don’t select that column. So, like, don’t do that.

What else causes perfectly parallel plan performance plotzing?

Total:

UDFs (Scalar, MSTVF)
Modifying table variables

Zone:

Backwards scan
Recursive CTE
TOP
Aggregate
Sequence
System tables

Other:

CLR functions (that perform data access)
Dynamic cursors
System functions

We’ll just look at the items from the first two lists. The stuff in Other is there because I couldn’t write a CLR function if I had Adam Machanic telling me what to type, cursors don’t need any worse of a reputation, and it would take me a year of Sundays to list every internal function and test it.

I’m going to depart from my normal format a bit, and put all the code at the end. It’s really just a mess of boring SELECT statements. The only thing I should say up front is that I’m leaning heavily on the use of an undocumented Trace Flag: 8649. I use it because it ‘forces’ parallelism by dropping the cost threshold for parallelism to 0 for each query. So if a parallel plan is possible, we’ll get one. Or part of one. You get the idea.

Just, you know, don’t use it in production unless you really know what you’re doing. It’s pretty helpful to use as a developer, on a development server, to figure out why queries aren’t going parallel. Or why they’re partially parallel.

All of this was run on 2016 CTP 3.1, so if RTM comes along, and something here is different, that’s why. Of course, backwards scans are probably close to 15 years old, so don’t sit on your thumbs waiting for them to get parallel support.

Backwards scan!

This is what happens when your ORDER BY is the opposite of your index.

Scalar!

Not only do they run serial, but they run once for every row returned. Have a nice day!

Table with computed column

Hint: the Id column isn’t the computed one.

MSTVF

Multi-statement Table Valued Garbage

Table Variable Insert

You probably can’t name one good reason to use a table variable.

Top

Top o’ the nothin’!

Aggregating

You’re making me wanna use Excel, here.

Row Number (or any windowing/ranking function)

You’re gonna get all those row numbers one by one.

Accessing System Tables

Twisted SYS-ter

The recursive part of a recursive CTE

This part was fun. I liked this part.

Picture time is over

Now you get to listen to be prattle on and on about how much money you’re wasting on licensing by having all your queries run serially. Unless you have a SharePoint server; then you have… many other problems. If I had to pick a top three list that I see people falling victim to regularly, it would be:

Scalar functions
Table variables
Unsupported ORDER BYs

They’re all relatively easy items to fix, and by the looks of it, we’ll be fixing them on SQL Server 2016 as well. Maybe Query Store will make that easier.

Thanks for reading!

Das Code

USE [StackOverflow];
SET NOCOUNT ON;
GO

/*Scalar*/
CREATE FUNCTION [dbo].[ScalarValueReturner] ( @id INT )
RETURNS INT
       WITH RETURNS NULL ON NULL INPUT,
            SCHEMABINDING
AS
    BEGIN
        RETURN @id * 1;
    END;
GO

/*MSTVF*/
CREATE FUNCTION [dbo].[MSTVFValueReturner] ( @id INT )
RETURNS @Out TABLE ( [Returner] INT )
       WITH SCHEMABINDING
AS
    BEGIN
        INSERT  INTO @Out
                ( [Returner] )
        SELECT  @id;
        RETURN;
    END;
GO

SELECT  [Id]
FROM    [dbo].[Users]
WHERE   [Id] BETWEEN 1 AND 10000
ORDER BY [Id] DESC
OPTION  ( QUERYTRACEON 8649, RECOMPILE );


SELECT  [Id] ,
        [dbo].[ScalarValueReturner]([Id])
FROM    [dbo].[Users]
WHERE   [Id] BETWEEN 1 AND 10000
OPTION  ( QUERYTRACEON 8649, RECOMPILE );

SELECT  [Id] ,
        [ca].[Returner]
FROM    [dbo].[Users]
CROSS APPLY [dbo].[MSTVFValueReturner]([Id]) AS [ca]
WHERE   [Id] BETWEEN 1 AND 10000
OPTION  ( QUERYTRACEON 8649, RECOMPILE );

DECLARE @t TABLE ( [Id] INT );
INSERT  @t
SELECT  [Id]
FROM    [dbo].[Users]
WHERE   [Id] BETWEEN 1 AND 10000
OPTION  ( QUERYTRACEON 8649, RECOMPILE );

SELECT TOP ( 10000 )
        [u].[Id] ,
        [ca].[Id] AS [Returner]
FROM    [dbo].[Users] AS [u]
CROSS APPLY ( SELECT    [u2].[Id]
              FROM      [dbo].[Users] AS [u2]
              WHERE     [u2].[Id] = [u].[Id] ) AS [ca]
WHERE   [u].[Id] BETWEEN 1 AND 10000
OPTION  ( QUERYTRACEON 8649, RECOMPILE );

SELECT  AVG(CAST([dt].[Reputation] AS BIGINT)) AS [WhateverMan]
FROM    ( SELECT    [u].[Id] ,
                    [u].[Reputation]
          FROM      [dbo].[Users] [u]
          JOIN      [dbo].[Posts] [p]
          ON        [u].[Id] = [p].[OwnerUserId]
          WHERE     [p].[OwnerUserId] > 0
                    AND [u].[Reputation] > 0 ) AS [dt]
WHERE   [dt].[Id] BETWEEN 1 AND 10000
OPTION  ( QUERYTRACEON 8649, RECOMPILE );

SELECT  [Id] ,
        ROW_NUMBER() OVER ( PARTITION BY [Id] ORDER BY [Id] ) AS [NoParallel]
FROM    [dbo].[Users]
WHERE   [Id] BETWEEN 1 AND 10000
OPTION  ( QUERYTRACEON 8649, RECOMPILE );


SELECT TOP ( 10000 )
        [sc1].[column_id]
FROM    [master].[sys].[columns] AS [sc1]
CROSS JOIN [master].[sys].[columns] AS [sc2]
CROSS JOIN [master].[sys].[columns] AS [sc3]
CROSS JOIN [master].[sys].[columns] AS [sc4]
CROSS JOIN [master].[sys].[columns] AS [sc5]
OPTION  ( QUERYTRACEON 8649, RECOMPILE );

CREATE TABLE [dbo].[NorseGods]
    (
      [GodID] INT NOT NULL ,
      [GodName] NVARCHAR(30) NOT NULL ,
      [Title] NVARCHAR(100) NOT NULL ,
      [ManagerID] INT NULL ,
      CONSTRAINT [PK_GodID] PRIMARY KEY CLUSTERED ( [GodID] )
    );

INSERT  INTO [dbo].[NorseGods]
        ( [GodID], [GodName], [Title], [ManagerID] )
VALUES  ( 1,    N'Odin',  N'War and stuff', NULL )
,       ( 2,  N'Thor',    N'Thunder, etc.', 1 )
,       ( 3,  N'Hel',     N'Underworld!',   2 )
,       ( 4,  N'Loki',    N'Tricksy',       3 )
,       ( 5,  N'Vali',    N'Payback',       3 )
,       ( 6,  N'Freyja',  N'Making babies', 2 )
,       ( 7,  N'Hoenir',  N'Quiet time',    6 )
,       ( 8,   N'Eir',    N'Feeling good',  2 )
,       ( 9,   N'Magni',  N'Weightlifting', 8 );

WITH    [Valhalla]
          AS ( SELECT   [ManagerID] ,
                        [GodID] ,
            [GodName],
                        [Title] ,
                        0 AS [MidgardLevel]
               FROM     [dbo].[NorseGods]
               WHERE    [ManagerID] IS NULL
               UNION ALL
               SELECT   [ng].[ManagerID] ,
                        [ng].[GodID] ,
            [ng].[GodName],
                        [ng].[Title] ,
                        [v].[MidgardLevel] + 1
               FROM     [dbo].[NorseGods] AS [ng]
               INNER JOIN [Valhalla] AS [v]
               ON       [ng].[ManagerID] = [v].[GodID]
             )
    SELECT  [Valhalla].[ManagerID] ,
            [Valhalla].[GodID] ,
      [Valhalla].[GodName] ,
            [Valhalla].[Title] ,
            [Valhalla].[MidgardLevel]
    FROM    [Valhalla]
    ORDER BY [Valhalla].[ManagerID]
  OPTION (QUERYTRACEON 8649, RECOMPILE)

As if there weren’t enough reasons

In my last blog post I talked about different things that cause plans, or zones of plans, to execute serially. One of the items I covered was computed columns that reference scalar functions. We know that they’ll make queries go parallel, but what about other SQL stuff?

Oh no my index is fragmented

If you’re running Expensive Edition, index rebuilds can be both online and parallel. That’s pretty cool, because it keeps all your gadgets and gizmos mostly available during the whole operation, and the parallel bit usually makes things faster.

That is, unless you have a computed column in there that references a scalar function. I decided to write my test function to not perform any data access so it could be persisted. It’s dead simple, and I’m tacking it on to a column in the PostLinks table of the Stack Overflow database.

CREATE FUNCTION dbo.PIDMultiplier (@pid int)
RETURNS INT
    WITH RETURNS NULL ON NULL INPUT,
         SCHEMABINDING
AS
    BEGIN
        DECLARE @Out BIGINT;
        SELECT  @Out = @pid * 2
        RETURN @Out;
    END;
GO

ALTER TABLE [dbo].[PostLinks]
ADD [Multiplied] AS dbo.[PIDMultiplier]([PostId]) PERSISTED

For this one, all we have to do is turn on actual execution plans and rebuild the index, then drop the column and rebuild again.

ALTER TABLE [dbo].[PostLinks] REBUILD WITH (ONLINE = ON)

ALTER TABLE [dbo].[PostLinks] DROP COLUMN [Multiplied]

Here are my execution plans. The rebuild I ran when the table had my computed column in it stayed serial.

Hi, I’m garbage.

Parallel, sans computed column:

Dude, you’re getting a parallel.

But there’s a bigger fish in the pond

Probably the most important maintenance item you should be doing, aside from backups, is running DBCC CHECKDB. Seriously, if you’re not doing them both, start today. Ola Hallengren has basically done all the work for you. Back when I had a real job, I used his scripts everywhere.

Before we were so rudely interrupted by a soap box, we were talking about parallelism. This part was a little bit more complicated, but don’t worry, you don’t have to follow along. Just look at the pretty pictures. Sleep now. Yes. Sleep.

The first couple times I tried, the DBCC check never went parallel. Since I’m on my laptop, and not a production server, I can set Cost Threshold for Parallelism to 0. You read that right, ZE-RO! Hold onto your drool dish.

With that set, I fire up Ye Olde Oaken sp_BlitzTrace so I can capture everything with Extended Events. You’ll need all three commands, but you’ll probably have to change @SessionId, and you may have to change @TargetPath. Run the first command to start your session up.

EXEC [dbo].[sp_BlitzTrace] @SessionId = 61 , @Action = 'start' , @TargetPath = 'c:\temp\' , @TraceParallelism = 1 , @TraceExecutionPlansAndKillMyPerformance = 1 

EXEC [dbo].[sp_BlitzTrace] @Action = 'stop' 

EXEC [dbo].[sp_BlitzTrace] @Action = 'read'

With that running, toss in your DBCC command. I’m only using DBCC CHECKTABLE here to simplify. Rest assured, if you run DBCC CHECKDB, the CHECKTABLE part is included. The only checks that DBCC CHECKDB doesn’t run are CHECKIDENT and CHECKCONSTRAINT. Everything else is included.

DBCC CHECKTABLE('dbo.PostLinks') WITH NO_INFOMSGS, ALL_ERRORMSGS

ALTER TABLE [dbo].[PostLinks]
ADD [Multiplied] AS dbo.[PIDMultiplier]([PostId]) PERSISTED

Run DBCC CHECKTABLE, add the computed column back, and then run it again. When those finish, run the sp_BlitzTrace commands to stop and read session data. You should see execution plans for each run, and they should be way different.

Hell Yeah.

Hell No.

So even DBCC checks are serialized. Crazy, right? I’d been hearing about performance hits to varying degrees when running DBCC checks against tables with computed columns for a while, but never knew why. There may be a separate reason for regular computed columns vs. ones that reference scalar functions. When I took the equivalent SQL out of a function, the DBCC check ran parallel.

ALTER TABLE [dbo].[PostLinks]
ADD [Multiplied] AS [PostId] * 2 PERSISTED

Of course, those online index rebuilds running single threaded might be a blessing in disguise, if you haven’t patched SQL recently.

I don’t have much of a grand closing paragraph here. These things can seriously mess you up for a lot of reasons. If you’re a vendor, please get away from using scalar functions, and please please don’t use them in computed columns.

Thanks for reading!

I’ve set up Mirroring about a billion times

I’m not bragging about that. I’d rather say that I set up a billion AGs, and not one of them ever failed. But then I’d be lying to you; those things fail like government programs. One thing I’d never done, though, is set up Mirroring with a Witness. I never wanted automatic failover, because it’s only one database at a time. If for some reason one database out of all that I had mirrored ever turned Ramblin’ Man and failed over to another server, there would understandably be some application consternation. Not to mention any maintenance and internal operations. They don’t react well to sudden database unavailability.

Of course, doing anything for the first time is horrible. Just ask my second wife.

Here’s where things got awkward

I have my databases! This is my top secret development environment. Stack Overflow is in an AG, and I had set up two other Mirrors: one synch and one asynch. I wanted to have a variety of setups to test some scripts against.

Everything looks good!

Alright, let’s set up Mirroring…

Configuring stuff is cool, right?

Yeah yeah next next next

Service accounts whatever BORING

GREEN LIGHT GO!

This is so easy. Seriously. Why doesn’t everyone do this? Why do you complicate your short, short lives with Availability Groups? Are they AlwaysOn? Are they Always On? WHO KNOWS? Not even Microsoft.

I’m hitting this button and jumping into a mile of Laphroaig.

HIGH FIVES ALL ARO-

Y THO?

This is the error text:

The ALTER DATABASE command could not be sent to the remote server instance 'TCP://ORACLEDB.darling.com:5022'. The database mirroring configuration was not changed. Verify that the server is connected, and try again.

Super Sleuth

Alright, that’s silly. I used the GUI. Instead of going to bed I’ll spend some time checking all my VM network settings. BRB.

I’m back. They were all correct. I could ping and telnet and set up linked servers and RDP. What in the name of Shub-Niggurath is going on with this thing?

I can even see the Endpoint! So close, and yet so far~~

The Endpoint is showing up on the Witness and what is this?

Where are we now?

This is a good time for a quick recap

Mirroring is up and running synchronously
The endpoint is configured on the witness
We get an error when we try to connect the witness

TO THE ERROR LOG!

I should have done this hours ago.

Well whaddya know? That’s a really good clue. Encryption and stuff. There’s no compatible algorithm. Ain’t that somethin’? You’d think that Microsoft would be cool about setting up the same kind of encryption across all the different Endpoints, if using different encryption would cause the setup to fail. Right guys? Heh. Right? Hey, hello?

NOPE.

Alright, let’s see what I need to be a matchmaker.

The Cure – Primary

Oh. AES. Okay. Cool. Thanks.

RC4 WHY WOULD YOU DO THAT?

Since we have them both scripted out already, let’s just drop and re-create the Witness Endpoint with the right encryption algorithm.

INTO THE TREES

That did not result in a forest fire. I’m hopeful. Sort of. It’s been a long night and I think I can see tomorrow from here.

Scriptin’

Meanwhile, back on the Primary…

I AM SO SMRT

It worked! Now I have a Witness, and I can shut all my VMs down. That was so much fun.

What did we learn?

Microsoft hates you and doesn’t want you to sleep. Just kidding. Mostly. But seriously, why would they do that?

It mostly goes to show that it’s always a smart idea to use that little script button at the top of (most) GUIs in SSMS. Who knows what kind of foolishness you’ll find? A little reading can save you a lot of time troubleshooting errors that make you feel insane.

Thanks for reading!

I love living in the city

Blog posts about people’s favorite data sets seem to be popular these days, so I’m throwing my hat in the ring.

NYC has been collecting all sorts of data from all sorts of sources. There’s some really interesting stuff in here.

Another personal favorite of mine is MTA turnstile data. If you’re a developer looking to hone your ETL skills, this is a great dataset, because it’s kind of a mess. I actually had to use PowerShell to fix inconsistencies with the older text files, which I’m still recovering from. I won’t spoil all the surprises for you.

Of course, there’s Stack Overflow.

You can’t go wrong with data from either of these sources. They’re pretty big. The main problem I have with Adventure Works is that it’s a really small database. It really doesn’t mimic the large databases that people deal with in the real world, unless you ~~do some work~~ run a script to make it bigger. The other problem with Adventure Works is that it went out of business a decade ago because no one wanted to buy yellow bikes. I’ve been learning a bit about Oracle, and their sample data sets are even smaller. If anyone knows of better ones, leave a comment.

Anyway, get downloading! Just don’t ask me about SSIS imports. I still haven’t opened it.

Thanks for reading!

Most of you are going to hate this

And TL;DR, there’s a script at the end of the post. But like The Monster At The End Of This Book, it’s worth it not to skip the middle.

There are about a billion but-what-ifs that could come into play. I can’t possibly answer all of those for you. But that’s not the point of this post, anyway! If you’re in a special circumstance, using some fancy features, or doing something utterly deranged to your database, this isn’t the post, or script, for you.

I mean really, unless the size of your log file is causing you some dramatic pain, leave it alone. You should probably go invent cold fusion if log file size is the worst issue in your database. Congratulations.

This is also a lousy place to ask me if you can shrink your log file. I have no idea how or why it got that size. There’s free space now because you’re using FULL recovery model and you took a log backup, or you’re in SIMPLE and your database hit a CHECKPOINT. No magic there. It may very well grow to that size again, so shrinking it could be a really dumb idea.

So what’s the point? Lots of people ask me this question: clients, Office Hours attendees, random passerby on the street who recognize me (even without my Robot). I usually give them the same answer and explanation, unless I have ample evidence that their fancy and/or deranged ways require a different estimate.

From the ivory tower

A good STARTING POINT for your log file is twice the size of the largest index in your database, or 25% of the database size. Whichever is larger.

Why?

If the largest object in your database is larger than 25% of your database, you are likely running some type of maintenance. Index rebuilds require the size of the object being rebuilt in log space. I usually rule of thumb twice that space, in case you’re doing anything else while you’re doing that maintenance, like ETL, reports, dragging data to and fro, purging data, whatever. If you’re only ever reorganizing the largest object, you may not need all that space. Are you sure you’re ONLY ever reorganizing that? I’ll wait.

But 25% seems so random!

Well, kinda. but you’re here for a starting point. If you’re not Super DBA and taking baselines and trending your database file sizes over time, random is better than nothing. It buys you some leeway, too.

If you miss a log backup (maintenance plans got you down?)
If you’re not taking frequent enough log backups (can I interest you in RPO/RTO insurance?)
If you run other long/large transactions (SSIS won’t save you)

You’ll have a fair amount of room to do your dirty work. Most sane and rational people consider this to be a positive thing.

But what if my log file still grows?

Well, then you found out you need a bigger log file. Or you need to take log backups more frequently. Perhaps those hourly log backups aren’t working out as you planned, hm?

And if your log file never grows, you’ll look really smart. And you’ll never have to wait for your log file to expand. They don’t benefit from Instant File Initialization the way data files do.

Show me the script already

It’s all right under here. Don’t forget to change the USE statement. All sizes are in GB. If your database is smaller than 1GB, you’re one of those lucky DBAs who can take vacations and stuff. Go do that. Life is short.

If your database is under 1GB, and your log file is over 1GB, start taking log backups. I’m pretty sure you’re not.

USE [StackOverflow] 
--You'll probably want to use your own database here
--Unless you work at Stack Overflow
--No, I'm not writing this to loop through all of your databases

;
WITH    [log_size]
          AS ( SELECT TOP 1
                        SCHEMA_NAME([t].[schema_id]) AS [schema_name] ,
                        [t].[name] AS [table_name] ,
                        [i].[name] ,
                        [p].[rows] AS [row_count] ,
                        CAST(( SUM([a].[total_pages]) * 8. ) / 1024. / 1024. AS DECIMAL(18,
                                                              2)) AS [index_total_space_gb] ,
                        ( SUM([a].[total_pages]) * 8 ) / 1024 / 1024 * 2 AS [largest_index_times_two_(gb)] ,
                        ( SELECT    ( SUM([mf].[size]) * 8 ) / 1024 / 1024
                          FROM      [sys].[master_files] AS [mf]
                          WHERE     [mf].[database_id] = DB_ID() ) AS [database_size_(gb)] ,
                        ( SELECT    CAST(( SUM([mf].[size]) * 8 ) / 1024
                                    / 1024 AS INT)
                          FROM      [sys].[master_files] AS [mf]
                          WHERE     [mf].[database_id] = DB_ID()
                                    AND [mf].[type_desc] = 'LOG' ) AS [current_log_size_(gb)] ,
                        ( SELECT    CAST(( SUM([mf].[size]) * 8 ) / 1024
                                    / 1024 * .25 AS INT)
                          FROM      [sys].[master_files] AS [mf]
                          WHERE     [mf].[database_id] = DB_ID()
                                    AND [mf].[type_desc] = 'ROWS' ) AS [25%_of_database_(gb)]
               FROM     [sys].[tables] [t]
               INNER JOIN [sys].[indexes] [i]
               ON       [t].[object_id] = [i].[object_id]
               INNER JOIN [sys].[partitions] [p]
               ON       [i].[object_id] = [p].[object_id]
                        AND [i].[index_id] = [p].[index_id]
               INNER JOIN [sys].[allocation_units] [a]
               ON       [p].[partition_id] = [a].[container_id]
               WHERE    [t].[is_ms_shipped] = 0
               GROUP BY SCHEMA_NAME([t].[schema_id]) ,
                        [t].[name] ,
                        [i].[name] ,
                        [p].[rows]
               ORDER BY [index_total_space_gb] DESC)
     SELECT * ,
            CASE WHEN [ls].[largest_index_times_two_(gb)] > [ls].[25%_of_database_(gb)]
                 THEN [ls].[largest_index_times_two_(gb)]
                 ELSE [ls].[25%_of_database_(gb)]
            END AS [maybe_this_is_a_good_log_size(gb)]
     FROM   [log_size] AS [ls]
OPTION  ( RECOMPILE );

Contrary to popular belief

You will not burst into eternal flames if you’re a SQL Server guy or gal, and you happen to be within 100 feet of an Oracle database. You might feel lost and confused for a while, but you probably felt the same way when you opened up SSMS for the first time.

I’ve been trying to expand my horizons a bit, so I followed some reasonably simple download instructions to get my feet wet in the ocean of gold hundred dollar bills and lobster/veal hybrids that is Oracle. The next thing I needed was some sample data, so I grabbed the HR stuff. I don’t need anything too complicated yet.

Is that Oracle handles multi-tenancy a bit differently than SQL Server does, at least up until 12c and the introduction of pluggable databases. Where in SQL Server you create databases, in Oracle you create users and schema. Creating an Oracle database happens when you install the software. It’s weird at first. Like when you move into a new apartment and keep reaching to the wrong side to switch the bathroom light on.

If you want to see what data exists for other users, you expand the Other Users node to view tables, indexes, and views, etc.

Sympathy for the Larry

Why start with dates?

No reason! We’ll talk about numbers and string later.

Another minor quirk when working with Oracle is that you can’t do what you do in SQL Server:

SELECT GETDATE()

In Oracle, you have to reference the DUAL table; it’s just one column called DUMMY and one row with an X in it.

SELECT * FROM DUAL;

Retrieving system dates goes like this:

SELECT SYSDATE, SYSTIMESTAMP
FROM DUAL;

SYSDATE just gives you the date, SYSTIMESTAMP gives you date, time and timezone offset.

Deep breaths!

The first thing you’ll notice is that Oracle presents dates differently — DD-MON-YY — which is something that, curiously, I have seen pop up as a requirement for people to do with SQL Server. Because SQL Server is a great presentation layer.

(It’s really not.)

Format is everything

If you do want to make your dates look a little prettier, you have to do a little wrangling with the TO_CHAR function. It allows you to specify string formatting of date and time data.

SELECT 
TO_CHAR( SYSDATE, 'YYYY-MM-DD HH:MI:SS') AS "SYSDATE Formatted" ,
TO_CHAR( SYSTIMESTAMP, 'YYYY-MM-DD HH:MI:SS.FF') AS "We need milliseconds",
TO_CHAR( CAST(SYSTIMESTAMP AS TIMESTAMP(3) ), 'YYYY-MM-DD HH:MI:SS.FF') "Maybe only three milliseconds",
TO_CHAR( CAST(SYSTIMESTAMP AS TIMESTAMP(3) ), 'YYYY-MM-DD HH24:MI:SS.FF') "And 24 hour time"
FROM DUAL;

Modern Art.

Which naturally bring us to comparing dates

You didn’t look at all that stuff and think you’d get off easy, did you? SQL Server jumps through a lot of hoops to let you pass in date values as predicates. Oracle, not so much. You have to do some hand holding.

Below are some queries where I tried to pass in date literals with varying degrees of success. Well, okay, the degrees of success were pretty binary. Not a lot of grey area when queries error out, huh?

--NO, hated this all day long.
SELECT *
FROM HR.EMPLOYEES HR
WHERE HR.HIRE_DATE >= '2000-01-01'

--YES, date passed in using standard Oracle DD-MON-YY format
SELECT *
FROM HR.EMPLOYEES HR
WHERE HR.HIRE_DATE >= '01-JAN-00';

--YES, close enough to above, I suppose.
SELECT *
FROM HR.EMPLOYEES HR
WHERE HR.HIRE_DATE >= '01-JAN-2000';

--NO, MM-DD? DD-MM? How is it supposed to know?
SELECT *
FROM HR.EMPLOYEES HR
WHERE HR.HIRE_DATE >= '01-01-2000';

--YES, prefixing the string with DATE brings an uneasy truce.
SELECT *
FROM HR.EMPLOYEES HR
WHERE HR.HIRE_DATE >= DATE '2000-01-01';

--YES, explicitly casting the string to a date with format works.
SELECT *
FROM HR.EMPLOYEES HR
WHERE HR.HIRE_DATE >= TO_DATE('2000-01-01', 'YYYY-MM-DD');

The main point to take away here is that Oracle is much pickier about this than SQL Server is. It’s totally fine to pass in ‘2000-01-01’ and a lot of different variations in formatting and delimiting and still get valid results back. Whether that’s good or bad is up to you. I like the ease and flexibility of SQL Server, but I also like that there’s far less ambiguity in Oracle.

I’m learning at the same time some of you are

So if there are any mistakes, misgivings, trick shots, or better ways to do things than I have written, feel free to leave comments.

Join me next time where I’ll look at some date math examples!

Thanks for reading!

Brent says: wow, I’m surprised it doesn’t take yyyy-mm-dd without the “DATE” prefix. I’ve been trying to break my American habits and use the yyyy-mm-dd format in my demos to work better with a worldwide audience, and now I feel like I’m behind again.

Join us in person for The Senior DBA Class or SQL Server Performance Tuning.

As soon as you store a date value

Someone is going to want to do something with it. Bucket it into a 30/60/90 day window, figure out how many times something happened between it and current, filter on it, or send you creepy anonymous cards on your birthday full of doll hair and coupons for lotion.

Since Oracle is apparently no slouch, it can do all this stuff for you. No more awkwardly counting days on your fingers or messing up your day-at-a-time Far Side desk calendar trying to look 6 months into the future. Not for you, smarty pants. We’re gonna use some SQL.

Just like home

If you add a number to a date value, Oracle adds that many days to the date. It’s exactly the same as GETDATE() + whatever in SQL Server. Below will add and subtract 365 days from the hire date of each employee.

SELECT 
HIRE_DATE, 
HIRE_DATE + 365,
HIRE_DATE - 365
FROM HR.EMPLOYEES;

Check out those headers!

One thing I really like that Oracle does, that SQL Server doesn’t do, is replace calculated column headers with the syntax behind them. It can be annoying if there’s really long or complicated syntax behind it, but it’s generally nicer to have something there. SQL Server just says “No column name” when you do the same thing. Both scenarios bring their own good reasons to alias columns to the table. Get it? Table. HEH!

One big difference you’ll find is that adding or subtracting two date values in Oracle leaves you with a number, not a date. The number you’re left with is the number of days between the two days.

SELECT FIRST_NAME, HIRE_DATE, SYSDATE, SYSDATE - HIRE_DATE
FROM HR.EMPLOYEES;

This is some old sample data.

It’s really easy to get the number or months between two dates, or add months to a date. You can use the MONTHS_BETWEEN and ADD_MONTHS functions to do that.

SELECT FIRST_NAME, MONTHS_BETWEEN(SYSDATE, HIRE_DATE) AS "A LONG TIME"
FROM HR.EMPLOYEES;

SELECT FIRST_NAME, HIRE_DATE, ADD_MONTHS(HIRE_DATE, 12) AS "FIRST ANNIVERSARY"
FROM HR.EMPLOYEES;

These are areas where I think SQL Server is a bit ahead of Oracle. DATEADD and DATEDIFF offer you a lot more flexibility and range of motion to pull out or add different intervals. I know there are Oracle workarounds with some additional math operators, but they’re not as straightforward to me.

A couple other neat options are NEXT_DAY and LAST_DAY. Next day tells you the next time the day of the week you choose occurs after the date you pass in, and LAST_DAY gives you the last day of the month for the date you pass in.

The NEXT_DAY function would be useful in a situation where the Monday following someone getting hired, they had to get their picture taken for their badge, or whatever.

SQL Server has EOMONTH (2012+), but curiously, neither platform has FIRST_DAY, or SOMONTH functions, to give you the first day of the month. That seems odd to me considering the amount of syntax that goes into getting back to the first day of the month (SELECT DATEADD(MONTH, DATEDIFF(MONTH, 0, @date), 0)).

You can also add years and months to a date using the TO_YMINTERVAL function. Um. Confetti?

SELECT EMPLOYEE_ID, HIRE_DATE, NEXT_DAY(HIRE_DATE, 'MONDAY') AS "FOLLOWING_MONDAY"
FROM HR.EMPLOYEES;

SELECT EMPLOYEE_ID, HIRE_DATE, LAST_DAY(HIRE_DATE)
FROM HR.EMPLOYEES;

SELECT EMPLOYEE_ID, HIRE_DATE, HIRE_DATE + TO_YMINTERVAL('10-00') AS "TENTH_ANNIVERSARY"
FROM HR.employees;

Finally, if you need to get individual parts of a date, you can use the EXTRACT function:

SELECT EMPLOYEE_ID,HIRE_DATE,
EXTRACT(DAY FROM HIRE_DATE),
EXTRACT(MONTH FROM HIRE_DATE),
EXTRACT(YEAR FROM HIRE_DATE)
FROM HR.EMPLOYEES;

This is pretty close to the DATEPART function in SQL Server.

Rudimentary Can’t Fail

Both Oracle and SQL Server store dates and times in the same way, though the way they’re presented and some of the ways we work with them vary between the two platforms. They’re not too far off from each other, and both have strengths and weaknesses. SQL Server makes it really easy to pass in dates, but Oracle is far less ambiguous about which dates it will compare, which can be error prone. Either way, if you’re developing an application to work with either platform, or if you need to port code over, you have your work cut out for you when it comes to working with dates.

Thanks for reading!

Brent says: those headers, are you kidding me? It puts the plain English formula as the field name? That’s amazing.

Join us in person for The Senior DBA Class or SQL Server Performance Tuning.

The almighty string

It’s so good for holding all sorts of things. Chicken roulade, beef roulade, salmon roulade. It’s also the way you should store phone numbers. If I could go back in time to when I first started working with SQL, that’s what I’d tell myself. Stop. Just, please, for the love of Codd, STOP TRYING TO STORE PHONE NUMBERS AS INTEGERS. It will only end in heartbreak and weird exponential notation.

But we’re here to talk about Oracle. And strings. And stuff you can do with them. I spent one Christmas writing a text parser that my wife still brings up when we go shopping. So this is an important area to me.

First things first, man

Oracle ships with a function to put your strings in proper case. It sounds trivial to most people, until you look at the amount of time, energy, and forum posts that have gone into getting the same behavior out of SQL Server. Why this isn’t built in is beyond me. I guess we really needed CHOOSE and IIF instead. Those are super helpful. Game changers, the both.

But check this out!

SELECT 
INITCAP('BRENT OZAR UNLIMITED'),
INITCAP('brent ozar unlimited'),
INITCAP('BrEnT oZaR uNlImItEd')
FROM DUAL;

I LOVE YOU ALL CAPS

But it doesn’t end there! I’m going to skip over the CONCAT function, because it only takes two arguments. In a situation where you need to do something like ‘Last Name, First Name’ you have to concatenate the comma + space to the last name. I think you actually end up typing more than just writing the whole thing out. So how do you do that?

SELECT HR.FIRST_NAME, HR.LAST_NAME, HR.LAST_NAME || ', ' || HR.FIRST_NAME AS "FULL_NAME"
FROM HR.EMPLOYEES HR;

You use the double pipes (||) to tell Oracle to mush everything together. A slightly more complicated example is if you needed to generate a ‘password’ based on some different bits of data.

SELECT 
EXTRACT(MONTH FROM HR.HIRE_DATE) || 
HR.LAST_NAME || 
EXTRACT(DAY FROM HR.HIRE_DATE) || 
HR.FIRST_NAME || 
EXTRACT(YEAR FROM HIRE_DATE) AS "TOP SECRET CODE"
FROM HR.EMPLOYEES HR;

Oracle is nice enough to let you put it all together without whinging about data types:

Hint: don’t actually generate passwords like this.

Oracle also has easy ways to pad strings, using LPAD and RPAD. This beats out most methods I’ve seen in SQL Server, using RIGHT/LEFT and some concatenation inside. It’s another situation where if you mix data types, the conversion happens automatically.

SELECT EMPLOYEE_ID, 
LPAD(EMPLOYEE_ID, 6, 0), --Padding with zero
RPAD(EMPLOYEE_ID, 6, '*') --Padding with asterisk
FROM HR.EMPLOYEES HR;

What you get is about as expected: strings padded to 6 digits with the character of your choosing.

Rubber Room

SQL offers RTRIM and LTRIM to remove leading and trailing spaces from strings. If you need to remove other stuff, you have an additional step of replacing them, or calculating substrings. Oracle’s TRIM function gives you several different ways to have it operate. Oracle also has REPLACE, but whatever. It does what you think it does.

SELECT 
      '    ' || PHONE_NUMBER || '    ' AS "PHONE WITH SPACES",
      TRIM('    ' || PHONE_NUMBER || '    ') AS "PHONE WITH SPACES REMOVED",
      TRIM(TRAILING '6' FROM PHONE_NUMBER) AS "TRAILING 6S REMOVED",
      TRIM(LEADING '6' FROM PHONE_NUMBER) "LEADING 6 REMOVED",
      TRIM(BOTH '6' FROM PHONE_NUMBER) AS "LEADING AND TRAILING REMOVED"
FROM HR.EMPLOYEES
WHERE PHONE_NUMBER = '603.123.6666';

TRIM by itself will remove trailing and leading spaces. You can also hint it to only do trailing, leading, or both, and additionally specify which character you want to remove. In the example above. I found a phone number that started with a 6 and ended with four 6s. Below is how each run of the TRIM function worked:

Several Sixes

It’s much more flexible and useful than SQL Server’s trimming functions, because it can be used to trim off things other than spaces.

Second String

Oracle has a really cool function for searching in strings, cleverly titled INSTR. It takes a few arguments, and sort of like CHARINDEX, you can tell it where in the string to start searching. The real magic for me is that you can also specify which OCCURRENCE of the string you want to find. The only downside is that it appears to be case sensitive.

SELECT LAST_NAME, 
      INSTR(LAST_NAME, 'A'), --First A
      INSTR(LAST_NAME, 'A', 1, 2) --Second occurrence of A starting at first char
FROM HR.EMPLOYEES;

That runs, but it only gets us a hit for people whose names start with A:

A is for Ack

All those B names have an a for the second character. But it’s easy to solve though, just use the UPPER function (or LOWER, whatever).

SELECT LAST_NAME, 
      INSTR( UPPER(LAST_NAME), 'A'), --First occurrence
      INSTR( UPPER(LAST_NAME), 'A', 2), --First occurrence starting at second char
      INSTR( UPPER(LAST_NAME), 'A', 1, 2) --Second occurrence starting at first char
FROM HR.EMPLOYEES;

Just like CHARINDEX or PATINDEX, it returns the position of the string you’re searching for. The last column is where it’s super interesting to me. I love that kind of flexibility. Here are the results:

I want all positions!

And just like with SQL Server, you can use substrings and character searching to parse out text between two characters. This is made a bit easier in Oracle by being able to specify the nth occurrence of a character, rather than having to a separate call to CHARINDEX.

SELECT PHONE_NUMBER, 
SUBSTR(PHONE_NUMBER, --Phone number
INSTR(PHONE_NUMBER, '.') + 1, --To the first period + 1
INSTR(PHONE_NUMBER, '.', 1, 2) -1 --To the second period - 1
- INSTR(PHONE_NUMBER, '.') ) --Minus the characters prior to the first period
AS "Middle Three"
FROM HR.EMPLOYEES;

And you get that there string parsing magic to just the middle three digits.

Middling.

That’s a lot harder when the strings aren’t uniform. We could just just as easily specified the constant positions in SUBSTR. But hey, declaring variables in Oracle is hard work. Seriously. There’s a lot of typing involved. And a colon. It’s weird.

Strung Out

This is as far as I’ve gotten with string manipulation in Oracle. I tried to figure out the stuff that I used to have to do in SQL Server a lot first. That usually makes the rest make more sense down the line. This is one area where I think Oracle clearly wins out. Though there’s a lot of conversation, and rightly so, about if the database is the proper place to do string manipulations like this.

Many people don’t have a choice. SQL either acts as the presentation layer, or the DBAs/developers who have to make these changes only have access to the data at the database level. It’s not rare that someone only knows some form of SQL, either.

Whatever your philosophy on the matter, it’s likely going to keep happening. Oracle makes it much easier to remedy some pretty common issues. One item I didn’t cover is using the LISTAGG function, which is another thing that SQL’s workarounds for are quite hacky, error prone, and involve XML. I can see why it would be a pretty big turn off for more people to have to implement it over a simple function call.

Thanks for reading!

Brent says: Oracle’s string functionality is a good example of why it’s so hard to port apps from one platform to another. It’s not just a matter of mapping functionality exactly, but also finding simpler ways to write the same queries.

Join us in person for The Senior DBA Class or SQL Server Performance Tuning.

Math is math is math

I haven’t found any tremendous differences working with numbers between Oracle and SQL Server. Both offer pretty standard functions to calculate your calculations. Oracle has MOD(), SQL Server uses %. This is likely something you’ll want to be aware of if you’re working cross-platform, but nothing earth shattering.

One really interesting thing I’ve found with Oracle is that it only has one data type for numbers when you create a table. SQL Server has tinyint, int, bigint, decimal, numeric, float, and money (does REAL count?).

Oracle, well, it just uses NUMBER. The difference here is that you have the option to specify precision and scale. You can only do that in SQL Server for types more precise than integers.

One area where Oracle has a leg up on SQL Server is when it comes to concatenating numbers and strings together. It will implicitly convert numbers to strings, where SQL Server will just throw awful red text at you. Wah, error converting data types. At minimum, it makes your code way more clean. How many times have you written some dynamic-ish SQL only to have to CAST/CONVERT a bunch of values to strings? Probably a million. Maybe more. This also goes for dates.

I mean, you can try to use CONCAT or FORMAT in SQL Server.

If you hate your server.

And performance.

And baby animals with big eyes.

Don’t you lose my number

Working with numbers is usually one of the easier things to do in databases. Just match your data types and you’re good to go. As long as you’re not covering up years of “accounting irregularities” it should go pretty smoothly.

One major area of difference is, as I mentioned, that Oracle will do you a solid and convert numbers to strings when you’re concatenating. I think this in large part because Oracle differentiates the concatenation operator (||) from the addition operator (+). We’ll look more closely at that next time, because there’s a bit more to say about working with strings in general.

Thanks for reading!

Brent says: wow, only one datatype for numbers sounds awesome, actually. The simpler, the better there.

Join us in person for The Senior DBA Class or SQL Server Performance Tuning.

Ask not for whom the bell tolls

It tolls for SQL Server 2005. Which means in a few short months, you’ll need to get out of Dodge. I’m sure you all have your upgrade ducks in a row, you’ve studied the breaking changes, the upgrade path, and maybe even downloaded the latest upgrade advisor. Smarty pants you are! I’m proud of you.

But chances are, that database of yours has been around for a while. If you’re not regularly upgrading SQL Server, you may forget to do a few things after you get that fossil to its new museum.

Page, but verify

Turning this on is a good step towards catching corruption early. This will put SQL Server in the habit of giving all your data pages a checksum as it writes them to disk, that it will validate when it reads them back into memory. If something spooky happened to a page, it will ring the alarm. Follow the link for some scripts to figure out if it’s turned on, and turn it on if it’s not.

Mind your compatibility level

When going to 2014 (as of today, 2016’s RTM hasn’t been announced yet), you’ll have to decide whether or not the new cardinality estimator suits you. There’s not a cut and dry answer, you’ll have to test it on your workload. If you’d like some of the more modern SQL features added to your arsenal, you can bump yourself up to 2012-levels to get the majority of them.

Who owns this thing?

Most people don’t log in as sa. Right? Heh. NO SERIOUSLY. Okay, your Windows login is a sysadmin anyway, so it doesn’t matter. Cool. But when you restore a database, the owner becomes whomever restored it. It’s generally considered pretty smart to make sa the owner. This goes for your Agent jobs, too!

You’ve gotten so big!

Careful about how you’re growing out your files. The majority of SQL users aren’t standing around with calipers and a magnifying glass deciding how much to manually grow files by. That’s special. Chances are you’re relying on percent autogrowth for your data and log files. Problem is, the bigger your files get, the more they grow by. What’s 10% of 5TB? BASICALLY A RHINOCEROS!

Does anyone know what’s in here?

Updating stats is important enough on its own, but Microsoft has put it in the upgrade steps for both 2014 and 2016. Perhaps they’re being collected differently in newer versions? I haven’t seen much of a “why” on this, but it’s not something I’d want to get caught out there on. But I’m sure that you, studious as you are, are updating them regularly anyway.

Since you’ve got this whole new server

Now is a great time to check out our Download Pack! We have a setup guide, and all our scripts are in there too. You can really dodge a lot of the common mistakes and oversights people make when setting up a new server.

Thanks for reading!

Brent says: when Microsoft doesn’t support something, we don’t support it either. This means we won’t do our SQL Critical Care® on SQL 2005, and our First Responder Kit script will be SQL 2008+ only. (This makes our life easier when it comes to testing and support.)

Join us in person for The Senior DBA Class or SQL Server Performance Tuning.

There’s an old DBA saying…

May you already have a backup restored
A half hour before your boss knows there’s corruption

What? There’s no such thing as old DBA sayings? Well, maybe if you all said something other than “no” once in a while, you’d be more quotable. Hmpf.

Anyway, this is a serious question! And there are a lot of things to consider

Do I have a different RTO for corruption?
What’s my backup retention policy?
How much data do I have?
How long are my maintenance windows?
Do I have a server I can offload checks to?

Recovery Time Objectification

When you’re setting these numbers with management, you need to make them aware that certain forms of corruption are more serious than others, and may take longer to recover from. If system tables or clustered indexes become corrupt, you’re potentially looking at a much more invasive procedure than if a nonclustered index gets a little wonky — something you can disable and rebuild pretty easily.

Either way, you’re looking at an RTO of at least how long it takes you to restore your largest database, assuming the corruption isn’t present in your most recent full backup. That’s why backup checksums are important. They’re not a replacement for regular consistency checks by any means, but they can provide an early warning for some types of page corruption, if you have page verification turned on, and your page is assigned a checksum.

If you use a 3rd party backup tool that doesn’t allow you to use the backup checksum option, stop using it. Seriously, that’s garbage. And turn on Trace Flag 3023 until you find a replacement that does.

Notice I’m not talking about RPO here. But there’s a simple equation you can do: the shorter your RTO for corruption, the longer your RPO. It’s real easy to run repair with allow data loss immediately. The amount of data you lose in doing so is ¯\_(?)_/¯

Which is why you need to carefully consider…

Backup retention

The shorter the period of time you keep backups, the more often you need to run DBCC CHECKDB. If you keep data for two weeks, weekly is a good starting point. If you take weekly fulls, you should consider running your DBCC checks before those happen. A corrupt backup doesn’t help you worth a lick. Garbage backup, garbage restore. If your data only goes back two weeks, and your corruption goes back a month, best of luck with your job search.

Of course, keeping backups around for a long time is physically impossible depending on…

How much data YOU have

The more you have, the harder it is to check it all. It’s not like these checks are a lightweight process. They chew up CPU, memory, disk I/O, and tempdb. They don’t cause blocking, the way a lot of people think they do, because they take the equivalent of a database snapshot to perform the checks on. It’s transactionally consistent, meaning the check is as good as your database was when the check started.

You can make things a little easier by running with the PHYSICAL ONLY option, but you lose out on some of the logical checks. The more complicated process is to break DBCC checks into pieces and run them a little every night. This is harder, but you stand a better chance of getting everything checked.

Especially if you have terabytes and terabytes of data, and really a short…

Maintenance window

Are you 24×7? Do you have nights or weekends to do this stuff? Are you juggling maintenance items alongside data loads, reports, or other internal tasks? Your server may have a different database for different customer locations, which means you have a revolving maintenance window for each zone (think North America, Europe, APAC, etc.), so at best you’re just spreading the pain around.

Or you could start…

Offloading checks

This is my absolute favorite. Sure, it can be a bear to script out yourself. Automating rotating backups and restores can be a nightmare; so many different servers with different drive letters.

Dell LiteSpeed has been automating this process since at least version 7.4, and it’s not like it costs a lot. For sure, it doesn’t cost more than you losing a bunch of data to corruption. If you’re the kind of shop that has trouble with in-place DBCC checks, it’s totally worth the price of admission.

But what about you?

Tell me how you tackle DBCC checks in the comments. You can answer the questions at the beginning of the post, or ask your own questions. Part of my job is to help you keep your job.

Thanks for reading!

Brent says: if you’re using NetApp SAN snapshots, they’ve also got great tooling to offload corruption checks to your DR site. Licensing gotchas may apply – for both SQL Server and NetApp writeable snaps.

Join us in person for The Senior DBA Class or SQL Server Performance Tuning.

True story

A long time ago, I had to actually do stuff to databases. One thing I had to do was move data files around. Maybe some knucklehead had put system databases on the C: drive, or a LUN was filling up, or we got a new LUN. You know, whatever. Natural curiosity oft leads one to the internet. If one does not succumb to food and cats, one may find useful information. Or anonymous message boards. Sort of a toss up. What I found was this article. Weird, right? 2009. Brent said to use ALTER DATABASE. It’s new and pretty and smart people do it. What Brent didn’t do was explain how it’s done. Or link to how it’s done. I felt cold and alone. Abandoned. Afraid. “Great post, Brent”, I said sarcastically, and set out to figure out how to work this magic on my own.

I turned to BOL, the destination of all self-loathing people. If you scroll down to the bottom, way down at the bottom, the syntax is there. Of course, moving system databases is a horse of a different color. But hopefully you don’t need that one. For user databases, it’s rather more simple:

Alter the file metadata to the new path
Set the database offline
Physically move the file
Set the database back online

Easy enough!

Run ALTER DATABASE with the new location. We’re moving the data file. If we were moving the log file, it would probably end in “_log” or something. You can find all this information in sys.master_files, except where you’re moving the file to. Just don’t actually move it to C:\Whatever. You may run into problems later. Also, you need the filename. If you don’t include it, SQL won’t complain until you try to set the database back online. Yay!

ALTER DATABASE [Sample] MODIFY FILE ( NAME = Sample, FILENAME = 'C:\Whatever\Sample.mdf' );

This is the part that you need to think through. People have to be cool with the database being offline while you move the physical file. This is not a seamless transition. If you’re moving large enough databases, you may want to consider an alternate method, like Mirroring or Log Shipping. They take more work, but you get the whole near-zero-downtime thing out of it. You may want to stage a mock file move to test LUN to LUN copy speeds. See how many GB you can move per minute. That way you’ll at least be able to estimate how long the outage will last. Assuming all that is cool, go ahead and take the database offline.

ALTER DATABASE [Sample] SET OFFLINE;

Now you gotta hurry up and get that file moved. How you do that is up to you. You may prefer to just use Windows Explorer, since it has a status bar, and tells you copy speeds. Good stuff to know if people ask for updates, right? Just to fill space, here’s a PowerShell command. I still hate PowerShell.

Move-Item -Path "D:\Data\Sample.mdf" -Destination "C:\Whatever" -Force

Once that finishes, put your database back online.

ALTER DATABASE [Sample] SET ONLINE;

If you find yourself having to do this often, or if you have to migrate a group of databases, it’s probably worth scripting out.

There you have it

It’s that easy to do. Just make sure you have adequate backups, in case something goes wrong. I take no responsibility for what happens to your data files when they copy across your SAN, or anywhere else.

Thanks for reading!

Comment responses will be delayed this week - we're out of the office for our 2016 company retreat.

TL;DR

As of CTP 3.3, it’s the same behavior as Trace Flag 2371, in 2008 R2 SP1 and onward. That basically means that the bigger your table is, the fewer rows need to be modified before an automatic statistics update occurs.

Slightly longer…

The change was announced over here. At first I thought, woah, cool, they thought about this and made big changes. But no, much like Trace Flags 1117 and 1118 being enabled for tempdb, it’s just…

Remember in Mortal Kombat, when getting to fight Reptile made you cool? Then Mortal Kombat 2 came out, and he was a playable character, and everyone would call your wins cheap if you picked him? That’s sort of what this reminds me of. If you’re new to SQL, you probably won’t appreciate the differences these Trace Flags make. If you’ve been using it for a while, you’ll probably start sentences with “back in my day, we had to add startup parameters…” and chuff off to write miserably long blog posts about unique indexes.

As of 02/23/2016, it sounds like Trace Flag 8048 is also enabled by default in 2016. See quote about soft NUMA at the link.

Moderate testing

I ran tests on some fairly large tables. I tried to run them on tables from 100 million to 1 billion rows, but I blew out the data drive of our AWS instance. So, uh, if you have a bigger server to test stuff out on, be my guest.

The basic concept was:

Load a bunch data into a table
Update it 1000 rows at a time (I know, I know, but updating less than that took FOREVER)
Run a query against it to invalidate stats
If they reset, add a bunch more data and start over

What I ended up with was, well…

Eighth place.

Here’s an abridged version of 10-20 million and 30-40 million rows, and how many modifications they took before a stats update occurred. If you follow the PercentMod column down, the returns diminish a bit the higher up you get. I’m not saying that I’d prefer to wait for 20% + 500 rows to modify, by any stretch. My only point here is that there’s not a set percentage to point to.

And, because you’re probably wondering, turning on Trace Flag 2371 in 2016 doesn’t make any difference. Here’s what 10-100 million look like, in 10 million row chunks.

And then SQL crashed.

If you can guess which side TF 2371 was on for, I’ll give you one merlin dollhairs.

Great expectations

This improvement is certainly welcome as a default, though it’s not all that ‘new’. My 2014 instance comes up with the same thresholds with 2371 enabled. Unless you’re working with pretty big tables, or used to managing statistics updates on your own, you likely won’t even notice the change.

Thanks for reading!

Brent says: It’s kinda like Microsoft is treating trace flags as alpha/beta tests for new features now. That’s right in line with how .com startups use feature flags.

Comment responses will be delayed this week - we're out of the office for our 2016 company retreat.