Table partitions can bring a lot of benefits to the manageability and performance of your SQL Server – that is if the partitions are configured and designed correctly. If you’ve got a large table with hundreds of millions of records, and you need to maintain a portion of that table only and not necessarily the whole table, table partitioning allows you to perform that maintenance on a partition-by-partition basis. And, also, when you enable lock escalation on the partition level, many of your locking and blocking woes might be fixed. If you have issues of locking/blocking on a big table in an application that performs high concurrency, table partitioning might help. Brent Ozar Unlimited has some good resources on SQL Server Table Partitioning.
What I want to focus on this post is the splitting of partition. It is very important to note that splitting a populated partition would cause the log file to bloat and, in some cases, the TempDB to run out of space. But you might have to deal with this if you’re adding partitioning in an existing table.
If you are still in the designing phase of your application development and you might be thinking of implementing table partitioning in the backend, then adding a task that would add partitions dynamically as your table grows should be on top of your to-do list. Creating the partitions in advance and staying few steps ahead of your data can save you from a lot of stress and headaches.
Here’s an example of how easy it is to implement a task that will automatically create partitions.
Test Table Partitions
Let’s create a test table. For this demo, we’re going to use a single filegroup (Primary) for convenience purposes.
CREATE TABLE PartitionedTable ( ID INT NOT NULL, SalesDate SMALLDATETIME NOT NULL, ColA VARCHAR(100) NULL, ColB VARCHAR(100) NULL, )
At this point, without any records inserted yet to the table, the sys.partitions
Object Catalog View would show something like this:
You can pull this metadata with this query:
SELECT o.name objectname , i.name indexname , partition_id , partition_number , [rows] FROM sys.partitions p INNER JOIN sys.objects o ON o.object_id = p.object_id INNER JOIN sys.indexes i ON i.object_id = p.object_id AND p.index_id = i.index_id WHERE o.name = 'PartitionedTable';
Partition Function and Scheme
Let’s create the Partition Function and Scheme. Again, for convenience purposes, let’s create the Scheme on a single filegroup, which is the Primary filegroup in this case. Let’s supposed we want to partition our data into month buckets. Since this a date, I’m using Left Range which means the partition uses the upper boundary and includes all the points to its left. If you use Right Range instead, then you have to have a logic to handle the 31st day of some of the months. Just to illustrate that:
LowerBoundaryValue | UpperBoundaryValue |
NULL | 1/1/2017 |
1/1/2017 | 2/1/2017 |
CREATE PARTITION FUNCTION pfPartitionedTableDate (SMALLDATETIME) AS RANGE LEFT FOR VALUES ('20170101'); CREATE PARTITION SCHEME schPartitionedTableDate AS PARTITION pfPartitionedTableDate ALL TO ([PRIMARY]);
Upon successful creation of the Partition Function and Scheme, you’ll get a message something like this:
Partition scheme 'schPartitionedTableDate' has been created successfully. 'PRIMARY' is marked as the next used filegroup in partition scheme 'schPartitionedTableDate'.
Add Indexes
Let’s also create indexes for demo purposes.
ALTER TABLE PartitionedTable ADD CONSTRAINT PK_PartitionedTable PRIMARY KEY CLUSTERED (ID) ON [PRIMARY] CREATE NONCLUSTERED INDEX IX_PartitionedTable_ID ON PartitionedTable (SalesDate) ON schPartitionedTableDate(SalesDate)
Populate Test Table
Although I said that you may want to avoid splitting populated partitions, for the purpose of this blog post, let’s create a single partition, which we already did above, then populate the table with data and add another partition on the populated table.
I am using Redgate SQL Data Generator to populate the test table with these settings on our Partition Column (SalesDate):
You may remember that we created a partition (1/1/2017). Now with the 1,000 records added, our Table Partition looks like this:
Create Additional Partition
So, let’s create 4 additional partitions:
ALTER PARTITION SCHEME schPartitionedTableDate NEXT USED [PRIMARY] ALTER PARTITION FUNCTION pfPartitionedTableDate() SPLIT RANGE('20170201') ALTER PARTITION SCHEME schPartitionedTableDate NEXT USED [PRIMARY] ALTER PARTITION FUNCTION pfPartitionedTableDate() SPLIT RANGE('20170301') ALTER PARTITION SCHEME schPartitionedTableDate NEXT USED [PRIMARY] ALTER PARTITION FUNCTION pfPartitionedTableDate() SPLIT RANGE('20170401') ALTER PARTITION SCHEME schPartitionedTableDate NEXT USED [PRIMARY] ALTER PARTITION FUNCTION pfPartitionedTableDate() SPLIT RANGE('20170501')
Insert Additional Records
Let’s insert few new records under April.
INSERT INTO PartitionedTable VALUES (1001, '2017/04/01','Product1001','ProductDescription1001'), (1002, '2017/04/01','Product1002','ProductDescription1002'), (1003, '2017/04/01','Product1003','ProductDescription1003')
So far, we have 5 partitions for each month (January, February, March, April, and May). So we can have a better picture of our table partitions with the additional partitions and data, let’s query the underlying objects:
SELECT OBJECT_NAME(p.object_id) AS ObjectName , i.name AS IndexName , p.index_id AS IndexID , ds.name AS PartitionScheme , p.partition_number AS PartitionNumber , fg.name AS FileGroupName , prv_left.value AS LowerBoundaryValue , prv_right.value AS UpperBoundaryValue , CASE pf.boundary_value_on_right WHEN 1 THEN 'RIGHT' ELSE 'LEFT' END AS PartitionFunctionRange , p.rows AS Rows FROM sys.partitions AS p INNER JOIN sys.indexes AS i ON i.object_id = p.object_id AND i.index_id = p.index_id INNER JOIN sys.data_spaces AS ds ON ds.data_space_id = i.data_space_id INNER JOIN sys.partition_schemes AS ps ON ps.data_space_id = ds.data_space_id INNER JOIN sys.partition_functions AS pf ON pf.function_id = ps.function_id INNER JOIN sys.destination_data_spaces AS dds ON dds.partition_scheme_id = ps.data_space_id AND dds.destination_id = p.partition_number INNER JOIN sys.filegroups AS fg ON fg.data_space_id = dds.data_space_id LEFT OUTER JOIN sys.partition_range_values AS prv_left ON ps.function_id = prv_left.function_id AND prv_left.boundary_id = p.partition_number- 1 LEFT OUTER JOIN sys.partition_range_values AS prv_right ON ps.function_id = prv_right.function_id AND prv_right.boundary_id = p.partition_number WHERE p.object_id = OBJECT_ID('PartitionedTable');
Which will give us this (click on the picture to enlarge it):
Dynamic Table Partition Split
So, let’s pretend it’s the month of April 2017 and this is the partition currently populated. Based on the query above, aside from the current partition bucket, we also have another available bucket month for May.
Say we want to maintain 3 available buckets at any given time. The next available bucket is May, so that means we need 2 more partitions to cover for June and July.
To do that, let’s do something like this:
--Variable Declaration DECLARE @currentpartition INT , @chunk INT , @partvalue SMALLDATETIME , @maintainbucket INT , @availablepartition INT , @partitiontocreate INT , @lastavailablepartid INT , @lastavailablepartval INT , @sql NVARCHAR(4000) ,@sqlmaxdate NVARCHAR(4000) ,@maxdate SMALLDATETIME , @pscheme VARCHAR(400) ; SET @pscheme = 'schPartitionedTableDate' ; SET @chunk = 1 ; --1 Month - Preset Partition Size/Chunk SET @maintainbucket = 3 ; --Number of Partitions to maintain any given time --Prep: Create and Populate Temp Table #Partitions WITH cte_part AS ( SELECT i.index_id , prv_left.value AS LowerBoundaryValue , prv_right.value AS UpperBoundaryValue , CASE pf.boundary_value_on_right WHEN 1 THEN 'RIGHT' ELSE 'LEFT' END AS PartitionFunctionRange , p.partition_number AS PartitionNumber , p.rows AS Rows --In case the table has multiple indexes , ROW_NUMBER() OVER (PARTITION BY prv_left.value, prv_right.value, p.rows ORDER BY i.index_id,prv_left.value , prv_right.value ) AS row_num FROM sys.partitions AS p INNER JOIN sys.indexes AS i ON i.object_id = p.object_id AND i.index_id = p.index_id INNER JOIN sys.data_spaces AS ds ON ds.data_space_id = i.data_space_id INNER JOIN sys.partition_schemes AS ps ON ps.data_space_id = ds.data_space_id INNER JOIN sys.partition_functions AS pf ON pf.function_id = ps.function_id LEFT OUTER JOIN sys.partition_range_values AS prv_left ON ps.function_id = prv_left.function_id AND prv_left.boundary_id = p.partition_number - 1 LEFT OUTER JOIN sys.partition_range_values AS prv_right ON ps.function_id = prv_right.function_id AND prv_right.boundary_id = p.partition_number WHERE ds.name = @pscheme --Partition Scheme Name ) SELECT cte_part.index_id , cte_part.LowerBoundaryValue , cte_part.UpperBoundaryValue , cte_part.PartitionFunctionRange , cte_part.PartitionNumber , cte_part.Rows INTO #Partitions FROM cte_part WHERE cte_part.row_num = 1 ORDER BY cte_part.LowerBoundaryValue , cte_part.UpperBoundaryValue , cte_part.PartitionFunctionRange SET @sqlmaxdate = 'SELECT @maxdate = MAX(SalesDate) FROM PartitionedTable' ; EXEC sp_executesql @sqlmaxdate , N'@maxdate SMALLDATETIME OUTPUT' , @maxdate = @maxdate OUTPUT ; SELECT @currentpartition = $PARTITION.[pfPartitionedTableDate](@maxdate) ; --Count the number of available Partitions SET @availablepartition = ((SELECT MAX( PartitionNumber ) FROM #Partitions) - @currentpartition) - 1 /* Minus 1 excluding unpartitioned in the count*/ ; SET @partitiontocreate = @maintainbucket - @availablepartition ; SET @lastavailablepartid = ((SELECT MAX( PartitionNumber ) FROM #Partitions) - 1) ; /* Minus 1 excluding unpartitioned in the count*/ SET @lastavailablepartval = (SELECT CONVERT(VARCHAR(8),UpperBoundaryValue,112)FROM #Partitions WHERE PartitionNumber = @lastavailablepartid) ; BEGIN IF @maintainbucket = @availablepartition OR @maintainbucket < @availablepartition BEGIN PRINT 'No Action Needed. Available Partition is: ' + CONVERT( VARCHAR(MAX), @availablepartition ) + ' and Number of Partition to Maintain is: ' + CONVERT( VARCHAR(MAX), @maintainbucket ) ; END ; ELSE BEGIN SET @partvalue = CAST(CAST(@lastavailablepartval as VARCHAR(8)) AS DATETIME); WHILE @partitiontocreate > 0 BEGIN SET @partvalue = DATEADD(MONTH, @chunk, @partvalue) ; -- Plus @chunk = 1 Month SET @sql = 'ALTER PARTITION SCHEME schPartitionedTableDate NEXT USED [PRIMARY] ALTER PARTITION FUNCTION [pfPartitionedTableDate]() SPLIT RANGE (''' + CONVERT( VARCHAR(10), @partvalue, 112 ) + ''') --Range Partition Value' ; EXEC sp_executesql @statement = @sql ; SET @partitiontocreate = @partitiontocreate - 1 ; END ; END END --Cleanup DROP TABLE #Partitions ;
Now we can see the additional partitions created:
Run the same code without changing the number of buckets to maintain (3) and you will get this message:
No Action Needed. Available Partition is: 3 and Number of Partition to Maintain is: 3
To make this truly automated, create a job that will run this code on monthly basis.
Enjoy!
Pingback: Automatic Partition Splitting – Curated SQL