sql server table partition function
sql server table partition function

How To Split Table Partitions Automatically

Table partitions can bring a lot of benefits to the manageability and performance of your SQL Server – that is if the partitions are configured and designed correctly. If you’ve got a large table with hundreds of millions of records, and you need to maintain a portion of that table only and not necessarily the whole table, table partitioning allows you to perform that maintenance on a partition-by-partition basis. And, also, when you enable lock escalation on the partition level, many of your locking and blocking woes might be fixed. If you have issues of locking/blocking on a big table in an application that performs high concurrency, table partitioning might help. Brent Ozar Unlimited has some good resources on SQL Server Table Partitioning.

What I want to focus on this post is the splitting of partition. It is very important to note that splitting a populated partition would cause the log file to bloat and, in some cases, the TempDB to run out of space. But you might have to deal with this if you’re adding partitioning in an existing table.

If you are still in the designing phase of your application development and you might be thinking of implementing table partitioning in the backend, then adding a task that would add partitions dynamically as your table grows should be on top of your to-do list. Creating the partitions in advance and staying few steps ahead of your data can save you from a lot of stress and headaches.

Here’s an example of how easy it is to implement a task that will automatically create partitions.

Test Table Partitions

Let’s create a test table. For this demo, we’re going to use a single filegroup (Primary) for convenience purposes.

CREATE TABLE PartitionedTable (
 ID INT NOT NULL,
 SalesDate SMALLDATETIME NOT NULL,
 ColA VARCHAR(100) NULL,
 ColB VARCHAR(100) NULL,
 )

At this point, without any records inserted yet to the table, the sys.partitions Object Catalog View would show something like this:
sys.partitions None Partitioned Table You can pull this metadata with this query:

SELECT o.name objectname ,
i.name indexname ,
partition_id ,
partition_number ,
[rows]
FROM sys.partitions p
INNER JOIN sys.objects o ON o.object_id = p.object_id
INNER JOIN sys.indexes i ON i.object_id = p.object_id
AND p.index_id = i.index_id
WHERE o.name = 'PartitionedTable';

Partition Function and Scheme

Let’s create the Partition Function and Scheme. Again, for convenience purposes, let’s create the Scheme on a single filegroup, which is the Primary filegroup in this case. Let’s supposed we want to partition our data into month buckets. Since this a date, I’m using Left Range which means the partition uses the upper boundary and includes all the points to its left. If you use Right Range instead, then you have to have a logic to handle the 31st day of some of the months. Just to illustrate that:

LowerBoundaryValue UpperBoundaryValue
NULL 1/1/2017
1/1/2017 2/1/2017
CREATE PARTITION FUNCTION pfPartitionedTableDate (SMALLDATETIME)
AS RANGE LEFT FOR VALUES ('20170101');

CREATE PARTITION SCHEME schPartitionedTableDate
AS PARTITION pfPartitionedTableDate ALL TO ([PRIMARY]);

Upon successful creation of the Partition Function and Scheme, you’ll get a message something like this:

Partition scheme 'schPartitionedTableDate' has been created successfully. 'PRIMARY' is marked as the next used filegroup in partition scheme 'schPartitionedTableDate'.

Add Indexes

Let’s also create indexes for demo purposes.

ALTER TABLE PartitionedTable ADD CONSTRAINT PK_PartitionedTable PRIMARY KEY CLUSTERED (ID)
ON [PRIMARY]

CREATE NONCLUSTERED INDEX IX_PartitionedTable_ID ON PartitionedTable (SalesDate) 
 ON schPartitionedTableDate(SalesDate)

Populate Test Table

Although I said that you may want to avoid splitting populated partitions, for the purpose of this blog post, let’s create a single partition, which we already did above, then populate the table with data and add another partition on the populated table.

I am using Redgate SQL Data Generator to populate the test table with these settings on our Partition Column (SalesDate):

Redgate SQL Data Generator

 

You may remember that we created a partition (1/1/2017). Now with the 1,000 records added, our Table Partition looks like this:

populated table partition

Create Additional Partition

So, let’s create 4 additional partitions:

 ALTER PARTITION SCHEME schPartitionedTableDate 
 NEXT USED [PRIMARY]
 ALTER PARTITION FUNCTION pfPartitionedTableDate() SPLIT RANGE('20170201')

 ALTER PARTITION SCHEME schPartitionedTableDate 
 NEXT USED [PRIMARY]
 ALTER PARTITION FUNCTION pfPartitionedTableDate() SPLIT RANGE('20170301')

 ALTER PARTITION SCHEME schPartitionedTableDate 
 NEXT USED [PRIMARY]
 ALTER PARTITION FUNCTION pfPartitionedTableDate() SPLIT RANGE('20170401')

 ALTER PARTITION SCHEME schPartitionedTableDate 
NEXT USED [PRIMARY] 
ALTER PARTITION FUNCTION pfPartitionedTableDate() SPLIT RANGE('20170501')

Insert Additional Records

Let’s insert few new records under April.

INSERT INTO PartitionedTable
VALUES (1001, '2017/04/01','Product1001','ProductDescription1001'),
(1002, '2017/04/01','Product1002','ProductDescription1002'),
(1003, '2017/04/01','Product1003','ProductDescription1003')

So far, we have 5 partitions for each month (January, February, March, April, and May). So we can have a better picture of our table partitions with the additional partitions and data, let’s query the underlying objects:

SELECT OBJECT_NAME(p.object_id) AS ObjectName ,
 i.name AS IndexName ,
 p.index_id AS IndexID ,
 ds.name AS PartitionScheme ,
 p.partition_number AS PartitionNumber ,
 fg.name AS FileGroupName ,
 prv_left.value AS LowerBoundaryValue ,
 prv_right.value AS UpperBoundaryValue ,
 CASE pf.boundary_value_on_right
 WHEN 1 THEN 'RIGHT'
 ELSE 'LEFT'
 END AS PartitionFunctionRange ,
 p.rows AS Rows
FROM sys.partitions AS p
 INNER JOIN sys.indexes AS i ON i.object_id = p.object_id
 AND i.index_id = p.index_id
 INNER JOIN sys.data_spaces AS ds ON ds.data_space_id = i.data_space_id
 INNER JOIN sys.partition_schemes AS ps ON ps.data_space_id = ds.data_space_id
 INNER JOIN sys.partition_functions AS pf ON pf.function_id = ps.function_id
 INNER JOIN sys.destination_data_spaces AS dds ON dds.partition_scheme_id = ps.data_space_id
 AND dds.destination_id = p.partition_number
 INNER JOIN sys.filegroups AS fg ON fg.data_space_id = dds.data_space_id
 LEFT OUTER JOIN sys.partition_range_values AS prv_left ON ps.function_id = prv_left.function_id
 AND prv_left.boundary_id = p.partition_number- 1
 LEFT OUTER JOIN sys.partition_range_values AS prv_right ON ps.function_id = prv_right.function_id
 AND prv_right.boundary_id = p.partition_number
WHERE p.object_id = OBJECT_ID('PartitionedTable');

Which will give us this (click on the picture to enlarge it):

Table Partitions sql server

Dynamic Table Partition Split

So, let’s pretend it’s the month of April 2017 and this is the partition currently populated. Based on the query above, aside from the current partition bucket, we also have another available bucket month for May.

Say we want to maintain 3 available buckets at any given time. The next available bucket is May, so that means we need 2 more partitions to cover for June and July.

To do that, let’s do something like this:

--Variable Declaration
DECLARE @currentpartition INT
 , @chunk INT
 , @partvalue SMALLDATETIME
 , @maintainbucket INT
 , @availablepartition INT
 , @partitiontocreate INT
 , @lastavailablepartid INT
 , @lastavailablepartval INT
 , @sql NVARCHAR(4000)
 ,@sqlmaxdate NVARCHAR(4000)
 ,@maxdate SMALLDATETIME
 , @pscheme VARCHAR(400) ;

SET @pscheme = 'schPartitionedTableDate' ;

SET @chunk = 1 ; --1 Month - Preset Partition Size/Chunk 
SET @maintainbucket = 3 ; --Number of Partitions to maintain any given time

--Prep: Create and Populate Temp Table #Partitions
WITH cte_part
 AS
 (
 SELECT i.index_id
 , prv_left.value AS LowerBoundaryValue
 , prv_right.value AS UpperBoundaryValue

, CASE pf.boundary_value_on_right
 WHEN 1
 THEN 'RIGHT'
 ELSE 'LEFT'
 END AS PartitionFunctionRange
 , p.partition_number AS PartitionNumber
 , p.rows AS Rows
 --In case the table has multiple indexes
 , ROW_NUMBER() OVER (PARTITION BY prv_left.value, prv_right.value, p.rows
 ORDER BY i.index_id,prv_left.value , prv_right.value 
 ) AS row_num
 FROM sys.partitions AS p
 INNER JOIN sys.indexes AS i
 ON i.object_id = p.object_id
 AND i.index_id = p.index_id
 INNER JOIN sys.data_spaces AS ds
 ON ds.data_space_id = i.data_space_id
 INNER JOIN sys.partition_schemes AS ps
 ON ps.data_space_id = ds.data_space_id
 INNER JOIN sys.partition_functions AS pf
 ON pf.function_id = ps.function_id
 LEFT OUTER JOIN sys.partition_range_values AS prv_left
 ON ps.function_id = prv_left.function_id
 AND prv_left.boundary_id = p.partition_number - 1
 LEFT OUTER JOIN sys.partition_range_values AS prv_right
 ON ps.function_id = prv_right.function_id
 AND prv_right.boundary_id = p.partition_number
 WHERE ds.name = @pscheme --Partition Scheme Name
 )
SELECT cte_part.index_id
 , cte_part.LowerBoundaryValue
 , cte_part.UpperBoundaryValue
 , cte_part.PartitionFunctionRange
 , cte_part.PartitionNumber
 , cte_part.Rows
INTO #Partitions
 FROM cte_part
 WHERE cte_part.row_num = 1
 ORDER BY cte_part.LowerBoundaryValue
 , cte_part.UpperBoundaryValue 
 , cte_part.PartitionFunctionRange


SET @sqlmaxdate = 'SELECT @maxdate = MAX(SalesDate) FROM PartitionedTable' ; 

EXEC sp_executesql @sqlmaxdate
 , N'@maxdate SMALLDATETIME OUTPUT'
 , @maxdate = @maxdate OUTPUT ;

SELECT @currentpartition = $PARTITION.[pfPartitionedTableDate](@maxdate) ;

--Count the number of available Partitions
SET @availablepartition = ((SELECT MAX( PartitionNumber ) FROM #Partitions) - @currentpartition) - 1 /* Minus 1 excluding unpartitioned in the count*/ ;
SET @partitiontocreate = @maintainbucket - @availablepartition ;
SET @lastavailablepartid = ((SELECT MAX( PartitionNumber ) FROM #Partitions) - 1) ; /* Minus 1 excluding unpartitioned in the count*/
SET @lastavailablepartval = (SELECT CONVERT(VARCHAR(8),UpperBoundaryValue,112)FROM #Partitions WHERE PartitionNumber = @lastavailablepartid) ;

BEGIN
IF @maintainbucket = @availablepartition OR @maintainbucket < @availablepartition
 BEGIN
 PRINT 'No Action Needed. Available Partition is: ' + CONVERT( VARCHAR(MAX), @availablepartition ) + ' and Number of Partition to Maintain is: ' + CONVERT( VARCHAR(MAX), @maintainbucket ) ;
 END ;
ELSE 
 BEGIN
 SET @partvalue = CAST(CAST(@lastavailablepartval as VARCHAR(8)) AS DATETIME);

 WHILE @partitiontocreate > 0
 BEGIN
 SET @partvalue = DATEADD(MONTH, @chunk, @partvalue) ; -- Plus @chunk = 1 Month
 SET @sql = 'ALTER PARTITION SCHEME schPartitionedTableDate 
 NEXT USED [PRIMARY] 
 
 ALTER PARTITION FUNCTION [pfPartitionedTableDate]() SPLIT RANGE (''' + CONVERT( VARCHAR(10), @partvalue, 112 ) + ''') --Range Partition Value' ;

 EXEC sp_executesql @statement = @sql ;

SET @partitiontocreate = @partitiontocreate - 1 ;
 END ; 
 END 
END

--Cleanup
DROP TABLE #Partitions ;

Now we can see the additional partitions created:

Table Partitions sql server additional table partitions

Run the same code without changing the number of buckets to maintain (3) and you will get this message:

No Action Needed. Available Partition is: 3 and Number of Partition to Maintain is: 3

To make this truly automated, create a job that will run this code on monthly basis.

Enjoy!

Author: Marlon Ribunal

I am SQL Server Database Administrator for a software company catering to supply chain and retail industry.