I'm currently at around 7 billion rows on one table and approaching the max disk space for the entire database.
I need to start saving new data on the one large table to a new disk.
I've looked at partitioning by `datetime`, which seems the obvious choice
This is an interim solution before we get a new server in 6 months, but need a "now" solution.
How would I achieve this? Can I implement manual partitioning so all new data goes to the new disk?
Everything I've seen requires creating something like a monthly option, whereas I just want to put all new stuff on a new disk.
One option is to split the existing data by month, and then manually move the partitions around.
I'm not even sure if this is possible, so would appreciate some feedback on the best solution to achieve the above.
SQL Server 2017 Enterprise Edition.
The normal short-term solution here is to simply [move some indexes and tables to the new filegroup][1] to free up space in the primary filegroup.
>Can I implement manual partitioning so all new data goes to the new disk?
Yes you can. The key idea is to create a new table on a partition scheme that is compatible with the existing table, and partitioned on a clustered index key column. Then switch the table into a partition on that partition scheme, and split its partition function so new rows go to the new filegroup.
I just discovered that you can create a partitioned table with only the "primordial" partition, and then switch a non-partitioned table into the primordial partition **without needing a `CHECK` constraint** on the non-partitioned table.
Here's an end-to-end demo:
use master
go
drop database parttest
go
create database parttest
go
use parttest
go
drop table if exists large
go
create table large(id int identity primary key, a char(1000), b int, c datetime);
create index ix_large_b on large(b);
--load some data
with q as
(
select top 100000 row_number() over (order by (select null)) n
from sys.messages m, sys.objects o
)
insert into large(a,b,c)
select replicate('z',n%989), n%39, getdate()-n%1000
from q;
--add a new filegroup with a single file
alter database current
add filegroup newfg
alter database current
add file (name = 'newfg_data', filename = 'c:\temp\newfg_data.mdf')
to filegroup newfg
--create the new partition function and partition scheme
--but don't specify any boundary points, so the target
--table can be switched in without a check constraint
create partition function pf_large(int)
as range right for values ()
create partition scheme ps_large
as partition pf_large all to ([Primary])
--create the table to switch into
create table large2(id int identity primary key, a char(1000), b int, c datetime)
on ps_large(id);
--create the secondary indexes on the new table
--if you don't create them they won't be switched
create index ix_large2_b on large2(b) ;
alter table large switch to large2 partition 1
-- drop and rename
begin transaction
drop table large
exec sp_rename 'large2','large'
commit
--reset the identity values
dbcc checkident('large')
go
--split the partition function so new rows to to the new filegroup
alter partition scheme ps_large next used newfg
begin transaction
declare @splitPoint int = 1+(select max(id) from large with (tablockx) )
alter partition function pf_large() split range (@splitPoint)
commit
go
--load some more data
with q as
(
select top 100000 row_number() over (order by (select null)) n
from sys.messages m, sys.objects o
)
insert into large(a,b,c)
select replicate('z',n%989), n%39, getdate()-n%1000
from q;
go
--check that it went to the new filegroup
select i.index_id, p.partition_number, ds.name fg, au.data_pages
from sys.tables t
join sys.indexes i
on t.object_id =i.object_id
left join sys.partitions p
on p.object_id = t.object_id
and p.index_id = i.index_id
left join sys.allocation_units au
on (au.type in (1,3) and au.container_id = p.hobt_id)
or(au.type = 2 and au.container_id = p.partition_id)
left join sys.data_spaces ds
on ds.data_space_id = au.data_space_id
where t.object_id = object_id('large')
order by ds.data_space_id, index_id, partition_number
Output:
|index_id|partition_number|fg|data_pages
|-:|-:|-|-:
|1|1|PRIMARY|14286
|2|1|PRIMARY|174
|1|2|newfg|14286
|2|2|newfg|231
You will have to drop and recreate any foreign key constraints referencing this table. And recreating the FK constraints after the switch the switch require table scans during DDL, so can be lengthy offline operations.
To optimize for up time you can recreate the FK constraints with `NOCHECK` and `CHECK` them later. Until then, they will be enforced for DML but not trusted by the query optimizer.
[1]: https://docs.microsoft.com/en-us/sql/relational-databases/indexes/move-an-existing-index-to-a-different-filegroup?view=sql-server-ver15
If your table has more than just a clustered index, you can do the following:
1. create a new filegroup for the database
2. add a new file to the new filegroup, specify location for new file - your new disk
3. move the non-clustered indexes to this new filegroup/disk
For more on how to do this, please see [here][1].
[1]: https://docs.microsoft.com/en-us/sql/relational-databases/indexes/move-an-existing-index-to-a-different-filegroup?view=sql-server-ver15
After you free up some space inside data file by moving nonclustered indexes to separate filegroup/disk, you can shrink data file (DBCC shrinkdatabase)
After this you will have some free space on your drive - and then consider rebuilding indexes - which removes fragmentation and can greatly reduce space consumption as a side effect, even without compression
And finally, while rebuild index, you can consider applying `data_compression = page` to all your indexes of the largest table, to save even more space ( if some further CPU overhead to compress/decompress new data is not an issue for you)
This way, partitioning will not be really needed