Hierarchical Data Deletion
SQL Server 2000: Tested
SQL Server 2005: Tested
SQL Server 2008: Tested
SQL Server 2008R2: Tested
SQL Server 2012: Not Tested
Author: Unknown (multiple)
Often, there is a need to remove data from some or all tables in a relational database, or even to remove tables completely. The challenge here is navigating the complexities of the PK/FK relationships between tables, especially in an unfamiliar system.
A greater challenge can be creating a robust scheduled process (for data archiving or refresh) as the data structures are subject to change.
There are 3 techniques that I have used in the past to achieve this:
Trial and Error - initially time consuming and an ongoing pain in the rear end.
Disable all constraints, delete and enable all constraints
Create dynamic deletion statements with the ordering determined using system metadata
Technique #1 is the quickest an easiest, especially with the ability of Management Studio GUI to continue on error and to resubmit the same list repeatedly until all tables are gone. Not the most elegant or supportable in the long term though and very much a brute force method to be used only when the rioting crowd around your desk demands immediate action
I will concentrate here on the last 2 techniques
NOTE: For ease of typing, I will refer to Deletion, but the technique can be equally applied to Archiving as this just involves an additional step of transferring the data somewhere else before deletion.
Delete Hierarchically - Disable Constraint Method
This is still a somewhat brute force method, as all constraints on the tables are disabled, data is deleted followed by a reinstatement of the constraints. This can be dangerous if only partial data is to be deleted - the deletion code must be coded carefully to ensure that constraints remain viable when enabled again. This effectively means the constraints are duplicated in code and is not very maintainable in the long term.
The code provided generates 3 sets of results for each invocation. They are lists of TSQL commands to:
Disable constraints on all tables
Delete data from all tables (this is where customisation would be necessary for archiving purposes)
Enable constraints on all tables
Delete Hierarcichally - Ordered Delete Method
This is a more challenging method to design and code, but ultimately produces a more robust and maintainable process. Full or partial data deletions are carried out in an order determined by the PK/FK relationships. This ensures that each delete operation only succeeds if the constraint rules are not broken.
The code provided here also generates 3 sets of results, but only one at a time dependent on parameters passed in. The results are also not TSQL commands but are lists of objects to be consumed by custom deletion code. The main point here is that the complex work of producing the hierarchically ordered list is done for you:
List of all PK/FK relationships, with parent and child table names
List of tables in data deletion order (bottom up - from child to parent)
List of tables in data insertion order (top down - from parent to child)
NOTE: Only tables with a PK or FK defined will appear in the lists.
There is a function-based implementation of the ordered table list which also produces a hirearchy level indicator which can also be used to drive custom INSERT/DELETE code.
Delete Hierarchichally - Disable Constraint Method
Delete Hierarchichally - Ordered Delete Method