Data Conversion - Check Impact of Converting from NVarchar to Varchar
Date: 10 Oct 2007
Have you ever needed to convert an entire table from unicode NVARCHAR to non-unicode VARCHAR?
Not something that happens very often, or by choice.
Usage of UNICODE storage has definteily ballooned over the last 5 years, mainly driven by the explosion in international data exchange (especially in the social networking domain).
However, there will always be occasions when a legacy system just cannot be changed, and data import may contain international characters. In these cases, it would be very useful to know if there will be any data loss or corruption.
If the characters fall within the low code page (first 255) , then there will be none, otherwise, these characters may be missed or mis-convereted.
The challenge is how to reliably and quickly check any table and it's data for problems?
Screening the data for each potential character in turn wouldn't be viable or efficient, nor would it necessarily be the most accurate.
There is a handy function called BINARY_CHECKSUM() which creates a checksum hash for one or many columns.
The code below can be used to generate dynamic TSQL that will allow all tables containing UNICODE text to be checked. It creates one command per table per column to check the data and return any rows that will be corrupted.
The code concentrates on NVARCHAR, but could easily be adapted to include NCHAR