Cleaning and Formatting Phone Numbers
Applicability:
SQL Server 2000: Tested
SQL Server 2005: Tested
SQL Server 2008: Tested
SQL Server 2008R2: Tested
SQL Server 2012: Not Tested
Sybase 15 ASE: Tested
Credits:
Author: ChillyDBA
Date: 7 Oct 2011
Description
I recently had the challenge of working with a data feed of Address/Phone Number data originating in Sybase 15 ASE. Not only was the DBMS out of my immediate comfort zone, but the application hosting the data was 15 years old, had no data input validation. Also, the data was still actively being updated and used, was not allowed to be changed by any part of the extract, and had to be extracted anew every time it was needed.
Several key pieces of data were required, including Name, Postal Code and phone/fax number. This tip deals with the steps taken to extract, clean and format phone numbers in an acceptable way for consumption by the new application.
The Sybase challenge was easily met. I remember back in the SQL 6.x days just how keen Microsoft was to move away from the legacy Sybase code into something more flexible and sustainable, and the continuous joy(?) as a DBA with the new features in each new version. each new version.
Imagine my shock to enter the Sybase universe again to be me with something familiar - kind of like visiting your old school 20 years later - change, but not much.
Fortunately, Sybase 15 now has functions, although only the scalar variety, so performing on-the-fly data cleansing and formatting during the extract process was an option.
Included here are the resulting pair of functions that I wrote to perform the task.
Some things to note:
The extract is designed for standard 10-digit North American format phone numbers with optional extension numbers of up to 5 digits
Initially, all non-numeric characters are removed
The resulting number of digits is then used to determine the validity of the number and ultimately output formatting
The new application was still undergoing design at the time this extract work was happening, so the formatting took 2 forms for future flexibility; a masked text string, and an XML fragment
XML is not a data type supported by SYBASE (yet), so the XML fragment is created the hard way.
CTEs are not supported by SYBASE (yet), so the Numbers Table technique cannot be used to clean a string
Code
Phone Number Cleaning Function :
DROP FUNCTION fn_CleanPhoneAndFAX
GO
CREATE FUNCTION fn_CleanPhoneAndFAX(@PhoneFAX VARCHAR(40) )
RETURNS VARCHAR(40)
/**************************************************************************************
Author: ChillyDBA
Create date: 2011-06-01
Description : This will take a single string phone or fax number and strip out all non-digit characters
Input Parameters:
@PhoneFAX: Tne Phone or FAX number to clean
Return Value:
VARCHAR(40)
***************************************************************************************/
AS
BEGIN
DECLARE
@PhoneFAX_Clean VARCHAR(40),
@Counter INT,
@StringLength INT
-- initialise variables
SELECT
@Counter = 1,
@PhoneFAX_Clean = '',
@StringLength = DATALENGTH(@PhoneFAX)
-- for every character in the string, from left to right
WHILE @Counter <= @StringLength
BEGIN
-- add the character to the clean string only if it is in the range 0..9
SELECT @PhoneFAX_Clean = @PhoneFAX_Clean + CASE
WHEN SUBSTRING(@PhoneFAX, @Counter, 1) LIKE '%[0-9]%' THEN SUBSTRING(@PhoneFAX, @Counter, 1)
ELSE ''
END
SELECT @Counter = @Counter + 1
END
-- if the resulting number is only one digit long, assume it is meant to indicate no number and set it to null
-- *** ANY OTHER FAIL CONDITIONS SHOULD BE HANDLED HERE e.g < 10 digits = NULL ***
IF DATALENGTH(@PhoneFAX_Clean) = 1
SELECT @PhoneFAX_Clean = NULL
RETURN @PhoneFAX_Clean
END
GO
SELECT dbo.fn_CleanPhoneAndFAX('613999-----6534' )
-- 6139996534
SELECT dbo.fn_CleanPhoneAndFAX('6139996534 ext 1' )
-- 61399965341
SELECT dbo.fn_CleanPhoneAndFAX('6139996534 (12345)' )
-- 613999653412345
Phone Number Masking Function :
DROP FUNCTION fn_MaskPhoneAndFAX
GO
CREATE FUNCTION fn_MaskPhoneAndFAX (@PhoneFAX VARCHAR(40) , @ReturnType CHAR(3) )
RETURNS VARCHAR(255)
/**************************************************************************************
Author: ChillyDBA
Create date: 2011-06-01
Description : This will take a single string phone or fax number and produce
either a masked output, or component parts
The algorithm used here assumes a phone number string with no non-digit charaters
Input Parameters:
@PhoneFAX: Tne Phone or FAX number to be masked
@ReturnType: Flag determinitiong in which format the result will be returned
TXT = Number will be returned as a masked string eg (613) 555-5555 (12345)
XML = Name will be returned as a well-formed XML fragemnt with AreaCode, PhoneNumber and Ext nodes
Return Value:
VARCHAR(255)
***************************************************************************************/
AS
BEGIN
DECLARE
@PhoneFAX_Masked VARCHAR(255),
@StringLength INT
-- initialise variables
SELECT
@StringLength = DATALENGTH(@PhoneFAX)
IF @ReturnType = 'TXT'
BEGIN
IF @StringLength IN (10, 13, 14, 15)
SELECT @PhoneFAX_Masked =
'('
+ SUBSTRING(@PhoneFAX, 1 ,3)
+ ') '
+ SUBSTRING(@PhoneFAX, 4 ,3)
+ '-'
+ SUBSTRING(@PhoneFAX, 7 ,4)
IF @StringLength IN (13, 14, 15)
SELECT @PhoneFAX_Masked = @PhoneFAX_Masked
+ ' ext '
+ SUBSTRING(@PhoneFAX, 11 ,5)
IF @StringLength NOT IN (10, 13, 14, 15)
SELECT @PhoneFAX_Masked = @PhoneFAX
END
ELSE IF @ReturnType = 'XML'
BEGIN
IF @StringLength IN (10, 13, 14, 15)
SELECT @PhoneFAX_Masked =
'<PhoneDetails>'
+ '<AreaCode>'
+ SUBSTRING(@PhoneFAX, 1 ,3)
+ '</AreaCode>'
+ '<PhoneNumber>'
+ SUBSTRING(@PhoneFAX, 4 ,3)
+ '-'
+ SUBSTRING(@PhoneFAX, 7 ,4)
+ '</PhoneNumber>'
+ '<Extension>'
+ CASE
WHEN @StringLength IN (13, 14, 15) THEN SUBSTRING(@PhoneFAX, 11 ,5)
ELSE ''
END
+ '</Extension>'
+ '</PhoneDetails>'
END
RETURN @PhoneFAX_Masked
END
GO
SELECT dbo.fn_MaskPhoneAndFAX('6139996534', 'TXT')
-- (613) 999-6534
SELECT dbo.fn_MaskPhoneAndFAX('6139996534', 'XML')
--<PhoneDetails>
-- <AreaCode>613</AreaCode>
-- <PhoneNumber>999-6534</PhoneNumber>
-- <Extension></Extension>
--</PhoneDetails>
SELECT dbo.fn_MaskPhoneAndFAX('6139996534123', 'TXT')
--(613) 999-6534 ext 123
SELECT dbo.fn_MaskPhoneAndFAX('6139996534123', 'XML')
--<PhoneDetails>
-- <AreaCode>613</AreaCode>
-- <PhoneNumber>999-6534</PhoneNumber>
-- <Extension>123</Extension>
--</PhoneDetails>
SELECT dbo.fn_MaskPhoneAndFAX('613999653412345', 'TXT')
--(613) 999-6534 ext 12345
SELECT dbo.fn_MaskPhoneAndFAX('613999653412345', 'XML')
--<PhoneDetails>
-- <AreaCode>613</AreaCode>
-- <PhoneNumber>999-6534</PhoneNumber>
-- <Extension>12345</Extension>
--</PhoneDetails>