Counting Trailing Spaces in SQL Server T-SQL

I recently had an issue with some data grouping in a third party application where data that appeared to be the same, was being grouped separately.

E.G.

Example Data:

AB
AB
AB
BC
BC
BC
BC
BC
BC
DC

When grouped and counted was giving:

Data         Count
AB              3
BC              4
BC              2
DC              1

Instead of:

Data         Count
AB              3
BC              6
DC              1

Running the query:

SELECT
DATA,
COUNT(*)
FROM MyTable
GROUP BY Data

Was returning the expected results.

After some thought it occurred to me that it might be an issue with some of the data having trailing spaces, and therefore, being treated as being different.  In the T-SQL query any trailing spaces are ignored, but I didn’t know how the third party application was grouping the data.

So the question was, how do I count the trailing spaces:

The T-SQL LEN command ignores trailing spaces, so what I ended up with was this…

Using the REPLACE to substitute spaces for asterisks and wrapping with LEN gave me the length of the data including the trailing spaces.

E.G.

LEN(REPLACE(Data,’ ‘, ‘*’))

Using

SELECT
Data,
LEN(REPLACE(Data,’ ‘, ‘*’)) ActualLen
FROM MyTable
GROUP BY
Data,
LEN(REPLACE(Data,’ ‘, ‘*’))

Gave me a list of Data and length:

Data         ActualLen
AB                  2
AB                  2
AB                  2
BC                  2
BC                  2
BC                  3
BC                  3
BC                  3
BC                  3
DC                  2

Showing me that some of the BC’s had trailing spaces and some did not, and these turned out to match the data that was being returned.

In reality I had too many records to look through manually so I wrapped this to just show Data values that had ‘matching’ entries with different lengths:

SELECT
Data,
COUNT(Data)
FROM
(
SELECT
Data,
LEN(REPLACE(Data,’ ‘, ‘*’)) ActualLen
FROM MyTable
GROUP BY
Data,
LEN(REPLACE(Data,’ ‘, ‘*’))
) AS A
GROUP BY Data
HAVING COUNT(Data) > 1

If you purely wanted to count trailing spaces (as promised) you can use…

LEN(REPLACE(Data,’ ‘, ‘*’)) – LEN(Data)

 

 

3 Comments

  1. Janette said:

    This is actually a very helpful, good idea. Some suggested DATALENGTH, but that gives you the actual byte size. Thanks.

    September 5, 2011
    Reply

Leave a Reply