SQL Server 2000 - Coalesce lots of values into a single record
Let's say that you have a data set like this:
Account# Comment
1 Something significant
1 Something else significant
...
1 30 records later and still saying significant stuff
2 You'll need this
2 It's important
3 Coalesce is a cool trick
Here is your goal - you want to create output that takes all the comments for each distinct account# and spit them out in a single row - ie. you want only 3 rows returned but each comment is unique and you can't lose any of them. How would you do this? You might do some complex string handling within SQL Server, or worse, you might be pulling the data out in a raw form and doing this work in your programs. In SQL Server 2005 you might write a CLR query that can handle this work for you easily.
Alternatively you could use the built in SQL Server function, COALESCE. Coalesce is a neat little function, but the SQL Server help documentation doesn't fully explain the power of the function. According to the SQL Server help, COALESCE will return the first non-null expression among it's arguments - ie. if you pass it (null, null, 1, null, 3) it will return 1. This works like a case statement where you might be checking a bunch of values for non-null results. That's both cool and useful.
However, there is an undocumented feature of COALESCE that can help you out in my ficticious example - you can use COALESCE to append values into a comma separated list using the following syntax:
Declare @myComments varchar(8000)
select @myComments = Coalesce (@myComments + ',', '') + Comment
from MyFakeTable
where Account# = 1
This will get fill @myComments with a comma separated list of your comments. Notice that I had to specify my Account# value so that I'd get only the comments pertaining to that specific account. Use this with a loop and you can get one row returned per account number. Here is an example you should be able execute against your PUBS database.
-- task: List all the book titles, by publisher, that have sold over 2000 copies to date. One row per publisher!
-- this gets you the results in multiple rows
select * from titles
join publishers
on titles.pub_id = publishers.pub_id
where ytd_sales > 2000
order by pub_name
-- this gets you the results you really want
DECLARE @pub_id INT
DECLARE @AllTitles varchar(3000)
DECLARE @myResults table (Publisher varchar(30), Titles varchar(3000))
DECLARE Pointless_cursor CURSOR FOR
SELECT DISTINCT pub_id
FROM titles
WHERE ytd_sales > 2000
OPEN Pointless_cursor
FETCH NEXT FROM Pointless_cursor
INTO @pub_id
WHILE @@FETCH_STATUS = 0
BEGIN
-- start with a clear string ech time we loop
SELECT @AllTitles = ''
-- here's the magic
SELECT @AllTitles = COALESCE(@AllTitles + ', ', '') + title
FROM titles
WHERE pub_id = @pub_id
-- do something with this
INSERT INTO @myResults
SELECT pub_name, @AllTitles
FROM publishers
WHERE pub_id = @pub_id
FETCH NEXT FROM Pointless_cursor
INTO @pub_id
END
CLOSE Pointless_cursor
DEALLOCATE Pointless_cursor
SELECT * FROM @myResults
Your results look like this:
| New Moon Books |
, You Can Combat Computer Stress!, Is Anger the Enemy?, Life Without Fear, Prolonged Data Deprivation: Four Case Studies, Emotional Security: A New Algorithm |
| Binnet & Hardley |
, Silicon Valley Gastronomic Treats, The Gourmet Microwave, The Psychology of Computer Cooking, Computer Phobic AND Non-Phobic Individuals: Behavior Variations, Onions, Leeks, and Garlic: Cooking Secrets of the Mediterranean, Fifty Years in Buckingham Pala |
| Algodata Infosystems |
, The Busy Executive's Database Guide, Cooking with Computers: Surreptitious Balance Sheets, Straight Talk About Computers, But Is It User Friendly?, Secrets of Silicon Valley, Net Etiquette |
This can be useful when you have to do silly things like reports with no good reporting tools.
-- Matt Ranlett
*Note* - the leading comma in the results is due to the way the cursor loops through and adds a comma to the empty varchar before performing the select. I haven't bothered to fix this b/c I can resolve the problem when I display my results later on.
*Note* - notice that I'm using a table variable to get my results out. This is preferable to temporary tables (#myTempTable and ##myGlobalTempTable) because the cause fewer recompiles and therefore execute faster
*Note* - SQL Server 2000 has a limit of 8060 bytes in a single row - this is due to the way SQL Server manages memory. Be careful if you need to get a varchar(8000) out with COALESCE because you might go beyond 8060 bytes when you start adding other columns to your result set. I usually manage this by chopping off some bytes with the LEFT function.