Last time I promised to start looking at performance and XML indexes. I’ll get there but I realised I needed a decent test data set and couldn’t easily lay my hands on one that matched my needs so had to start building one myself. Part way through the process I figured I may as well squeeze an extra post out of the process and blog the creation process.
In my experience creating XML is a less common task than reading it with one semi-exception (that being the old trick of using FOR XML PATH(”) to concatenate strings) but it can be an important one. The biggest piece of XML I’ve built was for a government mandated return, it was the only XML I built on that job but a sizeable proportion of the entire department’s work culminated in it. The switch from a set of Excel macros which took over eight hours to run to something that could rebuilt in minutes had a huge impact on our efficiency, particularly when there would be multiple rounds of validation failures against initial attempts at submission.
There are a few variants when it comes to converting queries into XML but in my opinion the only one you really need to know is the XML PATH approach, the others being either fiddly or not providing enough control.
Because all the cool kids are using it and I had a copy available from his recent live steamed courses I’m going to use a copy of the Stack Overflow database that Brent Ozar maintains. You can get your own copy here. I’m using the 2013 version but if you want to follow along any variant should work here. I’ve used TOP to limit results returned so that they’re short enough to place in the Blog.
How Do I Build XML In SQL Server?
At the simplest level we just drop FOR XML PATH after a query, we give a value to the path which becomes the element each row is nested in and give a root value that wraps the whole output.
SELECT TOP 10
U.Id AS 'UserId',
U.DisplayName,
U.UpVotes,
U.DownVotes
FROM dbo.Users AS U
FOR XML PATH('User'), ROOT('Users');
Which gets something like this (this is going to be a post where the SQL notebook version and its embedded query results is going to be really helpful, you can find these here):
<Users>
<User>
<UserId>17</UserId>
<DisplayName>Nick Berardi</DisplayName>
<UpVotes>884</UpVotes>
<DownVotes>215</DownVotes>
</User>
<User>
<UserId>19</UserId>
<DisplayName>Mads Kristiansen</DisplayName>
<UpVotes>36</UpVotes>
<DownVotes>12</DownVotes>
</User>
</Users>
How do I Create Attributes In XML With SQL Server?
Adding Attributes is easy, we just give the relevant column a name commencing with an @ symbol.
SELECT TOP 2
U.Id AS '@UserId',
U.DisplayName,
U.UpVotes,
U.DownVotes
FROM dbo.Users AS U
FOR XML PATH('User'), ROOT('Users');
And now the XML looks like this:
<Users>
<User UserId="17">
<DisplayName>Nick Berardi</DisplayName>
<UpVotes>884</UpVotes>
<DownVotes>215</DownVotes>
</User>
<User UserId="19">
<DisplayName>Mads Kristiansen</DisplayName>
<UpVotes>36</UpVotes>
<DownVotes>12</DownVotes>
</User>
</Users>
How Do I Nest Elements In XML With SQL Server?
If we want to start grouping elements under parent tags then we just name the columns with the needed path. If two consecutive fields have a common element to the path they get grouped together. Here we nest up and down votes in a votes element.
SELECT TOP 2
U.Id AS '@UserId',
U.DisplayName,
U.UpVotes AS 'Votes/UpVotes',
U.DownVotes AS 'Votes/DownVotes'
FROM dbo.Users AS U
FOR XML PATH('User'), ROOT('Users');
And now we have this:
<Users>
<User UserId="17">
<DisplayName>Nick Berardi</DisplayName>
<Votes>
<UpVotes>884</UpVotes>
<DownVotes>215</DownVotes>
</Votes>
</User>
<User UserId="19">
<DisplayName>Mads Kristiansen</DisplayName>
<Votes>
<UpVotes>36</UpVotes>
<DownVotes>12</DownVotes>
</Votes>
</User>
</Users>
How Do I Add Repeating Sub Elements to XML With SQL Server
Things get a little messier when it comes to adding repeating elements but the approach is relatively simple. An XML element can contain element’s itself so all we need to do is push in a sub query that generates that XML. We’ve already seen how to do this above.
I’m going to get some details on badges for each user, limited and grouped in this case to keep the results presentable.
Note in the sub query we specify the TYPE directive. This tells the server to keep the results as XML, if we don’t do this then the server assumes you want the results formatted as text, escapes all the special characters and makes a big mess.
SELECT TOP 2
U.Id AS '@UserId',
U.DisplayName,
U.UpVotes AS 'Votes/UpVotes',
U.DownVotes AS 'Votes/DownVotes',
( SELECT TOP 2
B.Name, MIN(B.Date) AS FirstAwarded FROM dbo.Badges AS B
WHERE B.UserId = U.Id
GROUP BY B.Name
FOR XML PATH ('Badge'), TYPE
) AS 'Badges'
FROM dbo.Users AS U
FOR XML PATH('User'), ROOT('Users');
And now we get this:
<Users>
<User UserId="17">
<DisplayName>Nick Berardi</DisplayName>
<Votes>
<UpVotes>884</UpVotes>
<DownVotes>215</DownVotes>
</Votes>
<Badges>
<Badge>
<Name>.net</Name>
<FirstAwarded>2009-02-10T05:11:54.607</FirstAwarded>
</Badge>
<Badge>
<Name>asp.net</Name>
<FirstAwarded>2009-04-15T03:45:30.720</FirstAwarded>
</Badge>
</Badges>
</User>
<User UserId="19">
<DisplayName>Mads Kristiansen</DisplayName>
<Votes>
<UpVotes>36</UpVotes>
<DownVotes>12</DownVotes>
</Votes>
<Badges>
<Badge>
<Name>Beta</Name>
<FirstAwarded>2008-09-16T00:00:00</FirstAwarded>
</Badge>
<Badge>
<Name>Citizen Patrol</Name>
<FirstAwarded>2008-09-15T08:55:05.157</FirstAwarded>
</Badge>
</Badges>
</User>
</Users>
I’m not a fan of sub queries directly in the SELECT statement of a query and possibly a little too keen on using APPLY so I’d probably re-write the above like the following.
The ‘*’ on the field name is a place holder for nothing so we don’t force in another element level. Equally we could exclude the Root of ‘users’ and use that as the field name and get the same result.
SELECT TOP 2
U.Id AS '@UserId',
U.DisplayName,
U.UpVotes AS 'Votes/UpVotes',
U.DownVotes AS 'Votes/DownVotes',
B.x AS '*'
FROM dbo.Users AS U
OUTER APPLY
(
SELECT B.Name, MIN(B.Date) AS FirstAwarded FROM dbo.Badges AS B
WHERE B.UserId = U.Id
GROUP BY B.Name
FOR XML PATH ('Badge'), ROOT('Badges'), TYPE
) B(x)
FOR XML PATH('User'), ROOT('Users')
;
And that should cover 99% of needs when it comes to building XML. You might need to take the sub query concept a little further, potentially needing sub queries in your sub queries but with the above you can put together pretty much anything.
Querying XML In SQL Server Series