Last time round we looked at exists, this time round it’s the query method. Similar to exists there’s rarely anything that you can do that you can’t also handle in SQL but there’s a lot of cases where it makes things either easier, neater or faster.
There is probably as much if not more to say about the query method as there the preceding methods put together so this post will be more of a guide on how to use it and some inspiration rather than comprehensive coverage. I’ll add more posts with examples as and when I’m inspired.
The query method does what you’d expect, it allows you to run a query against a piece of XML. Documentation for the method is here, it takes a string as the input and behind that string is XQuery an entire* new language hidden away in your SQL Server instance. Documentation for the XQuery language as implemented in SQL Server is here. It’s worth a browse, particularly the coded examples which often go a long way to providing models you can adapt to your needs.
You can think of XQuery as a superset of the XPath we’ve been using so far to navigate the levels of our XML, it does that and then a whole lot more on top to manipulate and apply functions to those paths. I’m not intending this to be a full tutorial on XQuery, that’s too big a job for a single post. Consider this more a recipe book and inspiration. I’m also going to focus on getting data out of XML. You can use XQuery to build XML as well but I’m not sure there are a lot of use cases for that inside SQL Server (do share in the comments if you have some).
It returns the XML data type, sometimes that’s actual XML other times it’s just text or a number that we’ll want to convert appropriately.
Let’s start as ever with our sample XML.
CREATE TABLE #MyXML (MyXML XML NOT NULL);
INSERT INTO #MyXML(MyXML)
VALUES
(
'<?xml version="1.0" encoding="UTF-8"?>
<items termsofuse="https://boardgamegeek.com/xmlapi/termsofuse">
<item type="boardgame" id="91514">
<thumbnail>https://cf.geekdo-images.com/R_VAhiLCzl5RXKwSluvEbg__thumb/img/zpzD1TJLfuNEjuTjpPbN1y1mpss=/fit-in/200x150/filters:strip_icc()/pic3271388.jpg</thumbnail>
<image>https://cf.geekdo-images.com/R_VAhiLCzl5RXKwSluvEbg__original/img/hVn3HtD2_5vKNjhEBkL1qFB4FaU=/0x0/filters:format(jpeg)/pic3271388.jpg</image>
<name type="primary" sortindex="1" value="Rhino Hero" />
<name type="alternate" sortindex="1" value="Rino Ercolino" />
<name type="alternate" sortindex="1" value="Super Rhino!" />
<name type="alternate" sortindex="1" value="Super Rino!" />
<description>
Super Rhino! presents players with an incredibly heroic &ndash; and regrettably heavy &ndash; rhinoceros who is eager to climb a tall building and leap other tall buildings in a single bound.
</description>
<yearpublished value="2011" />
<minplayers value="2" />
<maxplayers value="5" />
<poll name="suggested_numplayers" title="User Suggested Number of Players" totalvotes="61">
<results numplayers="1">
<result value="Best" numvotes="0" />
<result value="Recommended" numvotes="7" />
<result value="Not Recommended" numvotes="29" />
</results>
<results numplayers="2">
<result value="Best" numvotes="10" />
<result value="Recommended" numvotes="40" />
<result value="Not Recommended" numvotes="3" />
</results>
<results numplayers="3">
<result value="Best" numvotes="44" />
<result value="Recommended" numvotes="11" />
<result value="Not Recommended" numvotes="0" />
</results>
<results numplayers="4">
<result value="Best" numvotes="18" />
<result value="Recommended" numvotes="30" />
<result value="Not Recommended" numvotes="1" />
</results>
<results numplayers="5">
<result value="Best" numvotes="5" />
<result value="Recommended" numvotes="25" />
<result value="Not Recommended" numvotes="13" />
</results>
<results numplayers="5+">
<result value="Best" numvotes="0" />
<result value="Recommended" numvotes="3" />
<result value="Not Recommended" numvotes="24" />
</results>
</poll>
<playingtime value="15" />
<minplaytime value="5" />
<maxplaytime value="15" />
<minage value="5" />
</item>
</items>'
);
SELECT * FROM #MyXML;
We encountered the simplest form of XQuery earlier when looking at nodes.
The below gives a neat way to view the XML in the current context and up at the parent level. I use this often as a help during development. If the results are going in to a view I’ll probably leave them there as the test team can find them helpful too but it’s not something I’d want filling up disk space.
SELECT
R.ResultXML.query('.') AS ResultLevelXML,
R.ResultXML.value('.','varchar(max)') AS ResultLevelXMLValue,
cast(R.ResultXML.query('.') as varchar(max)) AS ResultLevelXMLCast,
R.ResultXML.query('..') AS PollLevelXML
FROM #MyXML AS X
CROSS APPLY X.MyXML.nodes('/items/item/poll/results') AS R(ResultXML);
Note how in this case the value method gives us nothing but we can cast the query result into text.
Most other uses for the query method fall into similar categories as the exists method these are things you can do in SQL but that would require you to drop down to a level in the structure below the one you want in your final output. Let’s look at a few examples.
How Do I Aggregate XML Values In SQL Server?
We can use Xquery to sum values without the need to break them down at row level and group them up again. This can give us a result in the select statement or could be used in a WHERE clause. The format is simple and looks just like a SQL aggregate with a path inside.
Note the data types, even though maximum and minimum are only handling integers I get a type conversion error if I ask for an integer result.
SELECT
R.ResultXML.value('./@numplayers','varchar(50)') AS NumPlayers,
R.ResultXML.query('sum(./result/@numvotes)') AS SumVotesQuery,
R.ResultXML.value('sum(./result/@numvotes)','int') AS SumVotesValue,
R.ResultXML.value('max(./result/@numvotes)','float') AS MaxVotesValue,
R.ResultXML.value('min(./result/@numvotes)','float') AS MinVotesValue,
R.ResultXML.value('count(./result/@numvotes)','int') AS CountVotesValue,
R.ResultXML.value('avg(./result/@numvotes)','numeric(10,2)') AS SumVotesValue,
R.ResultXML.query('.') AS ResultsXML
FROM #MyXML AS X
CROSS APPLY X.MyXML.nodes('/items/item/poll/results') AS R(ResultXML)
WHERE
R.ResultXML.value('sum(./result/@numvotes)','int') > 50;
If we really want that integer then we need to explicitly convert the @numvotes attribute to an integer.
SELECT
--R.ResultXML.value('min(xs:integer(./result/@numvotes))','int') AS MinVotesValueBroken,
R.ResultXML.value('min(for $v in ./result/@numvotes return xs:integer($v))','int') AS MinVotesValue,
R.ResultXML.query('.') AS ResultsXML
FROM #MyXML AS X
CROSS APPLY X.MyXML.nodes('/items/item/poll/results') AS R(ResultXML);
That’s not simple, the commented out version looks like a viable approach but that’s not asking to convert the values into an integer it’s trying to convert all the elements as one.
Instead we need to iterate through each item converting it. It’s unlikely to be worth the effort in most cases but there’s a small risk of floating point errors if you leave things to the defaults which use the double data type.
The above is a nice introduction to looping through data in XQuery which follows a fairly common pattern in programming languages:
for $MyOneThing in AllTheThings return DoSomethingOneAtATimeTo($MyOneThing)
This sets us up for the next example.
How Do I Concatenate XML Values Into a Single String?
The ability to loop through sub nodes gives us a route to a concatenated string because sometimes (e.g. for report outputs) that’s all we need. Here’s an example concatenating all the alternative names for our example game into one.
SELECT
X.MyXML.query(
'for $name in
/items/item/name
return
concat($name/@type,'': '',$name/@value, '' aka '')'
).value('.','varchar(max)') AS AllTheNames
FROM #MyXML AS X;
The value method doesn’t work directly here but we can stack methods, query gives us the result as XML type and then we apply the value method to get us to text. We could also do this in SQL and might want to as we probably want to remove that stray ‘aka’ at the end. This next example shows STUFF () being used to solve the extra aka issue and also the text conversion running via CAST().
SELECT
STUFF(
CAST(
X.MyXML.query(
'for $name in
/items/item/name
return
concat( '' aka '', $name/@type,'': '',$name/@value)'
)
AS varchar(max))
,1,5,'') AS
AllTheNames
FROM #MyXML AS X;
How to Do Almost Anything Else In SQL Server With XQuery?
The catch all solution applies here… Google it (other search engines are available). You’ll probably find someone has asked how to do the same thing on Stack Overflow (or tech forum of choice) and dropping that solution into the query method has a good chance of solving your problem or getting you most of the way there.
Other good resources are the aforementioned examples in Microsoft’s XQuery documentation and the W3 School’s tutorials both of which I’ve bounced in and out of while writing these blogs and the worktime activities that inspired them.
Observant readers may have noticed there was a somewhat ominous footnote promised earlier on, here it is.
*Why Won’t My Entirely Valid XQuery Work in SQL Server?
Long term users of SQL Server may be wholly unsurprised to discover that XQuery falls into the rather large category of “cool sounding features in SQL Server that haven’t been developed further since they were sufficiently functional to tick all the right boxes on a list of features but no more” (see elsewhere for temporal tables, graph queries and many others). To quote Microsoft on the subject:
This XQuery implementation is aligned with the July 2004 Working Draft of XQuery
Some Person At Microsoft
That’s a draft of version 1.0 and we’re now up to version 3.1, don’t be surprised if features are missing.
One set of features I’ve missed personally are the more advanced abilities to move around nodes in the XML specifying preceding or following siblings in a group to work out relative or absolute positions. Expect to see some alternative solutions to that in a future post.
Querying XML In SQL Server Series