Blog

Querying XML in SQL Server – Part 1 – Introduction

Time to meet our new best friend XML and some of the things we can do with it.

Sooner or later we all encounter XML, much like the urban legend that you’re never more than 6ft away from a rat in London the same is true of data and XML.
It’s understandable, XML is easy to read, easy to put together and unconstrained by the need to mess around with difficult tasks such as defining your data structure up front. JSON may be XMLs cooler younger brother but in my day to day life as a BI developer it’s XML that tends to be sent my way to make sense of and it’s XML I’m asked to pull together for things like the stacks of data returns I fed back to the government in my Health and Social Care days.

I’ve been working with it a lot again lately to either receive messages from company systems or to share customer summaries from the data warehouse back to those systems. As I can never quite remember the correct syntax to use and I’ve learnt a few neat tricks over the last few months I thought I’d get this all in writing. At the very least it puts it all in one place for myself and it may well help someone else.

Today I’m going to keep things light (this is only my third post after all) and give a brief tour of a piece of XML as a way of introducing the key terms.

Below is a copy of the XML api output from BoardGameGeek for family favourite game Rhino Hero. I’ve edited it down a little for brevity, you can see the full output here.

<?xml version="1.0" encoding="UTF-8"?>
<items termsofuse="https://boardgamegeek.com/xmlapi/termsofuse">
   <item type="boardgame" id="91514">
      <thumbnail>https://cf.geekdo-images.com/R_VAhiLCzl5RXKwSluvEbg__thumb/img/zpzD1TJLfuNEjuTjpPbN1y1mpss=/fit-in/200x150/filters:strip_icc()/pic3271388.jpg</thumbnail>
      <image>https://cf.geekdo-images.com/R_VAhiLCzl5RXKwSluvEbg__original/img/hVn3HtD2_5vKNjhEBkL1qFB4FaU=/0x0/filters:format(jpeg)/pic3271388.jpg</image>
      <name type="primary" sortindex="1" value="Rhino Hero" />
      <name type="alternate" sortindex="1" value="Rino Ercolino" />
      <name type="alternate" sortindex="1" value="Super Rhino!" />
      <name type="alternate" sortindex="1" value="Super Rino!" />
      <description>
		Super Rhino! presents players with an incredibly heroic &amp;ndash; and regrettably heavy &amp;ndash; rhinoceros who is eager to climb a tall building and leap other tall buildings in a single bound. 
	  </description>
      <yearpublished value="2011" />
      <minplayers value="2" />
      <maxplayers value="5" />
      <poll name="suggested_numplayers" title="User Suggested Number of Players" totalvotes="61">
         <results numplayers="1">
            <result value="Best" numvotes="0" />
            <result value="Recommended" numvotes="7" />
            <result value="Not Recommended" numvotes="29" />
         </results>
         <results numplayers="2">
            <result value="Best" numvotes="10" />
            <result value="Recommended" numvotes="40" />
            <result value="Not Recommended" numvotes="3" />
         </results>
         <results numplayers="3">
            <result value="Best" numvotes="44" />
            <result value="Recommended" numvotes="11" />
            <result value="Not Recommended" numvotes="0" />
         </results>
         <results numplayers="4">
            <result value="Best" numvotes="18" />
            <result value="Recommended" numvotes="30" />
            <result value="Not Recommended" numvotes="1" />
         </results>
         <results numplayers="5">
            <result value="Best" numvotes="5" />
            <result value="Recommended" numvotes="25" />
            <result value="Not Recommended" numvotes="13" />
         </results>
         <results numplayers="5+">
            <result value="Best" numvotes="0" />
            <result value="Recommended" numvotes="3" />
            <result value="Not Recommended" numvotes="24" />
         </results>
      </poll>
      <playingtime value="15" />
      <minplaytime value="5" />
      <maxplaytime value="15" />
      <minage value="5" />
   </item>
</items>

One of the benefits of XML is that in most cases you can read it by eye, I won’t go as far as describing it as self documenting but you can get a fair idea of how and what any given piece of XML is doing.

So what is XML made up of?

Firstly we have tags like <description> these are like the bones of XML. They either come in pairs like this:

<description> </description>

Or as a single item like this:

<description/>

A pair of tags and everything in between is an element, elements can contain further elements and those can contain further elements. Note this part of the XML above.

         <results numplayers="1">
            <result value="Best" numvotes="0" />
            <result value="Recommended" numvotes="7" />
            <result value="Not Recommended" numvotes="29" />
         </results>

We have multiple <result> tags inside a <results> tag but if you look back up you’ll also note there are multiple <results> nested inside a <poll> tag.

Finally we have attibutes, these are extra pieces of information we can attach to give us further properties of a tag, sometimes this is all the tag needs:

<name type=”alternate” sortindex=”1″ value=”Super Rhino!”/>

Other times we’ll have both attributes and further information or tags inside the element (see the <results> tag example above).

That’s all for our whirlwind tour of XML, next we’ll start looking at how we use T-SQL to pull information out of a piece of XML and into something that will look a little more normal to the average SQL Server user.

Querying XML In SQL Server Series

Cheat Sheet

Posts in SQL Notebook form

I’ve Got 99.0991 Problems But Floating Point Precision ‘Aint One

In which the blogger finds an excuse to use a title he’s been holding on to for some time and we learn why float probably isn’t the best data type to hold your numerical data in.

T-SQL Tuesday Logo

For this week’s T-SQL Tuesday, Brent Ozar (b|t) asked us to share either our most or least favorite SQL Server data type. I can’t confess to having any particularly strong feelings on the subject in either direction even if you take into account my keenness to add the number 2 on the end every time I see a datetime.

I do however struggle to resist the opportunity to use a catchy title and this week’s subject was enough to finally give me the motivation to put the furloughed part of my current four day working week to good use and set up a blog.

The float data type can often look like a tempting, easy choice for numerical data types. Why worry about all that precision and scale nonsense when you can just use a data type that will hold literally* any number!

*do not take this claim literally

Let’s test this out by counting problems. We’ll start in a relaxed state and steadily increase the pressure level until we reach 99 problems (one metric Jay-Z).

DECLARE @Problems float(24) = 0.0;

While NOT(@Problems > 99)
BEGIN
	SET @Problems += 0.1
END;

Print 'I''ve got ' + CAST(@Problems as varchar(10)) + ' problems but floating point precision ''aint one';

Unfortunately we end up a little way over 99 at 99.0991 (henceforth known as an imperial Jay-Z).

Why? Because accuracy isn’t the main aim of a float, it’s a way of storing something that’s roughly the right number in a small space, in a way computers find easy to process quickly, there are values that it can get really close to but never accurately store. That’s great if you’re not that worried about precision (imagine streams of IOT readings with an inherent error to them), it’s less good if you want the exact right number (think applying interest rates to your savings).

You might also be caught out by code like this:

DECLARE @Problems float(24) = 0.0;

While @Problems <> 99
BEGIN
	SET @Problems += 0.1;
	PRINT 'I''ve got ' + CAST(@Problems as varchar(10)) + ' problems but floating point precision ''aint one';
END;

It looks innocent enough but because we never actually hit a value of 99 on the nose the loop never ends, unless you want unending loops running on your server then that’s a Bad Thing™

Is using a floating point data type the end of the world? Probably not, most likely you’ll just sooner or later waste a few hours working out why the percentages on a report don’t total to 100.
Probably the biggest benefit I get as a senior developer from the float datatype is code smell. It’s a nice clear flag that that code possibly hasn’t been thought through as carefully as it should and I probably want to take a closer look.

Thanks for reading and do come back in the future, I’m planning on starting up a series over the next few weeks on another data type I’ve been spending a lot of time with lately at work – XML.

Further Reading

Hello World

In which new friends are made, jokes are told and the bloggers love of all things cheesy is established.

Hello and welcome to my blog! This is very much a test post to figure out how things work. At some point I will hopefully understand enough of this to make the pages nice and pretty, for now you’ll have to settle for data related blogging and the occasional dad joke.

How do you get Pikachu on a bus?
You have to Poke ’em on!
Here’s some cheese, it was very tasty.