Thursday, May 7, 2009

Getting Unique XML Element Values with XSL 1.0

Today I needed to munge some dirty XML data. I still haven't taught myself XSL/XPath 2.0 yet, so I was limited to XSL 1.0 for now. The data I had looked like this, only much, much worse.
<Subject>Value 1|Value 2</Subject>
<Subject>Value 1|Value 2</Subject>
<Subject>Value 1|Value 2</Subject>
<Time>Time Value 1</Time>
<Time>Time Value 1</Time>
<Time>Time Value 1</Time>
<Subject>Value 3|Value 4</Subject>
<Subject>Value 3|Value 4</Subject>
<Subject>Value 3|Value 4</Subject>
<Time>Time Value 2</Time>
<Time>Time Value 2</Time>
<Time>Time Value 2</Time>
I wanted two things out of that series of elements: unique strings and the value before the |. As I type this, I realize there may be a bit of a bug here, but I'll have to test it out. Here's what I did for the series of subject elements. Can you spot the bug? ;-)
<xsl:variable name="subjects" select="/fragment/index-only-subjects//Subject[not(text()=preceding-sibling::Subject/text())]/text()"/>
<xsl:for-each select="$subjects">
<xsl:sort select="." data-type="text"/>
<xsl:when test="contains(.,'|')">
<xsl:value-of select="substring-before(.,'|')"/>
<xsl:value-of select="."/>
I'm pretty sure there's a way to do this with xsl:keys / key(), but I got this solution working first.

No comments: