Using Regexs in XSLT Transformations with .NET 3.5

| | Comments (0) | TrackBacks (0)

This week I need to do something relatively simple that should have taken no time but ended up taking the better part of a morning: I needed to write an XSLT template that transformed a node value using a regular expression. I learned that XSLT 2 supports regular expressions; however, there is no support for XSLT 2 in .NET 3.5. So, I added an extension function to my template that the XPath engine could use to apply the regular expression substitution. Using this custom function required me to qualify it with a namespace prefix. This introduced a problem because I was creating the template dynamically at run-time using LINQ to XML. This technology tries to hide all namespace-related tasks from you (which is helpful in some applications), so I struggled to find a way to add the namespace and the corresponding prefix to the XDocument that I was building up in memory. Eventually, I ended up with something along these lines:

    1 class Program

    2 {

    3     static void Main()

    4     {

    5         var outputXml = new StringBuilder();

    6         var inputXml = @"<r><phoneNumber>555-555-5555</phoneNumber><o>Foo</o></r>";          

    7         var transform = new XslCompiledTransform();

    8         var templates = GetTemplates();

    9         var stylesheet =                                                                  

   10             new XDocument(new XElement(xsltNs + "stylesheet",

   11                 new XAttribute("version", "1.0"),

   12                 new XAttribute(XNamespace.Xmlns + extNsPrefix, extNs),                      

   13                 templates));


   15         using (var reader = stylesheet.CreateReader())

   16             transform.Load(reader);


   18         using (var reader = XmlReader.Create(new StringReader(inputXml)))

   19         using (var writer = XmlWriter.Create(outputXml))

   20         {

   21             var arguments = new XsltArgumentList();


   23             arguments.AddExtensionObject(extNs.ToString(), new XPathExtensions());          

   24             transform.Transform(reader, arguments, writer);                                  

   25         }


   27         Trace.WriteLine(outputXml);

   28     }


   30     private static IEnumerable<XElement> GetTemplates()

   31     {

   32         var templates = new List<XElement>();

   33         var pattern = "\\d";

   34         var replacement = "X";

   35         var matchValue = "//phoneNumber/text()";


   37         templates.Add(XElement.Parse(

   38             @"<template match='@*|node()' xmlns=''>

   39               <copy>

   40                 <apply-templates select='@*|node()'/>

   41               </copy>

   42             </template>"));

   43         templates.Add(

   44             new XElement(xsltNs + "template",

   45                 new XAttribute("match", matchValue),

   46                 new XElement(xsltNs + "value-of",

   47                     new XAttribute("select", string.Format("{0}:Replace(., '{1}', '{2}')",

   48                         extNsPrefix, pattern, replacement)))));


   50         return templates;

   51     }


   53     private class XPathExtensions

   54     {

   55         public static string Replace(string input, string pattern, string replacement)

   56         {

   57             return Regex.Replace(input, pattern, replacement);

   58         }

   59     }


   61     private static readonly XNamespace xsltNs = "";

   62     private static readonly XNamespace extNs = "urn:foo";

   63     private const string extNsPrefix = "ext";

   64 }

This code isn't advanced, but there are a few details that can take a bit of time to figure out. So you're able to implement this sort of thing before your morning slips away and you miss your coffeebreak, I'll hightlight them here:

  • The XSLT stylesheet is created dynamically at run-time using LINQ to XML.

  • The subclass, XPathExtensions, has a public, static method called Replace that applies a regex to a given string and returns the result.

  • This class's method is made available to the XSLT engine by associating the namespace, urn:foo, with an instance of my XPathExtensions nested class and passed to the XslCompiledTransform object's Transform methods on lines 21 - 24.

  • The namespace and the prefix (ext) are explicitly added to the XSLT document that is created with LINQ to XML by adding the XAttribute on line 12.

  • This extension is called in the select statement on line 48 by prefixing the XPath function with ext and specifying the name of the public, static method that is in the XPathExtensions object that was passed into the engine via the XsltArgumentList.

(Of all these details, the one that took me the longest to figure out was how to explicitly add a namespace with a given prefix when using LINQ to XML.)

I hope this helps you solve your programming problems, and that you enjoy your coffee break!