Wednesday, January 10, 2007

Try/catch in XSLT 2.0: first test cases, first problems

Introduction

In addition to a comment and some advices on Try/catch in XSLT 2.0, Michael Kay gave me today in a post on the Saxon mailing list four interesting test cases. The comment and advices say:

The design of this from the user perspective looks reasonable: it's better than my own attempt to do it solely using extension functions. I did it that way because I was targetting XQuery, but it's not ideal there either, if only because saxon:try() is not a true function.

The tricky part in doing try/catch properly, however, is the semantics. You need to make sure, as far as possible, that (a) you don't catch errors in expressions that are written outside the try but lazily evaluated within it, and (b) that you do catch errors in expressions that are written inside the try but lazily evaluated outside it. This involves both compile-time work, to suppress rewrites that move expressions into or out of the try block, and run-time work, to suppress lazy evaluation (or to make sure that lazily-evaluated expressions carry their catch block with them)

So I created a real test case for each of his test case advice, and added one more, more severe IMHO. Actually, three of his test cases passed, and one failed. The fifth one cause an illegal state in the serializer.

Thank you again Mike for your input!

Results

Here is first the output of the test cases, then each test case individually with a few words.

(drkm) [168]> saxon --b --add-cp=fgeorges.jar -it main error-safe-01.xsl
<root>
   <div-by-0 i="1"/>
   <div-by-0 i="2"/>
   <div-by-0 i="3"/>
   <div-by-0 i="4"/>
   <div-by-0 i="5"/>
   <div-by-0 i="6"/>
   <div-by-0 i="7"/>
   <div-by-0 i="8"/>
   <div-by-0 i="9"/>
   <div-by-0 i="10"/>
</root>

(drkm) [169]> saxon --b --add-cp=fgeorges.jar -it main error-safe-02.xsl
<root>
   <div-by-0 where="In main."/>
</root>

(drkm) [170]> saxon --b --add-cp=fgeorges.jar -it main error-safe-03.xsl
<root>
   <ERROR what="Div by 0" where="In main!"/>
</root>

(drkm) [171]> saxon --b --add-cp=fgeorges.jar -it main error-safe-04.xsl
<root>
   <div-by-0/>
</root>

(drkm) [172]> saxon --b --add-cp=fgeorges.jar -it main error-safe-05.xsl
java.lang.IllegalStateException: Attempt to end document in serializer when elements are unclosed
        at net.sf.saxon.event.XMLEmitter.endDocument(XMLEmitter.java:110)
        at net.sf.saxon.event.ProxyReceiver.endDocument(ProxyReceiver.java:102)
        at net.sf.saxon.event.ProxyReceiver.endDocument(ProxyReceiver.java:102)
        at net.sf.saxon.event.ProxyReceiver.endDocument(ProxyReceiver.java:102)
        at net.sf.saxon.event.ProxyReceiver.endDocument(ProxyReceiver.java:102)
        at net.sf.saxon.event.ComplexContentOutputter.endDocument(ComplexContentOutputter.java:115)
        at net.sf.saxon.Controller.transformDocument(Controller.java:1654)
        at net.sf.saxon.Controller.transform(Controller.java:1438)
        at net.sf.saxon.Transform.execute(Transform.java:890)
        at net.sf.saxon.Transform.doTransform(Transform.java:491)
        at net.sf.saxon.Transform.main(Transform.java:60)
Fatal error during transformation: Attempt to end document in serializer when elements are unclosed

(drkm) [173]> 

error-safe-01.xsl

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                xmlns:xs="http://www.w3.org/2001/XMLSchema"
                xmlns:ex="java:/org.fgeorges.exslt2.saxon.Exslt2InstructionFactory"
                exclude-result-prefixes="xs"
                extension-element-prefixes="ex"
                version="2.0">

  <xsl:output indent="yes" omit-xml-declaration="yes"/>

  <!--
      Test from Michael Kay, see http://sf.net/mailarchive/message.php?msg_id=37863852

      (the danger here is that the 1 div $zero, being independent of
      the loop context, will be evaluated outside the loop. This might
      be OK: In fact, I think it probably will be OK, because although
      Saxon moves the expression outside the loop, it always evaluates
      it lazily to avoid triggering errors if the loop is executed
      zero times. But it needs testing).
  -->

  <xsl:template name="main">
    <ex:error-safe>
      <ex:try>
        <root>
          <xsl:call-template name="test"/>
        </root>
      </ex:try>
      <ex:catch>
        <ERROR what="Div by 0 error not caught!"/>
      </ex:catch>
    </ex:error-safe>
  </xsl:template>

  <xsl:template name="test">
    <xsl:param name="zero" select="0" as="xs:integer"/>
    <xsl:for-each select="1 to 10">
      <ex:error-safe>
        <ex:try>
          <xsl:value-of select="1 div $zero"/>
        </ex:try>
        <ex:catch>
          <div-by-0 i="{ . }"/>
        </ex:catch>
      </ex:error-safe>
    </xsl:for-each>
  </xsl:template>

</xsl:stylesheet>

error-safe-02.xsl

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                xmlns:xs="http://www.w3.org/2001/XMLSchema"
                xmlns:ex="java:/org.fgeorges.exslt2.saxon.Exslt2InstructionFactory"
                exclude-result-prefixes="xs"
                extension-element-prefixes="ex"
                version="2.0">

  <xsl:output indent="yes" omit-xml-declaration="yes"/>

  <!--
      Test from Michael Kay, see http://sf.net/mailarchive/message.php?msg_id=37863852

      Here the error shouldn't be caught, but it might be, because of
      lazy evaluation.
  -->

  <xsl:template name="main">
    <root>
      <ex:error-safe>
        <ex:try>
          <xsl:call-template name="test"/>
        </ex:try>
        <ex:catch>
          <div-by-0 where="In main."/>
        </ex:catch>
      </ex:error-safe>
    </root>
  </xsl:template>

  <xsl:template name="test">
    <xsl:param name="zero" select="1" as="xs:integer"/>
    <xsl:variable name="v" select="for $n in 1 to 10 return $n div $zero"/>
    <xsl:for-each select="1 to 10">
      <ex:error-safe>
        <ex:try>
          <xsl:value-of select="$v[current()]"/>
        </ex:try>
        <ex:catch>
          <ERROR what="Div by 0 error caught!" i="{ . }"/>
        </ex:catch>
      </ex:error-safe>
    </xsl:for-each>
  </xsl:template>

</xsl:stylesheet>

error-safe-03.xsl

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                xmlns:xs="http://www.w3.org/2001/XMLSchema"
                xmlns:my="my:error-safe-03.xsl"
                xmlns:ex="java:/org.fgeorges.exslt2.saxon.Exslt2InstructionFactory"
                exclude-result-prefixes="my xs"
                extension-element-prefixes="ex"
                version="2.0">

  <xsl:output indent="yes"/>

  <!--
      Test from Michael Kay, see http://sf.net/mailarchive/message.php?msg_id=37863852

      Here the risk is that the evaluation is done lazily after exit
      from the function, outside the range of the try/catch. Again, I
      think this might work OK, but it needs careful checking. (If it
      works, it's because the whole xsl:error-safe expression is being
      lazily evaluated).
  -->

  <xsl:template name="main">
    <root>
      <ex:error-safe>
        <ex:try>
          <xsl:sequence select="my:div(1 to 5, 0)"/>
        </ex:try>
        <ex:catch>
          <ERROR what="Div by 0" where="In main!"/>
        </ex:catch>
      </ex:error-safe>
    </root>
  </xsl:template>

  <xsl:function name="my:div">
    <xsl:param name="n" as="xs:integer*"/>
    <xsl:param name="d" as="xs:integer"/>
    <ex:error-safe saxon:explain="yes">
      <ex:try>
        <xsl:value-of select="for $i in $n return $i div $d"/>
      </ex:try>
      <ex:catch>
        <div-by-0/>
      </ex:catch>
    </ex:error-safe>
  </xsl:function>

</xsl:stylesheet>

This test case failed. The error is caught on the main template, instead of within the my:div() function. I looked at the Saxon's internals deeper, but I am still lost with all rewriting, optimizing, simplifying, lazy evaluation, optimizing, etcetera. It is interesting to see that the expression trees, after optimization, are as follow:

Optimized expression tree for template at line 23 in error-safe-03.xsl:
    element
      name root
      content
      error-safe
        try
          call my:div
            operator to
              1
              5
            0
        catch
          element
            name ERROR
            content
                attribute
                  name what
                  "Div by 0"
                attribute
                  name where
                  "In main!"
Optimized expression tree for function my:div at line 38 in error-safe-03.xsl:
    error-safe
      try
        value-of
          construct simple content
            for $i as xs:integer in
              $n
            return
              operator div
                $i
                $d
            " "
      catch
        element
          name div-by-0
          content
          ()

I need more investigation (and I think more help) to see if that can be solved.

error-safe-04.xsl

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                xmlns:xs="http://www.w3.org/2001/XMLSchema"
                xmlns:ex="java:/org.fgeorges.exslt2.saxon.Exslt2InstructionFactory"
                exclude-result-prefixes="xs"
                extension-element-prefixes="ex"
                version="2.0">

  <xsl:output indent="yes" omit-xml-declaration="yes"/>

  <!--
      Test from Michael Kay, see http://sf.net/mailarchive/message.php?msg_id=37863852

      Here the risk is that the error won't be caught because the
      expression is evaluated early, at compile time (the so-called
      "constant folding" process).
  -->

  <xsl:template name="main">
    <root>
      <ex:error-safe>
        <ex:try>
          <xsl:call-template name="test"/>
        </ex:try>
        <ex:catch>
          <ERROR what="Div by 0" where="In main!"/>
        </ex:catch>
      </ex:error-safe>
    </root>
  </xsl:template>

  <xsl:template name="test">
    <xsl:variable name="zero" select="0" as="xs:integer"/>    
      <ex:error-safe>
        <ex:try>
           <xsl:value-of select="1 div $zero"/>
        </ex:try>
        <ex:catch>
          <div-by-0/>
        </ex:catch>
      </ex:error-safe>
  </xsl:template>

</xsl:stylesheet>

error-safe-05.xsl

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                xmlns:xs="http://www.w3.org/2001/XMLSchema"
                xmlns:my="my:error-safe-05.xsl"
                xmlns:ex="java:/org.fgeorges.exslt2.saxon.Exslt2InstructionFactory"
                exclude-result-prefixes="xs"
                extension-element-prefixes="ex"
                version="2.0">

  <xsl:output indent="yes" omit-xml-declaration="yes"/>

  <!--
      The error occurs in the middle of an element construction.  What
      to do?
  -->

  <xsl:template name="main">
    <ex:error-safe>
      <ex:try>
        <root>
          <xsl:sequence select="error(xs:QName('my:ERROR'), 'Error')"/>
        </root>
      </ex:try>
      <ex:catch>
        <error/>
      </ex:catch>
    </ex:error-safe>
  </xsl:template>

</xsl:stylesheet>

This test case shows in my opinion the worst problem. The problem is both technical and specification-related. The test case seems simple: an ex:try element contains as sequence constructor a unique literal element, that in turn contains a simple xsl:sequence instruction that throw an error. Something (more simple than) usual.

Usually, when an error is thrown, the transformation failed, stop brutaly, an the result is not serialized. But with ex:error-safe, the error is caught and the transformation continues. But the closing tag of the literal element within the error was thrown was never seen. So when the result "tree" is serialized, there is an error, because the sequence of events no longer represents a valid XML fragment.

So I need to define what to do in this case. Discard all the pending result of the sequence constructor already generated? Close every still opened nodes? Keep the items of the sequence but the last one if it is a node not fully build?

In the points of view of both the specification and the implementation, I am not sure what os the right thing to do.

To be continued...

Labels:

0 Comments:

Post a Comment

<< Home