Magpie Engineer


My own thoughts on fun with Data, MarkLogic, XQuery, fp, emacs, opencv, unix and anything shiny that catches my eye ...


Not your Father's XQuery

Last year, I embarrassed myself slightly by losing patience with someone who was blindly slagging off XSLT.

What the (unnamed) individual did not realise was that they were talking about an ancient version of XSLT. We can partially blame browser built in support of v1.0 (almost to the point where I think we should lobby for removal of XSLT 1.0 from all browsers!).

Even with the existance of the excellent SAXON-CE XSLT v2.0 implementation for javascript, its hard to get folks to take another look at XSLT in a browser context (that won’t stop me from using it!).


Observation: If javascript v1.0 was used in the browser would it enjoy the broad usage it has today?


I should soften my stance a little - I have not been as charitable with younger programmers as older programmers were with me when I was first learning the trade. The rate of change in our industry is ridiculous and expecting newcomers to have all the 'context' about any particular programming language is quite impossible.

The onus is upon those with the 'context' to communicate what a programming language is today.


XQuery is undergoing a similar situation where developers are forming opinions based on v1.0 implementations of the language. While most of us have known for a while about XQuery 3.0 precious little indication of this change is apparant for new users (searching google …​ hence this article).

To feel like you are travelling back in time (or to another universe) just take a look at how xquery gets used in any of Oracle or Microsoft’s database offerings.

When I look at many of these XQuery examples, in those environments, my eyes go blurry and I feel light headed.

What is interesting is that there has been recent work in XQuery in some products, for example there is XQuery 3.0 functionality listed in Oracle documentation for 'Oracle XQuery for Hadoop'. How a developer could ever find this out without digging (as I did) I do not know.

There is also the paradox of there being many XQuery v1.0 processors in existance …​ one example, the QT XQuery engine is set in a popular set of c++ extensions, provides good support for XQuery v1.0. As with XSLT I think we need to speak more about the latest version of the language.


Observation: XQuery v1.0 was good but XQuery 3.# makes XQuery a highly capable functional programming language.


What is true, is that anyone using XQuery in those environments would conclude XQuery is:

  • only for querying XML

  • an unwelcome hitchhiker embedded in much simpler SQL

  • not very elegant

  • stale eg. developed in a previous decade

As a daily user of modern XQuery, the most exciting and defining characteristics of XQuery are the advancements its made over the past couple of years.


Observation: We should try to clarify what XQuery is today.


What is XQuery today ?

I will reproduce the current Wikipedia definition of XQuery:

XQuery is a query and functional programming language that queries and transforms collections of structured and unstructured data, usually in the form of XML, text and with vendor-specific extensions for other data formats (JSON, binary, etc.).

Even this definition is a bit out of date, as the latest specification (XQuery v3.1) which is a W3C Candidate Recommendation adds maps, arrays and string constructors to the language making it easy to work with other text based formats like JSON, html5, SPARQL, etc allowing XQuery to used as a meta approach to working across many different data formats.

An excellant source of runnable XQuery examples is provided by eXistDB here:

And many of the historical XML databases have now become general databases able to natively represent JSON and work with a range of data formats (as well as triples,SQL, etc):

What is XQuery 1.0 ?

The XQuery 1.0 spec was released circa 2010. With a long gestation period many vendors had already developed full blown processors before ratification of the specification as a Recommendation.

Some products use early draft versions of XQuery 1.0 - for example Microsoft SQL server decided XQuery 2004 draft was sufficient and never bothered to update.

What is XQuery 2.0 ?

The WG decided to skip this version number to align with XSLT v3.0.

What is XQuery 3.#?

This is the minimum version that you should be trying to use.

The following products contain an XQuery processor with 3.-ou can get download XQuery processor from:

What is XQuery 3.0 ?

The XQuery 3.0 spec was released circa 2014 and includes a raft of fixes and new features to the language focusing on:

  • ease of use

  • advanced grouping and selection

  • functions as first class citizen

  • annotations

  • new functions for math, formatting, string manipulation

These improvements make XQuery 3.0 a full blown functional programming language which works well with text and xml.

The newer XQuery spec do a clear job at listing out 'whats new'.

Some of the new features like:

  • try/catch

  • string concat with ||

  • mapping operator ! for simple for expressions

  • count clause in FLWOR Expressions

  • Switch expressions

  • Computed namespace constructors

  • Output declarations

make the language more complete and formalise many common extensions found out in the wild.

Others enable novel grouping and selection mechanisms:

  • group by clause in FLWOR Expressions

  • tumbling window and sliding window in FLWOR Expressions

The most exciting (for this programmer) is fully embedding the notion of functional programming into the language. For example, inline functions are expressions and can appear anywhere an expression is allowed.

let $sq :=
 function($i as xs:integer) as xs:integer {
 $i * $i
 }

And variables with the function type can be passed around to other functions.

For example, here is a list of built in functions that accept a function.

  • fn:filter($function, $sequence)

  • fn:map($function, $sequence)

  • fn:map-pairs($function, $seq1, $seq2)

  • fn:fold-left($function, $initial, $sequence)

  • fn:fold-right($function, $initial, $sequence)

Implementing the common fold, map and filtering idioms.

Along with some helper functions to introspect information:

  • function-lookup($function)

  • function-name($function)

  • function-arity($function)

Having first class functions in a language allows implementation of dynamic dispatch and provide polymorphism-like features. It also allows developers to avoid the cognitive load of writing functions with explicit recursion which I find a common problem teaching others how to use xquery.

Next up, annotations, we see the addition of annotations eg. the ability to put metadata on variables and functions with annotation. This enables all kinds of fun stuff …​ for example RESTXQ is predicated on the ability to define annotations.

declare
%rest:GET
%rest:path("/stock/widget/{$id}")
function local:widget($id as xs:int) {
  fn:collection("/db/widgets")/widget[@id eq $id]
};

Lastly, there is a bunch of other new functions:

  • format-date(), format-number(), generateid(),unparsed-text() etc

  • trig/math functions: sin(), cos(), sqrt() etc

  • analyze-string()

with some functions being adopted from latest XSLT:

  • head(), tail(), path()

  • environment-variable(), uri-collection()

  • parse(), serialize()

  • Function assertions in function tests.

What is XQuery 3.1 ?

XQuery 3.1 is in Candidate recomendation status, which means its not yet a W3C recommendation but very close.

The main enhancements revolve around:

  • maps

  • arrays

  • string constructors

all which make it much easier to work with other data formats (like json, etc).

Summary

If you got this far then you already know the summary.

use XQuery 3.0, not XQuery 1.0 !

And tell your friends, family and fellow programmers about this excellent functional programming language.