Discussion:
[xquery-talk] Fold-right
W.S. Hager
2017-01-12 12:21:25 UTC
Permalink
Hello,

Not sure if anyone already noticed, but there's an error in the
XPath/XQuery F&O docs, where the example implementation of array:fold will
always return an array...

More importantly, it seems to me that a textbook implementation of
fold-right was used, without the notice that this will only work in a lazy
evaluation context. If you inspect the code as eagerly evaluated, you'll
see that the array is looped over multiple times (the result is the same of
course). This makes for a very inefficient loop AFAICT, but I may be
mistaken.

I just discovered that Clojure doesn't even have fold-right, because it
isn't lazily evaluated (since Clojure isn't interpreted it's probably too
complicated on the JVM).

I guess for XQuery an interpreter could defer evaluation when it encounters
specific calls, but a superficial web search on this subject related to
e.g. Saxon doesn't provide specific answers, only that (understandably) the
number of performance improvements has grown over time, and perhaps
deferring evaluation is one of them. However, as I'm re-evaluating (no pun
intended) my investment in XQuery, I'm eager to hear if there are some more
theoretical grounds for providing said example, or if the committee intends
to one day provide them.

Is there any assumption about the laziness of an XQuery implementation? Are
there any a priori lazy implementations out there?

Thanks,
Wouter
Liam R. E. Quin
2017-01-12 13:02:54 UTC
Permalink
Post by W.S. Hager
Hello,
Not sure if anyone already noticed, but there's an error in the
XPath/XQuery F&O docs, where the example implementation of array:fold
will always return an array...
I don't believe so. Please open a bug report (see the status section of
the document for details on how to do that).
Post by W.S. Hager
More importantly, it seems to me that a textbook implementation of
fold-right was used, without the notice that this will only work in a
lazy evaluation context.
XQuery is designed so that a hybrid lazy evaluation strategy is
encouraged. Queries are rewritten extensively by many implementations.
Having said that, we don't provide e.g. infinite sequences, so it's
also possible to implement an entirely eager XQuery.

The example implementations here and there in the spec are not
necessarily efficient but they should (unless otherwise noted in the
text) be correct in all cases.

Liam
--
Liam R. E. Quin <***@w3.org>
The World Wide Web Consortium (W3C)
_______________________________________________
***@x-query.com
http://x-query.com/mailman/listinfo/talk
Michael Kay
2017-01-12 13:09:22 UTC
Permalink
Post by W.S. Hager
Hello,
Not sure if anyone already noticed, but there's an error in the XPath/XQuery F&O docs, where the example implementation of array:fold will always return an array...
If they have noticed, they haven't reported it. Logged here as a bug:

https://www.w3.org/Bugs/Public/show_bug.cgi?id=30045

If you spot such things in future, please report them using the feedback mechanisms described in the status section of the spec.
Post by W.S. Hager
More importantly,
No, I think the subsequent points are far less important. The code used to specify the effect of the function is intended to be a specification, not an efficient implementation.
Post by W.S. Hager
it seems to me that a textbook implementation of fold-right was used, without the notice that this will only work in a lazy evaluation context. If you inspect the code as eagerly evaluated, you'll see that the array is looped over multiple times (the result is the same of course). This makes for a very inefficient loop AFAICT, but I may be mistaken.
I don't think it's true that it only works with lazy evaluation. It may well be true that it only works efficiently with lazy evaluation, but that's irrelevant: this is a specfication of the effect of the function, not a proposed implementation.
Post by W.S. Hager
Is there any assumption about the laziness of an XQuery implementation? Are there any a priori lazy implementations out there?
I think an implementation without lazy evaluation would be incredibly inefficient, and I would expect that most serious implementations use lazy evaluation to a lesser or greater extent. Certainly Saxon does so. I have seen open source XPath implementations that materialize all intermediate results, and frankly I don't think such an implementation is fit for purpose.

As it happens, this very morning I've been working on a problem where excessive use of lazy evaluation causes stack overflow problems when fold-left is used to process a very long sequence: you can find it here if you're interested:

https://saxonica.plan.io/issues/3106

Michael Kay
Saxonica


_______________________________________________
***@x-query.com
http://x-query.com/mailman/listinfo/talk
W.S. Hager
2017-01-12 15:17:39 UTC
Permalink
Hello Michael,
Post by Michael Kay
https://www.w3.org/Bugs/Public/show_bug.cgi?id=30045
Thanks, will do it next time.

No, I think the subsequent points are far less important. The code used to
Post by Michael Kay
specify the effect of the function is intended to be a specification, not
an efficient implementation.
Agreed, but that wasn't my point. You may have the opinion that it wasn't
important, but I'm curious to know where anything tangible on laziness is
mentioned. As you say, not having any won't be very efficient, so you may
as well be explicit about it, right? I don't really understand why it's
preferable to have a syntax without an implementation, and I simply pointed
out that in the case of the fold-right example that becomes slightly odd...
Michael Kay
2017-01-12 16:27:37 UTC
Permalink
Agreed, but that wasn't my point. You may have the opinion that it wasn't important, but I'm curious to know where anything tangible on laziness is mentioned.
It isn't - deliberately. We leave "quality of implementation" issues entirely to the implementor. There are many implementation techniques available, including ones that may not have been invented yet, and there are different trade-offs between time and memory, and the spec quite deliberately doesn't get involved in such matters. The spec tells you what result to expect, it doesn't tell you when to expect it.

Michael Kay
Saxonica
As you say, not having any won't be very efficient, so you may as well be explicit about it, right? I don't really understand why it's preferable to have a syntax without an implementation, and I simply pointed out that in the case of the fold-right example that becomes slightly odd...
_______________________________________________
***@x-query.com
http://x-query.com/mailman/listinfo/talk
W.S. Hager
2017-01-12 17:10:14 UTC
Permalink
Can we please agree to add a footnote to the fold-right example that says
something along the lines of "this example algorithm may be optimized using
lazy evaluation in the interpreter"? I don't see how that is biasing the
spec toward a specific implementation...
Post by W.S. Hager
Agreed, but that wasn't my point. You may have the opinion that it
wasn't important, but I'm curious to know where anything tangible on
laziness is mentioned.
It isn't - deliberately. We leave "quality of implementation" issues
entirely to the implementor. There are many implementation techniques
available, including ones that may not have been invented yet, and there
are different trade-offs between time and memory, and the spec quite
deliberately doesn't get involved in such matters. The spec tells you what
result to expect, it doesn't tell you when to expect it.
Michael Kay
Saxonica
Post by W.S. Hager
As you say, not having any won't be very efficient, so you may as well
be explicit about it, right? I don't really understand why it's preferable
to have a syntax without an implementation, and I simply pointed out that
in the case of the fold-right example that becomes slightly odd...
--
W.S. Hager
Lagua Web Solutions
http://lagua.nl
W.S. Hager
2017-01-12 17:21:56 UTC
Permalink
Better still, remove fold-right entirely, since laziness isn't part of the
spec (specifically infinite lists).
Post by W.S. Hager
Can we please agree to add a footnote to the fold-right example that says
something along the lines of "this example algorithm may be optimized using
lazy evaluation in the interpreter"? I don't see how that is biasing the
spec toward a specific implementation...
Post by W.S. Hager
Agreed, but that wasn't my point. You may have the opinion that it
wasn't important, but I'm curious to know where anything tangible on
laziness is mentioned.
It isn't - deliberately. We leave "quality of implementation" issues
entirely to the implementor. There are many implementation techniques
available, including ones that may not have been invented yet, and there
are different trade-offs between time and memory, and the spec quite
deliberately doesn't get involved in such matters. The spec tells you what
result to expect, it doesn't tell you when to expect it.
Michael Kay
Saxonica
Post by W.S. Hager
As you say, not having any won't be very efficient, so you may as well
be explicit about it, right? I don't really understand why it's preferable
to have a syntax without an implementation, and I simply pointed out that
in the case of the fold-right example that becomes slightly odd...
--
W.S. Hager
Lagua Web Solutions
http://lagua.nl
--
W.S. Hager
Lagua Web Solutions
http://lagua.nl
Michael Kay
2017-01-12 18:44:35 UTC
Permalink
There's nothing specific to fold-right here. It's a general property that when we specify functions using algorithms like this, the algorithms are not designed to be efficient. And in turn, that's a property of specifications in general, not just this one. It's the way standards work.

Michael Kay
Saxonica
Better still, remove fold-right entirely, since laziness isn't part of the spec (specifically infinite lists).
Can we please agree to add a footnote to the fold-right example that says something along the lines of "this example algorithm may be optimized using lazy evaluation in the interpreter"? I don't see how that is biasing the spec toward a specific implementation...
Agreed, but that wasn't my point. You may have the opinion that it wasn't important, but I'm curious to know where anything tangible on laziness is mentioned.
It isn't - deliberately. We leave "quality of implementation" issues entirely to the implementor. There are many implementation techniques available, including ones that may not have been invented yet, and there are different trade-offs between time and memory, and the spec quite deliberately doesn't get involved in such matters. The spec tells you what result to expect, it doesn't tell you when to expect it.
Michael Kay
Saxonica
As you say, not having any won't be very efficient, so you may as well be explicit about it, right? I don't really understand why it's preferable to have a syntax without an implementation, and I simply pointed out that in the case of the fold-right example that becomes slightly odd...
--
W.S. Hager
Lagua Web Solutions
http://lagua.nl <http://lagua.nl/>
--
W.S. Hager
Lagua Web Solutions
http://lagua.nl <http://lagua.nl/>
Loading...