[xquery-talk] arrow operator

Discussion:

Michael Kay

2017-08-02 15:44:32 UTC

To put things in perspective, => was introduced in version 3.1, together with map and array support. As far as I understand it and looking again at the specs, one of the use cases is for => to be used with maps and arrays, in an object-oriented style.

In particular, arrays don't have a filter operator in the way that sequences do, neither do they have a mapping operator, so to select into an array of maps you need to do things like

$A => a:filter( function($m) {$m?name='Mike'}) => a:for-each( function($m) { $m?age })

which gets terribly unwieldy if you try to write it with conventional nested function calls.

I wish we had done a concise anonymous function declaration as well:

$A => a:filter( {?name='Mike'} ) => a:for-each( {?age} )

Here {EXPR} is shorthand for "function($x){ $x ! EXPR }"

but that would have been a bridge too far for some people.

Michael Kay
Saxonica

_______________________________________________
***@x-query.com
http://x-query.com/mailman/listinfo/talk

Christian Grün

2017-08-02 15:01:45 UTC

Permalink

Why can't the context contain a sequence?

Please have a look at the specification for more information on the
context item:

https://www.w3.org/TR/xquery-31/#dt-context-item

Dear Wouter,
I see your point and it makes sense. The tricky part here, following up on
your suggestion to make the choice of the first position less arbitrary, is
that the context item must be an item: it cannot be a sequence of several
items. This eliminates any design based on communicating the value on the
left through the context item mechanism. The closest to the intent of =>
would be using a let variable binding.
The convention of binding the first argument, of course, implies that the
(or some) functions are designed having in mind that the first parameter is
“special”, that is, if the function is intended to be used with =>. In Java
or C++, the self/this implied parameter is special, too (but it is not
considered to be at any position).
Kind regards,
Ghislain

_______________________________________________
http://x-query.com/mailman/listinfo/talk

_______________________________________________

Christian Grün

2017-08-02 14:30:14 UTC

Permalink

This I understand, but the only thing that bothers me is that the implicit
binding of the first argument prevents the use of other arguments, whereas
an explicit context would've allowed for the more flexible option of
$position => array:remove($array, .)

A similar proposal was discussed in the W3 Bugzilla tracker some time ago:

https://www.w3.org/Bugs/Public/show_bug.cgi?id=26889

The majority of the W3 members voted for the initial proposal.

Here is yet another related idea that ended up in the 3.2 backlog:

https://www.w3.org/Bugs/Public/show_bug.cgi?id=29393

Dear Wouter,
Indeed, in some cases, whether to use => or ! can be subject to a matter
of taste.
To put things in perspective, => was introduced in version 3.1, together
with map and array support. As far as I understand it and looking again at
the specs, one of the use cases is for => to be used with maps and arrays,
in an object-oriented style. The XPath and XQuery Functions and Operators
["a", "b", "c"] => array:get(2)
$array => array:remove($position) => array:insert-before($position,
$member)
Kind regards,
Ghislain
[1] https://www.w3.org/TR/xpath-functions-31/

That's why, for cases where it's possible, I preferred to write the
simple mapping operator, as it's easier to read IMO.
Hi Ghislain,
Alright, I forgot about fn:replace#4, but the implicit binding is
sometimes hard to detect.
Thanks.
is no context item involved, I wrote too fast. :-)
@address ! string(.) (: explicitly passed context item :)
@address ! string() (: context item passed implicitly to string#0, which
is context-dependent)
@address => string() (: @address passed implicitly to string#1 as the
first parameter via the => operator, but string#1 is context-independent)
@address => string(.) (: error: string#2 does not exist :)
Kind regards,
Ghislain

Dear Wouter,
There is one more important difference on the syntactic level.
With the arrow operator, the left-hand-side is implicitly bound to the
first parameter of the function.
@address => replace(@postcode, "", "q")
is the same as
With the simple map operator, the context item must be explicitly
@address ! replace(., @postcode, "", "q")
What may create confusion is that some functions have several
signatures, some of which implicitly refer to the context item. But this is
a very different mechanism.
@address ! string(.) (: explicitly passed context item :)
@address ! string() (: context item passed implicitly to string#0,
which is context-dependent)
@address => string() (: context item passed implicitly to string#1 via
the => operator, but string#1 is context-independent)
I hope I got it right!
Kind regards,
Ghislain

Hi Michael,
The way you used the arrow operator in the example would be the way I
expected it to work, namely by explicitly addressing the context, but it
seems that it doesn't. It's actually implicitly binding the first argument
of the function on the right to the value on the left. Or is there an
exception I don't know about?
Thanks.
In the case of singletons there's very little difference, but (as I
now see Christian has pointed out), with sequences the effect is quite
different.
Also, of course, "!" changes the context item, so
@address => replace(@postcode, "", "q") works, while
@address ! replace(@postcode, "", "q") doesn't.
Michael Kay
Saxonica

Hi,
Is there any advantage to using the 3.1 arrow operator over the
simple map operator?
$string => upper-case() => normalize-unicode() => tokenize("\s+")
versus
$string ! upper-case(.) ! normalize-unicode(.) ! tokenize(.,"\s+")
Thanks,
Wouter
_______________________________________________
http://x-query.com/mailman/listinfo/talk

_______________________________________________
http://x-query.com/mailman/listinfo/talk

_______________________________________________
***@x-query.com
http://x-query.com/mailman/listinfo/talk

Ghislain Fourny

2017-08-02 13:56:29 UTC

Permalink

Dear Christian,

I agree; Actually, as far as I recollect, in the initial draft for the ! operator, I had spontaneously given it the same precedence as / in the grammar. We later decided that it made sense to keep them strictly separate as it makes the semantics much clearer, especially regarding sorting and duplicate elimination.

Kind regards,
Ghislain

On 2 Aug 2017, at 13:10, Christian GrÃŒn <***@gmail.com<mailto:***@gmail.com>> wrote:

As Ghislain indicated, the semantics of the two operators differs
pretty much (although there are surely some cases in which they can
serve as equivalent alternatives). Actually, the mapping operator is
much closer to XPath steps. There are even good reasons for replacing
/ with !, e.g. if you do not want to have duplicate-free and ordered
results.

Christian

Graydon Saunders

2017-08-02 12:32:27 UTC

Permalink

The thing that convinced me I cared about the arrow operator was

(//@someFlag) => distinct-values()

since there's no other way to avoid having to make whatever lump of logic
is really there for //@someFlag the explicit operand of the
distinct-values() and I often find I want to test that the lump of logic
returns the kind and number of thing I expect before reducing the sequence
to its distinct values. Using the arrow operator results in something
that's much nicer to read.

In the case of singletons there's very little difference, but (as I now
see Christian has pointed out), with sequences the effect is quite
different.
Also, of course, "!" changes the context item, so
@address => replace(@postcode, "", "q") works, while
@address ! replace(@postcode, "", "q") doesn't.
Michael Kay
Saxonica

Hi,
Is there any advantage to using the 3.1 arrow operator over the simple

map operator?

$string => upper-case() => normalize-unicode() => tokenize("\s+")
versus
$string ! upper-case(.) ! normalize-unicode(.) ! tokenize(.,"\s+")
Thanks,
Wouter
_______________________________________________
http://x-query.com/mailman/listinfo/talk

_______________________________________________
http://x-query.com/mailman/listinfo/talk

Michael Kay

2017-08-02 11:31:37 UTC

Permalink

There are even good reasons for replacing

Post by Ghislain Fourny
/ with !, e.g. if you do not want to have duplicate-free and ordered
results.

Yes, I'm advising people to do that, especially after a variable reference as in

for ... return $x/item

which hardly ever needs a sort into document order so it's better to write $x!item.

I had a case reported to me recently where $var was a sequence of three parentless element nodes and $var / item essentially randomized the order because the relative order of nodes in different trees is undefined. Much better to write $var ! item which (a) makes the result predictable, and (b) avoids the cost of an unwanted sort.

Michael Kay
Saxonica

_______________________________________________
***@x-query.com
http://x-query.com/mailman/listinfo/talk

Ghislain Fourny

2017-08-02 10:20:59 UTC

Permalink

Dear Wouter,

Yes, the first version of Mike's example (=>) uses fn:replace#4, the second version (!) uses fn:replace#3.

I am not sure what you mean with "detect" : unless I am missing something, it is always 100% clear from the syntax itself which function to use from an EQName and from the number of parameters in the function invocation. When => is used, one always needs to add 1 to the arity when looking up the function. With =>, the implicit binding always occurs and it is statically known that fn:replace#4 must be used in this case, it is a bit like a rewrite. Once the function has been looked up, it is invoked according to the semantics of a function call.

Kind regards,
Ghislain

Hi Ghislain,
Alright, I forgot about fn:replace#4, but the implicit binding is sometimes hard to detect.
Thanks.
@address ! string(.) (: explicitly passed context item :)
@address ! string() (: context item passed implicitly to string#0, which is context-dependent)
@address => string() (: @address passed implicitly to string#1 as the first parameter via the => operator, but string#1 is context-independent)
@address => string(.) (: error: string#2 does not exist :)
Kind regards,
Ghislain

Dear Wouter,
There is one more important difference on the syntactic level.
With the arrow operator, the left-hand-side is implicitly bound to the first parameter of the function.
@address => replace(@postcode, "", "q")
is the same as
@address ! replace(., @postcode, "", "q")
What may create confusion is that some functions have several signatures, some of which implicitly refer to the context item. But this is a very different mechanism.
@address ! string(.) (: explicitly passed context item :)
@address ! string() (: context item passed implicitly to string#0, which is context-dependent)
@address => string() (: context item passed implicitly to string#1 via the => operator, but string#1 is context-independent)
I hope I got it right!
Kind regards,
Ghislain

Hi Michael,
The way you used the arrow operator in the example would be the way I expected it to work, namely by explicitly addressing the context, but it seems that it doesn't. It's actually implicitly binding the first argument of the function on the right to the value on the left. Or is there an exception I don't know about?
Thanks.
In the case of singletons there's very little difference, but (as I now see Christian has pointed out), with sequences the effect is quite different.
Also, of course, "!" changes the context item, so
@address => replace(@postcode, "", "q") works, while
@address ! replace(@postcode, "", "q") doesn't.
Michael Kay
Saxonica

Hi,
Is there any advantage to using the 3.1 arrow operator over the simple map operator?
$string => upper-case() => normalize-unicode() => tokenize("\s+")
versus
$string ! upper-case(.) ! normalize-unicode(.) ! tokenize(.,"\s+")
Thanks,
Wouter
_______________________________________________
http://x-query.com/mailman/listinfo/talk

_______________________________________________
http://x-query.com/mailman/listinfo/talk

_______________________________________________
***@x-query.com
http://x-query.com/mailman/listinfo/talk

Christian Grün

2017-08-01 12:41:17 UTC

Permalink

Hi Wouter,

The arrow operator takes all items of the left side as first argument,
whereas the simple map operator processes all items one by one:

(1,2,3) => count() → 3
(1,2,3) ! count(.) → 1 1 1

Cheers,
Christian

Hi,
Is there any advantage to using the 3.1 arrow operator over the simple map
operator?
$string => upper-case() => normalize-unicode() => tokenize("\s+")
versus
$string ! upper-case(.) ! normalize-unicode(.) ! tokenize(.,"\s+")
Thanks,
Wouter
_______________________________________________
http://x-query.com/mailman/listinfo/talk

_______________________________________________
***@x-

W.S. Hager

2017-08-03 19:46:23 UTC

Permalink

Hi Michael,

Anonymous functions would've been quite nice, but could your example only
be used for a single argument?

To put things in perspective, => was introduced in version 3.1, together

with map and array support. As far as I understand it and looking again at
the specs, one of the use cases is for => to be used with maps and arrays,
in an object-oriented style.

In particular, arrays don't have a filter operator in the way that
sequences do, neither do they have a mapping operator, so to select into an
array of maps you need to do things like

$A => a:filter( function($m) {$m?name='Mike'}) => a:for-each( function($m)
{ $m?age })

which gets terribly unwieldy if you try to write it with conventional
nested function calls.

I wish we had done a concise anonymous function declaration as well:

$A => a:filter( {?name='Mike'} ) => a:for-each( {?age} )

Here {EXPR} is shorthand for "function($x){ $x ! EXPR }"

but that would have been a bridge too far for some people.

Michael Kay
Saxonica

Michael Kay

2017-08-03 21:01:21 UTC

Permalink

Anonymous functions would've been quite nice, but could your example only be used for a single argument?

Yes. I think that's such a common case that it's worth having special syntax for, particularly as we already have "." as a symbol representing an anonymous variable, so we get a nice combination of concepts.

(We do of course have syntax for the more general case already: it's just a bit verbose.)

But of course, in the WG we would spend months discussing alternatives and might well come up with something better...

Michael Kay
Saxonica
_______________________________________________
***@x-query.com
http://x-query.com/mailman/listinfo/talk

W.S. Hager

2017-08-04 07:52:59 UTC

Permalink

I see, thanks.

Post by W.S. Hager
Anonymous functions would've been quite nice, but could your example

only be used for a single argument?
Yes. I think that's such a common case that it's worth having special
syntax for, particularly as we already have "." as a symbol representing an
anonymous variable, so we get a nice combination of concepts.
(We do of course have syntax for the more general case already: it's just a bit verbose.)
But of course, in the WG we would spend months discussing alternatives and
might well come up with something better...
Michael Kay
Saxonica