Discussion:
[xquery-talk] Adaptive serialization of an empty sequence
Joe Wicentowski
2018-04-14 18:14:50 UTC
Permalink
Hi all,

Many thanks, as always, for the very helpful feedback here.

I have noticed that Saxon, eXist, and BaseX all serialize the empty
sequence `()` not as `()` but instead as the empty string ``. Sample code:

serialize((), map { "method": "adaptive" })

I was expecting to see `()` because when serializing a map entry, the empty
sequence is serialized as `()`:

serialize(map { "test": () }, map { "method": "adaptive" })

This returns `map{"blah":()}`.

Can anyone enlighten me on why the empty sequence is serialized as `()` in
the latter context and the empty string in the former?

Thanks,
Joe
Michael Kay
2018-04-14 18:58:40 UTC
Permalink
As always with "why?" questions, it's difficult to know what kind of answer you want, between

(a) Where does the spec say this should happen?

(b) Why does the the spec say this should happen?

and (b) breaks down to either

(b1) Why would this be a reasonable choice for the spec-writers to make?

(b2) As a matter of historical record, who proposed that it should be like this and what justification did they put forward?

Regarding (a), the spec says:

<quote>
Each item in the supplied sequence is serialized individually as follows, with an occurrence of the chosen item-separator between successive items.
</quote>

I think it's a reasonable reading of that that adaptive(S) == string-join(S!adaptive(.), item-separator), which leads to the conclusion that the serialization of () is "".

Regarding (b2), my main recollection of relevant discussions concerns streamability: specifically, it should be possible to serialize each item independently without knowing what follows. But if someone had proposed serializing () as "()", I don't think that could really have been opposed on streamability grounds. But I don't think anyone proposed it.

Regarding (b1), the main clue about the WG's thinking is the sentence

<quote>
The intention of this is to allow any valid XDM instance to be serialized without raising a serialization error.
</quote>

So you find that the adaptive method focuses on how to serialize cases that otherwise would fail. Serializing the empty sequence wouldn't otherwise fail, so I guess it didn't receive much attention. Whether a proposal to serialize () as "()" would have been accepted is anyone's guess.

Michael Kay
Saxonica
Post by Joe Wicentowski
Hi all,
Many thanks, as always, for the very helpful feedback here.
serialize((), map { "method": "adaptive" })
serialize(map { "test": () }, map { "method": "adaptive" })
This returns `map{"blah":()}`.
Can anyone enlighten me on why the empty sequence is serialized as `()` in the latter context and the empty string in the former?
Thanks,
Joe
_______________________________________________
http://x-query.com/mailman/listinfo/talk
_______________________________________________
***@x-query.com
http://x-query.com/mailman/listinfo/talk
Joe Wicentowski
2018-04-14 19:47:15 UTC
Permalink
Thank you, Mike. That explanation is perfectly reasonable; this handling
certainly meets the stated intention.

Joe
Post by Michael Kay
As always with "why?" questions, it's difficult to know what kind of
answer you want, between
(a) Where does the spec say this should happen?
(b) Why does the the spec say this should happen?
and (b) breaks down to either
(b1) Why would this be a reasonable choice for the spec-writers to make?
(b2) As a matter of historical record, who proposed that it should be like
this and what justification did they put forward?
<quote>
Each item in the supplied sequence is serialized individually as follows,
with an occurrence of the chosen item-separator between successive items.
</quote>
I think it's a reasonable reading of that that adaptive(S) ==
string-join(S!adaptive(.), item-separator), which leads to the conclusion
that the serialization of () is "".
Regarding (b2), my main recollection of relevant discussions concerns
streamability: specifically, it should be possible to serialize each item
independently without knowing what follows. But if someone had proposed
serializing () as "()", I don't think that could really have been opposed
on streamability grounds. But I don't think anyone proposed it.
Regarding (b1), the main clue about the WG's thinking is the sentence
<quote>
The intention of this is to allow any valid XDM instance to be serialized
without raising a serialization error.
</quote>
So you find that the adaptive method focuses on how to serialize cases
that otherwise would fail. Serializing the empty sequence wouldn't
otherwise fail, so I guess it didn't receive much attention. Whether a
proposal to serialize () as "()" would have been accepted is anyone's guess.
Michael Kay
Saxonica
Post by Joe Wicentowski
Hi all,
Many thanks, as always, for the very helpful feedback here.
I have noticed that Saxon, eXist, and BaseX all serialize the empty
serialize((), map { "method": "adaptive" })
I was expecting to see `()` because when serializing a map entry, the
serialize(map { "test": () }, map { "method": "adaptive" })
This returns `map{"blah":()}`.
Can anyone enlighten me on why the empty sequence is serialized as `()`
in the latter context and the empty string in the former?
Post by Joe Wicentowski
Thanks,
Joe
_______________________________________________
http://x-query.com/mailman/listinfo/talk
Loading...