Discussion:
[xquery-talk] Find All Nodes Between Root Node and Descendant Nodes of Some Type
Eliot Kimber
2015-07-19 14:33:16 UTC
Permalink
Given this starting document:

Let $doc :=
<root>
<a id="a1">
<b>b1</b>
<b>b2</b>
<c>
<b>b6</b>
</c>
<a id="a2">
<b>b3</b>
<b>b4</b>
<a id="a3">
<b>b5</b>
</a>
</a>
</a>
</root>

I want to find all the <b> elements descending from <a id="a1"> but not
within nested <a> elements:

<b>b1</b>
<b>b2</b>
<b>b6</b>


This gives me the correct answer:


let $a1 := $doc/a
let $bsInA1 := $a1//b[not(./ancestor::a = ($a1//a))]


My question: With Xpath 3.1, is there a better way to express this query?
I looked at the new outermost() and innermost() operators but I didn't see
a way to apply them to this problem.

Thanks,

Eliot

----
Eliot Kimber, Owner
Contrext, LLC
http://contrext.com



_______________________________________________
***@x-query.com
http://x-query.com/mailman/listinfo/talk
G. Ken Holman
2015-07-19 15:09:37 UTC
Permalink
How about:

$a1//b except $a1//a//b

. . . . . . Ken
Post by Eliot Kimber
Let $doc :=
<root>
<a id="a1">
<b>b1</b>
<b>b2</b>
<c>
<b>b6</b>
</c>
<a id="a2">
<b>b3</b>
<b>b4</b>
<a id="a3">
<b>b5</b>
</a>
</a>
</a>
</root>
I want to find all the <b> elements descending from <a id="a1"> but not
<b>b1</b>
<b>b2</b>
<b>b6</b>
let $a1 := $doc/a
let $bsInA1 := $a1//b[not(./ancestor::a = ($a1//a))]
My question: With Xpath 3.1, is there a better way to express this query?
I looked at the new outermost() and innermost() operators but I didn't see
a way to apply them to this problem.
Thanks,
Eliot
----
Eliot Kimber, Owner
Contrext, LLC
http://contrext.com
_______________________________________________
http://x-query.com/mailman/listinfo/talk
--
Check our site for free XML, XSLT, XSL-FO and UBL developer resources |
Free 5-hour lecture: http://www.CraneSoftwrights.com/links/video.htm |
Crane Softwrights Ltd. http://www.CraneSoftwrights.com/q/ |
G. Ken Holman mailto:***@CraneSoftwrights.com |
Google+ profile: http://plus.google.com/+GKenHolman-Crane/about |
Legal business disclaimers: http://www.CraneSoftwrights.com/legal |


---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus

_______________________________________________
***@x-query.com
http://x-query.com/mailman/listinfo/talk
Eliot Kimber
2015-07-19 15:12:15 UTC
Permalink
That seems to easy Ken.

In terms of processing optimization, is there any reason to prefer one
formulation over the other (meaning, is it possible to predict how XPath
processors will be able to optimize this type of expression)?

Thanks,

E.
----
Eliot Kimber, Owner
Contrext, LLC
http://contrext.com
Post by G. Ken Holman
$a1//b except $a1//a//b
. . . . . . Ken
Post by Eliot Kimber
Let $doc :=
<root>
<a id="a1">
<b>b1</b>
<b>b2</b>
<c>
<b>b6</b>
</c>
<a id="a2">
<b>b3</b>
<b>b4</b>
<a id="a3">
<b>b5</b>
</a>
</a>
</a>
</root>
I want to find all the <b> elements descending from <a id="a1"> but not
<b>b1</b>
<b>b2</b>
<b>b6</b>
let $a1 := $doc/a
let $bsInA1 := $a1//b[not(./ancestor::a = ($a1//a))]
My question: With Xpath 3.1, is there a better way to express this query?
I looked at the new outermost() and innermost() operators but I didn't see
a way to apply them to this problem.
Thanks,
Eliot
----
Eliot Kimber, Owner
Contrext, LLC
http://contrext.com
_______________________________________________
http://x-query.com/mailman/listinfo/talk
--
Check our site for free XML, XSLT, XSL-FO and UBL developer resources |
Free 5-hour lecture: http://www.CraneSoftwrights.com/links/video.htm |
Crane Softwrights Ltd. http://www.CraneSoftwrights.com/q/ |
Google+ profile: http://plus.google.com/+GKenHolman-Crane/about |
Legal business disclaimers: http://www.CraneSoftwrights.com/legal |
---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus
_______________________________________________
http://x-query.com/mailman/listinfo/talk
_______________________________________________
***@x-query.com
http://x-query.com/mailman/listinfo/talk
Michael Kay
2015-07-19 15:31:24 UTC
Permalink
Post by Eliot Kimber
That seems to easy Ken.
In terms of processing optimization, is there any reason to prefer one
formulation over the other (meaning, is it possible to predict how XPath
processors will be able to optimize this type of expression)?
No, it’s not really possible to predict. Both are amenable to optimization, but there’s a law of diminishing returns in what it’s worth attempting. Saxon will do both pretty much as written.

Saxon did at one time attempt to rewrite a/b/c/X except a/b/c/Y as a/b/c/(X except Y) but I found it was unsound, which taught me a lesson. (I forget the actual case that demonstrates this.) I still find it very hard to know how to prove which rewrites are sound and which aren’t.

Michael Kay
Saxonica


_______________________________________________
***@x-query.com
http://x-query.com/mai
Christian Grün
2015-07-19 15:42:25 UTC
Permalink
Hi Eliot,
Post by Eliot Kimber
In terms of processing optimization, is there any reason to prefer one
formulation over the other (meaning, is it possible to predict how XPath
processors will be able to optimize this type of expression)?
I like Ken's solution, but mostly because it's more concise. You will
never know what a specific implementation does, or what it will do in
a future version. It also depends on your data: If you have deep
document structures, the ancestor step may get more expensive. Just
create some gigabytes of test data and do some simple testing with the
processors of your choice.

However, your original query may be evaluated faster by some
processors if you move the path expression out of the predicate and
bind it to an additional variable:

let $a1 := $doc/a
let $a2 := $a1//a
let $bsInA1 := $a1//b[not(ancestor::a = $a2)]

Cheers,
Christian
_______________________________________________
***@x-query.com
http://x-query.com/mailman/listinfo/talk
G. Ken Holman
2015-07-20 00:01:25 UTC
Permalink
Post by Eliot Kimber
That seems to easy Ken.
...
...
Post by G. Ken Holman
$a1//b except $a1//a//b
Perhaps, but you were the one that put that expression in words and I
Post by Eliot Kimber
I want to find all the <b> elements descending from <a id="a1"> but not
within nested <a> elements
I see others have commented on the performance of this. I think the
maintenance of what I've written is easier than trying to grok the
application of the ancestor axis.

. . . . . . . Ken


--
Check our site for free XML, XSLT, XSL-FO and UBL developer resources |
Free 5-hour lecture: http://www.CraneSoftwrights.com/links/video.htm |
Crane Softwrights Ltd. http://www.CraneSoftwrights.com/q/ |
G. Ken Holman mailto:***@CraneSoftwrights.com |
Google+ profile: http://plus.google.com/+GKenHolman-Crane/about |
Legal business disclaimers: http://www.CraneSoftwrights.com/legal |


---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus

_______________________________________________
***@x-query.com
http://x-query.com/mailman/listinfo/talk

Loading...