Discussion:
[xquery-talk] parsing xml fragments
Leo Studer
2015-06-21 16:15:01 UTC
Permalink
Hello

I tried the following

declare variable $feed := doc("http://www.meteotest.ch/meteotest-extras/rss/rss-sio.xml")//item/description/string();

<html>
<body>{
parse-xml-fragment($feed)/*}
</body>
</html>

to use the weather infos from this feed.
Unfortunately parse-xml-fragment($feed) does not work since & is used in a URL. Is there a easy way to do that?

Thanks in advance
Leo
William Candillon
2015-06-21 16:22:09 UTC
Permalink
Have you considered parsing the description as html?
Here's an example with Zorba:
http://try-zorba.28.io/queries/xquery/KZc4M47%2BQ948k2%2FR8GneS%2BtP%2Fgs%3D
Post by Leo Studer
Hello
I tried the following
declare variable $feed :=
doc("http://www.meteotest.ch/meteotest-extras/rss/rss-sio.xml")//item/description/string();
<html>
<body>{
parse-xml-fragment($feed)/*}
</body>
</html>
to use the weather infos from this feed.
Unfortunately parse-xml-fragment($feed) does not work since & is used in a
URL. Is there a easy way to do that?
Thanks in advance
Leo
_______________________________________________
http://x-query.com/mailman/listinfo/talk
_______________________________________________
***@x-query.com
http://x-query.com/mailman/listinfo/talk
William Candillon
2015-06-21 16:22:48 UTC
Permalink
And the result serialized as html:
http://tryzorba.28.io/query.jq?id=KZc4M47%2BQ948k2%2FR8GneS%2BtP%2Fgs%3D&format=html
Post by William Candillon
Have you considered parsing the description as html?
http://try-zorba.28.io/queries/xquery/KZc4M47%2BQ948k2%2FR8GneS%2BtP%2Fgs%3D
Post by Leo Studer
Hello
I tried the following
declare variable $feed :=
doc("http://www.meteotest.ch/meteotest-extras/rss/rss-sio.xml")//item/description/string();
<html>
<body>{
parse-xml-fragment($feed)/*}
</body>
</html>
to use the weather infos from this feed.
Unfortunately parse-xml-fragment($feed) does not work since & is used in a
URL. Is there a easy way to do that?
Thanks in advance
Leo
_______________________________________________
http://x-query.com/mailman/listinfo/talk
_______________________________________________
***@x-query.com
http://x-query.com/mailman/listinfo/talk
Leo Studer
2015-06-21 18:33:00 UTC
Permalink
Yes, this would be the expected result, however the proposed solution does not work here.
Post by William Candillon
http://tryzorba.28.io/query.jq?id=KZc4M47%2BQ948k2%2FR8GneS%2BtP%2Fgs%3D&format=html
Post by William Candillon
Have you considered parsing the description as html?
http://try-zorba.28.io/queries/xquery/KZc4M47%2BQ948k2%2FR8GneS%2BtP%2Fgs%3D
Post by Leo Studer
Hello
I tried the following
declare variable $feed :=
doc("http://www.meteotest.ch/meteotest-extras/rss/rss-sio.xml")//item/description/string();
<html>
<body>{
parse-xml-fragment($feed)/*}
</body>
</html>
to use the weather infos from this feed.
Unfortunately parse-xml-fragment($feed) does not work since & is used in a
URL. Is there a easy way to do that?
Thanks in advance
Leo
_______________________________________________
http://x-query.com/mailman/listinfo/talk
_______________________________________________
http://x-query.com/mailman/listinfo/talk
_______________________________________________
***@x-query.com
http://x-query.com/mailman/listinfo/talk
Dannes Wessels
2015-06-21 17:10:46 UTC
Permalink
Does the doc()//item/... expression return anything? I expect a namespace issue here...

Cheers

Dannes


--
www.exist-db.org
Post by Leo Studer
Hello
I tried the following
declare variable $feed := doc("http://www.meteotest.ch/meteotest-extras/rss/rss-sio.xml")//item/description/string();
<html>
<body>{
parse-xml-fragment($feed)/*}
</body>
</html>
to use the weather infos from this feed.
Unfortunately parse-xml-fragment($feed) does not work since & is used in a URL. Is there a easy way to do that?
Thanks in advance
Leo
_______________________________________________
http://x-query.com/mailman/listinfo/talk
Leo Studer
2015-06-21 18:22:42 UTC
Permalink
Yes, $feed contains the expected info, no namespace issue here


Try it out.
BaseX does not report an error however the code does not work and Saxon reports

Engine name: Saxon-PE XQuery 9.6.0.5
Severity: fatal
Description: FODC0006: First argument to parse-xml-fragment() is not a well-formed and namespace-well-formed XML fragment. XML parser reported: org.xml.sax.SAXParseException: The reference to entity "service" must end with the ';' delimiter.
Post by Dannes Wessels
Does the doc()//item/... expression return anything? I expect a namespace issue here...
Cheers
Dannes
--
www.exist-db.org <http://www.exist-db.org/>
Post by Leo Studer
Hello
I tried the following
declare variable $feed := doc("http://www.meteotest.ch/meteotest-extras/rss/rss-sio.xml <http://www.meteotest.ch/meteotest-extras/rss/rss-sio.xml>")//item/description/string();
<html>
<body>{
parse-xml-fragment($feed)/*}
</body>
</html>
to use the weather infos from this feed.
Unfortunately parse-xml-fragment($feed) does not work since & is used in a URL. Is there a easy way to do that?
Thanks in advance
Leo
_______________________________________________
http://x-query.com/mailman/listinfo/talk <http://x-query.com/mailman/listinfo/talk>_______________________________________________
http://x-query.com/mailman/listinfo/talk
Leo Studer
2015-06-21 20:15:51 UTC
Permalink
I think this is a bug of parse-xml-fragment() since the string passed in $feed is a wellformed escaped xml fragment.
The xQuery-processor is trying to replace &amp; by & which makes no sense and produces like that an error ;-(

What do the experts say ?

Blessings
Leo
Post by Leo Studer
Yes, $feed contains the expected info, no namespace issue here

Try it out.
BaseX does not report an error however the code does not work and Saxon reports
Engine name: Saxon-PE XQuery 9.6.0.5
Severity: fatal
Description: FODC0006: First argument to parse-xml-fragment() is not a well-formed and namespace-well-formed XML fragment. XML parser reported: org.xml.sax.SAXParseException: The reference to entity "service" must end with the ';' delimiter.
Post by Dannes Wessels
Does the doc()//item/... expression return anything? I expect a namespace issue here...
Cheers
Dannes
--
www.exist-db.org <http://www.exist-db.org/>
Post by Leo Studer
Hello
I tried the following
declare variable $feed := doc("http://www.meteotest.ch/meteotest-extras/rss/rss-sio.xml <http://www.meteotest.ch/meteotest-extras/rss/rss-sio.xml>")//item/description/string();
<html>
<body>{
parse-xml-fragment($feed)/*}
</body>
</html>
to use the weather infos from this feed.
Unfortunately parse-xml-fragment($feed) does not work since & is used in a URL. Is there a easy way to do that?
Thanks in advance
Leo
_______________________________________________
http://x-query.com/mailman/listinfo/talk <http://x-query.com/mailman/listinfo/talk>_______________________________________________
http://x-query.com/mailman/listinfo/talk
_______________________________________________
http://x-query.com/mailman/listinfo/talk
Leo Studer
2015-06-21 20:33:39 UTC
Permalink
Hi George

thanks for your thoughts.
I thought the same in the beginning. And when you look in the source $feed there it is escaped but parse-xml-fragment does the mess…

Always
Leo
Hi Leo,
<img src="https://mdx.meteotest.ch/api_v1?key=A8FEFD4159D9934E4808641EE063254F&service=prod2data...
and thus the content is not XML wellformed, the parser thinks that you have a service entity because it sees &service= and thus it expects a ; after service.
You should escape & to &amp;
Best Regards,
George
--
George Cristian Bina
<oXygen/> XML Editor, Schema Editor and XSLT Editor/Debugger
http://www.oxygenxml.com
Post by Leo Studer
Hello
I tried the following
*declare variable
**$feed*:=/doc/("http://www.meteotest.ch/meteotest-extras/rss/rss-sio.xml")//*item*/*description*//string/();
<html>
<body>{
/parse-xml-fragment/(*$feed*)/*}
</body>
</html>
to use the weather infos from this feed.
Unfortunately /parse-xml-fragment/(*$feed*) does not work since & is
used in a URL. Is there a easy way to do that?
_______________________________________________
***@x-query.com
http://x-query.com/ma
David Carlisle
2015-06-21 20:47:44 UTC
Permalink
Post by Leo Studer
Hi George
thanks for your thoughts. I thought the same in the beginning. And
when you look in the source $feed there it is escaped but
parse-xml-fragment does the mess…
Always Leo
It looks to me as if the string is not well formed.

You have essentially

<![CDATA[<img src="aaa?&location=sio"/>]]>

So the input XML document is well formed

but the string you are passing to the xml parser is


<img src="aaa?&location=sio"/>

which is not well formed.

the input should be


<![CDATA[<img src="aaa?&amp;location=sio"/>]]>

If you know all your ampersands are not escaped, and you can't fix the
source documents you could use a regexp-replace to escape them before
parsing

parse-xml-fragment(replace($feed,'&amp;','&amp;amp;'))

David

________________________________


The Numerical Algorithms Group Ltd is a company registered in England and Wales with company number 1249803. The registered office is:

Wilkinson House, Jordan Hill Road, Oxford OX2 8DR, United Kingdom.



This e-mail has been scanned for all viruses by Microsoft Office 365.

________________________________

_______________________________________________
***@x-qu
Leo Studer
2015-06-21 21:32:16 UTC
Permalink
Hi David

thanks for your input, which solves my problem in a easy (strange) way ;-)

Always
Leo
Post by David Carlisle
<![CDATA[<img src="aaa?&amp;location=sio"/>]]>
If you know all your ampersands are not escaped, and you can't fix the
source documents you could use a regexp-replace to escape them before
parsing
parse-xml-fragment(replace($feed,'&amp;','&amp;amp;'))
_______________________________________________
***@x-query.com
http://x-query.com/mailman/listinfo/talk

Loading...