Ihe Onwuka
2014-03-31 01:02:00 UTC
This is a follow on from
http://en.wikibooks.org/wiki/XQuery/Freebase
which originated from a problem Michael Westbay assisted me with.
Again it illustrates how to obtain information from Freebase via it's
MQL language (it predated Sparql).
The previous query was taken from
https://developers.google.com/freebase/v1/mql-overview
and it limits the data that results from the call to the Freebase API.
You can see a limit parameter being set to 3 below in the API call.
https://www.googleapis.com/freebase/v1/mqlread?query=[{"type":"/music/album","name":null,"artist":{"id":"/en/bob_dylan"},"limit":3}]&cursor
If you do not specify a limit with your API call, Freebase will impose
a limit of 100 records on your query. This message addresses the
question of how to get everything.
The key to doing this is dangling at the end of the above API call,
it's the cursor parameter and it's usage is discussed with an example
here
https://developers.google.com/freebase/v1/mql-overview#querying-with-cursor-paging-results
To summarise you ask for a cursor (see the example API call above for
the form of the initial request) to be returned with your query
results which acts as a link to the next set of query results. You
obtain that next set by supplying the value of the cursor returned
from the previous invocation. Along with that next set you get another
cursor that points to the set after that. When the final set of
results are retrieved the cursor returns a string value of false (the
Freebase overview has this in upper case but my code used lower case
'false' and that works).
The overview has sample Python code which I have not tried or parsed
in anger but which I believe invokes libraries that take care of all
the cursor handling for you.
https://developers.google.com/freebase/v1/mql-overview#looping-through-cursor-results
However the same thing can easily be achieved from XQuery with a
little bit of tail recursion.
We will use as an example a MQL query that returns all films with
their netflix_id's.
[{
"type": "/film/film",
"name": null,
"netflix_id": []
}]
A few brief comments about MQL. You ask for something by giving the
field name and a value null. Null gets replaced by the actual value.
However if the field can have multiple values MQL will return an array
and cause your null query to error. This may happen even when you are
expecting a singular value so you can avoid this problem by using the
symbol for an empty array instead of null as in the query above.
You can paste the query above into
http://www.freebase.com/query
to see the results (we will take care of the cursor in the code example).
Now to the code, which assumes XQuery 3.0
xquery version "3.0";
import module namespace xqjson="http://xqilla.sourceforge.net/lib/xqjson";
Freebase returns JSON but we want to store this in an xml db so we use
the above package for json to XML conversion.
http://en.wikibooks.org/wiki/XQuery/Freebase
which originated from a problem Michael Westbay assisted me with.
Again it illustrates how to obtain information from Freebase via it's
MQL language (it predated Sparql).
The previous query was taken from
https://developers.google.com/freebase/v1/mql-overview
and it limits the data that results from the call to the Freebase API.
You can see a limit parameter being set to 3 below in the API call.
https://www.googleapis.com/freebase/v1/mqlread?query=[{"type":"/music/album","name":null,"artist":{"id":"/en/bob_dylan"},"limit":3}]&cursor
If you do not specify a limit with your API call, Freebase will impose
a limit of 100 records on your query. This message addresses the
question of how to get everything.
The key to doing this is dangling at the end of the above API call,
it's the cursor parameter and it's usage is discussed with an example
here
https://developers.google.com/freebase/v1/mql-overview#querying-with-cursor-paging-results
To summarise you ask for a cursor (see the example API call above for
the form of the initial request) to be returned with your query
results which acts as a link to the next set of query results. You
obtain that next set by supplying the value of the cursor returned
from the previous invocation. Along with that next set you get another
cursor that points to the set after that. When the final set of
results are retrieved the cursor returns a string value of false (the
Freebase overview has this in upper case but my code used lower case
'false' and that works).
The overview has sample Python code which I have not tried or parsed
in anger but which I believe invokes libraries that take care of all
the cursor handling for you.
https://developers.google.com/freebase/v1/mql-overview#looping-through-cursor-results
However the same thing can easily be achieved from XQuery with a
little bit of tail recursion.
We will use as an example a MQL query that returns all films with
their netflix_id's.
[{
"type": "/film/film",
"name": null,
"netflix_id": []
}]
A few brief comments about MQL. You ask for something by giving the
field name and a value null. Null gets replaced by the actual value.
However if the field can have multiple values MQL will return an array
and cause your null query to error. This may happen even when you are
expecting a singular value so you can avoid this problem by using the
symbol for an empty array instead of null as in the query above.
You can paste the query above into
http://www.freebase.com/query
to see the results (we will take care of the cursor in the code example).
Now to the code, which assumes XQuery 3.0
xquery version "3.0";
import module namespace xqjson="http://xqilla.sourceforge.net/lib/xqjson";
Freebase returns JSON but we want to store this in an xml db so we use
the above package for json to XML conversion.