Discussion:
[xquery-talk] contains and tokenize
Cindy Girard
2006-10-23 14:53:39 UTC
Permalink
Hi,

I have the following where clause:

where contains(upper-case($text), upper-case($keyword))

It works fine, except that it returns partial word matching. To only
match whole words, I'm pretty sure I need to use tokenize(), but I'm
not sure how to put it all together.

Any help would be appreciated.

Thanks,

-----
- Cindy

Cynthia M. Girard
IATH, University of Virginia
***@virginia.edu

"Danger? I laugh in the face of danger!
...and then I hide until it goes away."
David Sewell
2006-10-23 15:07:34 UTC
Permalink
Post by Cindy Girard
Hi,
where contains(upper-case($text), upper-case($keyword))
It works fine, except that it returns partial word matching. To only
match whole words, I'm pretty sure I need to use tokenize(), but I'm
not sure how to put it all together.
You could use an appropriate regular expression with matches(). For
example:

where matches($text, concat('\W', $keyword, '\W'), 'i')

I.e., match a string that contains non-word character + keyword +
non-word character, and do this case-insensitively.
--
David Sewell, Editorial and Technical Manager
ROTUNDA, The University of Virginia Press
PO Box 400318, Charlottesville, VA 22904-4318 USA
Courier: 310 Old Ivy Way, Suite 302, Charlottesville VA 22903
Email: ***@virginia.edu Tel: +1 434 924 9973
Web: http://rotunda.upress.virginia.edu/
David Carlisle
2006-10-23 15:10:29 UTC
Permalink
you could use matches instead of contains and add some word boundaries
to your regex. Schema/xpath regexp don't have built in word boundary
excpressions, but depending on your language rules something like

matches($text,concat('(^|[ ,\.;])',$keyword,($|[ ,\.;])'),'i')
for a case insensitive match against the keyword combined with either
the beginning or end of a string or a space , or ;.

If you want to use tokenize then you want = not contains, something like

upper-case($keyword) = tokenize(upper-case($text),'(\s|[\.,;])+'

which returns the sequence of words and then tests if any of them is
equal to the keyword.

(There are some differences, a case insenitive match for example isn't
the same thing as upper casing and then comparing, see
http://www.w3.org/TR/xpath-functions/#flags
)

David
Priscilla Walmsley
2006-11-02 23:55:23 UTC
Permalink
Hi all,

Just wanted to let you know that I will be giving a day-long tutorial at XML
2006 in Boston [1] this year. The title is "XQuery 1.0, XPath 2.0, and XSLT
2.0 Explained". I'll spend approximately a third of the day on XPath 2.0
(including the data model and functions and operators), a third on XQuery,
and a third on new features in XSLT 2.0.

The material will be based on my upcoming XQuery book from O'Reilly [2],
which is now available in its entirety as a PDF in "Rough Cuts" format.

The tutorial date is December 4, and I'd be delighted if you joined me. In
my experience, the XML conference is a great learning experience for
beginners and old-timers alike, and a lot of fun too.

Thanks,
Priscilla

[1] http://2006.xmlconference.org/

[2] http://www.oreilly.com/catalog/xquery/
Priscilla Walmsley
2007-04-25 20:46:40 UTC
Permalink
Hi all,

Just wanted to let you know that I will be giving a day-long tutorial at
XTech in Paris [1] next month. The title is "XQuery 1.0, XPath 2.0, and
XSLT 2.0 Explained". I'll spend approximately a third of the day on XPath
2.0 (including the data model and functions and operators), a third on
XQuery, and a third on new features in XSLT 2.0.

They're offering a 50% discount on my XQuery book [2] to attendees, and I'll
be around for the entire conference to answer any questions you may have.

The tutorial date is May 15, and I'd be delighted if you joined me. In my
experience, the XTech conference is a great way to stay up on cutting edge
technologies.

Thanks!
Priscilla

[1] http://2007.xtech.org/

[2] http://www.oreilly.com/catalog/9780596006341/index.html

Loading...