Discussion:
[xquery-talk] Does XQuery fit anywhere in this landscape.
Ihe Onwuka
2015-06-23 07:15:54 UTC
Permalink


By implication it puts the kibosh on SQL as the basis of a solution for
the future.
daniela florescu
2015-06-23 15:51:54 UTC
Permalink
Ihe,


I had discussions with Michael Stonebreaker for 20 years about about the fact that
XML “exists” or not. With Jim Gray too, before he disappeared. They were both extremely
supportive for me, yet were both thinking that I am crazy to waste my research career on XML.

Stonebreaker’s opinion: he doesn’t believe that XML “exists” in industry.

So he will not mention it, because it doesn’t exist :-)

But you have to remember that Stonebreaker is a database person. Probably he will not
understand the facet of XML which is “XML as documents”. It took me and the other database
people involved in XQuery years before we swallowed it. (Don Chamberlin of SQL fame
famously once said “who in the world would care about such a corner case as mixed content !?").

Don’t blame the database people that they don’t “get” XML. On one hand, it has never been explained
to them properly.

And again, Stonebreaker, being a database person, he will look at “XML as data” aspect of the story.
And this today is INDEED non-existing in industry, or almost. Or, when t is, it is mostly for log analysis.

============

JSON will completely change the landscape, in surprising ways, that none of us can predict.

And no, I trust that Michael Stonebreaker is too smart to believe that SQL is a solution to process JSON.

But time will tell.

Best regards
Dana
http://youtu.be/9K0SWs1mOD0 http://youtu.be/9K0SWs1mOD0
By implication it puts the kibosh on SQL as the basis of a solution for the future.
_______________________________________________
http://x-query.com/mailman/listinfo/talk
Ihe Onwuka
2015-06-23 16:14:50 UTC
Permalink
Well he didn't comment on SQL for JSON per se but saying that RDBMS are
sub-optimal for everything is a tacit repudiation of SQL is it not?

He buys into the notion that there will be swarms of data scientists doing
clever things with data which will need a different language. I am
continually surprised that people this smart believe that there is such a
pool of people to draw from.

He is right that statistical packages suck at data management but that
won't isn't going to deter the R community.

Do you see XQuery fitting anywhere in this vision. It has potential as a
pipeling technology as does for that matter SQL. I think it will always be
problematic to do analytics on the source data because it is too dirty.
Post by daniela florescu
Ihe,
I had discussions with Michael Stonebreaker for 20 years about about the fact that
XML “exists” or not. With Jim Gray too, before he disappeared. They were
both extremely
supportive for me, yet were both thinking that I am crazy to waste my
research career on XML.
Stonebreaker’s opinion: he doesn’t believe that XML “exists” in industry.
So he will not mention it, because it doesn’t exist :-)
But you have to remember that Stonebreaker is a database person. Probably he will not
understand the facet of XML which is “XML as documents”. It took me and
the other database
people involved in XQuery years before we swallowed it. (Don Chamberlin of SQL fame
famously once said “who in the world would care about such a corner case
as mixed content !?").
Don’t blame the database people that they don’t “get” XML. On one hand, it
has never been explained
to them properly.
And again, Stonebreaker, being a database person, he will look at “XML as
data” aspect of the story.
And this today is INDEED non-existing in industry, or almost. Or, when t
is, it is mostly for log analysis.
============
JSON will completely change the landscape, in surprising ways, that none of us can predict.
And no, I trust that Michael Stonebreaker is too smart to believe that SQL
is a solution to process JSON.
But time will tell.
Best regards
Dana
http://youtu.be/9K0SWs1mOD0
By implication it puts the kibosh on SQL as the basis of a solution for
the future.
_______________________________________________
http://x-query.com/mailman/listinfo/talk
Pavel Velikhov
2015-06-23 16:28:56 UTC
Permalink
I see a bit use-case for JSONiq every day: its data integration, cleaning, sanity checking, publishing.
More and more people are building data-driven products, i.e. data that is productised in APIs and then used in simple Web Apps.
They start with dirty data that nicely fits into JSON paradigm, and then goes thought lots of stages, before it’s exported by API, this time definitely in JSON format.
There are many steps to collect, clean, refine, transform, merge, etc., and all of them will need to operate on the structure of the data, not just the fields.
So schemas are a must, all sorts of schema operations are extremely useful (compute statistics on what the common values for such and such fields are, how many JSONs contain this field)
.

Right now there are no good tools for doing this, so actually I’m trying to start such a project (no fancy JSONiq processing, just basic interpreter, but with schema operations).
Well he didn't comment on SQL for JSON per se but saying that RDBMS are sub-optimal for everything is a tacit repudiation of SQL is it not?
He buys into the notion that there will be swarms of data scientists doing clever things with data which will need a different language. I am continually surprised that people this smart believe that there is such a pool of people to draw from.
He is right that statistical packages suck at data management but that won't isn't going to deter the R community.
Do you see XQuery fitting anywhere in this vision. It has potential as a pipeling technology as does for that matter SQL. I think it will always be problematic to do analytics on the source data because it is too dirty.
Ihe,
I had discussions with Michael Stonebreaker for 20 years about about the fact that
XML “exists” or not. With Jim Gray too, before he disappeared. They were both extremely
supportive for me, yet were both thinking that I am crazy to waste my research career on XML.
Stonebreaker’s opinion: he doesn’t believe that XML “exists” in industry.
So he will not mention it, because it doesn’t exist :-)
But you have to remember that Stonebreaker is a database person. Probably he will not
understand the facet of XML which is “XML as documents”. It took me and the other database
people involved in XQuery years before we swallowed it. (Don Chamberlin of SQL fame
famously once said “who in the world would care about such a corner case as mixed content !?").
Don’t blame the database people that they don’t “get” XML. On one hand, it has never been explained
to them properly.
And again, Stonebreaker, being a database person, he will look at “XML as data” aspect of the story.
And this today is INDEED non-existing in industry, or almost. Or, when t is, it is mostly for log analysis.
============
JSON will completely change the landscape, in surprising ways, that none of us can predict.
And no, I trust that Michael Stonebreaker is too smart to believe that SQL is a solution to process JSON.
But time will tell.
Best regards
Dana
http://youtu.be/9K0SWs1mOD0 http://youtu.be/9K0SWs1mOD0
By implication it puts the kibosh on SQL as the basis of a solution for the future.
_______________________________________________
http://x-query.com/mailman/listinfo/talk <http://x-query.com/mailman/listinfo/talk>
_______________________________________________
http://x-query.com/mailman/listinfo/talk
С уважеМОеЌ,
Павел ВелОхПв
***@gmail.com
daniela florescu
2015-06-23 17:43:30 UTC
Permalink
Me and you both, Pavel.

But remember, we both worked in XML data processing for 15 years at least.

Our understanding of processing of semi-structured data is very different from the
“normal” data processing people.

But maybe this (XQuery) community can put together the efforts into some common result ?

Best regards
Dana
Post by Pavel Velikhov
I see a bit use-case for JSONiq every day: its data integration, cleaning, sanity checking, publishing.
More and more people are building data-driven products, i.e. data that is productised in APIs and then used in simple Web Apps.
They start with dirty data that nicely fits into JSON paradigm, and then goes thought lots of stages, before it’s exported by API, this time definitely in JSON format.
There are many steps to collect, clean, refine, transform, merge, etc., and all of them will need to operate on the structure of the data, not just the fields.
So schemas are a must, all sorts of schema operations are extremely useful (compute statistics on what the common values for such and such fields are, how many JSONs contain this field)
.
Right now there are no good tools for doing this, so actually I’m trying to start such a project (no fancy JSONiq processing, just basic interpreter, but with schema operations).
Well he didn't comment on SQL for JSON per se but saying that RDBMS are sub-optimal for everything is a tacit repudiation of SQL is it not?
He buys into the notion that there will be swarms of data scientists doing clever things with data which will need a different language. I am continually surprised that people this smart believe that there is such a pool of people to draw from.
He is right that statistical packages suck at data management but that won't isn't going to deter the R community.
Do you see XQuery fitting anywhere in this vision. It has potential as a pipeling technology as does for that matter SQL. I think it will always be problematic to do analytics on the source data because it is too dirty.
Ihe,
I had discussions with Michael Stonebreaker for 20 years about about the fact that
XML “exists” or not. With Jim Gray too, before he disappeared. They were both extremely
supportive for me, yet were both thinking that I am crazy to waste my research career on XML.
Stonebreaker’s opinion: he doesn’t believe that XML “exists” in industry.
So he will not mention it, because it doesn’t exist :-)
But you have to remember that Stonebreaker is a database person. Probably he will not
understand the facet of XML which is “XML as documents”. It took me and the other database
people involved in XQuery years before we swallowed it. (Don Chamberlin of SQL fame
famously once said “who in the world would care about such a corner case as mixed content !?").
Don’t blame the database people that they don’t “get” XML. On one hand, it has never been explained
to them properly.
And again, Stonebreaker, being a database person, he will look at “XML as data” aspect of the story.
And this today is INDEED non-existing in industry, or almost. Or, when t is, it is mostly for log analysis.
============
JSON will completely change the landscape, in surprising ways, that none of us can predict.
And no, I trust that Michael Stonebreaker is too smart to believe that SQL is a solution to process JSON.
But time will tell.
Best regards
Dana
http://youtu.be/9K0SWs1mOD0 http://youtu.be/9K0SWs1mOD0
By implication it puts the kibosh on SQL as the basis of a solution for the future.
_______________________________________________
http://x-query.com/mailman/listinfo/talk <http://x-query.com/mailman/listinfo/talk>
_______________________________________________
http://x-query.com/mailman/listinfo/talk
С уважеМОеЌ,
Павел ВелОхПв
daniela florescu
2015-06-23 16:52:44 UTC
Permalink
Well he didn't comment on SQL for JSON per se but saying that RDBMS are sub-optimal for everything is a tacit repudiation of SQL is it not?
No, because he said exploitively that the *internals* of a database will be different (columnar, main memory, streaming, etc)
.. the
programming language will STILL be SQL. Or at least for all those databases for whom the data model is STILL relational.
He buys into the notion that there will be swarms of data scientists doing clever things with data which will need a different language.
Yes. SQL clearly doesn’t solve the R use cases. So yes, R is on the “acceptable OTHER languages” list.

But that’s not clear that what we (aka the XML community see) as “normal” data processing use cases will be considered necessary use cases
for the JSON/NoSQL community.

E.g. scanning the data and automatically extracting a schema. Is this an acceptable use case for JSON ? Or not ?

If yes, then XQuery has a chance, because XQuery can do that and SQL cannot.

If no, people will stick to what they know : SQL.
He is right that statistical packages suck at data management but that won't isn't going to deter the R community.
Yes, the R implementations (I looked at them in details about 2 years ago) have NO IDEA about how to deal with large volumes
of data, so probably a mix between data technologies and database technologies is necessary.

However, don’t underestimate companies like Oracle. They are not dummies, and the know what the market wants.
R is supported natively inside the Oracle database now.

I think that Stonebreaker exaggerates when he says that relational databases will disappear in 10 years. Well
 I don’t think
this will happen so quickly.
Do you see XQuery fitting anywhere in this vision. It has potential as a pipeling technology as does for that matter SQL. I think it will always be problematic to do analytics on the source data because it is too dirty.
XQuery COULD be a very good “glue” language between data in various formats (CSV, Excel, PDF, HTML, XML, JSON, relational, whatever).

But I say “COULD” not “CAN”.

It needs many extensions to be good at that: scripting, support for JSON, modules to support a variety of data formats and data processing services.


Best regards
Dana



P.S.
I am continually surprised that people this smart believe that there is such a pool of data scientists people to draw from.
Me too. I fell down from my chair when I saw the article saying that US needs 5 million data scientists in the next 2 years, aka, about 5% of the
US working population. Not sure if this for laughing, or for crying.

[[ aka, we will not have cashiers at Safeway anymore ‘cause they are all data scientists
.]]

Someone up there doing the math in this article doesn’t understand jack nothing about numbers and statistics 

.

And all this while:
http://www.nature.com/news/irreproducible-biology-research-costs-put-at-28-billion-per-year-1.17711?utm_content=buffer95bfb&utm_medium=social&utm_source=linkedin.com&utm_campaign=buffer <http://www.nature.com/news/irreproducible-biology-research-costs-put-at-28-billion-per-year-1.17711?utm_content=buffer95bfb&utm_medium=social&utm_source=linkedin.com&utm_campaign=buffer>

God knows how many medicines are wrongly given to sick people, because nobody knows how to do a proper case study 

REALLY scary 
 but that’s another discussion.

Again the same discussion comes up: DON”T look for 5 million data scientists. Just do with a smaller number of smart ones, but GIVE
THEM BETTER TOOLS and AUTOMATIZE THE PROCESS.

But hey, how can you stop such a wold wide enthusiasm for “data scientists” !?? Logic doesn’t do it
.
Ihe,
I had discussions with Michael Stonebreaker for 20 years about about the fact that
XML “exists” or not. With Jim Gray too, before he disappeared. They were both extremely
supportive for me, yet were both thinking that I am crazy to waste my research career on XML.
Stonebreaker’s opinion: he doesn’t believe that XML “exists” in industry.
So he will not mention it, because it doesn’t exist :-)
But you have to remember that Stonebreaker is a database person. Probably he will not
understand the facet of XML which is “XML as documents”. It took me and the other database
people involved in XQuery years before we swallowed it. (Don Chamberlin of SQL fame
famously once said “who in the world would care about such a corner case as mixed content !?").
Don’t blame the database people that they don’t “get” XML. On one hand, it has never been explained
to them properly.
And again, Stonebreaker, being a database person, he will look at “XML as data” aspect of the story.
And this today is INDEED non-existing in industry, or almost. Or, when t is, it is mostly for log analysis.
============
JSON will completely change the landscape, in surprising ways, that none of us can predict.
And no, I trust that Michael Stonebreaker is too smart to believe that SQL is a solution to process JSON.
But time will tell.
Best regards
Dana
http://youtu.be/9K0SWs1mOD0 http://youtu.be/9K0SWs1mOD0
By implication it puts the kibosh on SQL as the basis of a solution for the future.
_______________________________________________
http://x-query.com/mailman/listinfo/talk <http://x-query.com/mailman/listinfo/talk>
Ihe Onwuka
2015-06-27 13:51:25 UTC
Permalink
Below is an blog post from Norm Matloff the author of the Art of R - a
statistician that lives in a CS department at UC Irvine, the book I
referred to in another post,

http://blog.revolutionanalytics.com/2014/08/statistics-losing-ground-to-cs-losing-image-among-students.html

The article is interesting for a number of reasons not the least the
parallel of it's core theme, the image problem of a discipline that is
perceived to be unfashionable. The reference to a CS usurpation problem is
ironic because the reverse argument could also be made - really bad CS (e.g
data management) being entrenched as standard by statisticians (and the
like) in the name of data science - just goes to show that there are
probably 2 very valid sides to that coin.

Enough preamble, I am wondering if the perusal of some of the comments
reveal an opportunity.

Tom 26/8/14 @ 23.25 surmising that CS students are turned off of Stats
classes because of the use of R which "*as a programming language it is
horrible and needs to die in a fire" *I was hoping to see a reasoned
rebuttal of a viewpoint I share but Matloff really didn't deal with it well.

There is another comment by Jaipelai 27/8/14 @ 09:42 that I could almost
have written myself.

Point being Stonebraker identifies that people will want to do analytics
with their query languages but all the analytics tools suck at data
management. That market is all the rage now, but the R and Python
communities are probably lost causes.

So rather than bring analytics to a query language, suppose instead one
looked at baking in a best of breed query capability to a an analytics
language that was fashionable, functional and comprehension friendly -
Julia.
Post by Ihe Onwuka
Well he didn't comment on SQL for JSON per se but saying that RDBMS are
sub-optimal for everything is a tacit repudiation of SQL is it not?
No, because he said exploitively that the *internals* of a database will
be different (columnar, main memory, streaming, etc)
.. the
programming language will STILL be SQL. Or at least for all those
databases for whom the data model is STILL relational.
He buys into the notion that there will be swarms of data scientists doing
clever things with data which will need a different language.
Yes. SQL clearly doesn’t solve the R use cases. So yes, R is on the
“acceptable OTHER languages” list.
But that’s not clear that what we (aka the XML community see) as “normal”
data processing use cases will be considered necessary use cases
for the JSON/NoSQL community.
E.g. scanning the data and automatically extracting a schema. Is this an
acceptable use case for JSON ? Or not ?
If yes, then XQuery has a chance, because XQuery can do that and SQL cannot.
If no, people will stick to what they know : SQL.
He is right that statistical packages suck at data management but that
won't isn't going to deter the R community.
Yes, the R implementations (I looked at them in details about 2 years ago)
have NO IDEA about how to deal with large volumes
of data, so probably a mix between data technologies and database
technologies is necessary.
However, don’t underestimate companies like Oracle. They are not dummies,
and the know what the market wants.
R is supported natively inside the Oracle database now.
I think that Stonebreaker exaggerates when he says that relational
databases will disappear in 10 years. Well
 I don’t think
this will happen so quickly.
Do you see XQuery fitting anywhere in this vision. It has potential as a
pipeling technology as does for that matter SQL. I think it will always be
problematic to do analytics on the source data because it is too dirty.
XQuery COULD be a very good “glue” language between data in various
formats (CSV, Excel, PDF, HTML, XML, JSON, relational, whatever).
But I say “COULD” not “CAN”.
It needs many extensions to be good at that: scripting, support for JSON,
modules to support a variety of data formats and data processing services.
Best regards
Dana
P.S.
I am continually surprised that people this smart believe that there is
such a pool of data scientists people to draw from.
Me too. I fell down from my chair when I saw the article saying that US
needs 5 million data scientists in the next 2 years, aka, about 5% of the
US working population. Not sure if this for laughing, or for crying.
[[ aka, we will not have cashiers at Safeway anymore ‘cause they are all
data scientists
.]]
Someone up there doing the math in this article doesn’t understand jack
nothing about numbers and statistics 

.
http://www.nature.com/news/irreproducible-biology-research-costs-put-at-28-billion-per-year-1.17711?utm_content=buffer95bfb&utm_medium=social&utm_source=linkedin.com&utm_campaign=buffer
God knows how many medicines are wrongly given to sick people, because
nobody knows how to do a proper case study 

REALLY scary 
 but that’s another discussion.
Again the same discussion comes up: DON”T look for 5 million data
scientists. Just do with a smaller number of smart ones, but GIVE
THEM BETTER TOOLS and AUTOMATIZE THE PROCESS.
But hey, how can you stop such a wold wide enthusiasm for “data
scientists” !?? Logic doesn’t do it
.
Post by daniela florescu
Ihe,
I had discussions with Michael Stonebreaker for 20 years about about the fact that
XML “exists” or not. With Jim Gray too, before he disappeared. They were
both extremely
supportive for me, yet were both thinking that I am crazy to waste my
research career on XML.
Stonebreaker’s opinion: he doesn’t believe that XML “exists” in industry.
So he will not mention it, because it doesn’t exist :-)
But you have to remember that Stonebreaker is a database person. Probably he will not
understand the facet of XML which is “XML as documents”. It took me and
the other database
people involved in XQuery years before we swallowed it. (Don Chamberlin of SQL fame
famously once said “who in the world would care about such a corner case
as mixed content !?").
Don’t blame the database people that they don’t “get” XML. On one hand,
it has never been explained
to them properly.
And again, Stonebreaker, being a database person, he will look at “XML as
data” aspect of the story.
And this today is INDEED non-existing in industry, or almost. Or, when t
is, it is mostly for log analysis.
============
JSON will completely change the landscape, in surprising ways, that none
of us can predict.
And no, I trust that Michael Stonebreaker is too smart to believe that
SQL is a solution to process JSON.
But time will tell.
Best regards
Dana
http://youtu.be/9K0SWs1mOD0
By implication it puts the kibosh on SQL as the basis of a solution for
the future.
_______________________________________________
http://x-query.com/mailman/listinfo/talk
Ihe Onwuka
2015-06-27 17:33:19 UTC
Permalink
This looks like low hanging fruit.

*"For one thing, Mode will start letting people analyze data using
languages other than SQL, like Python, Steer said. Not that there’s
anything wrong with prioritizing SQL, but the support of more languages
should make Mode compelling for more data analysts."*

http://venturebeat.com/2014/06/17/mode-unveils-its-app-for-analyzing-your-data-in-the-cloud-and-2m-too/
https://gigaom.com/2015/02/05/data-collaboration-platform-mode-wants-to-move-sql-to-the-cloud/
https://modeanalytics.com/
Post by Ihe Onwuka
Well he didn't comment on SQL for JSON per se but saying that RDBMS are
sub-optimal for everything is a tacit repudiation of SQL is it not?
No, because he said exploitively that the *internals* of a database will
be different (columnar, main memory, streaming, etc)
.. the
programming language will STILL be SQL. Or at least for all those
databases for whom the data model is STILL relational.
He buys into the notion that there will be swarms of data scientists
doing clever things with data which will need a different language.
Yes. SQL clearly doesn’t solve the R use cases. So yes, R is on the
“acceptable OTHER languages” list.
But that’s not clear that what we (aka the XML community see) as “normal”
data processing use cases will be considered necessary use cases
for the JSON/NoSQL community.
E.g. scanning the data and automatically extracting a schema. Is this an
acceptable use case for JSON ? Or not ?
If yes, then XQuery has a chance, because XQuery can do that and SQL cannot.
If no, people will stick to what they know : SQL.
He is right that statistical packages suck at data management but that
won't isn't going to deter the R community.
Yes, the R implementations (I looked at them in details about 2 years
ago) have NO IDEA about how to deal with large volumes
of data, so probably a mix between data technologies and database
technologies is necessary.
However, don’t underestimate companies like Oracle. They are not dummies,
and the know what the market wants.
R is supported natively inside the Oracle database now.
I think that Stonebreaker exaggerates when he says that relational
databases will disappear in 10 years. Well
 I don’t think
this will happen so quickly.
Do you see XQuery fitting anywhere in this vision. It has potential as a
pipeling technology as does for that matter SQL. I think it will always be
problematic to do analytics on the source data because it is too dirty.
XQuery COULD be a very good “glue” language between data in various
formats (CSV, Excel, PDF, HTML, XML, JSON, relational, whatever).
But I say “COULD” not “CAN”.
It needs many extensions to be good at that: scripting, support for JSON,
modules to support a variety of data formats and data processing services.
Best regards
Dana
P.S.
I am continually surprised that people this smart believe that there is
such a pool of data scientists people to draw from.
Me too. I fell down from my chair when I saw the article saying that US
needs 5 million data scientists in the next 2 years, aka, about 5% of the
US working population. Not sure if this for laughing, or for crying.
[[ aka, we will not have cashiers at Safeway anymore ‘cause they are all
data scientists
.]]
Someone up there doing the math in this article doesn’t understand jack
nothing about numbers and statistics 

.
http://www.nature.com/news/irreproducible-biology-research-costs-put-at-28-billion-per-year-1.17711?utm_content=buffer95bfb&utm_medium=social&utm_source=linkedin.com&utm_campaign=buffer
God knows how many medicines are wrongly given to sick people, because
nobody knows how to do a proper case study 

REALLY scary 
 but that’s another discussion.
Again the same discussion comes up: DON”T look for 5 million data
scientists. Just do with a smaller number of smart ones, but GIVE
THEM BETTER TOOLS and AUTOMATIZE THE PROCESS.
But hey, how can you stop such a wold wide enthusiasm for “data
scientists” !?? Logic doesn’t do it
.
Post by daniela florescu
Ihe,
I had discussions with Michael Stonebreaker for 20 years about about the fact that
XML “exists” or not. With Jim Gray too, before he disappeared. They were
both extremely
supportive for me, yet were both thinking that I am crazy to waste my
research career on XML.
Stonebreaker’s opinion: he doesn’t believe that XML “exists” in
industry.
So he will not mention it, because it doesn’t exist :-)
But you have to remember that Stonebreaker is a database person. Probably he will not
understand the facet of XML which is “XML as documents”. It took me and
the other database
people involved in XQuery years before we swallowed it. (Don Chamberlin of SQL fame
famously once said “who in the world would care about such a corner case
as mixed content !?").
Don’t blame the database people that they don’t “get” XML. On one hand,
it has never been explained
to them properly.
And again, Stonebreaker, being a database person, he will look at “XML
as data” aspect of the story.
And this today is INDEED non-existing in industry, or almost. Or, when t
is, it is mostly for log analysis.
============
JSON will completely change the landscape, in surprising ways, that none
of us can predict.
And no, I trust that Michael Stonebreaker is too smart to believe that
SQL is a solution to process JSON.
But time will tell.
Best regards
Dana
http://youtu.be/9K0SWs1mOD0
By implication it puts the kibosh on SQL as the basis of a solution for
the future.
_______________________________________________
http://x-query.com/mailman/listinfo/talk
Michael Kay
2015-06-23 17:04:14 UTC
Permalink
Don’t blame the database people that they don’t “get” XML. On one hand, it has never been explained
to them properly.
Long before XML (about 1990, I think) I submitted a VLDB paper about a database/repository holding software engineering artifacts whose structure/schema was defined in BNF. Two of the reviewers gave it 9/10, two of them gave it 2/10. So yes: the traditional database research community, of whom Stonebraker was for many years the high priest, has problems understanding data outside their traditional customer/orders/suppliers domain.

Michael Kay
Saxonica


_______________________________________________
***@x-query.com
http://x-query.com/mail
daniela florescu
2015-06-23 17:16:34 UTC
Permalink
Post by Michael Kay
Don’t blame the database people that they don’t “get” XML. On one hand, it has never been explained
to them properly.
Long before XML (about 1990, I think) I submitted a VLDB paper about a database/repository holding software engineering artifacts whose structure/schema was defined in BNF. Two of the reviewers gave it 9/10, two of them gave it 2/10. So yes: the traditional database research community, of whom Stonebraker was for many years the high priest, has problems understanding data outside their traditional customer/orders/suppliers domain.
Yes, Michael, you are right. I was for many years one of “them".

It wasn’t easy for me to understand the “other” side of the story.
(and with me, many other database people like Don Chamberlin, etc)

This mixture of knowledge and mixture of communities with strong beliefs
is not easy to do AT ALL.

That’s why I don’t expect miracles and I don’t expect masses of database people to suddenly see the “light”
and suddenly jump on XQuery or JSONiq.

What WILL convince them though is a strong use case, a vertical where tons of money can be made.

Think AI/machine learning. For how many decades those guys were working in their corner, ignored and
humored ……until SUDDENLY, Google realized that they can make LOTS OF MONEY with their work !?

What XQuery or JSONiq need right now are strong use cases. Stuff that can be done with XQuery or JSONiq, but cannot
be done otherwise.

Best regards
Dana










_______________________________________________
***@x-query.com
h
Ihe Onwuka
2015-06-24 15:55:46 UTC
Permalink
Post by daniela florescu
What WILL convince them though is a strong use case, a vertical where tons
of money can be made.
Think AI/machine learning. For how many decades those guys were working in
their corner, ignored and
humored 

until SUDDENLY, Google realized that they can make LOTS OF MONEY
with their work !?
What XQuery or JSONiq need right now are strong use cases. Stuff that can
be done with XQuery or JSONiq, but cannot
be done otherwise.
I think what helps with machine learning is that it is not dragged down by
being something everybody thinks they can or should do, so progress is not
inhibited.

The use case for XQuery and JSONiq is there but it is being blocked by a
belief that SQL++ or N1QL or whatever will satisfice.

Look to history. How did relational databases supplant CODASYL. Firstly the
biggest fish in the pond IBM had a product and backed the technology to the
hilt. Then there was a sustained and successful campaign initiated by Codd
and Date but grew momentum to differentiate between a proper relational
database and a database that just had a relational veneer.

The former is out of the hands of the XQuery community, the latter isn't.
Half the problem is there isn't a consensus on what a query language should
be able to do. So formulate the equivalent of Codd's 12 rules for a modern
day query language such that each one is aligned to a discernible benefit
i.e this is what you are missing if your language doesn't do this and can
be relatively easily verified.

We don't at this point know whether these SQL variants will turn out to be
the equivalent of a modern day object oriented Cobol, because right now
their users aren't given reason to question their capabilities and sound
of those that would is drowned out by the propaganda deluge.

Daniela you've already started this with contributions to the list but it
needs to be fleshed out somewhat e.g composability, what are you missing in
practical terms if your language is not composable and how would you
demonstrate this property/benefit in a manner that can be critically
assessed.

If you can get vendors to assess their offerings by the criteria defined
then half the battle is won.
Michael Kay
2015-06-24 16:03:28 UTC
Permalink
Look to history. How did relational databases supplant CODASYL. Firstly the biggest fish in the pond IBM had a product and backed the technology to the hilt. Then there was a sustained and successful campaign initiated by Codd and Date but grew momentum to differentiate between a proper relational database and a database that just had a relational veneer.
But the biggest factor was probably that the move to minicomputer architecture created a discontinuity that forced people to consider change. You need to do two things: convince people that the new technology is better (or at least, is cool), and give them a big kick up the backside to get them out of their comfort zone.

Michael Kay
Saxonica



_______________________________________________
***@x-query.com
http://x-query.com/mailman/listinfo/talk
Pavel Velikhov
2015-06-24 16:30:57 UTC
Permalink
There is a big trend now to build Web Application using APIs. This seems like a much better idea than building HTML from XML with XQuery.
These APIs are usually REST apis (i.e. the URL encodes query parameters) with JSON as output.
So the data for these APIs can come from a JSON database and REST calls can be translated into JSONiq.
And the data needs to be prepared, sometimes it needs to go through a number of stages. Each of that stage could also be done with JSONiq.

The application then is a heavy client, build in some .js framework, and its runs queries against the API and is responsible only for the presentation layer.

A lot of people are content with MongoDB to store the JSONs. So a killer use-case needs to look beyond dumb storage of JSONs. Maybe focus on the
preparation/transformation/cleaning/merging stuff.
Post by Michael Kay
Look to history. How did relational databases supplant CODASYL. Firstly the biggest fish in the pond IBM had a product and backed the technology to the hilt. Then there was a sustained and successful campaign initiated by Codd and Date but grew momentum to differentiate between a proper relational database and a database that just had a relational veneer.
But the biggest factor was probably that the move to minicomputer architecture created a discontinuity that forced people to consider change. You need to do two things: convince people that the new technology is better (or at least, is cool), and give them a big kick up the backside to get them out of their comfort zone.
Michael Kay
Saxonica
_______________________________________________
http://x-query.com/mailman/listinfo/talk
С уважением,
Павел Велихов
***@gmail.com


_______________________________________________
***@x-query
Ihe Onwuka
2015-06-25 14:10:25 UTC
Permalink
Post by Pavel Velikhov
A lot of people are content with MongoDB to store the JSONs. So a killer
use-case needs to look beyond dumb storage of JSONs. Maybe focus on the
preparation/transformation/cleaning/merging stuff.
Post by Michael Kay
But the biggest factor was probably that the move to minicomputer
architecture created a discontinuity that forced people to consider change.
You need to do two things: convince people that the new technology is
better (or at least, is cool), and give them a big kick up the backside to
get them out of their comfort zone.
Post by Michael Kay
Michael Kay
Saxonica
The data prep/transformation/cleaning/merging stuff is currently the
domain of R and Python.

R because thats what the statisticians like and (if you will see if you
watch the R Good Bad and Ugly presentation I posted) they are not going to
change. Unfortunately they are being sheepishly followed by
non-statisticians. The non-statisticians who could change this - the
software people - are for the most part saying I don't care if R sucks for
data management and I don't care that I am not a statistician, working
with R will help me get a sexy data science job. QED.

With Python you have the same issue but with the additional twist that it
is revered for being Swiss Army knife for devs and data scientists. This is
another one of those situations where the industry inverts common sense and
transforms what should ordinarily be a handicap into a virtue.

Ok so you go to the restaurant, place your order and they bring your food.
How many of you are now going to reach into your pocket and eat it with
this.

Loading Image...

So there is a very challenging people issue to overcome

Technically there would need to be a streaming capability so that
XQuery/JSONiq is not the part of the pipeline that barfs when fed a large
dataset.
Pavel Velikhov
2015-06-25 14:49:57 UTC
Permalink
Post by Pavel Velikhov
A lot of people are content with MongoDB to store the JSONs. So a killer use-case needs to look beyond dumb storage of JSONs. Maybe focus on the
preparation/transformation/cleaning/merging stuff.
Post by Michael Kay
But the biggest factor was probably that the move to minicomputer architecture created a discontinuity that forced people to consider change. You need to do two things: convince people that the new technology is better (or at least, is cool), and give them a big kick up the backside to get them out of their comfort zone.
Michael Kay
Saxonica
The data prep/transformation/cleaning/merging stuff is currently the domain of R and Python.
You must be talking about “data science” that is used internally in the organization. I’m talking more about data-driven Web sites, that have a big data component in their products.
In this case folks would never use R, they use all sorts of other stuff, including Python.
Post by Pavel Velikhov
R because thats what the statisticians like and (if you will see if you watch the R Good Bad and Ugly presentation I posted) they are not going to change. Unfortunately they are being sheepishly followed by non-statisticians. The non-statisticians who could change this - the software people - are for the most part saying I don't care if R sucks for data management and I don't care that I am not a statistician, working with R will help me get a sexy data science job. QED.
With Python you have the same issue but with the additional twist that it is revered for being Swiss Army knife for devs and data scientists. This is another one of those situations where the industry inverts common sense and transforms what should ordinarily be a handicap into a virtue.
Ok so you go to the restaurant, place your order and they bring your food. How many of you are now going to reach into your pocket and eat it with this.
http://gadgether.walyou.netdna-cdn.com/wp-content/uploads/2009/11/swissarmius-main-01.jpg <http://gadgether.walyou.netdna-cdn.com/wp-content/uploads/2009/11/swissarmius-main-01.jpg>
So there is a very challenging people issue to overcome
Technically there would need to be a streaming capability so that XQuery/JSONiq is not the part of the pipeline that barfs when fed a large dataset.
We’re thinking about building a JSONiq component in Scala, so it could be plugged into Spark.
С уважеМОеЌ,
Павел ВелОхПв
***@gmail.com
Ihe Onwuka
2015-06-25 15:01:25 UTC
Permalink
Post by Ihe Onwuka
Post by Pavel Velikhov
A lot of people are content with MongoDB to store the JSONs. So a killer
use-case needs to look beyond dumb storage of JSONs. Maybe focus on the
preparation/transformation/cleaning/merging stuff.
Post by Michael Kay
But the biggest factor was probably that the move to minicomputer
architecture created a discontinuity that forced people to consider change.
You need to do two things: convince people that the new technology is
better (or at least, is cool), and give them a big kick up the backside to
get them out of their comfort zone.
Post by Michael Kay
Michael Kay
Saxonica
The data prep/transformation/cleaning/merging stuff is currently the
domain of R and Python.
You must be talking about “data science” that is used internally in the
organization. I’m talking more about data-driven Web sites, that have a big
data component in their products.
In this case folks would never use R, they use all sorts of other stuff, including Python.
I'm talking generally to be honest but with data science as a prime example
but there are several others e.g web scraping. There is a marked preference
for other tools to do what XQuery etc are good at - hence there is a
people issue.
Pavel Velikhov
2015-06-25 20:08:37 UTC
Permalink
I still don’t know if we’ll have the resources yet, so this is hypothetical right now.
But yup, we definitely need a schema language and we can fix up some issues if it all works out!
Just JSONiq, I don’t have much hope for XQuery (because of XML underneath).
Post by Pavel Velikhov
Post by Pavel Velikhov
A lot of people are content with MongoDB to store the JSONs. So a killer use-case needs to look beyond dumb storage of JSONs. Maybe focus on the
preparation/transformation/cleaning/merging stuff.
Post by Michael Kay
But the biggest factor was probably that the move to minicomputer architecture created a discontinuity that forced people to consider change. You need to do two things: convince people that the new technology is better (or at least, is cool), and give them a big kick up the backside to get them out of their comfort zone.
Michael Kay
Saxonica
The data prep/transformation/cleaning/merging stuff is currently the domain of R and Python.
You must be talking about “data science” that is used internally in the organization. I’m talking more about data-driven Web sites, that have a big data component in their products.
In this case folks would never use R, they use all sorts of other stuff, including Python.
I'm talking generally to be honest but with data science as a prime example but there are several others e.g web scraping. There is a marked preference for other tools to do what XQuery etc are good at - hence there is a people issue.
С уважеМОеЌ,
Павел ВелОхПв
***@gmail.com
daniela florescu
2015-06-25 18:21:16 UTC
Permalink
Post by Pavel Velikhov
We’re thinking about building a JSONiq component in Scala, so it could be plugged into Spark.
That’s a brilliant idea, Pavel. That’s the kind of think that will make or break XQuery and/or JSONiq
.

Let me know if I can help with something.

BTW, you are implementing only JSONiq—, right !? (I guess so
)

Given that you have the freedom, could you please make me favor, and “fix” some of the
grammatical issues with XQuery, hm !?

Like:
- make FROM a synonym for FOR and SELECT for RETURN
- eliminate the need for white spaces around the “-“ sign


Stuff lie that 


Unless you plan to expand this later on to JSONiq++
.(but even then , you can still do the FOR/FROM, etc).

Do you plan to implement JSOUND — the JSON schema too ?

Best
Dana
daniela florescu
2015-06-23 17:38:00 UTC
Permalink
Post by Michael Kay
Long before XML (about 1990, I think) I submitted a VLDB paper about a database/repository holding software engineering artifacts whose structure/schema was defined in BNF. Two of the reviewers gave it 9/10, two of them gave it 2/10. So yes: the traditional database research community, of whom Stonebraker was for many years the high priest, has problems understanding data outside their traditional customer/orders/suppliers domain.
There is something else Michael.

Unfortunately, your ideas were way TOO EARLY for people to understand them.

As they say in business: “being too early is the same of being wrong”.

In business timing is everything.

(and research in CS is often a comic parody of the industry, not a real scientific field..)

The questions are:

1. if you take the exact same paper and submit it today to VLDB, will they understand it now ?
(my bet is STILL not)
2. if not VLDB, is there a community who does understand it ?
(my bet is JSON community is still too young to get it, unfortunately)

If not those, who else ? Which community, or which vertical WOULD get that paper now ?

Dana






_______________________________________________
***@x-query.com
http:/
Loading...