William Candillon
2015-06-22 12:23:30 UTC
https://github.com/28msec/cellstore
Written in 15k of lines JSONiq code (and 20k for the test suite), the
CellStore is an implementation of a modern data warehousing paradigm
designed by Ghislain Fourny (http://arxiv.org/pdf/1410.0600.pdf).
Cell stores are an extremely efficient solution to perform analytical
tasks on top of XML and JSON data lakes. Too often we see XML users
trying to fit their data back into traditional OLAP solutions and it
always breaks our hearts. We wanted to build OLAP commodities that
directly work on top of data lakes. And the CellStore is what we came
up with.
The relational world has SQL for queries and OLAP for analytics. We
have JSONiq/XQuery for queries and the CellStore for analytics.
The cell store model defines the notion of cell as an atom of data. A
cell contains a value and a set of dimensional coordinates that are
string-value pairs. Cells can be stored on any type of physical
storage. Whereas traditional analytical processing tools can only
support hundreds of fixed dimensions and thus need to ETL the data to
analyze, cell stores support an unbounded number of dimensions. There
is no need for fixed hypercubes to be designed up front. Hypercubes
are not the schema. Hypercubes are the queries.
We're focused on financial use cases at the moment but the technology
can be used for any vertical.
Kind regards,
William
_______________________________________________
***@x-query.com
http://x-query.com/mailman/listinfo/talk
Written in 15k of lines JSONiq code (and 20k for the test suite), the
CellStore is an implementation of a modern data warehousing paradigm
designed by Ghislain Fourny (http://arxiv.org/pdf/1410.0600.pdf).
Cell stores are an extremely efficient solution to perform analytical
tasks on top of XML and JSON data lakes. Too often we see XML users
trying to fit their data back into traditional OLAP solutions and it
always breaks our hearts. We wanted to build OLAP commodities that
directly work on top of data lakes. And the CellStore is what we came
up with.
The relational world has SQL for queries and OLAP for analytics. We
have JSONiq/XQuery for queries and the CellStore for analytics.
The cell store model defines the notion of cell as an atom of data. A
cell contains a value and a set of dimensional coordinates that are
string-value pairs. Cells can be stored on any type of physical
storage. Whereas traditional analytical processing tools can only
support hundreds of fixed dimensions and thus need to ETL the data to
analyze, cell stores support an unbounded number of dimensions. There
is no need for fixed hypercubes to be designed up front. Hypercubes
are not the schema. Hypercubes are the queries.
We're focused on financial use cases at the moment but the technology
can be used for any vertical.
Kind regards,
William
_______________________________________________
***@x-query.com
http://x-query.com/mailman/listinfo/talk