Google Cloud has introduced support for JSON documents in its distributed RDBMS Spanner – a move the Chocolate Factory says offers advantages in adding new data attributes without changing the schema.

Database experts said that while there are benefits to the support, Google is playing catch-up with other RDBMSes which have supported JSON for some time. Meanwhile, for specialist document-store applications, users would still need a specialist database such as Couchbase or MongoDB.

In a blog published earlier this month, Justin Makeig, product manager, Cloud Spanner, said support for JSON – JavaScript Object Notation – would let Spanner users add new attributes to data without changing the database schema. Such attributes are essentially parked in a JSON store and extracted as required.

“With Spanner, you now have the option to store this long tail of other attributes as JSON. Unlike a strongly typed column, JSON values don’t need to pre-define anything about their structure or values. Thus it’s easy to add new attributes without changing the relational schema,” he said.

The post explained how developers can use SQL to query a table with JSON data while Spanner also provides a set of SQL functions that allow them to use JSONPath to traverse JSON values.

“The above query uses the relational model to do the heavy lifting of filtering, while still providing the flexibility to project out of the JSON column for the filtered set,” Makeig said.

Carl Olofson, IDC research vice president, said the approach was similar to other relational databases, which support JSON in a large text block. Then Google provides built-in support for JSON notation so users can search based on tag value, for example.

“If your need is to embed JSON documents in what is otherwise a largely relational database and have that all travel together, sure, that’s fine,” Olofson said.

Fellow RDBMS PostreSQL has supported JSON since 2013, and continues to develop its approach to the document model.

Olofson pointed out that Spanner’s approach is similar to that of PostgreSQL, which “does the same thing: they treat JSON as a data type, creating a column that is essentially a text block and it’s formatted as JSON.”

Oracle also allows users to store JSON as a native storage format “similar to Couchbase or MongoDB” so “it’s faster and they can do more things to optimize retrieval and performance,” but outside of any schema, Olofson said.

Nonetheless, Google’s support for JSON is useful if users want “to maintain relationships between the JSON data and some relational data in a single database,” but was not a threat to specialist NoSQL databases per se.

“I don’t see it as in any way representing a threat to the native JSON guys, you know MongoDB and Couchbase,” Olofson said.

Andy Pavlo, associate professor of databaseology at Carnegie Mellon University, pointed out that Spanner’s implementation of JSON was “incomplete.”

By its own admission, “Spanner doesn’t (yet) index data in JSON columns.”

However, Pavlo pointed out that with Spanner’s approach, users get “strong serializable, external consistency guarantees of Spanner on JSON data.”

“That differentiates it from CouchDB and the other NoSQL systems. FaunaDB is also a serializable transactional NoSQL DBMS that natively supports JSON, but they don’t support SQL. So the major competitor to Spanner is [distributed RDBMS] CockroachDB and they’ve supported JSON since 2018,” he said.

To sum up, Merv Adrian, Gartner research VP, data and analytics, said: “Google Spanner now sports a JSON datatype, expanding the capabilities of its flagship transactional engine and closing the gap with its competitors. It does not yet support indexing, but this is a step in the right direction, with a solid set of functional operations to leverage the frequently used tools associated with JSON.” ®

Source link