Document Databases are not Relational

time to read 4 min | 692 words

I got several similar questions regarding my post about modeling data for document databases:

…how would you handle a situation where you need (or want) to store some information in a relational database. For example, user accounts.
Would you duplicate the user accounts in the document db? If not, how would you relate posts to users and preserve some kind of integrity.

The most typical error people make when trying to design the data model on top of a document database is to try to model it the same way you would on top of a relational database. A document database is a non relational data store, and trying to hammer a relational model on top of it will produce sub optimal results. But you can get fantastic results by taking advantage on the documented oriented nature of Raven.

Documents, unlike a row in a RDBMS, are not flat, you are not limited to just storing keys and value. Instead, you can store complex object graphs as a single document. That includes arrays, dictionaries and trees. What it means, in turn, is that unlike a relational database, where a row can only contain simple values and more complex data structures need to be stored as relations, you don't need to work hard to get your data into Raven.

Let us take the following page as an example:

In a relational database, we would have to touch no less than 4 tables to show the data in this single page (Posts, Comments, Tags, RelatedPosts).

Using a document database, we can store all the data that we need to work with as a single document with the following format:

This format allows to get everything that we need to display the page shown above in a single request.

Documents are expected to be meaningful on their own. You can certainly store references to other documents, but if you need to refer to another document to understand what the current document means, you are probably using the document database wrongly.

With document database, you are encourage to include in your documents all the information they need. Take a look at the post example above. In a relational database, we would have a link table for RelatedPosts, which would contain just the ids of the linked posts. If we would have wanted to get the titles of the related posts, we would need to join to the Posts table again. You can do that in document database, but that isn't the recommended approach, instead, as shown in the example above, you should include all the details that you need inside the document. Using this approach, you can display the page with just a single request, leading to much better overall performance.

Nitpicker corner: Yes, it does mean that you need to update related posts if you edit the title of a post.

Once we established this context, we can try answering the actual question.

Assuming that we store users in a relational database, the question now becomes, what would we gain by replicating the users information to a document database?

If we were using a relational database, that would have given us the ability to join against the users. But a document database doesn’t support joins. Moreover, if we consider the apparent aim of the question “maintain some integrity”, we can see that it doesn’t really matter where we store the users’ data. A document database doesn’t support things like referential integrity in the first place, so putting the users inside the document database gives you no benefit.

Now, you may want to be able to put the users in the document database anyway, to benefit from the features that it brings to the table, but integrity isn’t one of those reasons.

Tweet Share Share 56 comments

Tags:

Comments

04 May 2010
08:19 AM

ales

I wrote some comment, but may be that it lost. So I will try again.

How do you implement some data which are (or in my point of view have to be relational). Say some security data as user name, passowds, privilegies to document data, ... .

I red your post about blog app vs Raven.

How to implemen some secure data in that sample?

Or you have to store some data into relational database and then "link" them with document database.

Sorry for begginers question, but i have information about document db only from your blogs.

04 May 2010
08:21 AM

Mike

Hey thanks for writing this post!

I guess I am still firmly in the nitpicker corner, thinking about how you are going to have to update potentially a huge number of documents if the title of a related post would change. Or if a post would be deleted, all posts should be checked to see that they don't reference the deleted post anymore.

If that sort of thing can be done in a transaction that solves part of the problem I guess, but it still feels like an antipattern. But that's just coming from someone with 0 experience working with document databases.

04 May 2010
08:33 AM

Ayende Rahien

Ales,

I am not following why you think that secured data has to in a relational database, can you expand on that?

Securing with Raven is pretty easy, and completely customized.

04 May 2010
08:34 AM

Ayende Rahien

Mike,

You don't check all posts, you check the index you created for related posts, and get all the related posts to update.

It is pretty easy, and damn fast.

And yes, it can be done in a transaction.

04 May 2010
08:36 AM

ales

May be I'm wrong, but how can I say that some user does/does not acces to some documet, that is relation. Or that some user can/cannot add comment to some document.

04 May 2010
09:12 AM

Ayende Rahien

Ales,

Take a look here: groups.google.com/.../docs-server-triggers
We have support for read triggers as well, but it is not document yet.

04 May 2010
09:46 AM

ales

Ok, thank you a lot.

I have one another "stupid" question. Has Raven some pages to read more about that, or try it and so?

04 May 2010
10:37 AM

Ayende Rahien

Ales,

Yes, it has, but I don't want to make things public yet. It is not released

04 May 2010
10:48 AM

ales

Ohh, ok, I will keep eye on that and hopefully try it when it will be possible :-)

Thank you a lot.

04 May 2010
10:55 AM

tobi

For me the biggest question mark is how to automatically update all locations where a post title is stored when it changes. That seems to be very much work and error prone.

However your post is an excellent example of the potential performance benefits of a document oriented storage model.

04 May 2010
12:01 PM

Michael L Perry

Updating names of posts in two places should not be marginalized by relegating It to the nitpicker's corner. This is a significant difference between document and relational. It informs your decision of which technlology to use.

Consider how many places you will need to do this. Every time you need a new reference to a post, you will need to go back and change the update name code to update that place, too. If this happens a lot, maybe the normalization benefit of a relational database is what your problem calls for.

In this example I don't think that's the case. I think that the number of places where you reference posts will be small enough to manage. But don't marinalize it. It is an imporant design consideraion.

04 May 2010
13:01 PM

Colin Goudie

Thinking out load here. If you come from a DDDD approach with CQRS at the architecture level, does it make sense to have say a relational database handling the Commands and a Document database for Query responsibilities? Or is that defeating the purpose of a document database?

04 May 2010
13:31 PM

Andrew

Michael,

"Every time you need a new reference to a post, you will need to go back and change the update name code to update that place, too. If this happens a lot, maybe the normalization benefit of a relational database is what your problem calls for."

I'd have to ask, how many times does something like a Blog Title change? Even if we throw out that example, its rare for the "header" data (i.e. blog titles, user names, company names, etc.) of a document to change, even if it's common for the details to be modified.

That is unless I'm completely missing something, could you think of a realistic scenario where you'd be concerned with a document "header" changing on a regular basis?

04 May 2010
13:54 PM

Ayende Rahien

Colin,

Persistent store for view models is certainly one place I see Raven being used.

I think that there is a good story around using Raven for persisting commands, but that is another use case.

04 May 2010
13:55 PM

Ayende Rahien

Michael,

Such decisions are based on understanding the model you are working in. As you noted, updating blog post title is something that is done very rarely. It make a lot more sense to replicate that value and gain the huge perf benefit you gain than having to refer to it all the time.

04 May 2010
17:23 PM

Jason Slocomb

Oren,

I'm not sure how posts relate in your model.

Assuming they just share tags, why not just index posts and their tags, and query the index when displaying the related posts? It seems strange to store related posts and incur the overhead updating Posts every time a new one is added.

I appreciate the academic aspect that you can do this in RavenDB. VERY cool. I just fail to see the benefit of the pattern in this case.

Am I missing the point?

04 May 2010
17:29 PM

Justin

A relational database can have hierarchical data types that would essentially make it a document db.

CREATE TABLE dbo.Documents

(

[id] uniqueidentifier NOT NULL,

[document] xml NOT NULL

)

Wow I just created a document db with indexes and atomic updates in MSSQL. Except I can make foreign keys to the document id if I wish.

Document databases have the same impedance mismatch relational databases do when using objects.

Objecst are cyclic graphs, document are directed acyclic graphs, so some object models can be stored easily in a document db, just like some objects model easily translate to relational, but not all by a long shot.

So what is the advantage of a document db again? Can't we just get a decent OO db with Linq query support, that can store object graphs directly? DB4O?

04 May 2010
17:46 PM

Ayende Rahien

Jason,

Related posts may be something like my "That NoSQL Thing" series.

A better example might be Author, where you store the author id and the author name, though.

04 May 2010
17:49 PM

Ayende Rahien

Justin,

A relational database can have hierarchical data types that would essentially make it a document db.

No, not even close.

Oh, you can build a doc db on top of that, sure, but the relational database would be storage for a document database, not a document database.

I can make foreign keys to the document id

how can you make a FK from an element value inside the xml document to the id?

Objecst are cyclic graphs, document are directed acyclic graphs,

This is just a serialization detail. It is fairly easy to deal with that using object references in the serialization format, although it does make the documents much harder to read for humans.

04 May 2010
20:38 PM

jdn

Dumb question, but where do you keep the lookup data in this scenario?

For instance, where do you store the tags ("raven", "docdb", "modelling")? Are they just embedded in code enum-style? I know you don't tend to do this but on occasion I add a category and then go back and add it to various posts.

I'm almost less concerned about how you go back and update old posts as in where the lookup data is stored.

04 May 2010
20:48 PM

Rob Jennings

So, how long after Raven's release can we expect a Raven tekpub series with you and Rob C.? ;)

04 May 2010
21:13 PM

Justin

"Oh, you can build a doc db on top of that, sure, but the relational database would be storage for a document database, not a document database."

Perhaps you could be a little more descriptive, instead of just saying No not the same. In my MSSQL table with xml data type, I can atomically update a single document, The entire XML doc can be indexed for fast retrieval. I'm using SQL and XML with TDS instead of REST/JSON over HTTP.

How about I just say NO Raven is just storage for a document DB it's not a document database.

"how can you make a FK from an element value inside the xml document to the id?"

The same way you can make a foreign in Raven(You Can't), but at least the document can be keyed out in my design.

"This is just a serialization detail. It is fairly easy to deal with that using object references in the serialization format, although it does make the documents much harder to read for humans."

And objects to tables is just a serialization detail, lol, talk about over simplification. Objects aren't documents, just like they aren't tables, so when is your ODM coming out?

04 May 2010
21:26 PM

Ayende Rahien

Jdn,

I store it in the post, and create an index to get to it.

04 May 2010
22:00 PM

Ayende Rahien

Justin,

Storage for Doc DB can be anything, flat files, SQL Server, etc.

Hell, you can say that the file system is a doc db according to your logic.

A doc db provide more than just binary persistence. It provide querying capabilities, transactions, aggregation (usually in the form of map/reduce), conflict detection & resolution, etc.

Sure, you can build on top of existing SQL Server features to provide all of those, but it doesn't really offer much of the things that you would like, and you still have to do two hops to handle most non full crud operations, which really hurts performance in the long run.

Not sure that I understand what you mean by "at least the document can be keyed", can you explain a bit more?

As for full object graph (including circular references), that is a pretty simple problem to solve. Hell, it is a first year problem when you use JSON or XML. The model support it easily.

All you need to do is to something like:

{

"Name": "Parent",

 "__object_id": "parents/1",

"Chidlren" : [ 

        {"Name": "Child", "Parent": { "__is_object_ref": true, "__object_id": "parents/1"

]

}

Now all you need to do is support it in the serializer (use a hash tabhle).

04 May 2010
22:50 PM

jdn

Ayende:

Sorry, let me rephrase the question.

I know that you store the tags that apply to a particular post in the post itself, but where do you store the list of potentia tal categories?

So, in SubText, there's a table that stores them (I think, haven't looked in a while), where if you decide to add a 'NoSQL' category (as you did when I asked you about it), a row gets entered there, and you have a foreign key constraint to it.

If you were building a RavenDB version of SubText, where do you keep that lookup data? Would it be a document of its own?

04 May 2010
23:00 PM

jdn

'potential tag categories' I mean.

05 May 2010
00:58 AM

Harry Steinhilber

@jdn

I think you would just need to define an index that pulls out all the tags applied to all the posts in the doc db. I believe you could also do it with a simple map/reduce and get how frequently each tag gets used as well. Something like:

// Map

from post in docs

from tag in post.tags

select new { tag, count = 1 };

// Reduce

from pair in results

group pair by pair.tag into g

select new { pair.key, count = g.Sum(x=>x.count };

05 May 2010
04:44 AM

Ayende Rahien

Jdn,

Oh, now it makes sense, I don't store it.

I query that information from an index sitting on top of the posts.Tags collection

05 May 2010
12:01 PM

Ray

Document DB's are certainly interesting but honestly I still fail to see that many cases when it will have an edge over RDB's...

For example each post will have an author and each author will have some statistics (badges, title, ect) that can be changed any time and certain data that will be changed everytime he posts something (number of posts). How do we handle that without hassle of going through all posts, selecting only posts with this particular author (kk, maybe fast through the index but still it will load the server), updating all of them and saving them back into DB...

Sorry if my question sounds ignorant.. I really need to dig more into this subject when I get more time.

05 May 2010
12:10 PM

Ray

Additionally: if certain author is awarded with a new badge, how can we be sure that ALL his posts will be updated? i.e. how to ensure data consistency effectively?

05 May 2010
14:27 PM

noname

the links to pictures are broken

05 May 2010
15:25 PM

Ayende Rahien

Ray,

You missed the point.

You won't embed the entire user information inside post.

Indeed, that makes no sense. In the case of a Post and a User, we will embed the user's id, and we might embed the user's name, so we won't need to look it up in the user document just for display purposes.

But we would certainly not replicate the entire user's document. If I really needed that much of the user's data, I would simply refer to the user's document

05 May 2010
16:12 PM

Jason Slocomb

I think some of you are failing to realize that cow is already out of the barn with regards to NOSQL. RDBMS's are not always well suited for the kinds of challenges Amazon, Facebook, etc. are dealing with. The big boys are already rolling their own, ie; voldemort, Bigtable, etc. While the example Oren is using (blog) can be done well with a traditional Rdb, it's also an example everyone can wrap their mind around and discuss. That said, this contrived example should not be used as measuring stick for whether or not the DocDB paradigm has any validity.

05 May 2010
16:27 PM

Justin

"Storage for Doc DB can be anything, flat files, SQL Server, etc.

Hell, you can say that the file system is a doc db according to your logic."

"A doc db provide more than just binary persistence. It provide querying capabilities, transactions, aggregation (usually in the form of map/reduce), conflict detection & resolution, etc."

You can do all those things with the XML datatype in MSSQL.

What does Raven store it's documents in, a file right? What is the difference between storage for documents and a document db? All DB's are is an abstraction layer over data that provides useful tools for update/retrieval right? Raven meets this definition, so MSSQL using a xml datatype.

A filesystem on most OS's is a hierarchal db, some OS's even give you a way to index and query the filesystem. Some OS's file systems are closer to full DBMS, ironic that we build most of our DB's(Relational and Document) on top of a Hierarchal db. Now lets build an OO database on top of a document database on top of a hierarchal DB, heck we've been doing that with ORM's for years...

"Not sure that I understand what you mean by "at least the document can be keyed", can you explain a bit more?"

I have the option of using the document id as a key, I can make relational tables in my document db with foreign keys to the documents if I need to, you don't have that option in a pure document db. My point is a full featured relational database with hierarchal data types can do what document database can, plus all the relational features if you need them.

"As for full object graph (including circular references), that is a pretty simple problem to solve. Hell, it is a first year problem when you use JSON or XML. "

So why use foreign keys or joins in a relation db with an ORM? Just store object ID's and let the application layer enforce integrity right?

What happens when I delete a object referenced by id in the document, oh now my app code has to go make sure I can delete, oops I forgot one place I stored the id, now we a reference to nothing the the app code has to handle.

Why store objects in a document db at all, just store them as blobs in a file system. Hell, it is a first year problem when you use Files for serialization, right?

05 May 2010
16:34 PM

Ayende Rahien

Justin,

You can do all those things with the XML datatype in MSSQL.

Of course, as I said, you need to build that on top of MSSQL, turning your code into a doc db that uses MSSQL as a storage.

Raven store its information in a ISAM DB internally. Storage for docs is just that, simple storage. A database tend to do more than just store bits.

just store them as blobs in a file system

Most file systems tend to fall over and die when you store millions of small files on them (oh, they work, but veeeeerrrrryyyy sloooooooooowwwly

But I think that we have better cut the argument right now. You seem very firm in your opinions. I suggest taking a look at my posts series on NoSQL to see why you might want to use such a database.

05 May 2010
16:46 PM

Justin

" think some of you are failing to realize that cow is already out of the barn with regards to NOSQL. RDBMS's are not always well suited for the kinds of challenges Amazon, Facebook, etc. are dealing with. The big boys are already rolling their own, ie; voldemort, Bigtable, etc."

Except the largest databases in the world are running relational.

Google ad words runs on Mysql, not Big table.

Yahoo did have the record for the largest db in the world(2 petabytes) running on Postgres.

Document db's are not new they predate relational, everyone just seems to have forgotten what happend a few decades ago, the industry is cyclic. Same things where said when OO db's became popular, and then XML db's, now we have NoSQL(which is just a rehash of IMS from the 1960's).

05 May 2010
17:30 PM

Justin

"Of course, as I said, you need to build that on top of MSSQL, turning your code into a doc db that uses MSSQL as a storage."

No I am using SQL extended with XQuery, it is part of the DBMS, not something I am building on top of it. That would be like saying storing rows in MSSQL using SQL is turing my code into a relational db that uses MSSQL as storage.

"Raven store its information in a ISAM DB internally. Storage for docs is just that, simple storage. A database tend to do more than just store bits."

Storage isn't very useful with out a way to retrieve the data later right? So it's not just storing bits, not even the filesystem is just storing bits. Every datastore has various helpers for retrieving those bits later and making sure they are consistent.

"Most file systems tend to fall over and die when you store millions of small files on them (oh, they work, but veeeeerrrrryyyy sloooooooooowwwly"

Depends on the filesystem and how you store the data in it, your statement is a gross generalization. I bet I can store data in Raven in such a way to make it fall over and die as well. You write your serializer to match the capability of the data store, the more tools the data store provide the less you write in your serializer yourself.

"You seem very firm in your opinions."

I am very open to new DB technologies, relational has started to stagnated and is some what related to mindshare going to NoSQL. It is simply annoying to see the same arguments rehashed every 10 years, with the "Next Big Thing", Seems liek most of the arguments for NoSQL forget why relational won out before.

OK I'll shut up now.

05 May 2010
18:27 PM

Ayende Rahien

Except the largest databases in the world are running relational.

What on Earth gave you that impression?

Google ads run on MySql, but there is no information on whatever it is running on MySQL using relational mode, or as a record store or even key/value store.

Frankly, I am leaning toward the later, rather then the former. At any rate, I think that I can safely guess that the data is heavily sharded.

As for Yahoo, they are running on Postgres, but not as a RDBMS, they are running a modified version that runs as a column database.

Quite a different beast.

05 May 2010
19:50 PM

Justin

Both MySQL and PostgresSQL are SQL databases, they store data in sets called tables and retrieve sets using SQL, how exactly do you run MySQL in non-relational mode? You may be able to forgo some ACID compliance in MySQL(just like using read uncomitted in MSSQL) by using one of the various storage engines, but you are always dealing with it relationally, as in tables and joins.

The modified Postgres Yahoo uses, changed the storage engine to be column based from row based, this however doesn't change the fact that they are still using SQL(with joins, etc.) and dealing with sets, it is hidden under the covers, like all DBMS's storage engines.

They specifically noted the reason the choose to modify Postgres instead of using a NoSQL solutions was because existing tools and applications could run with little change.

05 May 2010
19:58 PM

Ayende Rahien

Justin,

Just using INSERT or SELECT statement doesn't mean that I am working in a relational form.

MySQL is actually heavily utilized as a key/value store.

I suggest you will read this, it might help explain the basis of the issue:

www.25hoursaday.com/.../...LargeScaleWebsites.aspx

05 May 2010
22:28 PM

Jason Slocomb

Justin you are trolling. Just because Google or Yahoo hasn't migrated an existing production database consisting of 2 Petabytes of data supporting hundreds of thousands of users to a DocumentDB is not an indictment against the NOSQL movement or Raven.

05 May 2010
22:33 PM

Fizz

Do you know what BCNF is?

Do you understand the "only store once" principle of a relational DB?

Do you know what a semantic (o,p,v) store is?

Beyond storing non-related property bags, document DBs are useless.

06 May 2010
01:55 AM

Rob Jennings

Do you understand the "only store once" principle of a relational DB

Do you understand that in many high usage scenarios the "only store once" principle leads to horrible performance?

06 May 2010
02:17 AM

Fizz

The process is called "Denormalisation" i.e. you work out how to optimally store the data then make tradeoffs, not start with a tradeoff

In reverse, storing data denormalised leads to storage bloat and data inconsistancies?

On performance then, try removing every post and comment by a user

Beyond storing non-related property bags, document DBs are useless?

The Google search engine, is not ACID complient, it will never be consistant, the same query at the same point in time for 2 users is not garrenteed to return the same result.

06 May 2010
11:26 AM

Ray

I'm on this with Fizz. Highly doubt that document DB's will be used extensively any time soon. They only greatly benefit from map-reduce pattern, and map-reduce is pointless if you don't have a cluster (map - split job, reduce - flatten results into one) and use cases when you need to do a lot of such processing on your data store... Good for Amazon, not so good for most of enterprise applications.

06 May 2010
17:18 PM

Justin

"Just using INSERT or SELECT statement doesn't mean that I am working in a relational form.

MySQL is actually heavily utilized as a key/value store."

Using insert and select against tables IS the definition of the Relational Model.

The core definition:"Its central idea is to describe a database as a collection of predicates over a finite set of predicate variables, describing constraints on the possible values and combinations of values."

How SQL the language and a DB that uses SQL is the the relational model:" A table in an SQL database schema corresponds to a predicate variable; the contents of a table to a relation; key constraints, other constraints, and SQL queries correspond to predicates."

I would suggest you read some Codd(The guy who defined the relational model): www.seas.upenn.edu/~zives/03f/cis550/codd.pdf

You can build a key-value store on top of the relational model in MySQL, and yes many have done so. Why, because MySQL performed well even compared to native key-value implementations. Just like I showed how to make a doc db using MSSQL's xml data type, MSSQL is still relational, but I am hiding it's relational model behind a document based application layer.

An OO database can also be built on a Realtional Database, you should know since NHibernate does just that. Is MSSQL in non-relational mode when I use NHibernate against it?

As far I know Google has never said they are using MySQL any other way then just straight relational with some sharding techniques, so your just grasping any when you say they use it in a key-value store manner anyway.

Jason Slocomb,

"Justin you are trolling. Just because Google or Yahoo hasn't migrated an existing production database consisting of 2 Petabytes of data supporting hundreds of thousands of users to a DocumentDB is not an indictment against the NOSQL movement or Raven."

Except Yahoo's db wasn't a production db they migrated TO the SQL db from NoSQL, here straight from Waqar Hasan, vice president of engineering in Yahoo's data group:

"Hasan joined Yahoo more than three years ago. At the time, Yahoo already had huge non-SQL databases storing hundreds of terabytes of data. Problem was, the data was in the form of large collections of compressed files that could be accessed only by writing programs in a language such as C++, rather than more easily and quickly via SQL commands, he said."

and: "The top layer remains PostgreSQL, however, so that Yahoo can use the many off-the-shelf tools available for it."

So if getting the facts straight on the largest most heavlily used DB's in the world being SQL and not NoSQL is trolling, I am truly sorry.

06 May 2010
23:46 PM

Ayende Rahien

Using insert and select against tables IS the definition of the Relational Model.

No, it isn't. Insert and Select and just readable way to do INSERT("ayende","ayende@ayende.com", 38840) and SELECT("ayende")

Relational model include things like predicate logic, joins, fk, relations, etc.

When I am talking about using something like MySql as a key/value store, I am referring to only issuing queries by id, a very common approach. That isn't relational.

As a good example of that, see: bret.appspot.com/entry/how-friendfeed-uses-mysql

On a general note, I would really appreciate if you won't assume I am not aware of what I am talking about.

07 May 2010
03:27 AM

Fizz

INSERT("ayende","ayende@ayende.com", 38840)

Is this a a document database command? It is not in the form K -> V, as it has three parameters.

Can you please tell me, beyond storing non-related property bags, document DBs are useless?

07 May 2010
09:58 AM

Ayende Rahien

Fizz,

That insert command is "store a tuple, keyed by the first item".

As for what Doc DBs are good for, read my NoSQL category, it is all there.

07 May 2010
14:26 PM

Ray

Oren, on paper it sounds very good but the question is did you actually have experience with No SQL in a real production environment? No offense intended, I'm just curious.

07 May 2010
14:44 PM

Ayende Rahien

Ray,

Yes.

Commercially speaking, I built a fairly large system on top of a DHT, in another project we had external indexes for most of the queries.

07 May 2010
19:08 PM

Justin

I thought you did know what the relational model is, but when you say nonsense like "Just using INSERT or SELECT statement doesn't mean that I am working in a relational form." then I start to wonder.

The SQL language including inserts and selects IS First order predicate logic, table contents IS a relation, and yet here you are saying it's not!

"No, it isn't. Insert and Select and just readable way to do INSERT("ayende","ayende@ayende.com", 38840) and SELECT("ayende")"

That is the definition of Codd's 7th rule, did you forget this too? Do you need a refresher on how to tell if something is a relational db: http://en.wikipedia.org/wiki/Codd's_12_rules ?

Friend Feed built a key-value system ON TOP of a set of tables, they update and retrieve data with SQL. They even describe their table structure complete with primary and foreign keys! Are you seriously going to tell me Friend Feed is some how using MySQL in "non-relational mode"? Friend Feeds MySQL db still meets all of Codd's 12 rules.

If I run Windows in a VM on my Mac, I am still running a Mac. If I build a Key-Value store on top of a relational db I am still using a relational db!

07 May 2010
19:19 PM

Andrew

Justin,

At the end of the day, you're being very disingenuous, and quite frankly I have no idea why you are so angry about some guys blog just because you define every database that can be used in a "relational" manner is automatically considered "relational".

You can quote Codd up the yin-yang, but it really doesn't change the fact that when 99.9% of the planet talks about "relational" DBs they aren't referring to INSERT and SELECT statements, they are talking normalization, they're talking about Foreign Keys, etc.

Think of it this way, I have 4 wheel drive on my SUV, it has "4WD" right on the decal, and in the sales literature it says that it has that capabilities. But I choose to use it in 2WD mode when I drive to work. You could argue that my Jeep is 4WD (and it is), but I don't use it that way (most of the time anyways).

Not a perfect analogy, but good enough. Chill out and relax just a bit, you're freaking out over what is basically syntax, and it's making you look bad.

09 May 2010
15:33 PM

Mahmud Khaled

@Justin:

chill man!! the very word "relational" in the "relational data model" is the key - it indicates to the relations among various entities using foreign keys etc. Use of SELCT INSERT has got nothing to to do with being relational. Having/creating a bunch of tables in a SQL/Oracle/MySql database doesn't make it a relational database.

09 May 2010
16:37 PM

Demis Bellot

@Justin

I am really not sure what you are arguing about any more?

It does seem rooted in the fact that because you can make MSSQL/RDBMS act like a document database that there is no reason to go the NoSQL solution specifically catered for the task. Sure as a persistence layer you can potentially map any data structure to a relational model, it doesn't make it a good idea. The best solution is one that 'works the best' not 'any that works with what I know'. For some insight on the decision making process that leads towards adoption of a NoSQL solution over an RDBMS one, check out this post from digg's architect on the subject, here are a couple of choice quotes:

stu.mp/.../...l-vs-rdbms-let-the-flames-begin.html
"Do you honestly think that the PhDs at Google, Amazon, Twitter, Digg, and Facebook created Cassandra, BigTable, Dynamo, etc. when they could have just used a RDBMS instead?"

" NoSQL solutions allow us to serve absurd amounts of data for a really, really low price. I’m happy to put my $/write, $/read, and $/GB numbers for my NoSQL setup against anyone’s RDBMS numbers.

We’re not nearly as dumb as everyone thinks we are; I promise."

10 May 2010
14:34 PM

Justin

Perhaps the problem that I am so upset about IS the fact that a large percentage of the programming population thinks "relational" is something that it is not, and is now rehashing old technologies like key-value stores and doc db's like they invented something new.

"he very word "relational" in the "relational data model" is the key - it indicates to the relations among various entities using foreign keys etc. Use of SELCT INSERT has got nothing to to do with being relational."

See here is my point, please read some Codd and understand what a relation is before saying something like this.

You wonder what I am getting upset about when I get called disingenuous for simply wanting people to know the relational model before they bash it.

"It does seem rooted in the fact that because you can make MSSQL/RDBMS act like a document database that there is no reason to go the NoSQL solution specifically catered for the task."

I never said the is no reason to use a NoSQL db. I just wanted to give some perspective on the past, because most of the arguments we hashed out 30 years ago, and all the sudden no one can remember what the relational model is.

Comment preview

Comments have been closed on this topic.

Markdown turns plain text formatting into fancy HTML formatting.

Phrase Emphasis

*italic*   **bold**
_italic_   __bold__

Links

Inline:

An [example](http://url.com/ "Title")

Reference-style labels (titles are optional):

An [example][id]. Then, anywhere
else in the doc, define the link:
  [id]: http://example.com/  "Title"

Images

Inline (titles are optional):

![alt text](/path/img.jpg "Title")

Reference-style:

![alt text][id]
[id]: /url/to/img.jpg "Title"

Headers

Setext-style:

Header 1
========
Header 2
--------

atx-style (closing #'s are optional):

# Header 1 #
## Header 2 ##
###### Header 6

Lists

Ordered, without paragraphs:

1.  Foo
2.  Bar

Unordered, with paragraphs:

*   A list item.
    With multiple paragraphs.
*   Bar

You can nest them:

*   Abacus
    * answer
*   Bubbles
    1.  bunk
    2.  bupkis
        * BELITTLER
    3. burper
*   Cunning

Blockquotes

> Email-style angle brackets
> are used for blockquotes.
> > And, they can be nested.
> #### Headers in blockquotes
> 
> * You can quote a list.
> * Etc.

Horizontal Rules

Three or more dashes or asterisks:

---
* * *
- - - -

Manual Line Breaks

End a line with two or more spaces:

Roses are red,   
Violets are blue.

Fenced Code Blocks

Code blocks delimited by 3 or more backticks or tildas:

```
This is a preformatted
code block
```

Header IDs

Set the id of headings with {#<id>} at end of heading line:

## My Heading {#myheading}

Tables

Fruit    |Color
---------|----------
Apples   |Red
Pears	 |Green
Bananas  |Yellow

Definition Lists

Term 1
: Definition 1
Term 2
: Definition 2

Footnotes

Body text with a footnote [^1]
[^1]: Footnote text here

Abbreviations

MDD <- will have title
*[MDD]: MarkdownDeep

Oren Eini

Oren Eini

CEO of RavenDB