SQL Azure, Sharding and NHibernate: A call for volunteers

time to read 2 min | 285 words

I was quite surprised to hear that SQL Azure has a 10 GB limit for each database. That drastically reduce the amount of effort that I guess SQL Azure takes. At a guess, I would say it is simply replicated instances of databases instead of real SQL on the cloud.

One of the nice premises of working on the cloud is that you get transparent scaling. 10GB limit is not transparent. The answer from Microsoft seems to be that you need to implement Sharding. That is, you spread your logical database over several physical databases.

Usually it is done on physical database instances for the purpose of speeding up application because you get can parallelize the queries. In this case, you would need this because each database is pretty small.

Sharding is a term that was invented by Google, and a few years ago several Google engineers decided that they want to use Sharding with Hibernate. Thus, the Hibernate Shards project was born, bringing transparent sharding support to Hibernate.

The equivalent project for NHibernate was started, but porting was never complete. This is a call for volunteers to help continue the port of Hibernate Shards to NHibernate. You now have a very clear goal for why you would want that.

Having NHibernate Shards fully functional would mean that you get transparent scaling on SQL Azure. The fun part is that there isn’t a lot of thinking or design involved, the road was already traveled by, the only effort would be porting it.

And, to give some incentive, I am willing to donate an NH Prof license for all the major contributors that would finish the Hibernate Shards port.

Tweet Share Share 24 comments

Tags:

NHibernate

Comments

06 Sep 2009
01:44 AM

Eyston

Never really understood why move to Azure and keep the relational model. It seems if I'm going to move to Azure it is for benefit of building a scalable application which would mean Azure Storage (tables). If I just want to take an existing app and 'cloud it up' EC2 is a better fit.

SQL Azure just seems like a compromised solution. They started with a different goal but it didn't work (at the initial stages it was hard to differentiate between tables which lead to confusion and the later stages never came) so now its just 'SQL in the cloud'.

06 Sep 2009
03:08 AM

Adam D.

Sounds great. I'll tweet you directly.

06 Sep 2009
08:07 AM

Joannes Vermorel

why move to Azure and keep the relational model.

In many apps, you have a few tables that hold 99% of your raw data. Those are obvious candidates for Blob Storage or Table Storage under Azure. Then, the other 50 tables just aren't worth the migration because they don't hold that much data. This is more or less our situation at Lokad.com and the approach we have adopted while migrating to Azure.

06 Sep 2009
08:48 AM

Chitty

That would be much better than my proposal :)

(see comments in ayende.com/.../...nate-on-the-cloud-sql-azure.aspx)

06 Sep 2009
09:18 AM

Sean

would be happy to volunteer if i can contribute, how can i help?

06 Sep 2009
09:23 AM

Steve Strong

I'm pretty busy with the Linq to NHibernate work at the moment, but if there's still a need for contributors once I've got that nailed, then I'd be up for it. Already ported a lot of Hibernate Java code over to NH, so it's something I'm pretty used to :)

06 Sep 2009
16:18 PM

Santos Ray Victorero, II

I could use this!

How can I help?

06 Sep 2009
20:06 PM

Jason

Would be happy to help!

07 Sep 2009
02:27 AM

Joel Garcia Martinez

I am not very experienced, but I am eager to learn and would love to contribute, where do I sign up ?

07 Sep 2009
07:33 AM

Simeon

I would have thought Shard or Sharding came from MMOG's like Ultimate Online, which you logged onto different shards. These were deployed to manage player load, and locality to the server (ping time), basically the same scaling problem, and around the same time 1996/1997.

07 Sep 2009
13:15 PM

Ivan

I'd like to help make NHibernate Shards alive!

07 Sep 2009
13:20 PM

Dario Quintana

Hi Ayende, glad to read this post !

I began with NH.Shards long time ago and I couldn't finish it yet.

I moved some commits to this svn because a friend was helping me with some code...

http://code.google.com/p/nhshards

But I will commit those changes to the trunk/ on NH.Contrib if you like then somebody can continue, and I will be happy to help ;-)

07 Sep 2009
16:45 PM

Tim

Dario, I glanced nhshards repo and seems that it has quite much of the base work done. IMO it would be good to sync those changes to the contrib.

07 Sep 2009
22:00 PM

Mike Brown

I'm definitely game, where do I sign up?

@eyston There's a lot of info regarding SQL Azure. Long story short, they tried a schema-less and got a very vocal response from those who wanted "SQL in the cloud". Reading in between the lines of the announcement of the transiton, Microsoft said that SQL Azure will support TDS (the protocol of SQL server). I imagine that they have an engine in front of the schema-less data storage that handles the translation for you.

It's not a compromise, it's the best of both worlds. My hope is that the schema-less API will be re-enabled at some point in the future.

08 Sep 2009
07:20 AM

Billy Stack

Would definitely be interested in contributing...

How/Where do I get started?

Anything to help the NHibernate community. Have used NHibernate extensively over the last few years - it would be nice to give something back!

08 Sep 2009
13:37 PM

Alberto

I'm in debt with NHibernate: it saved me many work hours.

If I can contribute in any way, please tell me where I can sign up.

08 Sep 2009
13:40 PM

NicoPaez

Hi, I have worked with NHibernate during the last 3 years and I really like to contribute. Where do we start?

08 Sep 2009
13:54 PM

Ayende Rahien

The place to discuss this is the nhibernate contrib mailing list.

http://groups.google.com/group/nhcdevs

08 Sep 2009
13:55 PM

Ayende Rahien

Dario,

I think this would be great, and it seems like a lot of people are willing to help

08 Sep 2009
13:59 PM

Ayende Rahien

Mike,

The place to discuss this is the nhibernate contrib mailing list.

http://groups.google.com/group/nhcdevs

As for what SQL Azure is, it is most definitely not sitting in front of the schemaless storage. There is just no way it could work.

The 10GB limit makes me think that they simply put some version of SQL Server and handle replication on the fly. That way they can still benefit from what SQL Server can do.

08 Sep 2009
15:45 PM

Eyston

@Mike Brown

No, there is no translation, it is just SQL installed on their virtual instances.

The original SSDS (SDS) was schema less and built to scale. When presented at PDC it wasn't fleshed out, didn't really do joins the way people wanted, but it scaled "automatically" (as long as you embraced the new way of thinking). The fact that it was schema less and the relational bits weren't really there made people confused with the differentiation between Azure Table Storage and SQL Data Services.

It looks like SDS wasn't going to be mature enough to meet the Azure schedule and people weren't really interested in learning the different paradigm so we get SQL Azure which has no 'seamless scaling' capabilities that every other Azure service has. It kind of sticks out like a sore thumb to me. I'm sure more details / roadmap will be unveiled at PDC, because I don't think SQL Azure is the long term goal. Complete speculation though.

18 Sep 2009
11:12 AM

Rob

It should be noted that amazons simpledb (cloud schemaless db) also has a 10gig limit per domain (database). You reach that limit pretty fast in a scema-less world. Sharding your fact data is a good idea for both scale and performance.

26 Sep 2009
12:08 PM

Jonesie

Any chance you could finish Shards this week ?? :) We need this, like now! We are creating the ultimate Azure killer app - no really! Lots of database partitions in the cloud - for performance reasons mostly - need to sell lots of stuff in a very short period. I need to create an automatic partitioning and de-partition mechanism so we can move slices of the database in and out of the cloud as demand increases or decreases. I think sharding might help with this.

Cheers

26 Sep 2009
12:29 PM

Ayende Rahien

Jonessie,

If you are serious, we can talk about sponsoring the Shards development.

Putting money into this is the way to make sure that it will happen fast.

Comment preview

Comments have been closed on this topic.

Markdown turns plain text formatting into fancy HTML formatting.

Phrase Emphasis

*italic*   **bold**
_italic_   __bold__

Links

Inline:

An [example](http://url.com/ "Title")

Reference-style labels (titles are optional):

An [example][id]. Then, anywhere
else in the doc, define the link:
  [id]: http://example.com/  "Title"

Images

Inline (titles are optional):

![alt text](/path/img.jpg "Title")

Reference-style:

![alt text][id]
[id]: /url/to/img.jpg "Title"

Headers

Setext-style:

Header 1
========
Header 2
--------

atx-style (closing #'s are optional):

# Header 1 #
## Header 2 ##
###### Header 6

Lists

Ordered, without paragraphs:

1.  Foo
2.  Bar

Unordered, with paragraphs:

*   A list item.
    With multiple paragraphs.
*   Bar

You can nest them:

*   Abacus
    * answer
*   Bubbles
    1.  bunk
    2.  bupkis
        * BELITTLER
    3. burper
*   Cunning

Blockquotes

> Email-style angle brackets
> are used for blockquotes.
> > And, they can be nested.
> #### Headers in blockquotes
> 
> * You can quote a list.
> * Etc.

Horizontal Rules

Three or more dashes or asterisks:

---
* * *
- - - -

Manual Line Breaks

End a line with two or more spaces:

Roses are red,   
Violets are blue.

Fenced Code Blocks

Code blocks delimited by 3 or more backticks or tildas:

```
This is a preformatted
code block
```

Header IDs

Set the id of headings with {#<id>} at end of heading line:

## My Heading {#myheading}

Tables

Fruit    |Color
---------|----------
Apples   |Red
Pears	 |Green
Bananas  |Yellow

Definition Lists

Term 1
: Definition 1
Term 2
: Definition 2

Footnotes

Body text with a footnote [^1]
[^1]: Footnote text here

Abbreviations

MDD <- will have title
*[MDD]: MarkdownDeep

Oren Eini

Oren Eini

CEO of RavenDB