Oren Eini

CEO of RavenDB

a NoSQL Open Source Document Database

Get in touch with me:

oren@ravendb.net +972 52-548-6969

Posts: 7,546
|
Comments: 51,161
Privacy Policy · Terms
filter by tags archive
time to read 10 min | 1880 words

In my previous post, we dealt with how to model Auctions and Products, this time, we are going to look at how to model bids.

Before we can do that, we need to figure out how we are going to use them. As I mentioned, I am going to use Ebay as the source for “application mockups”.  So I went to Ebay and took a couple of screen shots.

Here is the actual auction page:

image

And here is the actual bids page.

image

This tells us several things:

  • Bids aren’t really accessed for the main page.
  • There is a strong likelihood that the number of bids is going to be small for most items (less than a thousand).
  • Even for items with a lot of bids, we only care about the most recent ones for the most part.

This is the Auction document as we have last seen it:

{
   "Quantity":15,
   "Product":{
      "Name":"Flying Monkey Doll",
      "Colors":[
         "Blue & Green"
      ],
      "Price":29,
      "Weight":0.23
   },
   "StartsAt":"2011-09-01",
   "EndsAt":"2011-09-15"
}

The question is where are we putting the Bids? One easy option would be to put all the bids inside the Auction document, like so:

{
   "Quantity":15,
   "Product":{
      "Name":"Flying Monkey Doll",
      "Colors":[
         "Blue & Green"
      ],
      "Price":29,
      "Weight":0.23
   },
   "StartsAt":"2011-09-01",
   "EndsAt":"2011-09-15",
   "Bids": [
     {"Bidder": "bidders/123", "Amount": 0.1, "At": "2011-09-08T12:20" }
   ]
}

The problem with such an approach is that we are now forced to load the Bids whenever we want to load the Auction, but the main scenario is that we just need the Auction details, not all of the Bids details. In fact, we only need the count of Bids and the Winning Bid, it will also fail to handle properly the scenario of High Interest Auction, one that has a lot of Bids.

That leave us with few options. One of those indicate that we don’t really care about Bids and Auction as a time sensitive matter. As long as we are accepting Bids, we don’t really need to give you immediate feedback. Indeed, this is how most Auction sites work. They give you a cached view of the data, refreshing it every 30 seconds or so. The idea is to reduce the cost of actually accepting a new Bids to the minimum necessary. Once the Auction is closed, we can figure out who actually won and notify them.

A good design for this scenario would be a separate Bid document for each Bid, and a map/reduce index to get the Winning Bid Amount and Big Count. Something like this:

     {"Bidder": "bidders/123", "Amount": 0.1, "At": "2011-09-08T12:20", "Auction": "auctions/1234"}
     {"Bidder": "bidders/234", "Amount": 0.15, "At": "2011-09-08T12:21", "Auction": "auctions/1234" }
     {"Bidder": "bidders/123", "Amount": 0.2, "At": "2011-09-08T12:22", "Auction": "auctions/1234" }

And the index:

from bids in docs.Bids
select new { Count = 1, bid.Amount, big.Auction }

select result from results
group result by result.Auction into g
select new 
{
   Count = g.Sum(x=>x.Count),
   Amount = g.Max(x=>x.Amount),
   Auction = g.Key
}

As you can imagine, due to the nature of RavenDB’s indexes, we can cheaply insert new Bids, without having to wait for the indexing to work. And we can always display the last calculated value of the Auction, including what time it is stable for.

That is one model for an Auction site, but another one would be a much stringer scenario, where you can’t just accept any Bid. It might be a system where you are charged per bid, so accepting a known invalid bid is not allowed (if you were outbid in the meantime). How would we build such a system? We can still use the previous design, and just defer the actual billing for a later stage, but let us assume that this is a strong constraint on the system.

In this case, we can’t rely on the indexes, because we need immediately consistent information, and we need it to be cheap. With RavenDB, we have the document store, which is ACIDly consistent. So we can do the following, store all of the Bids for an Auction in a single document:

{
   "Auction": "auctions/1234",
   "Bids": [
     {"Bidder": "bidders/123", "Amount": 0.1, "At": "2011-09-08T12:20", "Auction": "auctions/1234"}
     {"Bidder": "bidders/234", "Amount": 0.15, "At": "2011-09-08T12:21", "Auction": "auctions/1234" }
     {"Bidder": "bidders/123", "Amount": 0.2, "At": "2011-09-08T12:22", "Auction": "auctions/1234" }
    ]
}

And we modify the Auction document to be:

{
   "Quantity":15,
   "Product":{
      "Name":"Flying Monkey Doll",
      "Colors":[
         "Blue & Green"
      ],
      "Price":29,
      "Weight":0.23
   },
   "StartsAt":"2011-09-01",
   "EndsAt":"2011-09-15",
   "WinningBidAmount": 0.2,
   "BidsCount" 3
}

Adding the BidsCount and WinningBidAmount to the Auction means that we can very cheaply show them to the users. Because RavenDB is transactional, we can actually do it like this:

using(var session = store.OpenSession())
{
  session.Advanced.OptimisticConcurrency = true;
  
  var auction = session.Load<Auction>("auctions/1234")
  var bids = session.Load<Bids>("auctions/1234/bids");
  
  bids.AddNewBid(bidder, amount);
  
  auction.UpdateStatsFrom(bids);
  
  session.SaveChanges();
}

We are now guaranteed that this will either succeed completely (and we have a new winning bid), or it will fail utterly, leaving no trace. Note that AddNewBid will reject a bid that isn’t the higher (throw an exception), and if we have two concurrent modifications, RavenDB will throw on that. Both the Auction and its Bids are treated as a single transactional unit, just the way it should.

The final question is how to handle High Interest Auction, one that gather a lot of bids. We didn’t worry about it in the previous model, because that was left for RavenDB to handle. In this case, since we are using a single document for the Bids, we need to take care of that ourselves. There are a few things that we need to consider here:

  • Bids that lost are usually of little interest.
  • We probably need to keep them around, just in case, nevertheless.

Therefor, we will implement splitting for the Bids document. What does this means?

Whenever the number of Bids in the Bids document reaches 500 Bids, we split the document. We take the oldest 250 Bids and move them to Historical Bids document, and then we save.

That way, we have a set of historical documents with 250 Bids each that no one is ever likely to read, but we need to keep, and we have the main Bids document, which contains the most recent (and relevant Bids. A High Interest Auction might end up looking like:

  • auctions/1234 <- Auction document
  • auctions/1234/bids <- Bids document
  • auctions/1234/bids/1 <- historical bids #1
  • auctions/1234/bids/2 <- historical bids #2

And that is enough for now I think, this post went on a little longer than I intended, but hopefully I was able to explain to you both the final design decisions and the process used to reach them.

Thoughts?

time to read 4 min | 791 words

We recently had a client with a similar model, and I thought that this would be a good topic for a series of blog posts. For the purpose of this blog posts series, I am going to be using EBay as the example. Just to be clear, this is merely because it is a well known site that most people are likely to be familiar with.

In our model, we have the following Concepts:

  • Categories
  • Products
  • Auctions
  • Bids
  • Sellers
  • Buyers
  • Orders

We will start from what is likely the most confusing aspect. Products and Auctions.

A Product is something that can be sold. It can be either a unique item “a mint condition Spiderman comics” or it can be something that we have lots of “a flying monkey doll”. There is a big distinction between the two, because of the way the business rules are structured.

Once an Auction has been started, it cannot be changed. This include everything about this auction, pricing, shipping information, product information, etc. But a Product may change at any time (maybe I now have the flying monkey in red and green, instead of the original red and yellow). That leads us to conclude that we actually have two instances of the Product in our domain. We have the Product Template (mutable, can change at any time) and with have the Auctioned Product (immutable).

That realization leads me to the following model for products and auctions:

{
   "Name":"Flying Monkey Doll",
   "Colors":[
      "Red & Yellow",
      "Blue & Green"
   ],
   "Price":29,
   "Weight":0.23
}
{
   "Quantity":15,
   "Product":{
      "Name":"Flying Monkey Doll",
      "Colors":[
         "Blue & Green"
      ],
      "Price":29,
      "Weight":0.23
   },
   "StartsAt":"2011-09-01",
   "EndsAt":"2011-09-15"
}
products/1 auctions/1

As you can see, the Auction is going to wholly own the product. Any change made to the product will not be reflected in the Auctioned Product. This has the advantage that we need only a single document load in order to show the auction page.

Another option would be to use the versioning bundle to do that, so we would have this:

{
   "Name":"Flying Monkey Doll",
   "Colors":[
      "Red & Yellow",
      "Blue & Green"
   ],
   "Price":29,
   "Weight":0.23
}
{
   "Name":"Flying Monkey Doll",
   "Colors":[
       "Blue & Green"
   ],
   "Price":29,
   "Weight":0.23
}
{
   "Quantity":15,
   "Product": "products/1/versions/1",
   "StartsAt":"2011-09-01",
   "EndsAt":"2011-09-15"
}
products/1 products/1/versions/1 auctions/1

The versioning bundle ensures that we can get immutable views of our documents, so we can safely reference the product by its id and version.

That is it for now, on the next post, we will deal with how to work with bids.

FUTURE POSTS

  1. Partial writes, IO_Uring and safety - about one day from now
  2. Configuration values & Escape hatches - 4 days from now
  3. What happens when a sparse file allocation fails? - 6 days from now
  4. NTFS has an emergency stash of disk space - 8 days from now
  5. Challenge: Giving file system developer ulcer - 11 days from now

And 4 more posts are pending...

There are posts all the way to Feb 17, 2025

RECENT SERIES

  1. Challenge (77):
    20 Jan 2025 - What does this code do?
  2. Answer (13):
    22 Jan 2025 - What does this code do?
  3. Production post-mortem (2):
    17 Jan 2025 - Inspecting ourselves to death
  4. Performance discovery (2):
    10 Jan 2025 - IOPS vs. IOPS
View all series

Syndication

Main feed Feed Stats
Comments feed   Comments Feed Stats
}