Reviewing OSS Project: Whiteboard Chat–Unbounded Result Sets and Denial of Service Attacks

architecture (612) rss
bugs (451) rss
challanges (123) rss
community (380) rss
databases (481) rss
design (895) rss
development (642) rss
hibernating-practices (71) rss
miscellaneous (592) rss
performance (397) rss
programming (1085) rss
raven (1450) rss
ravendb.net (534) rss
reviews (184) rss

2025
- June (7)
- May (10)
- April (10)
- March (10)
- February (7)
- January (12)
2024
- December (3)
- November (2)
- October (1)
- September (3)
- August (5)
- July (10)
- June (4)
- May (6)
- April (2)
- March (8)
- February (2)
- January (14)
2023
- December (4)
- October (4)
- September (6)
- August (12)
- July (5)
- June (15)
- May (3)
- April (11)
- March (5)
- February (5)
- January (8)
2022
- December (5)
- November (7)
- October (7)
- September (9)
- August (10)
- July (15)
- June (12)
- May (9)
- April (14)
- March (15)
- February (13)
- January (16)
2021
- December (23)
- November (20)
- October (16)
- September (6)
- August (16)
- July (11)
- June (16)
- May (4)
- April (10)
- March (11)
- February (15)
- January (14)
2020
- December (10)
- November (13)
- October (15)
- September (6)
- August (9)
- July (9)
- June (17)
- May (15)
- April (14)
- March (21)
- February (16)
- January (13)
2019
- December (17)
- November (14)
- October (16)
- September (10)
- August (8)
- July (16)
- June (11)
- May (13)
- April (18)
- March (12)
- February (19)
- January (23)
2018
- December (15)
- November (14)
- October (19)
- September (18)
- August (23)
- July (20)
- June (20)
- May (23)
- April (15)
- March (23)
- February (19)
- January (23)
2017
- December (21)
- November (24)
- October (22)
- September (21)
- August (23)
- July (21)
- June (24)
- May (21)
- April (21)
- March (23)
- February (20)
- January (23)
2016
- December (17)
- November (18)
- October (22)
- September (18)
- August (23)
- July (22)
- June (17)
- May (24)
- April (16)
- March (16)
- February (21)
- January (21)
2015
- December (5)
- November (10)
- October (9)
- September (17)
- August (20)
- July (17)
- June (4)
- May (12)
- April (9)
- March (8)
- February (25)
- January (17)
2014
- December (22)
- November (19)
- October (21)
- September (37)
- August (24)
- July (23)
- June (13)
- May (19)
- April (24)
- March (23)
- February (21)
- January (24)
2013
- December (23)
- November (29)
- October (27)
- September (26)
- August (24)
- July (24)
- June (23)
- May (25)
- April (26)
- March (24)
- February (24)
- January (21)
2012
- December (19)
- November (22)
- October (27)
- September (24)
- August (30)
- July (23)
- June (25)
- May (23)
- April (25)
- March (25)
- February (28)
- January (24)
2011
- December (17)
- November (14)
- October (24)
- September (28)
- August (27)
- July (30)
- June (19)
- May (16)
- April (30)
- March (23)
- February (11)
- January (26)
2010
- December (29)
- November (28)
- October (35)
- September (33)
- August (44)
- July (17)
- June (20)
- May (53)
- April (29)
- March (35)
- February (33)
- January (36)
2009
- December (37)
- November (35)
- October (53)
- September (60)
- August (66)
- July (29)
- June (24)
- May (52)
- April (63)
- March (35)
- February (53)
- January (50)
2008
- December (58)
- November (65)
- October (46)
- September (48)
- August (96)
- July (87)
- June (45)
- May (51)
- April (52)
- March (70)
- February (43)
- January (49)
2007
- December (100)
- November (52)
- October (109)
- September (68)
- August (80)
- July (56)
- June (150)
- May (115)
- April (73)
- March (124)
- February (102)
- January (68)
2006
- December (95)
- November (53)
- October (120)
- September (57)
- August (88)
- July (54)
- June (103)
- May (89)
- April (84)
- March (143)
- February (78)
- January (64)
2005
- December (70)
- November (97)
- October (91)
- September (61)
- August (74)
- July (92)
- June (100)
- May (53)
- April (42)
- March (41)
- February (84)
- January (31)
2004
- December (49)
- November (26)
- October (26)
- September (6)
- April (10)

Mar 14 2011

Reviewing OSS ProjectWhiteboard Chat–Unbounded Result Sets and Denial of Service Attacks

time to read 3 min | 468 words

Originally posted at 3/8/2011

As a reminder, I am reviewing the problems that I found while reviewing the Whiteboard Chat project during one of my NHibernate’s Courses. Here is the method:

[Transaction]
[Authorize]
[HttpPost]
public ActionResult GetLatestPost(int boardId, string lastPost)
{
    DateTime lastPostDateTime = DateTime.Parse(lastPost);
    
    IList<Post> posts = 
        _postRepository
        .FindAll(new GetPostLastestForBoardById(lastPostDateTime, boardId))
        .OrderBy(x=>x.Id).ToList();

    //update the latest known post
    string lastKnownPost = posts.Count > 0 ? 
        posts.Max(x => x.Time).ToString()
        : lastPost; //no updates
    
    Mapper.CreateMap<Post, PostViewModel>()
        .ForMember(dest => dest.Time, opt => opt.MapFrom(src => src.Time.ToString()))
        .ForMember(dest => dest.Owner, opt => opt.MapFrom(src => src.Owner.Name));

    UpdatePostViewModel update = new UpdatePostViewModel();
    update.Time = lastKnownPost; 
    Mapper.Map(posts, update.Posts);

    return Json(update);
}

In this post, I want to focus on a very common issue that I see over & over. The problem is that people usually don’t notice those sort of issues.

The problem is quite simple, there is no limit to the amount of information that we can request from this method. What this mean is that we can send it 1900-01-01 as the date, and force the application to get all the posts in the board.

Assuming even a relatively busy board, we are talking about tens or hundreds of thousands of posts that are going to be loaded. That is going to put a lot of pressure on the database, on the server memory, and on the amount of money that you’ll pay in the end of the month for the network bandwidth.

There is a reason why I strongly recommend to always use a limit, especially in cases like this, where it is practically shouting at you that the number of items can be very big.

On the next post, we will analyze the SELECT N+1 issue that I found in this method (so far, my record is 100% success in finding this type of issue in any application that I reviewed)…

Tweet Share Share 21 comments

Tags:

Comments

14 Mar 2011
14:13 PM

Dmitry

Is the Owner property lazy loaded?

14 Mar 2011
14:14 PM

Ayende Rahien

Dmitry,

Yes

14 Mar 2011
14:18 PM

jdn

"There is a reason why I strongly recommend to always use a limit"

And, once again, this is a bad recommendation. Getting all the posts in the board (or the equivalent in other apps) is quite often a legitimate use case.

Use limits when they make sense, sure. Blanket recommendations (or crippling a query engine silently) like this are bad.

14 Mar 2011
15:38 PM

Karg

@jdn The point is not that the app shouldn't be able to show all posts. It is that the service call shouldn't be returning them all back in a single response.

If you do have a legitimate need for all of the board's posts (and let's just assume that you do for the sake of argument) then there are better back end implementation details for getting that data to the client than a single response with all of the data.

14 Mar 2011
15:39 PM

Scooletz

@jdn

Look at Tweeter and other reaaally big services. You provide "more" button, which can be easily clicked several times, when not providing of "kill my db by querying for everything" button. I hope you don't consider export/import scenario in here;)

14 Mar 2011
17:50 PM

Daniel

jdn - When would it be a legitimate use case?

Can you provide an example?

14 Mar 2011
18:40 PM

Dmitry

It would only be a legitimate case when you know about the size of the collection. A number of countries in the world or employees in the company is not likely to dramatically increase one day.

14 Mar 2011
23:38 PM

jdn

@karg

What if I want them all in one call?

@kooletz

I will grant you that Twitter is probably not a legitimate use case. That's rather the extreme case though.

@daniel

If I want all of the open orders for a trade group that I support, that might return 15 records, or it might return 100,000+ (that's an accurate range, btw, not making it up). Regardless, when I query for all open orders, I want ALL open orders. And not paged either (if I want them paged, I'll page them explicitly).

Ayende said 'always' and I think he means it, that's why he made Raven DB safe/crippled by default (he calls it one, I call it the other, I'll let you guess which...LOL). And, he's been open about the fact that he did it for marketing reasons as well as technical ones.

And he's still wrong. YMMV.

15 Mar 2011
06:04 AM

Ayende Rahien

Jdn,

What are you going to do with 100,000+ orders?

15 Mar 2011
13:20 PM

jdn

I am going to process them and send them to a 3rd party vended application that expects them in one shot (I've never really thought about it, but I don't think they even have paging in their API).

15 Mar 2011
13:25 PM

Karg

@jdn

What if I want to make a blocking network call on my UI thread so I lock up my UI? That makes me a bad developer for wanting to do that.

You seem to be stuck on the idea that since you have a desire to show a large amount of data to the user at once that it must be returned in a single service call.

Why is it that you specifically want it returned in one service call?

15 Mar 2011
13:32 PM

Ayende Rahien

Jdn,

I can guarantee that the 3rd party vendors would REALLY like it if you didn't do this.

See this:

msdn.microsoft.com/.../cc663023.aspx#id0090070

Udi describe a very similar scenario and what happens when you throw that on a system

15 Mar 2011
13:57 PM

jdn

Ayende:

I can guarantee you that the 3rd party vendor expects it in one shot.

I am familiar with Udi's article and scenario, and it is irrelevant.

Which is part of the point. You don't know my scenario or the vendor, I do.

Now, if you want to ask me whether I think the vendor is doing things correctly, we may come to a different conclusion.

15 Mar 2011
13:59 PM

jdn

Karg:

Who said anything about showing a large amount of data?

It is true that Ayende's specific example is an ActionResult, but again, he said "always" and believes it (and built/crippled RavenDB around the concept).

15 Mar 2011
14:11 PM

jdn

Karg:

Sorry, missed the question.

To use my specific example, the end client wants everything to be sent to them in one shot. Since our systems are perfectly capable of handling 100,000 open orders in one shot, there is no reason not to get it in one call.

It is an interesting fact that even good developers appear not to consider the possibility of 'unbounded' result sets. I've discussed it with Ayende, and I don't dispute that even good developers write bad code.

It still isn't right to make code un-self-documenting.

If I write:

myCollection.Skip(472).Take(14393)

I mean, "skip 472 records and then take the next 14393". I do not mean, "skip 472 records and then take whatever Ayende thinks is the correct silent formerly poorly documented number of records that he thinks you should take because he doesn't want bad performance to reflect badly on RavenDB."

16 Mar 2011
23:07 PM

Luke

@jdn

1) he never said it had to be a silent limit. There's nothing really wrong with letting the caller specify the limit, as long as it's kept reasonable.

2) if you have service A that calls service B to get data, then send it to service C (where service C needs it in one shot), why can't service B have limits? Service A can call it multiple times, aggregate, then send to C. When you write service D that presents service B results to screen, it's already paginated.

17 Mar 2011
02:35 AM

Hendry Luk

I am with jdn on this case. If the 3rd party vendor requires the whole set of records anyway, how does splitting the request to multiple paginated calls (like few people have suggested above) make it any cheaper? You're still querying for the same set of data, splitting them to multiple streams will do nothing but making it a lot more costly. Law of physics ensures that.

This is a perfect example of enforcing a blanket guidance (i.e. pagination) just for the sake of it backed by no legitimate reason, in which you're actually causing the exact problem the guidance is meant to overcome.

17 Mar 2011
07:32 AM

Ayende Rahien

Hendry,

It means that Service B doesn't need to worry about Out Of Memory Exceptions, for once.

Since in Service A, you are explicitly doing something out of the ordinary, you take care of that only in that place, and you don't have to worry about this in multiple systems.

17 Mar 2011
22:54 PM

Hendry Luk

I thought we had had streaming of large data in (web-)services? And similarly in NH. Unbounded query might not necessary be all bad, but holding unbounded data in memory definitely is.

18 Mar 2011
08:21 AM

Nick Aceves

@jdn

So you're saying that because you happen to have an (allegedly) legitimate case where you need ALL of the data, that it's a bad idea in general to limit it by default?

I don't know about you, but I would much rather have a system that, by default, caters to the 99% case (i.e., when requesting all of the data at once is a mistake) and requires some minor tweaking to make that 1% case work when I'm really sure I need to shoot myself in the foot.

In RavenDB you can override the max page size on the server. Problem solved.

In the case of this post, we're talking about a UI. I don't care what business function your app performs, displaying 100,000 records all at once to a user without paging is a bad idea for a whole host of reasons.

20 Mar 2011
02:01 AM

jdn

@Nick

No one is arguing that you should pull 100,000 records to display in a UI.

I completely disagree that it is a 99% case that it is a mistake to request all data at once. My (allegedly) legitimate case is a very legitimate case, and I can come up with many more.

When I write code that pulls data, it is up to me to decide how much data is going to pulled and whether it needs to be paged. I absolutely don't want some silent global variable deciding that for me.

It is unfortunate that too many developers are too lazy to figure out the performance impact of the code that they write. Treating the symptom by crippling by default is an anti-practice, not a best practice.

Comment preview

Comments have been closed on this topic.

Markdown turns plain text formatting into fancy HTML formatting.

Phrase Emphasis

*italic*   **bold**
_italic_   __bold__

Links

Inline:

An [example](http://url.com/ "Title")

Reference-style labels (titles are optional):

An [example][id]. Then, anywhere
else in the doc, define the link:
  [id]: http://example.com/  "Title"

Images

Inline (titles are optional):

![alt text](/path/img.jpg "Title")

Reference-style:

![alt text][id]
[id]: /url/to/img.jpg "Title"

Headers

Setext-style:

Header 1
========
Header 2
--------

atx-style (closing #'s are optional):

# Header 1 #
## Header 2 ##
###### Header 6

Lists

Ordered, without paragraphs:

1.  Foo
2.  Bar

Unordered, with paragraphs:

*   A list item.
    With multiple paragraphs.
*   Bar

You can nest them:

*   Abacus
    * answer
*   Bubbles
    1.  bunk
    2.  bupkis
        * BELITTLER
    3. burper
*   Cunning

Blockquotes

> Email-style angle brackets
> are used for blockquotes.
> > And, they can be nested.
> #### Headers in blockquotes
> 
> * You can quote a list.
> * Etc.

Horizontal Rules

Three or more dashes or asterisks:

---
* * *
- - - -

Manual Line Breaks

End a line with two or more spaces:

Roses are red,   
Violets are blue.

Fenced Code Blocks

Code blocks delimited by 3 or more backticks or tildas:

```
This is a preformatted
code block
```

Header IDs

Set the id of headings with {#<id>} at end of heading line:

## My Heading {#myheading}

Tables

Fruit    |Color
---------|----------
Apples   |Red
Pears	 |Green
Bananas  |Yellow

Definition Lists

Term 1
: Definition 1
Term 2
: Definition 2

Footnotes

Body text with a footnote [^1]
[^1]: Footnote text here

Abbreviations

MDD <- will have title
*[MDD]: MarkdownDeep

Oren Eini

Oren Eini

CEO of RavenDB

Reviewing OSS ProjectWhiteboard Chat–Unbounded Result Sets and Denial of Service Attacks

More posts in "Reviewing OSS Project" series:

Comments

Comment preview

FUTURE POSTS

RECENT SERIES

RECENT COMMENTS

Syndication

Main feed
Comments feed

Oren Eini

CEO of RavenDB

More posts in "Reviewing OSS Project" series:

Comments

Comment preview

Markdown formatting

Phrase Emphasis

Links

Images

Headers

Lists

Blockquotes

Horizontal Rules

Manual Line Breaks

Fenced Code Blocks

Header IDs

Tables

Definition Lists

Footnotes

Abbreviations

FUTURE POSTS

RECENT SERIES

RECENT COMMENTS

Syndication