Challenges when building NH Prof
Right now, we have one challenge with building NH Prof: making the UI scale to what people are actually using.
The problem that we face is:
- A profiled application is generating a lot of events.
- Many of those events are relevant to the profiler and so result in UI Model changes.
- UI Model changes require UI refresh.
There is a limit on how often you can refresh the UI. It is not just a matter of the UI being incredibly slow. (Try it, UI is about as slow as remote system calls!) That problem we already solved with batching and aggregation.
The problem is deeper than that, how do you keep a good UI experience when you have so many changes streaming into the UI. If my recent statements list is updated 5000 in the space of one minute, there is no value in actually showing anything there, because you don't get to see that.
We are thinking about various ways of handling that, but I would appreciate additional input. How do you think we should deal with this issue?
Comments
Option to group similar statements together with a count/sec?
Graphs. Not lists of things, but graphs showing average # of queries, flushes, commits per second. Stuff like that. Then you can zoom in and zoom out of the graph, and as you zoom into certain sections of it you can start to show the actual queries that were run in another part of the UI for the section you zoomed in on.
Take a look at Quest Software Spotlight product, at least for MySQL it had a very nice (but not very detailed) way to show what exactly is happening in real-time (when you have more than 300 statements per second).
Probably the answer is to find out what users are interested in seeing. Certain statements? Then the answer is a filter. How often some statements/warnings have occurred? Then you have statistics.
There is no need to see what happens in real time. That doesn't add any value.
What you could do is what most profilers do :
Start profiling
Display a funny waiting indicator, maybe an animated timeline which shows the profiling progress and some global stats in real time (like statements/sec, etc.).
Eventually let the user define tags during the profiling
Stop profiling
Let the user explore the profiling session result by selecting ranges in the timeline. Display the details accordingly: statements, warnings, etc.
RedGate Ants Profiler is my main source of inspiration here. But that implies a lot of changes from what you already have in NHProf. Maybe it worths it.
Record all output to a data structure. Have the user click a refresh button to update the display, or tick a checkbox to auto refresh with a spinedit type control for specifying how many seconds to wait between refreshes.
A human can't read that much data that quickly so there is no point in updating it that often.
i agree with Romain...
there is no reason to have real time data flowing over the UI
one of the risks that you expose through a non-auto-refreshing UI is the end user not understanding that the UI needs to be refreshed
in other words... there could be some data or information that the user is missing because they did not do a final refresh
i would recommend (or suggest) that you allow the user to start a profiling session
if the user wants to see some data while the profiler is running... allow them to take a "snap-shot" to see how things are going
once the user stops the session... create another "snap-shot" which contains all of the data
another option would be to have a "snap-shot interval" which will automatically take a "snap-shot"
with the manual "snap-shot" idea... users can run through normal scenarios, take a "snap-shot", run through stress scenarios, take a "snap-shot" and finally compare the two
just an idea
Have a look at any load testing tool for insipiration, e.g. Visual Studio Load Tester. Possible ideas of the top of my head
Display status such as Max, Min, Average, Requests per sec, etc.
Display graphs of the events and of the above aggregates
Filters - allow the user to set filters so they only see rows which match a predefined specification (e.g table type, entity, query time thresholds, etc)
Alerts - allow the user to set alerts when a certain counter reaches a certain level, they'll be alerted
Scripting - perhaps you could even allow users to write custom scripts (DSL anyone?) on the events coming in, so when they come in they can be examined and actioned!
Cheers
Neil
Start Capture, Pause Capture, Stop Capture
Filter is your friend. How often do people profiling care about absolutely everything? I would guess pretty much never. You're looking for key occurrences that indicate a need to fix something. Allow the user to specify filters to isolate the changing data to just what they care about. Then they have a usable application. This also helps improve the performance issues related to the UI (provided they use the filter appropriately).
a quick question (sorry I don't have time to google it ;-p ), when will nHProg be out of beta?
thanks
Sounds like soon we'll have Rhino.UI layer living on top of the DirectX10/Win32...
This sounds like the experience you get with SQL Server Profiler. There wasn't a way to 'fix' it. You usually run it for a while and then stop it and review the results.
Peter,
That ignore another common scenario, where a user want to debug through the system, and want to be able to immediately see what is going on.
Neil,
That isn't what I am aiming NH Prof for. I am trying to give you the actual information about what NH is doing.
That means, at the end of the day, the SQL.
Well, a lot of the SQL will be identical surely?
So perhaps you could group queries with the same SQL and then show the various aggregates (count, sum, avg, max, min, standard deviation etc) against the groups? You'd obviously need to provide the ability to "drill-down" into individual queries.
cowagR
Right now it is in feature freeze (almost).
When we have a good story for all the scenarios that we have to deal with.
Right now, it is just a matter of hammering the bugs, mostly.
Since it is usable right now, it is not something that I want to commit to a date for.
John Chapman,
Imagine debugging, imagine just wanting to go and see what is going on the fly.
My two cents:
If the situation you're describing - a myriad of changes per minute - is always the case, then you have no reason to show anything at real time.
Either show a snapshot on request or at regular, humanly-sane intervals, or aggregate the data and allow zoom-in on demand.
If you also want to support a case where data doesn't change much and the user wants real-time stats (like debugging, if I understand you correctly), then I suggest you use an escalation policy:
Start out in real-time mode, but once updates reach a certain level of intensity, switch to snapshot/aggregate mode and give the user a clear visual feedback that you're holding new data back in order to allow them to read the old data.
What to aggregate and how -- that's a question for people actually using NHibernate. I think you'd be a prime candidate for this :)
Sounds like there are at least 2 discrete scenarios you are aiming at:
run it, do some stuff, paw through the results and notifications and resolve problems (no need to see exactly what is happening right now)
running system view where you want to see what is happening in near real time (but not needing all of the additional information / warnings etc)
Maybe build UI around these type of scenarios? i.e. for the first one you current ui works well - maybe focus on the warnings etc more, for the second one you need more of a scrollable viewport over what is happening that allows you to see what is going on in a linear manner in near real time.
I suck at UI design but I think the key is to focus on usage scenarios / user goals which should indicate the information the user is after. Once you understand that, you can build the UI to transform the data into what they need to see.
I just did a simple test in a console app. Writing out a line every 250 milliseconds is more than adequate for debugging.
I'd suggest that every time you need to update the GUI you record the time. If the time since the last update < 250MS then start a timer which will expire 250MS after the last update and update from that instead.
This way you will get small amounts of data through immediately, and large amounts of data through in blocks of acceptable chunks.
What about splitting the GUI up so that you have one screen per use case.
The GUI only need refresh the active screen.
Each screen is task-specifc, leaving more room for UX and no unnecessary debris.
A single dashboard can highlight problems across any of the analysis use cases, allowing the user to drill down to problems.
Personally I've never really liked start/stop profiling much, I'd prefer real-time view with the ability to filter and split data on multiple dimensions.
My 2 pennies worth :)
@Peter Morris
Like that idea, could you call that "buffering"?
You could batch updates to the UI, and do the batch updates at small time intervals. You could use a timer or something that triggered a look at the batch update storage to see if there was anything in there.
Tobin, I suppose you could call it that if you like :-)
neil,
And you want to be able to see them ,as well.
You want to visually spot the SELECT N+1
Not to mention that even identical SQL (with different params) is important.
Peter,
Except that this is more complex than that if you have a large number of updates.
I'll have a separate post about that.
Sure it's more complex, more work is always more complex.
In your example there are 83 statements per second. No human can read that many messages that quickly. Here are the things I would consider doing.
01: Buffer the statements so that updates don't happen more than 4 times per second. Even if there was 1 line per update a human couldn't read that, so more than 4 times per second is pointless.
02: Group similar messages. When the statement is the same except for parameters show a [+] next to it and allow the user to expand.
03: Don't show all of the data at once. The GUI update is going to be much slower if you try to do this. Instead show a page at a time and have page up/down buttons. Showing only a small window at a time will speed up the GUI quick significantly.
Are u still on WindowsForm and GDI?
then I would advise use DataGridView with virtual option
or if not possible do your own control with GDI trace and virtuality behind the scene.
Of course you can still take account of events that have a visible impact on the UI, most of the time it will be a scrollbar scale minor change because the added info might not be visible.
We are exactly working on that for the next NDepend version and you can expect tree and grid mixed views with virtually millions of items refreshed real time (ok this is actually the case in the CQL query result on very large code base but what we re working on is going much further)
No, I am using WPF.
And I am using this in virtual mode.
The problem isn't the actual painting time, it is the number of updates that are happening, because the view model is the basis of everything in the UI.
You need a querying language for your query profiling results. :-p LINQPad wouldn't be the worst option: http://www.linqpad.net/
Comment preview