The NuGet Problem
NuGet is a wonderful system, and I am very happy to be able to use and participate in it.
Unfortunately, it has a problem that I don’t know how to solve. In a word, it is a matter of granularity. With RavenDB, we currently have the following packages:
- RavenDB
- RavenDB.Embedded
- RavenDB.Client
The problem is that as we have some features that uses F#, we have some features that uses MVC, we have a debug visualizer for VS, we have… Well, I think you get the point. The problem is that if we split things too granularly, we end up with something like:
- RavenDB.Client.FSharp
- RavenDB.MvcIntegration
- RavenDB.DebugSupport
- RavenDB
- RavenDB.Core
- RavenDB.Embedded
- RavenDB.Client
- RavenDB.Sharding
- RavenDB.NServiceBus
- RavenDB.WebApiIntegration
- RavenDB.Etl
- RavenDB.Replication
- RavenDB.IndexReplication
- RavenDB.Expiration
- RavenDB.MoreLikeThis
- RavenDB.Analyzers
- RavenDB.Versioning
- RavenDB.Authorization
- RavenDB.OAuth
- RavenDB.CascadeDelete
And there are probably more.
It gets more complex because we don’t really have a good way to make decisions on the type of assemblies that we add to what projects.
As I said, I don’t have an answer, but I would sure appreciate suggestions.
Comments
Firstly I don't think it's that complex to have multiple fine-grained packages, there are plenty of projects that take this approach (Ninject, ServiceStack, NServiceBus).
I think it can actually be simpler and cleaner for consumers to be able to pick and choose precisely what they need.
The complexity occurs on the producer side (and maybe the consumer) if these packages are independently versioned. Currently RavenDB is a single solution, so new package releases are concurrent, and thus pacakge versions will be lock-stepped.
There is also some complexity with the sub-package dependencies, regardless of the fine grained vs coarse grained strategy.
I've posted a more complete argument on my blog: http://dhickey.ie/post/2012/03/12/NuGet-Packages-Coarse-grained-vs-fine-grained-and-sub-dependencies.aspx
IMHO it's totally OK to have fine-grained packages.
Well,one thing to point out is that there is a debug assembly among those and when I publish the project, that assembly is being deployed as well. Interesting thing is that it has a reference to one of the VS's assemblies which is in the GAC. But if you do not have VS in the server, you are getting a weird error message.
I had to go through all the assemblies to see which one has a dependency on that assembly.
That assembly's Local Copy attribute needs to be set to False just after it is installed.
Maybe the problem is that feature and package should not be the same. A package should maybe be able to contain more than one feature that can be turned on or off individually, even if the delivery vector is a coarser package.
You can do a super package with all and just install the features that user wants via power shell or MVC recipes
I don't think granularity is a problem, as long as there is a decent package discovery story and good documentation on each package.
I find ServiceStack's packages not granular enough to be honest
I've been toying with the idea for my CQRS library of having "recipe" nuget packages. So, while there is practically one package per dll, one would rarely reference them directly (with the exception of the "core" one that you would reference to just have the interfaces etc).
Instead, I'll have packages that do little more than combine the fine-grained together in various combinations or "recipes". That way if you want RavenDB event storage, you "install-package Regalo.EventSourcing.Raven" and it'll suck in the correct package dependencies to make that happen. If you want SQL Server event storage, well then "install-package Regalo.EventSourcing.SqlServer".
Consuming nuget packages with multiple packages can be confusing. Especially because the description on must packages are lacking any real detail.
I also often wonder why the need for multiple packages. How often would/should size of the download really be a concern?
In the case of ravendb I think 3 packages are probably enough:
.Client .Embedded .Bundles
I think Nuget needs a new concept "child packages". I'd call them "droplets" and if I'd dream an api for acessing them it would be something like:
Using profiles:
Then adding specific sub packages (droplets) to the main package:
An some tehnicalities, a child package can only have as dependecy it's parent nuget package or other sibling packages (droplets) from the same parent. This would be to restrict using sub packages as a regular package, and prevent crosss dependencies that should be solved at package level.
At the same time regular packages should be allowed to take dependencies on child packages (droplet) of another package, but this will mean pulling the parent package with the default profile and including the referenced child package (droplet) will be pulled.
T this will allow fine grained control of packages withouth poluting the repository with dozes of packages wich are simply extensions of existing packages, and shoudln't be thought of as separate packages.
Just an ideea, but maybe this is worth pursuing ...
It's simple, it's been used forever in software land. it's called an installer. One downloads an installer, runs it, picks the cherries from the feature list offered in the installer and the installer places the files in the folders they need to go.
Solved!
NuGet is great for getting the latest version of an assembly you're referencing or want to reference in your code. It sucks in every other situation, simply because it's not meant for that. Using Nuget for ravendb or any other tool/system that's not a single assembly therefore is more or less using it for a situation it's not designed for.
I'm working on a new thing now which has a bunch of optional components that are just dependencies on external Nuget packages (such as JsonFx) with a very lightweight interface wrapper around them. It's at the point where there'd be an assembly with a dozen lines of code in some cases, which doesn't feel right.
What I'm thinking is, Nuget supports source-code in packages and will add the type to the referencing project, so I could just distribute the interface wrappers that way.
What would be completely awesome would be a way for Nuget to do something like the jQuery UI site, where you can package a bunch of self-contained ingredients and the user can choose which ones to combine into a nupkg for themselves.
I like Mark's idea about adding the ability to combine packages into one to the NuGet system.
Ayende, I think the problem is less with NuGet and more with you just have a lot of stuff.
+1 for Mark Rendle
Dependencies help too: you pull the one high-level feature package you need and it pulls the low-level dependencies it requires and nothing more.
I would prefer with as few choices as possible, keeps it simple even though i might download some unnesessary assemblies.
First of all, I don't have any practical experience with NuGet, but I do have experience with OpenWrap (and Maven in the Java world). I don't think this is that relevant for your problem, because whether to go for fine-grained or coarse-grained should not depend on the technical implementation of your dependency manager.
That being said, I would definitely go for the fine-grained approach. It puts a higher burden on the developer (you), but it also forces you to think more about the different parts in your product/system/sources, and the internal dependencies between those parts. And that in itself is already a good thing.
An important reason for going fine-grained is that most likely each of those different parts have themselves different dependencies. If you only make one big package, everyone who uses your product automatically brings in a bunch of dependencies that they do not need and do not want. Related to that, you also increase the chance of bringing in a dependency that conflicts with a dependency that the user already has in his project.
If you go fine-grained, you can still offer convenience packages for your users. These convenience packages are basically empty packages that only contain dependencies on other packages (this is called meta packages in OpenWrap). So, it's entirely possible to create a RavenDB.Full that is itself a dummy package, but pulls in all the other separate small packages.
Another advantage is that each of those packages can evolve separately from the others. If you use semantic versioning (and also use version ranges to specify versions of dependencies), it's really easy to make bugfix releases, without having to build and publish the whole suite.
You can also turn around the last argument: every part of your system that you want to let evolve and release independently of the rest, has to be put in a separate package. For example, do you want to be able to release an improved version of RavenDB.NServiceBus integration as soon as that part is ready, or you want to wait till you release a complete new RavenDB package?
The fine-grained approach is more work for you, but it's gonna make things easier for your users.
First of all, I don't have any practical experience with NuGet, but I do have experience with OpenWrap (and Maven in the Java world). I don't think this is that relevant for your problem, because whether to go for fine-grained or coarse-grained should not depend on the technical implementation of your dependency manager.
That being said, I would definitely go for the fine-grained approach. It puts a higher burden on the developer (you), but it also forces you to think more about the different parts in your product/system/sources, and the internal dependencies between those parts. And that in itself is already a good thing.
An important reason for going fine-grained is that most likely each of those different parts have themselves different dependencies. If you only make one big package, everyone who uses your product automatically brings in a bunch of dependencies that they do not need and do not want. Related to that, you also increase the chance of bringing in a dependency that conflicts with a dependency that the user already has in his project.
If you go fine-grained, you can still offer convenience packages for your users. These convenience packages are basically empty packages that only contain dependencies on other packages (this is called meta packages in OpenWrap). So, it's entirely possible to create a RavenDB.Full that is itself a dummy package, but pulls in all the other separate small packages.
Another advantage is that each of those packages can evolve separately from the others. If you use semantic versioning (and also use version ranges to specify versions of dependencies), it's really easy to make bugfix releases, without having to build and publish the whole suite.
You can also turn around the last argument: every part of your system that you want to let evolve and release independently of the rest, has to be put in a separate package. For example, do you want to be able to release an improved version of RavenDB.NServiceBus integration as soon as that part is ready, or you want to wait till you release a complete new RavenDB package?
The fine-grained approach is more work for you, but it's gonna make things easier for your users.
Apologies for the double post. It was not clear whether the post was actually accepted. The post did not immediately show up on the page, and since I always get captcha's wrong, I just assumed that was the problem, and so I posted it again. At that time, I assume my original post did show up and I thought all is well.... So please, remove the double post.
Scared of some extra bandwidth? I would not be unless it will start to reach 100 MB-s. Database server is not something I would throw in using nuget anyways. Client, yes. Not the server.
Fine grained is absolutely the way to go! The downloading and discovery mechanisms in Nuget are just vehicles for its true purpose, dependency management. It can certainly be improved a lot in this area, but for me any many others, that's what it's all about. I don't care about large downloads, but I certainly don't want to introduce unnecessary dependencies in my projects. Dependencies which can have other dependencies which I need even less, and so on.
Nuget needs more dependency smartness and perhaps a way to hide "Core" packages (no value on their own) from the main discovery lists.
@Joshua Lewis
I couldn't agree more about ServiceStack. The packages are extremely inconveniently set up. Convenient for the publisher, perhaps, but not for the consumer.
We're facing a similar situation with our core library at worker. I've gone down the path of "many small assemblies" ilmerged into one big assembly. Conceptually I prefer many assemblies because I can reason easier about the network of dependencies in our system (does assembly A really need a reference on assembly B?). But in practice using many assemblies is a pain (not to mention the awful compile times).
So, we're very anal about our core library, but then we package it up as one big assembly and nuget package which is then used by our application developers. So far, this seems to be working very well.
I think it's okay to break things down as required but keep a smaller number of package options bundling things together. Making too many packages will ruin the ease of reference we have now.
Have to differ with Frans on this, I would run as far away as possible from installers as possible! xcopy deploy is a good thing...
@Pop Catalin: You can simulate that in a way, by hiding the all versions of your package, so they don't show up in NuGet. I'm not sure though what happens with packages that have no public releases, whether they show up at all or not, but it might serve your purpose. Consider it as poor man's child packages :)
First off, I think there's a lot of good points in this post, specifically Ruben's post about granularity enforcing clarity for your developers. As a dev, it's my responsibility to know what's going on in my solution. Coarse ground packages seem to only work when you are using one platform or framework, but I've run into DLL hell a lot even with nuget where 2 packages required different, incompatible versions of the same dependency. Huge packages, mean larger dependency trees and that means greater chance of conflict.
That being sad, it's quite common in package managers (they're not a new concept) to have "meta packages" that bundle up a bunch of common tools to make it easier to setup. Best example of this I can think of is "aptitude-get install lamp-server" which on ubuntu sets up apache, mysql, php and configures them all to work together.
The section titled "Automatically Running PowerShell Scripts During Package Installation and Removal" in the nuget docs at http://docs.nuget.org/docs/creating-packages/creating-and-publishing-a-package.
With the right PowerShell script, you could display a menu letting users select the options they want to affect their current project. The rest should be relatively straightforward.
Personally, I don't think that the granularity of packages should be driven by your dependencies or your personal belief. This is a matter of architecture and release/stability. Hence, the packages should be constructed following an analytic process using tools such as NDepends and metrics such as Ce (Efferent Coupling) and Ca (Afferent Coupling).
Comment preview