Not all bytes weight exactly 8 bits
Or, pay attention to how you write to the disk. Here is a simple example:
static void Main(string[] args) { var count = 10000000; Stopwatch stopwatch = Stopwatch.StartNew(); using (var stream = CreateWriter()) using (var bw = new BinaryWriter(stream)) { for (var i = 0; i < count; i++) { bw.Write(i); } bw.Flush(); } stopwatch.Stop(); Console.WriteLine("Binary Writer: " + stopwatch.ElapsedMilliseconds); stopwatch = Stopwatch.StartNew(); using (var stream = CreateWriter()) { for (var i = 0; i < count; i++) { var bytes = BitConverter.GetBytes(i); stream.Write(bytes, 0, 4); } stream.Flush(); } stopwatch.Stop(); Console.WriteLine("BitConverter: " + stopwatch.ElapsedMilliseconds); stopwatch = Stopwatch.StartNew(); using (var stream = CreateWriter()) using (var ms = new MemoryStream()) { for (var i = 0; i < count; i++) { var bytes = BitConverter.GetBytes(i); ms.Write(bytes, 0, 4); } var array = ms.ToArray(); stream.Write(array, 0, array.Length); stream.Flush(); } stopwatch.Stop(); Console.WriteLine("Memory stream: " + stopwatch.ElapsedMilliseconds); stopwatch = Stopwatch.StartNew();
using (var stream = CreateWriter()) { byte[] buffer = new byte[sizeof(int) * count]; int index = 0; for (var i = 0; i < count; i++) { buffer[index++] = (byte)i; buffer[index++] = (byte)(i >> 8); buffer[index++] = (byte)(i >> 16); buffer[index++] = (byte)(i >> 24); } stream.Write(buffer, 0, buffer.Length); stream.Flush(); } stopwatch.Stop(); Console.WriteLine("Single buffer: " + stopwatch.ElapsedMilliseconds); } private static FileStream CreateWriter() { return new FileStream(Path.GetTempFileName(), FileMode.Create, FileAccess.Write, FileShare.Read, 0x10000, FileOptions.SequentialScan | FileOptions.WriteThrough); }
And the results:
Binary Writer: 1877
BitConverter: 1985
Memory stream: 1702
Single buffer: 1022
Comments
Is there a line missing in the code when you wrote it to the blog? There's no call to StartNew() in the last chunk.
BIl,
Yeah, sense. You found a bug :-)
I Updated the post accordingly
Similar results here, though MemoryStream and 'Single buffer' seem proportionally faster for some reason (tried many iterations, same results):
Binary writer: 1581
BitConverter: 1608
MemoryStream: 1016
Single buffer: 709
With the target stream being a MemoryStream rather than FileStream:
Binary writer: 362
BitConverter: 479
MemoryStream: 683
Single buffer: 349
I find it rather logical that when you create your own buffering system with knowledge of the data size, it will be faster than the default buffering in a framework (which aims for overall average performance)
I looked at the source of the FileStream class, and it indeed holds an internal buffer of 4096 bytes. When write is called, the data is copied to the buffer and when the buffer is full, it's flushed tot the actual file handle.
So using the binary writer and bitconverter you have 10000000 copies to a internal buffer and 19532 separate flushes.
While the single buffer avoids the buffering of the FileStream class and therefore doesn't copy the memory but writes it directly to the file handle.
I suspect the memory stream uses a different buffering mechanism, but that's for someone else to look at?
It is not related, but a byte is not ever 8 bits. There are (mostly were, i.e. the PDP-10) many architectures where a byte has a different weight.
Seeing the fact that the buffering of the filestream doesn't slow us down with the single buffer method, maybe it's possible we could convert the int array faster to an byte array...
I created a faster variant, but its ugly (Unmanaged code) and I wouldn't use it unless this part was really a bottleneck.
stopwatch = Stopwatch.StartNew();
Correction..
So using the binary writer or the bitconverter solution you have 10000000 copies to the internal buffer of the filestream and 153 separate flushes to the real file.
(Looked over the buffer paramater)
It seems you really like the "var" thing.
I'm really a stupid guy, and I like to be able to figure out meaning of code by reading 1st, then debugging. But "var" does well to prevent this
tcmaster,
var is always initialized, just look at what the value is.
Comment preview