Finding chrome bugs
That one was annoying to figure out. Take a look at the following code:
static void Main(string[] args) { var listener = new HttpListener(); listener.Prefixes.Add("http://+:8080/"); listener.Start(); Console.WriteLine("Started"); while(true) { var context = listener.GetContext(); context.Response.Headers["Content-Encoding"] = "deflate"; context.Response.ContentType = "application/json"; using(var gzip = new DeflateStream(context.Response.OutputStream, CompressionMode.Compress)) using(var writer = new StreamWriter(gzip, Encoding.UTF8)) { writer.Write("{\"CountOfIndexes\":1,\"ApproximateTaskCount\":0,\"CountOfDocuments\":0}"); writer.Flush(); gzip.Flush(); } context.Response.Close(); } }
FireFox and IE have no trouble using this. But here is how it looks on Chrome.
To make matter worse, pay attention to the conditions of the bug:
- If I use Gzip instead of deflate, it works.
- If I use "text/plain” instead of “application/json”, it works.
- If I tunnel this through Fiddler, it works.
I hate stupid bugs like that.
Comments
I had something like this happen when I saved a batch file using Notepad2, which defaulted to "UTF-8 with Signature". What you're seeing is the BOM (byte order mark)...
seems like a UTF8 BOM, clean up the files that introduce this and you'll be good to go
I agree with anton. You'd be better off with
I haven't checked the relevant RFCs, but as others have said, looks like a BOM where there shouldn't be one. As far as I am aware, BOMs are for files only.
Oh and this blog software still doesn't remember me properly. And no it's not a bug in my browser.
@Rik, BOM identifies the encoding used for a stream of text. It is good to have whenever you are fetching a textual stream - from FS or not.
@Ayende, try adding a charset header. Apparently all other browsers detect the BOM even when it isn't provided, although Chrome is perfectly alright when not doing so.
Not providing a BOM is possible, but you may hit walls later on when this code is used with other encodings (UTF16/32 for CJK for example).
Like everyone said it's the BOM. Chrome shows everything for encodings that don't have specific rules about not showing them, like text/plain; application/json is good for applications, not for showing the text. Why is this a problem? Does the json not get parsed properly? A charset header should fix it - chrome is probably using the wrong charset here.
Itamar is right, content type should include the charset:
context.Response.ContentType = "application/json; charset=utf-8";
See http://www.w3.org/International/O-HTTP-charset
Ok, I am the 10th person to confirm: It is the BOM-header^^ Such bugs make me believe that it would be very beneficial for most standards to have a reference implementation. That way the standards body can detect mistakes by themselves and implementers hopefully get even such details right.
Tobi, do you remmeber that nice article www.joelonsoftware.com/items/2008/03/17.html :-)
Maybe I am the real Joel Spolsky in disguise of a nickname... You will never know for sure ;-)
The best part is the title: how to elevate responsibility from your own lameness to someone else (Chrome this case). The more posts I read from this person, the more I see disguised lamer. Not only there was a lack of charset in declaration (which is violation of the standard) but also lack of (lame but working) solution with StreamWriter constructor that explicitly specifies no BOM. I think there also was no clue what BOM is...
My goodness, what a bunch of crap commentary directed at Ayende.
Have a look at the RFC 4627 standard, third part about encoding.
http://www.faqs.org/rfcs/rfc4627.html
"JSON text SHALL be encoded in Unicode. The default encoding is UTF-8.
Since the first two characters of a JSON text will always be ASCII characters [RFC0020], it is possible to determine whether an octet stream is UTF-8, UTF-16 (BE or LE), or UTF-32 (BE or LE) by looking at the pattern of nulls in the first four octets."
In other words, no charset that you need to specify in the headers. The BOM will specify the encoding.
Hmmm, I'll have to correct myself about the BOM. The browser needs to check it based upon the null characters.
I'd like to shake it up a bit and go with the argument that this is a Chrome bug (or unacceptably dumb behaviour).
Software (text editors) that can't handle the BOM are usually referred to as "Older Software" - from Wikipedia: "Older text editors may display the BOM as "" at the start of the document, even if the UTF-8 file contains only ASCII and would otherwise display correctly".
So In this case, the document would display correctly if Chrome were simply able to recognise the BOM, ignore it and read the remaining text. That doesn't sound like much to expect from software written sometime after 2000...?
So, I would ask: is there any real excuse for modern software to fail to interpret the BOM and therefore leave the page in the state shown above (ie completely broken)? Is it not an "obvious" requirement to be able to interpret BOM and no BOM in UTF-8?
I experienced nasty bugs with Chrome in the past too.
I guess Chrome could do with more if statements ;)
I love it how no one has actually run the code.
Guys, if I use the charset=utf-8, there is still a problem.
So yes, it is a bug.
1) I ran the code, and did this with a fresh install of Chrome... by default the page encoding was set to Unicode (UTF-8). Choosing auto-detect and re-running removed the BOM
2) You can force the removal of the BOM by changing the code to this:
using (StreamWriter writer = new StreamWriter(gzip, new UTF8Encoding(false)))
Comment preview