Strange production errors
The following code cause a really strange error in production:
new MailAddress("test@gmail.com");
The specified string is not in the form required for an e-mail address.
Huh?!
Obviously it is!
After immediately leaping to the conclusion that .NET is crap and I should immediately start writing my own virtual machine, I decided to dig a little deeper:
Character | Code |
---|---|
t | 116 |
e | 101 |
s | 115 |
t | 116 |
@ | 64 |
g | 103 |
m | 109 |
a | 97 |
i | 105 |
l | 108 |
. | 46 |
? | 8203 |
c | 99 |
o | 111 |
m | 109 |
8203 stands for U+200B or zero width space.
I guess that someone with a software testing background decided to get medieval on one of our systems.
Comments
Holy crap!
I just debugged the exact same issue on my client's system.
We were all similarly scratching our heads till I had to use to view source.
My solution:
// Remove HTML characters email = Regex.Replace(email, "&#[0-9]+;", "");
(A big hacky)
This usually happens when you copy-paste from Word. That guy isn't too sophisticated, he is just lazy...
We've just been dealing with something similar.
select id, catnum from table;
1 ABCD-1234 2 ABCD-1234
select id, '[' + catnum + ']' from table; 1 [ABCD-1234] 2 [ABCD-1234
(catnum is ment to be unique, too!)
Got some unicode nonsense going on in there somewhere.... I suspect a newline, but we still can't find it.
Another problem to watch out for when using the MailAddress constructor:
http://social.msdn.microsoft.com/forums/en-US/netfxnetcom/thread/2217c413-968f-4dcf-8035-45eaf2a3c609
I get this quite a lot in our databases. The source is usually legacy processes that rely on Excel spreadsheets/vba for data loading (yuck).
So when is RavenVM coming out?
This is a quite valid and common character in some languages, such as Persian (it is called Zero-Width Non Jointer) and joins different parts of a single word, when you don't want it get separated when word-wrapping happens. E.g., the following word contains a ZWNJ: میروم
Since it is a very common character for some languages it may happen usually that somebody changes the keyboard language accidentally and enter it without purpose.
Comment preview