I’m slightly concerned

Character length check is wrong for Unicode strings

I'm taking an example of a recent tweet that I posted, which is originally (note the curly quotes as opposed to normal quotes):

“Paying as much attention to appreciation as u do w achievement. Achievement w/o reflection & gratitude for that is worthless.” @tferriss

This is counted as 137 characters. But Tweetminer rejected the posting of the tweet, which I suspect it being due to the fact that my input is received in UTF-8 and these special quotes are two bytes each: hence making it 141 characters.

I haven't personally tested it, but if that's the case, I'm pretty sure that the it'd cause problems with Chinese and Japanese tweets (which in UTF-8 would take 3 bytes per character).

** Expected behaviour: **
That the character count is based on Unicode character length instead of byte length.
2 people have
this problem
+1
Reply