Tweets to SMS cut in 3 parts
When I receive an update by SMS, it is always cut in several parts. That means I receive 3 sms for every update.
I'm in France on the SFR network.
I'm in France on the SFR network.
9
people have this problem
I have this problem, too!
Tell me when someone solves it.
The more people who report this problem, the more it gets noticed.
The more people who report this problem, the more it gets noticed.
The best solution from everyone
-
Here is the answer I received from support (Crystal) :
---------------
Sometimes updates are broken into 3 parts when the character type causes the update to exceed 140 characters. SMS is limited to 140 bytes; for English, that equates to 140 characters. However, for extended European alphabets, that's not quite true; each character requires up to 4 bytes, so:
?????? is 6 characters, but consumes 18 bytes.
Voile est l'art du contrôd'un bateau àoile. is 48 characters, but consumes 50 bytes.
what's more problematic is that when sending "international" (Unicode) messages, we're limited to 70 characters, since the SMS carriers convert ALL characters into 2 bytes, doubling the size of the message. We're working on creative ways to get around this but for now, iwe must comply with the carriers. Sorry about that!
--------------
3 people say
this solves the problem
-
Inappropriate?same problem in France on Orange and Bouyges networks.... Any solution ? Thanks !
I’m frustrated
-
Inappropriate?Here is the answer I received from support (Crystal) :
---------------
Sometimes updates are broken into 3 parts when the character type causes the update to exceed 140 characters. SMS is limited to 140 bytes; for English, that equates to 140 characters. However, for extended European alphabets, that's not quite true; each character requires up to 4 bytes, so:
?????? is 6 characters, but consumes 18 bytes.
Voile est l'art du contrôd'un bateau àoile. is 48 characters, but consumes 50 bytes.
what's more problematic is that when sending "international" (Unicode) messages, we're limited to 70 characters, since the SMS carriers convert ALL characters into 2 bytes, doubling the size of the message. We're working on creative ways to get around this but for now, iwe must comply with the carriers. Sorry about that!
--------------
3 people say
this solves the problem
-
Inappropriate?To the twitter staff: there's a simple way for European languages at least, just convert/strip accented characters when sending to SMS: "ãbecomes "a", "çbecomes "c", and so on. This will have minimal impact on the comprehension of the message, and it will much less annoying than receiving two or three SMS messages for each twit.
I’m confident!
-
Inappropriate?I would agree to this if it's a temporary solution. I don't care about accents in spanish, but ñ in Cataláçre slightly different. BTW, I can send accents, ñtc in my normal sms (160 chars) and the message is not splitted, why this difference?
-
Inappropriate?When I tweet in German (within Germany!), my entire message often apparently goes into Unicode and is split into 3 SMS (2 * 70 characters + a short 3rd. SMS). But why _three_ SMS when 2*70 = 140 characters should be sufficient?
Now I'm aware of this, I can replace ä ö ü ß with ae oe ue ss (thus messing up my T9 predictive text!), and I know how many bytes I'm actually generating and don't go into Unicode (unless I miss an umlaut or two).
However, there are also some standard (non-international) characters which generate more than 1 byte. So I've sometimes been surprised by tweets I thought were max. 140 characters being accepted - but then displayed in truncated form. So far I'm aware of
" ' < > becoming \" \' < > respectively
but there will be more.
So can someone please point me to the standard 1-byte character set? This will clearly include
0..9 a..z A..Z & ; \
and a few common punctuation marks.
This information would be useful in FAQ form
Development suggestion:
~~~~~~~~~~~~~~~~~~~~
Add a "Bytes available" count to the count of available characters shown above the type-in box.
I’m erwartungsvoll
-
Inappropriate?Eridanus, that's such a great suggestion! Maybe we can start compiling the ones that we've encountered so far here.
For what it's worth, I've noticed that the dash creates problems. Here's an example of a tweet that got cut into three sms messages because of the dash.
1/3: biz: just finished with the @jetblue call and all booked for OAK>
2/3: BOS—she told me bad weather somewhere=more online traffic=more er
3/3: rors online
I think the UK pound sign (£) also causes problems, as demonstrated by this other tweet
1/3: benfu: hi america. this is how you design money: http://www.royal
2/3: mint.com/newdesigns/designsRevealed.aspx (missing the £2 coin tho
3/3: ugh?) -
Inappropriate?Okay. Here's what I found out:
SMS can be encoded in three ways: 7-bit, 8-bit and 16-bit. 8-bit SMS are only used for binary data transfer like logos and ringtones, so forget about them.
7-bit SMS are the usual encoding for western languages. They include many accented characters (at the code position where ASCII has its control characters), and even contain a few "two-character" (14 bit) combinations for "extended" characters like "€" and "{". See the complete 7-bit SMS alphabet at Wikipedia: http://en.wikipedia.org/wiki/SMS#GSM
16-bit SMS are in UCS-2 encoding (a predecessor of UTF-16) and have only 70 characters available. That's what Twitter uses as soon as a non-ASCII character appears in a tweet, afaics. And a 140-character tweet in this encoding obviously leads to _three_ messages: 2x70 plus the sender name followed by colon and space (this solves Eridanus' question in his first paragraph).
However, I don't quite understand why Twitter isn't using 7-bit encoding, as long as all characters in the tweet are available in that encoding. Maybe it's because the SMS gateway they're using either allows ASCII or UCS-2, but I rather suspect they are not aware of the fact that lots of non-ASCII characters are available in the 7-bit encoding. Or they were just too lazy to implement a checking mechanism. Which leads us to an answer to mdy's observations: Both the dash and the pound sign are not in ASCII and will therefore make Twitter use UCS-2 encoding. Note that the pound sign is available in 7-bit encoding, the second tweet could therefore be sent in a single SMS. The dash, however, would force the SMS to be UCS-2.
Well, Twitter staff, does this help you somehow to rescue most Europeans from receiving three instead of one SMS? Or are we not important enough?
I’m feeling wiser than Twitter
-
Inappropriate?Thanks, Skytale, for your excellent and informative post.
The standard default GSM 03.38 7-bit alphabet table (your Wikipedia link) includes all the German Umlaut characters, so - as you mention - it's certainly odd that even a single Umlaut near the beginning of a message (first word was Für) triggers the wasteful UCS-2 encoding. But if the SMS carrier is doing this, then Twitter can't correct it.
However, I'm still puzzled why some characters are being converted to HTML (had to expand with "-" to prevent this page obligingly reconverting the HTML to the original characters!):
< becomes &-l-t-;
> becomes &-g-t-;
Further, " and ' arrive as \" and \' respectively - (e.g. don't becomes don\'t) - but only if I post from the cellityTweeter Java app. on my cellphone. So maybe this one is a cellity issue.
I’m somewhat wiser than before.
-
Inappropriate?Hey, employes!
Is it possible to contribute to Twitter? I can try to fix the problem... (Cause I really want it to be fixed.))
-
Inappropriate?stroncium:
I'm confident that Twitter employees are interested in our ideas and suggestions - even though they won't implement everything we suggest!.
Whenever you're willing to share an idea publicly with other Twitter users, you can post it here on Get Satisfaction. Once a few other users are interested in your idea, you could submit it to Twitter employees via http://twitter.com/help/ (choose idea in the dropdown menu) and include a link to the Get Satisfaction post where your idea is described and discussed.
If you initially prefer to share your idea with Twitter employees only, then you can submit it directly to http://twitter.com/help/ .
I’m always happy when other active users join us on Get Satisfaction.
Loading Profile...








