Thursday, October 30, 2008

Vietnamese unicode and the BlackBerry

NOTE: The issue described below was occurring while using BlackBerry OS version 4.0 and 4.1. The newer 5.0 version of the BlackBerry OS does not exhibit these issues.

Apparently the BlackBerry does a good job of rendering decomposed unicode characters into readable characters. But the BlackBerry does not appear to be able to render precomposed characters (which is what most pages on the internet use).

So to make a long story short, if you copy, let's say, some Vietnamese from a web page, and paste it into an email ... that email will not render very well on a BlackBerry.

Well, here is a hack that may help someone out. Decompose the characters, and send those decomposed characters in an email. If you do that, the email will likely render "fine" in a normal email client as well as a BlackBerry email client. I call this a hack because the W3C generally recommends to exchange texts in NFC ... well most BlackBerry email clients will not render all precomposed characters (NFC).

For example, if you are NOT using a windows box, try cutting and pasting the following into an email:
Subject: NFC (like most of the web)

Chúa yêu em lòng em vui thay
Kia Kinh Thánh đã tỏ cho hay
Các con thơ thuộc Jê-sus đây
Chúng yếu nhưng Ngài khỏe mạnh hoài

Jê-sus yêu em lắm
Phải em được Chúa yêu
Jê-sus yêu em lắm
Chính trong lời Chúa dạy nhiều
You'll find that emails sent with these characters render fine in a normal desktop email client, or even a web email client like Gmail, but they do not render correctly in the BlackBerry email client.

Now, try the same with this "decomposed" text:
Subject: NFKD (a decomposed form)

Chúa yêu em lòng em vui thay
Kia Kinh Thánh đã tỏ cho hay
Các con thơ thuộc Jê-sus đây
Chúng yếu nhưng Ngài khỏe mạnh hoài

Jê-sus yêu em lắm
Phải em được Chúa yêu
Jê-sus yêu em lắm
Chính trong lời Chúa dạy nhiều
While the two texts may look similar on this web page, they are different, trust me. And you'll find that these characters render well in Gmail, Outlook, Evolution, but also render well on the BlackBerry.

I'm not sure why I feel like including a small java program I wrote to help folks create emails that render better for the BlackBerry, but here it is:
import java.util.Scanner;
import java.text.Normalizer;
import java.text.Normalizer.Form;

public class d {
public static void main(String[] args) {
Scanner sc = new Scanner(;
sc.useDelimiter("Yes, my Java is terrible ...");
String foo =;
CharSequence c = foo.subSequence(0,foo.length());
Normalizer.Form nf = Normalizer.Form.valueOf("NFKD");
System.out.println(nf + " Compatability Decomposed:\n" + Normalizer.normalize(c,nf));
This program would be used as follows from a command line:
$ cat myFileWithPrecomposedCharacters | java d
So you could paste NFC characters from, say, a Vietnamese web page, into a file, and then run the file through the program to generate NFKD which you can then paste into an email you're sending, and that email should render in a readable way using a desktop email client, a web email client, or a BlackBerry.

I've only tested this methodology with Vietnamese, and because the incident at the tower of Babel was so confusing, all bets are off with other languages.

By the way, it looks to me like neither NFC nor NFKD render correctly in AndroidMail as of today's build, so we should end up seeing complaints about Vietnamese not rendering well on the G1, unless the developers get it fixed soon. Maybe we will have a follow up post with more on that subject.

UPDATE: A great write up on unicode and the BlackBerry is here on the Logicmail website. LogicMail is a J2ME E-Mail client supporting IMAP and POP, and designed to run on RIM BlackBerry handheld devices.


1) For more help with the definitions of the normalization forms mentioned above try here. It is a good document to be familiar with if you are planning to do i18n or l10n.
- i18n stands for internationalization
- l10n stands for localization

2) for those of you who just want a quick overview ...
- These terms are roughly equivalent for this discussion:
   compatibility decomposition (NFKD)
   canonically decomposed characters
   composite unicode
   composite characters
   the "separated" diacritical marks and letters used in Vietnamese without combining

- These terms are also roughly equivalent, and should not be confused with those just above:
   compatibility composition (NFKC)
   NFC - normalization form canonical composition
   precompound unicode
   unicode dựng sẵn
   composed characters
   recomposed characters (by canonical equivalence)
   precomposed characters
   decomposable characters
   pre-composite characters
   the set of completed characters (including all markings)

If you think there is a problem with the terms or equivalencies drawn above, please let's discuss it via email. If there are things that need to be corrected, I am open to that, just let me know.

Thursday, October 23, 2008

Android Email application

First of all ... Android is an amazing platform. So, for those of us who want details about what we are getting before we make that purchase decision ... Let's just take a quick look at one application, the Email application that Google released the source to a couple of days ago.

I compiled AndroidMail and deployed it to the Android emulator, and here are screenshots (click to enlarge them):

Here is the opening screenshot of my gmail inbox showing an email that I just sent to myself using the emulator running the AndroidMail application ...

Here is that same email fully rendered with the subject "Droid Emulator" in the AndroidMail application.

Here is the account setup screen. If you click "Next" with the information given, it sets up a Gmail inbox flawlessly. If you click on "Manual Setup" you get the screenshot below ...

It supports IMAP, Yay! Although, it does not appear to support IMAP idle ... yet.

Here we are prompted for the type of security and port numbers and such for incoming mail on the account ... I just clicked next, even though the information did not look correct yet ...

Here is what happened when I clicked on the "Next" button with my gmail address given in the previous screen ... I thought gmail supported IMAP already. Actually I know it does, because I access my gmail using another email app for the black berry called LogicMail. LogicMail supports IMAP, and can access my gmail via IMAP just fine.

It says ...

Sorry! The application Email (process has stopped unexpectedly. Please try again.

If you can't tell why I'm showing an error screen here, well ... you can learn quite a bit about an application by how it handles errors. And AndroidMail has nerd-love error screens. Yes, we are strange ... we have to see how it responds to failure. And that may explain our social problems too.

Anyway, it is great to see a phone running Linux AND a clean Java app. Overall ... Good job, Android team!


1) mydroid took 32 minutes to build on this Intel Core2 Duo 3GHz fc8 box.

2) Yes, the Android SDK, and eclipse ADT work great on Fedora ... let me know if you'd like a rundown on the packages needed to get the Android SDK, and/or the entire Android platform building and working on your Fedora box.

3) Screenshots made with Android Emulator, imglib, and the (super)+drawbox capture functionality provided by compiz-fusion.x86_64

4) UPDATE: I have found that I have to run the eclipse ADT (ganymede) with at least a 1.5GB virtual machine (./eclipse -vmargs -Xms512m -Xmx1536m) in order to get Android Email to completely build and run in the emulator from Eclipse ... otherwise I get errors like "Could not find AndroidMail.apk!" ... this is because the builds were not completing (because of out of memory issues) and not producing the AndroidMail.apk file

Wednesday, October 01, 2008

Sweet peas

Here we go ...

What's in the box?

Just get on with it!

Sweet Peas ...

Green squeeze

Four sweet peas