characters with the highest bit set as HIGHBIT. We need to expand
this to support the UTF-8 character set properly. However, this
solves the problem that the character 0x80 (which is common in UTF-8)
gets masked to 0x00.
Patch submitted by "Huang Yuzhen" <huangyuzhen@bj.tom.com>