path: root/Software/BadSoftware.mdwn
blob: ecee4f14329f9da1eead3697a90dda88efec18a2 (plain)

This is a list of software that doesn't support [[Unicode|]] properly, or at all. Please note that we do not consider [[UTF-16|]] or [[UTF-32|]] support adequate (or [[CESU-8|]], for that matter). There is also [[another page|Software/BMP-Only]] for software that doesn't support characters outside the [[Basic Multilingual Plane|Software/BMP]]. [[Please file bug reports|]] whenever possible, and be sure to let us know about them. 

<!-- Note to editors: Please keep this list in alphabetical order. Thank you. --> 

                     * [[Aterm|]]: Appears to [[fail unfortunately|]] when attempting to read [[Markus Kuhn's|]] [[UTF-8 demonstration document|]]. 
                     * [[Emacs|]]: The [[CVS version|]] is [[unable to uppercase|]] lowercase characters that map to multiple uppercase characters. 
                     * [[Evolution|]]: [[Pango|]] is used improperly in GtkHtml, so [[right-to-left text is displayed incorrectly|]]. 
                     * [[Flex|]]: Unicode support is very basic. There is no support for dealing with `wchar_t` strings, and the regular expression matching is limited to [[US-ASCII|]]. 
                     * [[gnome-print|]]: [[Pango|]] is [[not used for printing|]]. 
                     * [[GNU Arch|]]: `tla` does not accept anything other than simple letters, numbers, and basic punctuation in `my-id`. It encounters problems with [[umlauts|]], [[underscores|]], and other [[“funky characters”|]]. 
                     * [[GPG|]]: Configuration files contain a setting that communicates the character set that is wanted. There is a notice near this setting claiming that UTF-8 will be the default in the next version. However, in !CVS revision HEAD, the default remains ISO-8859-1. 
                     * [[grep|]]: [[Markus Kuhn|]] [[noticed|]] that grep 2.5 is very slow in UTF-8 locales; [[Mika Fischer posted a patch|]] that you may want to try. 
                     * [[Grip|]] has problems with UTF-8 in [[ID3|]] tags. See [[#854558|]] and [[#852783|]] 
                     * [[ID3v2|]] (an MP3 tagging tool) doesn't set the encoding flag of the text fields in the ID3v2 tags to UTF-8, which thus are not shown correctly in most players/music organizers. See [[this|]] bug report for details. 
                     * [[joe|]]: [[Lon Hohberger|]] said, “I looked at it briefly, but I didn't get too far before more important things came up. On a side note, it's probably much easier to write a `joe.elisp` or `joe.vim` :)” 
                     * [[Linux kernel|]]: Console: Can display UTF-8 characters after configured using [[kbd|]] package. Console can show 256 or 512 different characters at same time. Supported already in distributions such as Fedora, Mandrake, etc. Unicode input is problematic for composing (using a dead key to add accents to characters) as diacritics [[must be 8-bits|]], allowing ISO 8859, but not UTF-8. You can input UTF-8 characters by assigning one key to one Unicode character. The issue of diacritics looks difficult to solve in an easy way. 
                     * man-pages: Manual pages for character sets like ISO-8859-1 [[are not encoded in UTF-8|]]. 
                     * newt: Has problems with multibyte characters, entering UTF-8 characters. 
                     * [[mc|]]: The Red Hat/[[Fedora|]] Linux packages contain patches that fix the main interface, but not the viewer or editor. [[Grab the source RPM|]] for the patches. There are also patches from Suse available: [[Suse Patches|]] 
                     * strings, part of [[binutils|]], does not support UTF-8. 
                     * [[tcsh|]]: This shell has issues: “Unicode (UTF-8) doesn't seem to work”. This is on their [[wish list|]], though. Setting the variable dspmbyte to utf8 seems to solve the problem, however [[you have to set it explicitely|]]. There are bugs when editing a command line with UTF-8 characters. 
                     * [[TWiki|]]: See [[|]] for the latest information. 
                     * [[WordPress|]]: There's no current plan to support UTF-8, and there were no responses to a [[request|]] to use anything other than ISO-8859-1. *Update* (2004-03): This appears to be mostly resolved in their CVS. 
                     * [[Zsh|]]: The Z Shell has partial support for UTF-8. For example, pasted UTF-8-encoded Latin characters display OK, but other characters such as the 3-byte UTF-8 single quote (’) or many Asian characters do not. Zsh also has trouble moving the cursor around multi-byte characters (such as using backspace). Tab completion will match files but probably not display them correctly. A \u escape for generating Unicode characters is supported. In the current unstable (4.3) branch a lot of progress has been made in adding support for unicode from the line editor and help in testing this would be much appreciated by the developers. It is a big job, in part because zsh's useful feature of being able to handle null characters is being preserved. 
<!-- Note to editors: Please leave only three change log entries here. Thank you. --> 

<br>-- Main.[[NickLamb|NickLamb]] - 26 Jul 2006 <br>-- Main.[[AlexanderWinston|AlexanderWinston]] - 09 Jul 2004 <br>-- Main.[[AlexanderWinston|AlexanderWinston]] - 14 Jun 2004