Khmer Software Initiative

 

Fonts

 

 
 

Computer fonts for the Khmer language have existed now for almost 15 years. They were not developed using Unicode, as Khmer Unicode did not exist at the time. These fonts are now being widely used for word processing as well as for newspaper and magazine typesetting. Keyboards for these fonts exist and are used.

The Unicode coding of Khmer has become considered finalized by the Cambodian government (except for possible bugs) in the version 4.0 of Unicode (it was already included in versions 3 and 3.1). Unicode not only establishes which code is assigned to each letter of the Khmer alphabet, but also specifies in which order they are typed, which changes from the prior systems. In handwriting and with non-standard fonts, Khmer was typed from left to right, in the same order that characters are printed. With Unicode, Khmer is typed in the order that letters are pronounced (different), which requires an "intelligent behavior" from the system or word processor.

New font formats (OpenType, ATT) permit the coding of this "intelligent behavior" inside the fonts, allowing them to be used in standard systems, without having to develop much specific code for them.

A standard for Khmer OpenType (developed by Microsoft) and some OpenType fonts for Khmer already exist, codifying all the necessary information. Most of these fonts are proprietary. The KhmerOS initiative has already produced and placed in the public domain a full Unicode OpenType font family, in order to allow for the development of OpenSource software that interprets Khmer fonts. The project is presently working on improving this font, hinting it, so that it can be used as a system font, visible in very small sizes. Please see the status and download pages for news on fonts produced by the KhmerOS initiative or other sources.

A full set of fonts has to be bought or developed and put in the public domain, it has to include basic (very clear) fonts to be used in menus and other parts of the user interface, as well as other fonts in which the artistic side is more important, to be used in documents and design. The set of fonts has to be an improvement on the non-standard sets used now to do Khmer text processing in English-language based computers, so that fonts are not a barrier in the deployment and acceptance of the system (reducing the cost of change for users who already type Khmer in other systems).

Fonts are a particularly important part of this project, as they can add value that other systems may not have. The richness in fonts of the Macintosh system turned Apple in the 1990s into an industry reference for publishing, owning a lot of its merit is to enormous variety of fonts.

The sets of fonts developed will have to cover very different needs. Some of these fonts will be bought, some will be developed and some will be adapted to Unicode and OpenType, re-using efforts made earlier by font developers, computer manufacturers and scholars:

  • Fonts for computer screens. Very clear fonts that are easy to see in a computer screen, even at a very small size. This fonts will be used to translate system and application messages. A full family of fonts is necessary (normal, bold, cursive, cursive-bold). The fonts need to be well “hinted” for their use in small text. We actively working on this font at the moment.

  • Fonts for word processing. These fonts must include traditional Khmer fonts, old style fonts (ancient scripts), fonts for signs (Mul), square fonts (Chrun), handwriting fonts and newer (fancier) typefaces.

  • Fonts for designers. Present fonts are not well fitted for design, aesthetically as well as technically. New designer fonts must be developed, assuring that they can be used and “bended” on the ways designers have to. Styles from western and far-eastern countries should be considered when designing these fonts.

  • The full list of fonts should include:

  • Representation of historical texts in their original style.

o        Brahmi script (1st to 5th centuries) (1 model).

o        Khmer ancient pre-Angkor script (6th to 9th century) (1 model).

o        Khmer ancient Angkor script (10th to 13th century) (3 models).

o        Post-Angkor Khmer Script (14th to 19th century) (many models).

  • System fonts (to be used mostly in computer screens).

o        A full family: normal, bold, cursive, cursive-bold.

  • Letters, forms, etc. (modern Khmer fonts).

o        Fonts for Signs (Mul).

o        Square fonts. (Chrun).

o        Handwritten script.

  • Design

o        Fancy typefaces based in Khmer, European and Chinese/Japanese calligraphy.

The fonts for this project must conform to industry standards, and use advance typographical formats that are well prepared for Indic scripts such as Khmer. Both Microsoft and Apple have such formats (OpenType for Microsoft and ATT for Apple). The fonts must also use the Unicode standard.

A number of these fonts (15) are already available in OpenType format, but they are copyrighted. Part of these project would be to try to buy them into the public domain.

In total the initiative should buy or produce at least 50 fonts or variants for the public domain.

Om Mony, Cambodias pioneer typographer, also makes Unicode fonts available through his www.camboday.com website.

As of August 2004, we consider that enough Unicode font exist, and, unless there are funds available, it is no longer the business of the project to try to bring to the public domain more fonts, unles specific funding becomes available. Fonts produced by Danh Hong, Om Mony, the Khek brother and other sources are considered sufficient. Nevertheless, we do want to be careful about fonts complying to the standard. Fonts that include specific ligatures instead of characters (such as using the word AUI instead of the independent vowel QUUV2) have been detected. The use of these fonts produces incorrect Khmer Unicode texts, and should be strongly discouraged.

 

Translation of text in legacy fonts to Unicode

 

It is quite reasonable to think that -with the use of Unicode- all old fonts will be phased out.

In order to reduce the cost of changing from an old system to this one, it is important that files and documents written using old Khmer fonts (pre-Unicode) be translated easily to Khmer fonts. An application to translate old documents need to be developed.

There already exists a utility, developed by Jean Yves Fusil and maintained by OpenForum to translate texts written in one type of old (legacy or private encoding) Khmer computer font to another old type of Khmer font. This program –able to translate among 23 codifications of Khmer fonts- does not handle Unicode at present.

Lin Chear developed a first version of a ABC to Unicode converter, taking a plain text file and creating a scaped Unicode file. The we decided to get into it and created a more complete converter that transfers a plain text ABC file (taking into account Zero Space) into a Unicode utf-8 file. A second version of the program that starts directly from a Limon font has also been made available. Both programs are available from our download page. Their characteristics are:

  • Differentiating between SRA U and the lower forms of TRAISAP and MUSIKATOAN.

  • Recognizing all of the cases in which PO + SRA A are used instead of the letter NYO.

  • Getting most of the cases of Coeng TA and Coeng DA correctly.

  • Reordering Register Shifters and diacritics when they are located in the wrong place.

  • Reordering Vowels, Signs and Coeng consonants when they are in the wrong order.

As for document file format (passing from MS Windows to Open Source environment), no translation program is necessary. Even if the Microsoft format for documents (UCS-2) is different from the format used by Linux (UFT-8), the applications do this translation automatically when they open a document created with Microsoft programs (such a MS Word or MS Excel).

Check our status and download pages for updates.

 

Page Last Updated: Friday, 22 October 2004

For any comments on the web, please contact the wembaster of this domain