Unicode typefaces
From Wikipedia, the free encyclopedia
Unicode typefaces (also known as UCS fonts and Unicode fonts) are typefaces containing a wide range of characters, letters, digits, glyphs, symbols, ideograms, logograms, etc., which are collectively mapped into the standard Universal Character Set, derived from many different languages and scripts from around the world. Unlike most conventional computer fonts, which are specific to a particular language or legacy character set and contain only a small subset of the UCS characters, these fonts attempt to include many thousands of possible glyphs, so that they can be used as a single typeface across multi-lingual documents.
The Unicode standard does not specify the typeface (a collection of graphical shapes called glyphs) itself, but rather instead, it defines the abstract characters as a specific number (known as a codepoint) and also defines the required changes of shape depending on the context the glyph is used in (e.g., Combining characters, precomposed characters and letter-diacritic combinations). The choice of font, which governs how the abstract UCS characters are converted into a bitmap or vector output that can be viewed on a screen or printed, is left up to the user. If a font is chosen which does not contain a glyph for a codepoint used in the document, typically a question mark ("?"), a box, or some other Substitute character is displayed.
Currently (July, 2006), no single "Unicode font" includes all the characters defined in the present revision of the ISO 10646 (Unicode) standard. Many are continually updated to incorporate characters which were previously omitted or which were added in a newer version of the standard. Additionally, fonts may be updated to correct errors in past versions.
The UCS has over 1.1 million code points, but only the first 65,536 (the Plane 0: Basic Multilingual Plane, or BMP) had entered into common use before 2000. (See the Mapping of Unicode characters article for more information on other planes, including Plane 1: SMP, Plane 2: SIP, Plane 14: SSP, Plane 15 and 16: reserved for PUA.)
The first Unicode font (with very large character set, and supporting many Unicode blocks) was Lucida Sans Unicode, it was developed by Charles Bigelow & Kris Holmes' in March, 1993 (Shipped with Windows NT 3.1). Second was Unihan font, developed by Ross Paterson in 1993. Third was Everson Mono Unicode font, released in 1995, developed by Michael Everson.
| Unicode |
|---|
| Encodings |
| UCS |
| Mapping |
| Bi-directional text |
| BOM |
| Han unification |
| Unicode and HTML |
| Unicode and E-mail |
| Unicode typefaces |
Contents |
[edit] Issues
There are typographical ambiguities in Unicode, so that some of the unified Chinese characters will be typographically different in different regions. For example, Unicode point U+9AA8 (骨) is typographically different between simplified Chinese and traditional Chinese. This has implications for the idea that a single typeface can satisfy the needs of all locales.[1]
[edit] Application of Unicode typefaces
Beside all the issues, Unicode is now the base character set for many new standards and protocols, and is built into the architecture of operating systems (Microsoft Windows, Apple Mac OS X, and many versions of Unix), programming languages (Ada, Perl, Python, Java, Common LISP, APL), and libraries (IBM International Components for Unicode (ICU) along with the Pango, Graphite, Scribe, Uniscribe, and ATSUI rendering engines), font formats (TrueType and OpenType) and so on. Many other standards are also getting upgraded to Unicode compliance, day by day.
[edit] Utility software
Utility software can be used to see exactly which characters are included inside a font file:
- Character Map applet included with Windows 2000/XP
- Font Book application included with Mac OS X
- gucharmap for GNOME
- kcharmap for KDE
- MainType (by HighLogic, commercial)
- BabelMap (by Andrew West, free, donation-ware)
- Unicode Font Viewer (by Mike Lischke, freeware)
- Quick Key (by Nathanael Jones, opensource, free)
[edit] List of Unicode fonts
Of the many Unicode fonts available, the few are listed below are the most commonly used by a majority of users around the world on mainstream computing platforms. More Unicode fonts can be found in the (List of typefaces) article's "Unicode fonts" section.
| Font | Char(s) | Glyphs | Kernpairs | Version | Font Family | Font style | Font type | Serif style | License | Notes |
|---|---|---|---|---|---|---|---|---|---|---|
| Arial | 1,419 | 1,674 | 909 | 3.00 | Arial | Regular | OTF+TTO | Normal Sans | Proprietary | Included with Microsoft Windows. |
| Arial Unicode MS | 38,917 | 50,377 | 0 | 1.00 | Arial | Regular | OTF+TTO | Normal Sans | Proprietary | Included with Microsoft Office. |
| Bitstream Cyberbit | 32,910 | 29,934 | 935 | 2.0 beta | Bitstream Cyberbit | Roman | TTF | Cove | Freeware | For non-commercial use only. |
| Cardo | 2,879 | 2,882 | 216 | 0.098 (2004) | Cardo | Regular | TTF | Cove | Freeware | For non-commercial and non-profit uses only. |
| Caslon Roman | 3,684 | 3,686 | 0 | 001.000 16-12-2001 | Caslon | Roman | TTF | BSD-like license | ||
| Code2000 | 51,239 | 61,864 | 115 | 1.16 | Code2000 | Regular | TTF | Any | Shareware | Register after "reasonable" period (author's words). |
| Charis SIL | 1,958 | 3,084 | 0 | 4.002 | Charis SIL | Regular | TTF | Any | OFL | |
| Chryſanþi Unicode (Chrysanthi Unicode) | 4,818 | 4,383 | 0 | 3.1 | Chrysanthi | Regular | TTF | Cove | Freeware | |
| ClearlyU | - | 9,538 | 0 | 1.9 | - | - | - | - | Freeware | |
| DejaVu Sans | 5,223 | 5,427 | 2,558 | 2.18 | DejaVu | Book | OTF+TTO | Normal Sans | Bitstream Vera license and public domain for additions | |
| Doulos SIL | 1,958 | 3,083 | 0 | 4.014 | Doulos SIL | Regular | TTF | Any | OFL | |
| Everson Mono Unicode | 4,893 | 4,899 | 0 | 3.2b4 | Everson Mono | Regular | TTF | Any | Shareware | Monospaced width. |
| FreeSerif | 3,914 | 5,257 | 0 | 1.52 | FreeSerif | Medium | TTF | Cove | GPL | Sans serif (FreeSans) and monospaced (FreeMono) variants. |
| Gentium Regular | 1,469 | 1,699 | 2,857 | 1.0.2 (2005) | Gentium | Regular | TTF | Any | OFL | |
| GNU Unifont | 33,580 | 33,583 | 0 | 001.000 | Unifont | Medium | Bitmap | Any | GPL | |
| Junicode | 2,235 | 2,256 | 0 | 0.6.12 | Junicode | Regular | TTF | Any | GPL | |
| Linux Libertine | 1,982 | 1,985 | 0 | 2.2.0 | Linux Libertine | Regular | OTF+TTO | Any | GPL, OFL | |
| Lucida Grande | 2,245 | 2,826 | 0 | 5.0d8e1 (Revision 1.002) | Lucida Grande | Regular | OTF | Normal Sans | Proprietary | Included with Mac OS X. Any proportion. |
| Lucida Sans Unicode | 1,765 | 1,776 | 0 | 2.00 | Lucida Sans | Regular | OTF+TTO | Normal Sans | Proprietary | Included with Microsoft Windows. |
| Microsoft Sans Serif | 2,301 | 2,257 | 0 | 1.41 | Microsoft Sans Serif | Regular | OTF+TTO | Normal Sans | Proprietary | Included with Microsoft Windows. |
| New Gulim | 46,567 | 49,284 | 0 | 3.10 | New Gulim | Regular | TTF | Obtuse Cove | Proprietary | Included with Microsoft Office 2000. Any Proportion. |
| Tahoma | 1,912 | 2,034 | 674 | 3.14 | Tahoma | Regular | OTF+TTO | Normal Sans | Proprietary | Included with Microsoft Windows. |
| Times New Roman | 2,790 | 3,380 | 867 | 5.01 | Times New Roman | Regular | OTF+TTO | Cove | Proprietary | Included with Microsoft Windows Vista. |
| TITUS Cyberbit Basic | 9,341 | 10,044 | 0 | 3.0 (2000) (Revision 4.00) | TITUS Cyberbit | Regular | TTF | Cove | Freeware | |
| Y.OzFontN | 21,360 | 59,678 | 0 | 9.41 | Y.OzFontN | Regular | TTF | Any | Freeware | Sans-serif (for Japanese) and Monospace (for Latin). |
| Font | Char(s) | Glyphs | Kernpairs | Version | Font Family | Font style | Font type | Serif style | License | Notes |
- Note
- ^† OTF+TTO: OpenType font with TrueType outlines.
- ^‡ OpenType fonts sometimes don't contain a one-by-one Kernpair table but a kern-by-classes table where groups of similar characters are seen as one kern group. I.e. have V and W nearly the same left and right geometry. So “0” doesn't mean that no kerning is supported!
[edit] Comparison of fonts
Number of characters included by the above version of fonts, for different Unicode blocks (or, ranges), are listed below.
[edit] 0000-077F
- N = Numerical digits. This number of characters are included in the font for that range.
- Image:U2713.svg = Most or some portion out of all characters in that range are present in the font.
- X = No characters are included in the font for that range or Unicode block.
- - = Data not available now.
[edit] 0780-139F
[edit] 13A0-1DBF
[edit] 1DC0-257F
[edit] 2580-2DDF
[edit] 2E00-4DBF
[edit] 4DC0-FE2F
[edit] FE30-FFFF
[edit] 10000-1D7FF
| Image:U+21A7.gif SMP | |||||||||||||||||||||||||
| Image:U+2191.gif Range Font Image:U+2192.gif Range Image:U+21B4.gif | Image:Arial uf vt.svg | Image:Arial Unicode MS uf vt.svg | Image:Bitstream Cyberbit uf vt.svg | Image:Cardo uf vt.svg | Image:Caslon Roman uf vt.svg | Image:Code2000 uf vt.svg | Image:Charis SIL uf vt.svg | Image:Chrysanthi Unicode uf vt.svg | Image:ClearlyU uf vt.svg | Image:DejaVu Sans uf vt.svg | Image:Doulos SIL uf vt.svg | Image:Everson Mono uf vt.svg | Image:FreeSerif uf vt.svg | Image:Gentium Regular uf vt.svg | Image:GNU Unifont uf vt.svg | Image:Junicode uf vt.svg | Image:Linux Libertine uf vt.svg | Image:Lucida Grande uf vt.svg | Image:Lucida Sans Unicode uf vt.svg | Image:Microsoft Sans Serif uf vt.svg | Image:New Gulim uf vt.svg | Image:Tahoma uf vt.svg | Image:Times New Roman uf vt.svg | Image:TITUS Cyberbit Basic uf vt.svg | Image:Y.OzFontN uf vt.svg |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Linear B Syllabary (10000–1007F) | X | X | X | 0 | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X |
| Linear B Ideograms (10080–100FF) | X | X | X | 2 | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X |
| Aegean Numbers (10100–1013F) | X | X | X | 2 | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X |
| Ancient Greek Numbers (10140–1018F) | X | X | X | 75 | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X |
| Old Italic (10300–1032F) | X | X | X | 35 | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X |
| Gothic (10330–1034F) | X | X | X | 27 | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X |
| Ugaritic (10380–1039F) | X | X | X | 0 | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X |
| Old Persian (103A0–103DF) | X | X | X | 0 | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X |
| Deseret (10400–1044F) | X | X | X | 0 | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X |
| Shavian (10450–1047F) | X | X | X | 0 | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X |
| Osmanya (10480–104AF) | X | X | X | 0 | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X |
| Cypriot syllabary (10800–1083F) | X | X | X | 0 | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X |
| Phoenician (10900-1091F) | X | X | X | 0 | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X |
| Kharoshthi (10A00–10A5F) | X | X | X | 0 | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X |
| Sumero-Akkadian Cuneiform (12000-1236E and 12400-12473) | X | X | X | 0 | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X |
| Byzantine Musical Symbols (1D000–1D0FF) | X | X | X | 0 | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X |
| Musical Symbols (1D100–1D1FF) | X | X | X | 0 | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X |
| Byzantine Musical Symbols (1D000–1D0FF) | X | X | X | 0 | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X |
| Ancient Greek Musical Notation (1D200–1D24F) | X | X | X | 70 | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X |
| Tai Xuan Jing Symbols (1D300–1D35F) | X | X | X | X | X | X | X | X | X | 87 | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X |
| Mathematical Alphanumeric Symbols (1D400–1D7FF) (994) | X | X | X | 13 | X | X | 2 | X | X | 45 | 2 | X | X | X | X | X | X | X | X | X | X | X | X | X | X |
| Supplementary Private Use Area-A | X | X | X | 462 | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X |
| CJK Unified Ideographs Extension B | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | 339 |
| CJK Compatibility Ideographs Supplement | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | 15 |
| Image:U+2191.gif Range Font Image:U+2192.gif | Image:Arial uf vt.svg | Image:Arial Unicode MS uf vt.svg | Image:Bitstream Cyberbit uf vt.svg | Image:Cardo uf vt.svg | Image:Caslon Roman uf vt.svg | Image:Code2000 uf vt.svg | Image:Charis SIL uf vt.svg | Image:Chrysanthi Unicode uf vt.svg | Image:ClearlyU uf vt.svg | Image:DejaVu Sans uf vt.svg | Image:Doulos SIL uf vt.svg | Image:Everson Mono uf vt.svg | Image:FreeSerif uf vt.svg | Image:Gentium Regular uf vt.svg | Image:GNU Unifont uf vt.svg | Image:Junicode uf vt.svg | Image:Linux Libertine uf vt.svg | Image:Lucida Grande uf vt.svg | Image:Lucida Sans Unicode uf vt.svg | Image:Microsoft Sans Serif uf vt.svg | Image:New Gulim uf vt.svg | Image:Tahoma uf vt.svg | Image:Times New Roman uf vt.svg | Image:TITUS Cyberbit Basic uf vt.svg | Image:Y.OzFontN uf vt.svg |
| Unicode Fonts | |||||||||||||||||||||||||
[edit] See also
[edit] References
- ^ Ken Lunde, CJKV Information Processing, O'Reilly Inc, 1999. Page 128, "CJKV character form differences"
[edit] External links
- ISO/IEC JTC1/SC2/WG2, the working group in charge of ISO 10646
- Fonts and Keyboards at Unicode.org
- Unicode Font Guide For Free/Libre Open Source Operating Systems - a huge index of high quality free fonts
- Alan Wood's Unicode Resources
- Character sets - Ken Fowles, Microsoft, 1997. - Enable Unicode for applications.
- Arial Unicode Font at AscenderCorp.com

