[World] Unicode support gets better

Daniel Phillips phillips at phunq.net
Mon May 21 14:58:21 PDT 2012


Unicode string handling

Thanks to Hirofumi for some good advice on proper Unicode support. Executive 
summary:  iconv(3) is a decent library for reading unicode characters from 
utf8 strings, among other things. It is part of libc and therefore does not 
introduce a new dependency. (Each avoided dependency makes the tarball that 
much easier to build.) And it is reported to be relatively efficient, unlike 
mbrtowc(3), also part of libc, which apparently slows down text scanning by 
orders of magnitude.

For now, I actually use this crazily tight hack, lovingly hand rolled by 
Bjoern Hoehrmann into about 500 bytes of blazingly fast code:

    http://bjoern.hoehrmann.de/utf-8/decoder/dfa/

See shiny89.png, below. I think these three kanjis spell "happy". Now you can 
give a text string to the freetype demo that has unicode characters, which may 
be ordinary ascii or any other Unicode character, though only those characters 
actually present in the specified font file are actually rendered. Text is 
always rendered left to right for now.

Just as with Roman text, this subdivision rendering is pretty, it is not 
"correct". Hirofumi can read it and doesn't immediately complain, which I 
consider a great success. The strokes do not look much like brush strokes as 
they are supposed to, but I have seen CJK text created by graphic designers in 
much the same spirit, 2D of course.

Now a simple render cache has been added to the freetype demo, allowing it to 
load each glyph when first encountered in a unicode string instead of loading 
all the glyphs fron a font file on program start. Otherwise, there would be an 
annoying pause while the tens of thousands of ideographs of a CJK or Unicode 
typeface are loaded and converted to outlines.

Generic font names etcetera

Right now you need to tell the freetype demo the exact name of the Truetype 
font to render. This is problematic because each Linux distribution likes to 
name font files differently and put them in its own favorite place. As Hirofumi 
pointed out, the proper solution is ftc, the Freetype font cache. This not 
only caches font files as they are accessed but provides an abstract font 
naming scheme that hopefully works across all the different Linux flavors and 
other operating systems.

I will also continue to support the ability to name a font file explicitly, to 
avoid the need to install a font before using it. For example, a game might 
provide its own font files and might not want to install them, or somebody 
designing a font might want to avoid the heavyweight installation step just to 
have a look at it.

Bitmap text

We are getting closer to rendering high quality rasterized text without help 
from the GLC library.

This task is far from dead simple. Here is a quick tour of the bits and pieces 
that need to be taken care of:

   * Scan convert glyphs using Freetype's rasterizer as an alternative to 
extracting and rendering outlines.

   * Pack glyphs into texture albums. Algorithms can get complex in order to 
avoid wasting precious GPU texture memory. For example, if we give every 
character the same size texture box, most characters will waste more than half 
of it, so we really want variable sized glyph boxes, which greatly complicates 
texture album layout. For the same reason as above, we want to operate the 
texture album as a cache and rasterize glyphs on demand as encountered in text 
strings. Messy! Not only do we need to worry about allocating UV space in the 
texture and possibly flushing out glyphs that haven't been used for many 
frames, but any changed parts of the texture have to be re-uploaded.

   * Handle multiple different point sizes. We probably want a separate glyph 
cache for each different point size in use. We need some way of knowing what is 
currently "in use", which point sizes will be used, and which point size is 
supposed to be rendered currently. We need a way of purging point sizes that 
are no longer in use, or haven't been used for some time.

   * Render text from texture albums. Whatever scheme we came up with above, 
render it to the screen, mapping each glyph to some exact number of screen 
pixels. Need to handle text layout here, including variable width characters, 
variable rendering direction (later!) and automatic kerning. Need to map 
glyphs to proper position depending on projection and viewpoint (including the 
important orthogonal case.) Need custom clipping to avoid strange behavior for 
text outside or overlapping the frustum.

   * Sort out exactly how Freetype hinting works so small point size fonts are 
hinted to actual pixel positions and not some other nasty result.

   * If text is to be scaled according to the current screen size, we must 
invalidate and rebuild all glyph images to align properly to pixel positions 
on each screen resize. Note that the screen may be resized many continuously 
by dragging, so trickery is needed to avoid stuttering. Alternatively, we may 
always render rasterized text at the same size regardless of screen size, 
which is just what GLC does. In many cases that is just what you want anyway, 
so it is a good place to start.

As you can see, I would need to be really, really anxious to get rid of GLC to 
put myself through all this work, or even some minimal fraction of it. And 
indeed I am. While I certainly appreciate the fact that the GLC was able to 
provide usal OpenGL text rendering after only a few hours of development, now 
the time has come to put in some serious work to have really high performance 
and beautiful text rendering. It's just one of those things like antialiased 
lines: not necessary, but awfully nice to have. Anyway, here is a big shout-
out to the QuesoGLC devs for giving me what I needed, when I needed it, for 
the usual price: free.

Regards,

Daniel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://phunq.net/pipermail/world/attachments/20120521/8e77e1fc/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: shiny89.png
Type: image/png
Size: 136311 bytes
Desc: not available
URL: <http://phunq.net/pipermail/world/attachments/20120521/8e77e1fc/attachment-0001.png>


More information about the World mailing list