[World] Unicode support gets better
Daniel Phillips
phillips at phunq.net
Mon May 21 14:58:21 PDT 2012
Unicode string handling
Thanks to Hirofumi for some good advice on proper Unicode support. Executive
summary: iconv(3) is a decent library for reading unicode characters from
utf8 strings, among other things. It is part of libc and therefore does not
introduce a new dependency. (Each avoided dependency makes the tarball that
much easier to build.) And it is reported to be relatively efficient, unlike
mbrtowc(3), also part of libc, which apparently slows down text scanning by
orders of magnitude.
For now, I actually use this crazily tight hack, lovingly hand rolled by
Bjoern Hoehrmann into about 500 bytes of blazingly fast code:
http://bjoern.hoehrmann.de/utf-8/decoder/dfa/
See shiny89.png, below. I think these three kanjis spell "happy". Now you can
give a text string to the freetype demo that has unicode characters, which may
be ordinary ascii or any other Unicode character, though only those characters
actually present in the specified font file are actually rendered. Text is
always rendered left to right for now.
Just as with Roman text, this subdivision rendering is pretty, it is not
"correct". Hirofumi can read it and doesn't immediately complain, which I
consider a great success. The strokes do not look much like brush strokes as
they are supposed to, but I have seen CJK text created by graphic designers in
much the same spirit, 2D of course.
Now a simple render cache has been added to the freetype demo, allowing it to
load each glyph when first encountered in a unicode string instead of loading
all the glyphs fron a font file on program start. Otherwise, there would be an
annoying pause while the tens of thousands of ideographs of a CJK or Unicode
typeface are loaded and converted to outlines.
Generic font names etcetera
Right now you need to tell the freetype demo the exact name of the Truetype
font to render. This is problematic because each Linux distribution likes to
name font files differently and put them in its own favorite place. As Hirofumi
pointed out, the proper solution is ftc, the Freetype font cache. This not
only caches font files as they are accessed but provides an abstract font
naming scheme that hopefully works across all the different Linux flavors and
other operating systems.
I will also continue to support the ability to name a font file explicitly, to
avoid the need to install a font before using it. For example, a game might
provide its own font files and might not want to install them, or somebody
designing a font might want to avoid the heavyweight installation step just to
have a look at it.
Bitmap text
We are getting closer to rendering high quality rasterized text without help
from the GLC library.
This task is far from dead simple. Here is a quick tour of the bits and pieces
that need to be taken care of:
* Scan convert glyphs using Freetype's rasterizer as an alternative to
extracting and rendering outlines.
* Pack glyphs into texture albums. Algorithms can get complex in order to
avoid wasting precious GPU texture memory. For example, if we give every
character the same size texture box, most characters will waste more than half
of it, so we really want variable sized glyph boxes, which greatly complicates
texture album layout. For the same reason as above, we want to operate the
texture album as a cache and rasterize glyphs on demand as encountered in text
strings. Messy! Not only do we need to worry about allocating UV space in the
texture and possibly flushing out glyphs that haven't been used for many
frames, but any changed parts of the texture have to be re-uploaded.
* Handle multiple different point sizes. We probably want a separate glyph
cache for each different point size in use. We need some way of knowing what is
currently "in use", which point sizes will be used, and which point size is
supposed to be rendered currently. We need a way of purging point sizes that
are no longer in use, or haven't been used for some time.
* Render text from texture albums. Whatever scheme we came up with above,
render it to the screen, mapping each glyph to some exact number of screen
pixels. Need to handle text layout here, including variable width characters,
variable rendering direction (later!) and automatic kerning. Need to map
glyphs to proper position depending on projection and viewpoint (including the
important orthogonal case.) Need custom clipping to avoid strange behavior for
text outside or overlapping the frustum.
* Sort out exactly how Freetype hinting works so small point size fonts are
hinted to actual pixel positions and not some other nasty result.
* If text is to be scaled according to the current screen size, we must
invalidate and rebuild all glyph images to align properly to pixel positions
on each screen resize. Note that the screen may be resized many continuously
by dragging, so trickery is needed to avoid stuttering. Alternatively, we may
always render rasterized text at the same size regardless of screen size,
which is just what GLC does. In many cases that is just what you want anyway,
so it is a good place to start.
As you can see, I would need to be really, really anxious to get rid of GLC to
put myself through all this work, or even some minimal fraction of it. And
indeed I am. While I certainly appreciate the fact that the GLC was able to
provide usal OpenGL text rendering after only a few hours of development, now
the time has come to put in some serious work to have really high performance
and beautiful text rendering. It's just one of those things like antialiased
lines: not necessary, but awfully nice to have. Anyway, here is a big shout-
out to the QuesoGLC devs for giving me what I needed, when I needed it, for
the usual price: free.
Regards,
Daniel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://phunq.net/pipermail/world/attachments/20120521/8e77e1fc/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: shiny89.png
Type: image/png
Size: 136311 bytes
Desc: not available
URL: <http://phunq.net/pipermail/world/attachments/20120521/8e77e1fc/attachment-0001.png>
More information about the World
mailing list