Is there already a fuction to turn these HTML character references into
proper characters? "Я" must come out as the cyrillic "$B'A(B" etc.
On the command line recode can do the trick:
.AN echo "Я" | recode html..utf-8
--
| ,__o
http://www.gnu.franken.de/ke/ | _-\_<,
ke@suse.de (work) / keichwa@gmx.net (home) | (*)/'(*)