Joe Corneli writes: > Adapted from w3m-filter.el: > > (while (re-search-forward "&#\\([0-9]+\\);" nil t) > (setq ucs (string-to-number (match-string 1))) > (delete-region (match-beginning 0) (match-end 0)) > (insert-char ucs 1)) > > This would appear to work if the characters themselves were recognized... > > But when I run this expression on a buffer containing the string > "玄奘" what I get is an error, like this: Is that really what w3m does? I'm not sure how the above could possibly work in any normal version of Emacs -- the argument to `insert-char' is an Emacs characater, not a unicode code-point. So, you need to translate from the unicode code-point to the Emacs character encoding. One method might be to translate the unicode code-point into a utf-16 string (should be trivial I guess), and then use `decode-coding-string' to translate that into Emacs' internal encoding; e.g.: (while (re-search-forward "&#\\([0-9]+\\);" nil t) (let* ((ucs (string-to-number (match-string 1))) (ucs-string (string (logand ucs #xFF) (logand (ash ucs -8) #xFF))) (decoded-string (decode-coding-string ucs-string 'mule-utf-16le))) (delete-region (match-beginning 0) (match-end 0)) (insert decoded-string))) For me, this does the right thing on your example, and on the text of that wikipedia page: The fictional character Xuanzang ($B8