unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* decode-coding-string gone awry?
@ 2005-02-13  3:50 David Kastrup
  2005-02-14  1:50 ` Kenichi Handa
  2005-02-14 13:37 ` Stefan Monnier
  0 siblings, 2 replies; 32+ messages in thread
From: David Kastrup @ 2005-02-13  3:50 UTC (permalink / raw)



Hi,

I have the problem that within preview-latex there is a function that
assembles UTF-8 strings from single characters.  This function, when
used manually, mostly works.  It is called within a process sentinel
and fails rather consistently there with a current CVS Emacs.  I
include the code here since I don't know what might be involved here:
regexp-quote, substring, char-to-string etc.  The starting string is
taken from a buffer containing only ASCII (inserted by a process with
coding-system 'raw-text).

Output looks like shown below.


(defun preview-error-quote (string)
  "Turn STRING with potential ^^ sequences into a regexp.
To preserve sanity, additional ^ prefixes are matched literally,
so the character represented by ^^^ preceding extended characters
will not get matched, usually."
  (let (output case-fold-search)
    (while (string-match "\\^\\{2,\\}\\(\\([@-_?]\\)\\|[8-9a-f][0-9a-f]\\)"
			 string)
      (setq output
	    (concat output
		    (regexp-quote (substring string
					     0
					     (- (match-beginning 1) 2)))
		    (if (match-beginning 2)
			(concat
			 "\\(?:" (regexp-quote
				  (substring string
					     (- (match-beginning 1) 2)
					     (match-end 0)))
			 "\\|"
			 (char-to-string
			  (logxor (aref string (match-beginning 2)) 64))
			 "\\)")
		      (char-to-string
		       (string-to-number (match-string 1 string) 16))))
	    string (substring string (match-end 0))))
    (setq output (concat output (regexp-quote string)))
    (if (featurep 'mule)
	(prog2
	    (message "%S %S " output buffer-file-coding-system)
	    (setq output (decode-coding-string output buffer-file-coding-system))
	  (message "%S\n" output))
      output)))

The prog2 is just for the sake of debugging.  What we get here is
something akin to

"r Weise \\$f\\$ um~\\$1\\$ erhöht und \\$e\\$" mule-utf-8-unix 
#("r Weise \\$f\\$ um~\\$1\\$ erh\xc2\x81Á\xc2\xb6ht und \\$e\\$" 0 26 nil 26 28 (display "\\201" help-echo utf-8-help-echo untranslated-utf-8 129) 28 29 nil 29 31 (display "\\266" help-echo utf-8-help-echo untranslated-utf-8 182) 31 43 nil)

when this is called in a mule-utf-8-unix buffer with
(preview-error-quote "r Weise $f$ um~$1$ erh^^c3^^b6ht und $e$")

Namely, the decoding from utf-8 does not work.  The original strings
are multibyte before the conversion and look reasonable, with the
bytes produced by char-to-string.

Unfortunately, when I call this stuff by hand instead from the
process-sentinel, it mostly works, so it would appear to be dependent
on some uninitialized stuff or similar that is different in the
process sentinel.

Anybody have a clue what might go wrong here?

-- 
David Kastrup, Kriemhildstr. 15, 44793 Bochum

^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2005-02-22  8:41 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-02-13  3:50 decode-coding-string gone awry? David Kastrup
2005-02-14  1:50 ` Kenichi Handa
2005-02-14  2:28   ` David Kastrup
2005-02-15  6:15   ` Richard Stallman
2005-02-15  9:31     ` David Kastrup
2005-02-15 16:17     ` Stefan Monnier
2005-02-17 10:35       ` Richard Stallman
2005-02-17 12:08       ` Kenichi Handa
2005-02-17 13:20         ` Stefan Monnier
2005-02-18  8:30           ` Kenichi Handa
2005-02-18 12:56             ` Stefan Monnier
2005-02-19  9:44             ` Richard Stallman
2005-02-18 14:12           ` Richard Stallman
2005-02-19 20:55             ` Richard Stallman
2005-02-21  1:19               ` Kenichi Handa
2005-02-22  8:41                 ` Richard Stallman
2005-02-18 14:12         ` Richard Stallman
2005-02-14 13:37 ` Stefan Monnier
2005-02-14 13:50   ` David Kastrup
2005-02-14 16:57     ` Stefan Monnier
2005-02-14 17:24       ` David Kastrup
2005-02-14 18:12         ` Stefan Monnier
2005-02-14 18:41           ` David Kastrup
2005-02-14 19:30             ` Stefan Monnier
2005-02-14 20:09               ` David Kastrup
2005-02-14 20:56                 ` Stefan Monnier
2005-02-14 21:07                   ` David Kastrup
2005-02-14 21:29                     ` Stefan Monnier
2005-02-14 21:57                       ` David Kastrup
2005-02-14 21:26                   ` David Kastrup
2005-02-15 17:28         ` Richard Stallman
2005-02-15 21:42           ` David Kastrup

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).