unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* cut-and-paste german quotes
@ 2004-04-23 16:59 Karl Eichwalder
  2004-04-29  2:44 ` Kenichi Handa
  0 siblings, 1 reply; 10+ messages in thread
From: Karl Eichwalder @ 2004-04-23 16:59 UTC (permalink / raw)


German quotes are looking this way: „[...]“ (as XML entities:
„[...]“ = low 99 ... upper 66).

I can paste them from Emacs into an xterm, but back from the xterm and
pasted into Emacs the closing "upper 66" is broken.
It has double width:

Char:  (0150310, 53448, 0xd0c8, file ...) point=533 of 746 (71%) column 6 

initially it was:

Char: “ (01234574, 342396, 0x5397c, file ...) point=337 of 851 (39%) column 42 

And then, Gnus complains (thus I'll remove the offening character for
posting):

Debugger entered--Lisp error: (error "Non-character input-event")
  read-char()
  byte-code("Š\b„Ð\0ÆÇ	ÈÉ\nÊ#ËP#ˆÌ ‰\x10\nž„\x01\0Í\x10ÎÏ!\x13Ð\v!ˆÑ ˆÒ ˆÓ ˆ	Ô±\x02ˆÕ\n‰\f†:\0Ö×͉‰‰\x1d\x1e \x1e!\x1e\"\x1e#\x1c\x1e$\x1e%\x1e&\x0e%ƒq\0\f\x0e%@8G\x16\"\x0e&\x0e\"V„h\0\x0e\"\x16&\x0e%A‰\x16%„T\0\x0e&Ø\\\x16&Ù S\x0e&¥\x16\"Ù S\x0e\"¥\x16!\x0e$ƒË\0\x0e#\x0e\"W„ž\0×\x16#ÚÕ!ˆÛcˆ\x0e!ÜZ\x16 ÝÞ\x0e !ßQ\x15Å\r\x0e$‰\x1e'@@)\f\x0e$@8#cˆ\x0e$A\x16$\x0e#T\x16#‚‡\0.	‚\x01\0)͇" [tchar prompt choice buf idx format message "%s (%s): " mapconcat #[(s) "Á\b@!‡" [s char-to-string] 2] ", " ", ?" read-char nil get-buffer-create "*Gnus Help*" pop-to-buffer fundamental-mode buffer-disable-undo erase-buffer ":\n\n" -1 1 0 4 window-width delete-char "\n" 3 "%c: %-" int-to-string "s" pad width n i alist list max x] 10)
  gnus-multiple-choice("Non-printable characters found.  Continue sending?" ((100 "Remove non-printable characters and send") (114 "Replace non-printable characters with dots and send") (105 "Ignore non-printable characters and send") (101 "Continue editing")))
  message-fix-before-sending()
  message-send(nil)
  message-send-and-exit(nil)
  call-interactively(message-send-and-exit)

-- 
                                                         |      ,__o
                                                         |    _-\_<,
http://www.gnu.franken.de/ke/                            |   (*)/'(*)

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: cut-and-paste german quotes
  2004-04-23 16:59 cut-and-paste german quotes Karl Eichwalder
@ 2004-04-29  2:44 ` Kenichi Handa
  2004-04-29  5:18   ` Karl Eichwalder
  2004-08-04  5:19   ` Karl Eichwalder
  0 siblings, 2 replies; 10+ messages in thread
From: Kenichi Handa @ 2004-04-29  2:44 UTC (permalink / raw)
  Cc: emacs-devel

In article <shd65yljfg.fsf@tux.gnu.franken.de>, Karl Eichwalder <ke@gnu.franken.de> writes:

> German quotes are looking this way: „[...]“ (as XML entities:
> &#x201E;[...]&#x201C; = low 99 ... upper 66).

> I can paste them from Emacs into an xterm, but back from the xterm and
> pasted into Emacs the closing "upper 66" is broken.
> It has double width:

> Char:  (0150310, 53448, 0xd0c8, file ...) point=533 of 746 (71%) column 6 

> initially it was:

> Char: “ (01234574, 342396, 0x5397c, file ...) point=337 of 851 (39%) column 42 

0xd0c8 is a character of charset japanese-jisx0208.  Emacs
by default requests a selection of type COMPOUND_TEXT.  It
seems that xterm, on responding to it, encodes U+201C into a
character of japanese-jisx0208.  It itself is not a bug
because that that character can be mapped to U+201C
according to glibc's charset mapping table.

Please try this:

(setq x-select-request-type '(UTF8_STRING COMPOUND_TEXT TEXT STRING))

Then Emacs requests a selection of type UTF8_STRING at first.

> And then, Gnus complains (thus I'll remove the offening character for
> posting):

> Debugger entered--Lisp error: (error "Non-character input-event")
>   read-char()
>   byte-code("Š\b„Ð\0ÆÇ	ÈÉ\nÊ#ËP#ˆÌ ‰\x10\nž„\x01\0Í\x10ÎÏ!\x13Ð\v!ˆÑ ˆÒ ˆÓ ˆ	Ô±\x02ˆÕ\n‰\f†:\0Ö×͉‰‰\x1d\x1e \x1e!\x1e\"\x1e#\x1c\x1e$\x1e%\x1e&\x0e%ƒq\0\f\x0e%@8G\x16\"\x0e&\x0e\"V„h\0\x0e\"\x16&\x0e%A‰\x16%„T\0\x0e&Ø\\\x16&Ù S\x0e&¥\x16\"Ù S\x0e\"¥\x16!\x0e$ƒË\0\x0e#\x0e\"W„ž\0×\x16#ÚÕ!ˆÛcˆ\x0e!ÜZ\x16 ÝÞ\x0e !ßQ\x15Å\r\x0e$‰\x1e'@@)\f\x0e$@8#cˆ\x0e$A\x16$\x0e#T\x16#‚‡\0.	‚\x01\0)͇" [tchar prompt choice buf idx format message "%s (%s): " mapconcat #[(s) "Á\b@!‡" [s char-to-string] 2] ", " ", ?" read-char nil get-buffer-create "*Gnus Help*" pop-to-buffer fundamental-mode buffer-disable-undo erase-buffer ":\n\n" -1 1 0 4 window-width delete-char "\n" 3 "%c: %-" int-to-string "s" pad width n i alist list max x] 10)
>   gnus-multiple-choice("Non-printable characters found.  Continue sending?" ((100 "Remove non-printable characters and send") (114 "Replace non-printable characters with dots and send") (105 "Ignore non-printable characters and send") (101 "Continue editing")))
>   message-fix-before-sending()
>   message-send(nil)
>   message-send-and-exit(nil)
>   call-interactively(message-send-and-exit)

It seems that gnus included in the latest Emacs doesn't have
this bug.  First of all, it doesn't have the function
gnus-multiple-choice.

---
Ken'ichi HANDA
handa@m17n.org

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: cut-and-paste german quotes
  2004-04-29  2:44 ` Kenichi Handa
@ 2004-04-29  5:18   ` Karl Eichwalder
  2004-08-04  5:19   ` Karl Eichwalder
  1 sibling, 0 replies; 10+ messages in thread
From: Karl Eichwalder @ 2004-04-29  5:18 UTC (permalink / raw)
  Cc: emacs-devel

Kenichi Handa <handa@m17n.org> writes:

> 0xd0c8 is a character of charset japanese-jisx0208.  Emacs
> by default requests a selection of type COMPOUND_TEXT.  It
> seems that xterm, on responding to it, encodes U+201C into a
> character of japanese-jisx0208.  It itself is not a bug
> because that that character can be mapped to U+201C
> according to glibc's charset mapping table.

thanks for the explanation.

> Please try this:
>
> (setq x-select-request-type '(UTF8_STRING COMPOUND_TEXT TEXT STRING))

thanks, this helps.

> Then Emacs requests a selection of type UTF8_STRING at first.

I'm wondering whether this setting would be appropriate as the default
on GNU/Linux distributions which set UTF-8 as the system default
encoding.

-- 
                                                         |      ,__o
                                                         |    _-\_<,
http://www.gnu.franken.de/ke/                            |   (*)/'(*)

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: cut-and-paste german quotes
  2004-04-29  2:44 ` Kenichi Handa
  2004-04-29  5:18   ` Karl Eichwalder
@ 2004-08-04  5:19   ` Karl Eichwalder
  2004-08-04  9:32     ` Andreas Schwab
  2004-08-09 12:42     ` Kenichi Handa
  1 sibling, 2 replies; 10+ messages in thread
From: Karl Eichwalder @ 2004-08-04  5:19 UTC (permalink / raw)
  Cc: emacs-devel

Kenichi Handa <handa@m17n.org> writes:

[...]

>> It has double width:

[...]

> 0xd0c8 is a character of charset japanese-jisx0208.  Emacs
> by default requests a selection of type COMPOUND_TEXT.  It
> seems that xterm, on responding to it, encodes U+201C into a
> character of japanese-jisx0208.  It itself is not a bug
> because that that character can be mapped to U+201C
> according to glibc's charset mapping table.
>
> Please try this:
>
> (setq x-select-request-type '(UTF8_STRING COMPOUND_TEXT TEXT STRING))
>
> Then Emacs requests a selection of type UTF8_STRING at first.

In the past I used this setting successfully.  Unfortunately, it does
not seem to catch all the other characters properly; e.g., the 'lower d with a
dash' in "Dindic" (simplified) as cut from
http://de.wikipedia.org/wiki/Zoran_%C4%90in%C4%91i%C4%87 is too wide:

    Zoran Ðinđić

Char: đ (0212242, 70818, 0x114a2, file ...) point=1406 of 2969 (47%) column 13 

But "Zoran Đinđić" is the version wanted:

Char: đ (01210061, 331825, 0x51031, file ...) point=1506 of 3096 (49%) column 14

What can I do to work around this problem?

-- 
                                                         |      ,__o
                                                         |    _-\_<,
http://www.gnu.franken.de/ke/                            |   (*)/'(*)

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: cut-and-paste german quotes
  2004-08-04  5:19   ` Karl Eichwalder
@ 2004-08-04  9:32     ` Andreas Schwab
  2004-08-09 12:42     ` Kenichi Handa
  1 sibling, 0 replies; 10+ messages in thread
From: Andreas Schwab @ 2004-08-04  9:32 UTC (permalink / raw)
  Cc: emacs-devel, Kenichi Handa

Karl Eichwalder <ke@gnu.franken.de> writes:

> In the past I used this setting successfully.  Unfortunately, it does
> not seem to catch all the other characters properly; e.g., the 'lower d with a
> dash' in "Dindic" (simplified) as cut from
> http://de.wikipedia.org/wiki/Zoran_%C4%90in%C4%91i%C4%87 is too wide:
>
>     Zoran Ðinđić
>
> Char: đ (0212242, 70818, 0x114a2, file ...) point=1406 of 2969 (47%) column 13 
>
> But "Zoran Đinđić" is the version wanted:
>
> Char: đ (01210061, 331825, 0x51031, file ...) point=1506 of 3096 (49%) column 14
>
> What can I do to work around this problem?

Works for me.  Both your mail and the name pasted from the web page use
only the 0x51031 character in my Emacs (which is pretty recent).  But
maybe that's because I'm using a language environment that is slightly
modified from German to favor UTF-8:

(set-language-info-alist
 "German-utf8" '((charset ascii latin-iso8859-1 mule-unicode-0100-24ff
			  mule-unicode-2500-33ff mule-unicode-e000-ffff)
		 (coding-system iso-latin-1 iso-latin-9 mule-utf-8)
		 (coding-priority mule-utf-8 iso-latin-1)
		 (documentation . "\
This language environment is almost the same as German,
but favors UTF-8 encoding.")
		 (unibyte-display . iso-latin-1)
		 (unibyte-syntax . "latin-1")
		 (nonascii-translation . latin-iso8859-1)
		 (input-method . "german-postfix")
		 (tutorial . "TUTORIAL.de"))
 '("European"))

Andreas.

-- 
Andreas Schwab, SuSE Labs, schwab@suse.de
SuSE Linux AG, Maxfeldstraße 5, 90409 Nürnberg, Germany
Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: cut-and-paste german quotes
  2004-08-04  5:19   ` Karl Eichwalder
  2004-08-04  9:32     ` Andreas Schwab
@ 2004-08-09 12:42     ` Kenichi Handa
  2004-08-09 13:12       ` ke
  1 sibling, 1 reply; 10+ messages in thread
From: Kenichi Handa @ 2004-08-09 12:42 UTC (permalink / raw)
  Cc: emacs-devel

Sorry for the late response.

In article <shekmn7a5b.fsf@tux.gnu.franken.de>, Karl Eichwalder <ke@gnu.franken.de> writes:

>>  Please try this:
>> 
>>  (setq x-select-request-type '(UTF8_STRING COMPOUND_TEXT TEXT STRING))
>> 
>>  Then Emacs requests a selection of type UTF8_STRING at first.

> In the past I used this setting successfully.  Unfortunately, it does
> not seem to catch all the other characters properly; e.g., the 'lower d with a
> dash' in "Dindic" (simplified) as cut from
> http://de.wikipedia.org/wiki/Zoran_%C4%90in%C4%91i%C4%87 is too wide:

When I open that URL by mozilla and paste that name into
Emacs, I get 0x51031 (which is what you want) for d-dash.
In what locale, are you running your browser?  What is the
value of last-coding-system-used just after you paste the
name?

---
Ken'ichi HANDA
handa@m17n.org

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: cut-and-paste german quotes
  2004-08-09 12:42     ` Kenichi Handa
@ 2004-08-09 13:12       ` ke
  2004-08-19 11:22         ` Kenichi Handa
  0 siblings, 1 reply; 10+ messages in thread
From: ke @ 2004-08-09 13:12 UTC (permalink / raw)
  Cc: emacs-devel

Kenichi Handa <handa@m17n.org> writes:

> Sorry for the late response.

No problem - Andreas hwo was not able to reproduce the problem already
answered; unfortunately, updating to a recent CVS version did not make
the problem go away for me.

> When I open that URL by mozilla and paste that name into
> Emacs, I get 0x51031 (which is what you want) for d-dash.
> In what locale, are you running your browser?

en_US.UTF-8

> What is the value of last-coding-system-used just after you paste the
> name?

last-coding-system-used's value is 
compound-text-with-extensions

Thanks for your help.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: cut-and-paste german quotes
  2004-08-09 13:12       ` ke
@ 2004-08-19 11:22         ` Kenichi Handa
  2004-08-19 15:16           ` Karl Eichwalder
  0 siblings, 1 reply; 10+ messages in thread
From: Kenichi Handa @ 2004-08-19 11:22 UTC (permalink / raw)
  Cc: emacs-devel

Sorry for the late response again.

In article <sh8ycoe9r3.fsf@frechet.suse.de>, ke@gnu.franken.de writes:

>>  When I open that URL by mozilla and paste that name into
>>  Emacs, I get 0x51031 (which is what you want) for d-dash.
>>  In what locale, are you running your browser?

> en_US.UTF-8

I tested mozilla started with that locale, but still can't
reproduce the problem.   The version of my mozilla is:

Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.0.0) Gecko/20020623 Debian/1.0.0-0.woody.1

What is yours?  I suspect that your mozilla doesn't use
UTF8_STRING somehow.

>>  What is the value of last-coding-system-used just after you paste the
>>  name?

> last-coding-system-used's value is 
> compound-text-with-extensions

I want you to test these two methods (each independently).

(1) Force Emacs to request only UTF8_STRING on receiving
    selection.

(setq x-select-request-type 'UTF8_STRING)

(2) Force compound-text-with-extensions to translate latin
characters in korean-ksc5601 to mule-unicode-0100-24ff.

(coding-system-put
 'compound-text-with-extensions
 'translation-table-for-decode
 (make-translation-table
  (let ((row #x21) (row-to #x2F)
	col char unicode map)
    (while (<= row row-to)
      (setq col #x21)
      (while (<= col #x7E)
	(setq char (make-char 'korean-ksc5601 row col)
	      unicode (encode-char char 'ucs))
	(if (and unicode (>= unicode #x80))
	    (setq map (cons (cons char (decode-char 'ucs unicode)) map)))
	(setq col (1+ col)))
      (setq row (1+ row)))
    map)))

---
Ken'ichi HANDA
handa@m17n.org

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: cut-and-paste german quotes
  2004-08-19 11:22         ` Kenichi Handa
@ 2004-08-19 15:16           ` Karl Eichwalder
  2004-09-01 13:13             ` Kenichi Handa
  0 siblings, 1 reply; 10+ messages in thread
From: Karl Eichwalder @ 2004-08-19 15:16 UTC (permalink / raw)
  Cc: emacs-devel

Kenichi Handa <handa@m17n.org> writes:

> Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.0.0) Gecko/20020623 Debian/1.0.0-0.woody.1
>
> What is yours?  I suspect that your mozilla doesn't use
> UTF8_STRING somehow.

It looks the like.  Mine is

    "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040114"

> (1) Force Emacs to request only UTF8_STRING on receiving
>     selection.
>
> (setq x-select-request-type 'UTF8_STRING)

This way it work.  I'm inclined to switch to this setting.

> (2) Force compound-text-with-extensions to translate latin
> characters in korean-ksc5601 to mule-unicode-0100-24ff.
>
> (coding-system-put
>  'compound-text-with-extensions
>  'translation-table-for-decode
>  (make-translation-table
>   (let ((row #x21) (row-to #x2F)
> 	col char unicode map)
>     (while (<= row row-to)
>       (setq col #x21)
>       (while (<= col #x7E)
> 	(setq char (make-char 'korean-ksc5601 row col)
> 	      unicode (encode-char char 'ucs))
> 	(if (and unicode (>= unicode #x80))
> 	    (setq map (cons (cons char (decode-char 'ucs unicode)) map)))
> 	(setq col (1+ col)))
>       (setq row (1+ row)))
>     map)))

This also works even it I use it in combination with

    (setq x-select-request-type '(COMPOUND_TEXT UTF8_STRING TEXT STRING))

Thanks for your debugging hints; I hope you can make use of my testing.

Karl

-- 
                                                         |      ,__o
                                                         |    _-\_<,
http://www.gnu.franken.de/ke/                            |   (*)/'(*)

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: cut-and-paste german quotes
  2004-08-19 15:16           ` Karl Eichwalder
@ 2004-09-01 13:13             ` Kenichi Handa
  0 siblings, 0 replies; 10+ messages in thread
From: Kenichi Handa @ 2004-09-01 13:13 UTC (permalink / raw)
  Cc: emacs-devel

Very sorry for the late response.

In article <sh3c2jkvjq.fsf@tux.gnu.franken.de>, Karl Eichwalder <ke@gnu.franken.de> writes:

> Kenichi Handa <handa@m17n.org> writes:
>>  Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.0.0) Gecko/20020623 Debian/1.0.0-0.woody.1
>> 
>>  What is yours?  I suspect that your mozilla doesn't use
>>  UTF8_STRING somehow.

> It looks the like.  Mine is

>     "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040114"

>>  (1) Force Emacs to request only UTF8_STRING on receiving
>>      selection.
>> 
>>  (setq x-select-request-type 'UTF8_STRING)

> This way it work.  I'm inclined to switch to this setting.

I see.

>>  (2) Force compound-text-with-extensions to translate latin
>>  characters in korean-ksc5601 to mule-unicode-0100-24ff.
>> 
>>  (coding-system-put
>>   'compound-text-with-extensions
>>   'translation-table-for-decode
>>   (make-translation-table
>>    (let ((row #x21) (row-to #x2F)
>>  	col char unicode map)
>>      (while (<= row row-to)
>>        (setq col #x21)
>>        (while (<= col #x7E)
>>  	(setq char (make-char 'korean-ksc5601 row col)
>>  	      unicode (encode-char char 'ucs))
>>  	(if (and unicode (>= unicode #x80))
>>  	    (setq map (cons (cons char (decode-char 'ucs unicode)) map)))
>>  	(setq col (1+ col)))
>>        (setq row (1+ row)))
>>      map)))

> This also works even it I use it in combination with

>     (setq x-select-request-type '(COMPOUND_TEXT UTF8_STRING TEXT STRING))

> Thanks for your debugging hints; I hope you can make use of my testing.

Yes.  Your testing is helpful.  I think your preferring of
mule-unicode-0100-24ff to korean-ksc5601 is reasonable for
non-Korean lang. env.  I'll try to find a way to make Emacs
work as you expect.

---
Ken'ichi HANDA
handa@m17n.org

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2004-09-01 13:13 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-04-23 16:59 cut-and-paste german quotes Karl Eichwalder
2004-04-29  2:44 ` Kenichi Handa
2004-04-29  5:18   ` Karl Eichwalder
2004-08-04  5:19   ` Karl Eichwalder
2004-08-04  9:32     ` Andreas Schwab
2004-08-09 12:42     ` Kenichi Handa
2004-08-09 13:12       ` ke
2004-08-19 11:22         ` Kenichi Handa
2004-08-19 15:16           ` Karl Eichwalder
2004-09-01 13:13             ` Kenichi Handa

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).