unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* displaying 8bit characters octal sequences
@ 2004-10-10 22:34 Roland Winkler
  2004-10-10 22:51 ` Stefan
  0 siblings, 1 reply; 15+ messages in thread
From: Roland Winkler @ 2004-10-10 22:34 UTC (permalink / raw)


Eli Zaretskii <eliz@is.elta.co.il> writes:
> On 2 Jul 1999, Roland Winkler wrote:
> > My default setting for editing files is unibyte with
> > iso-latin-1. What should I do if in certain buffers I want
> > everything beyond 7bit asci to be displayed with the
> > corresponding octal number?
> 
> Try this:
> 
>  M-: (standard-display-default 128 255) RET

The above is from five years ago. It worked fine up to emacs 21.2.1
(as far as I can go back). Now I am using emacs 21.3.1 or CVS emacs.
and it doesn't work for me anymore.

To be more specific: Say, I start a fresh emacs --no-init-file.
Then I load a file that contains some german umlaute (iso-8859-1).
Then I do
 M-: (standard-display-default 128 255) RET

When I do all this with emacs 21.2.1 I see the octal sequences as
expected. When I do all this with emacs 21.3.1 or CVS emacs, the
umlaute are still displayed as umlaute.

The problem is not restricted to CVS emacs, but Eli Zaretskii
suggested that I should post it here.

Roland

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: displaying 8bit characters octal sequences
  2004-10-10 22:34 displaying 8bit characters octal sequences Roland Winkler
@ 2004-10-10 22:51 ` Stefan
  2004-10-10 23:09   ` Roland Winkler
  0 siblings, 1 reply; 15+ messages in thread
From: Stefan @ 2004-10-10 22:51 UTC (permalink / raw)
  Cc: emacs-devel

>> > My default setting for editing files is unibyte with
>> > iso-latin-1. What should I do if in certain buffers I want
>> > everything beyond 7bit asci to be displayed with the
>> > corresponding octal number?
>> 
>> Try this:
>> 
>> M-: (standard-display-default 128 255) RET

> The above is from five years ago. It worked fine up to emacs 21.2.1
> (as far as I can go back). Now I am using emacs 21.3.1 or CVS emacs.
> and it doesn't work for me anymore.

My crystal ball tells me you're using `standard-european-display' and you
think that's enough to be in unibyte mode.  Not so any more.
If you want unibyte mode, you need to ask for it explicitly.


        Stefan

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: displaying 8bit characters octal sequences
  2004-10-10 22:51 ` Stefan
@ 2004-10-10 23:09   ` Roland Winkler
  2004-10-11 14:12     ` Stefan
  0 siblings, 1 reply; 15+ messages in thread
From: Roland Winkler @ 2004-10-10 23:09 UTC (permalink / raw)
  Cc: emacs-devel

On Sun Oct 10 2004 Stefan wrote:
> >> > My default setting for editing files is unibyte with
> >> > iso-latin-1. What should I do if in certain buffers I want
> >> > everything beyond 7bit asci to be displayed with the
> >> > corresponding octal number?
> >> 
> >> Try this:
> >> 
> >> M-: (standard-display-default 128 255) RET
> 
> > The above is from five years ago. It worked fine up to emacs 21.2.1
> > (as far as I can go back). Now I am using emacs 21.3.1 or CVS emacs.
> > and it doesn't work for me anymore.
> 
> My crystal ball tells me you're using `standard-european-display' and you
> think that's enough to be in unibyte mode.  Not so any more.
> If you want unibyte mode, you need to ask for it explicitly.

I am not sure I understand correctly what you say. No matter whether
I use "emacs --no-init-file" or "emacs --unibyte --no-init-file" 
I will not see the octal sequences. Where does the
standard-european-display come from? (Normally, I always use --unibyte.)

In any case, it seems to me that emacs 21.2.1 and emacs 21.3.1
behave differently here. (There might be good reasons for this. I am
merely asking myself how I can get the behavior I got with emacs
21.2.1.)

Roland

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: displaying 8bit characters octal sequences
  2004-10-10 23:09   ` Roland Winkler
@ 2004-10-11 14:12     ` Stefan
  2004-10-11 14:28       ` Roland Winkler
                         ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: Stefan @ 2004-10-11 14:12 UTC (permalink / raw)
  Cc: emacs-devel

>> My crystal ball tells me you're using `standard-european-display' and you
>> think that's enough to be in unibyte mode.  Not so any more.
>> If you want unibyte mode, you need to ask for it explicitly.

> I am not sure I understand correctly what you say. No matter whether
> I use "emacs --no-init-file" or "emacs --unibyte --no-init-file" 
> I will not see the octal sequences. Where does the
> standard-european-display come from? (Normally, I always use --unibyte.)

Duh, you're right.

..... yes, now I remember ..... someone changed the default display of
eight-bit-graphic chars: in multibyte buffers it's as before
(i.e. octal sequences), but in unibyte buffers they're displayed as
you're seeing them (i.e. as which ever glyph your default font chose for
those non-ascii chars).

Kim did you do this change?  I couldn't find mention of it in NEWS (looked
for "unibyte" and "eight-bit") and don't know how a user can overrule
this change.


        Stefan

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: displaying 8bit characters octal sequences
  2004-10-11 14:12     ` Stefan
@ 2004-10-11 14:28       ` Roland Winkler
  2004-10-11 14:45         ` Stefan
  2004-10-11 14:55       ` Kim F. Storm
  2004-10-12  8:57       ` Richard Stallman
  2 siblings, 1 reply; 15+ messages in thread
From: Roland Winkler @ 2004-10-11 14:28 UTC (permalink / raw)
  Cc: emacs-devel

On Mon Oct 11 2004 Stefan wrote:
> >> My crystal ball tells me you're using `standard-european-display' and you
> >> think that's enough to be in unibyte mode.  Not so any more.
> >> If you want unibyte mode, you need to ask for it explicitly.
> 
> > I am not sure I understand correctly what you say. No matter whether
> > I use "emacs --no-init-file" or "emacs --unibyte --no-init-file" 
> > I will not see the octal sequences. Where does the
> > standard-european-display come from? (Normally, I always use --unibyte.)
> 
> Duh, you're right.
> 
> ..... yes, now I remember ..... someone changed the default display of
> eight-bit-graphic chars: in multibyte buffers it's as before
> (i.e. octal sequences), but in unibyte buffers they're displayed as
> you're seeing them (i.e. as which ever glyph your default font chose for
> those non-ascii chars).
> 
> Kim did you do this change?  I couldn't find mention of it in NEWS (looked
> for "unibyte" and "eight-bit") and don't know how a user can overrule
> this change.

Stefan, thanks a lot, though I am still not sure I understand
correctly what you say. If I understand you right

(standard-display-default 128 255)

should work if I am not in unibyte mode. 

When I load a buffer with german umlaute in "emacs --no-init-file",
according to the modeline emacs is not using the unibyte mode.
Nonetheless, standard-display-default has no effect for me either.
Or am I missing something here?

Roland

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: displaying 8bit characters octal sequences
  2004-10-11 14:28       ` Roland Winkler
@ 2004-10-11 14:45         ` Stefan
  2004-10-11 17:58           ` Roland Winkler
  0 siblings, 1 reply; 15+ messages in thread
From: Stefan @ 2004-10-11 14:45 UTC (permalink / raw)
  Cc: emacs-devel

> Stefan, thanks a lot, though I am still not sure I understand
> correctly what you say. If I understand you right

> (standard-display-default 128 255)

> should work if I am not in unibyte mode. 

Yes, but it only applies to chars in the 128-255 range (in the domain of
the chars internal to Emacs, not in latin-1, or koi-8, or whathaveyou).

> When I load a buffer with german umlaute in "emacs --no-init-file",
> according to the modeline emacs is not using the unibyte mode.
> Nonetheless, standard-display-default has no effect for me either.
> Or am I missing something here?

When not in multibyte mode, most likely your umlaute are correctly
recognized as latin-1 and represented as latin-1 chars internally (with
codepoints around 2200 or 2300 IIRC).  Try C-u C-x = on one of those chars.

To see the octal sequence try: open the file in unibyte mode, then type
M-x toggle-enable-multibyte-characters RET.


        Stefan

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: displaying 8bit characters octal sequences
  2004-10-11 14:12     ` Stefan
  2004-10-11 14:28       ` Roland Winkler
@ 2004-10-11 14:55       ` Kim F. Storm
  2004-10-12  7:38         ` Kenichi Handa
  2004-10-12  8:57       ` Richard Stallman
  2 siblings, 1 reply; 15+ messages in thread
From: Kim F. Storm @ 2004-10-11 14:55 UTC (permalink / raw)
  Cc: emacs-devel, Roland Winkler

Stefan <monnier@iro.umontreal.ca> writes:

> ..... yes, now I remember ..... someone changed the default display of
> eight-bit-graphic chars: in multibyte buffers it's as before
> (i.e. octal sequences), but in unibyte buffers they're displayed as
> you're seeing them (i.e. as which ever glyph your default font chose for
> those non-ascii chars).
>
> Kim did you do this change?  

Not on purpose.

I think Handa did it:

2002-08-27  Kenichi Handa  <handa@etl.go.jp>

	* xdisp.c (get_next_display_element): In unibyte case, don't use
	octal form for such eight-bit characters that can be converted to
	multibyte char.

-- 
Kim F. Storm <storm@cua.dk> http://www.cua.dk

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: displaying 8bit characters octal sequences
  2004-10-11 14:45         ` Stefan
@ 2004-10-11 17:58           ` Roland Winkler
  2004-10-11 19:41             ` Stefan Monnier
  2004-10-11 19:45             ` Stefan Monnier
  0 siblings, 2 replies; 15+ messages in thread
From: Roland Winkler @ 2004-10-11 17:58 UTC (permalink / raw)
  Cc: emacs-devel

On Mon Oct 11 2004 Stefan wrote:
> > Stefan, thanks a lot, though I am still not sure I understand
> > correctly what you say. If I understand you right
> 
> > (standard-display-default 128 255)
> 
> > should work if I am not in unibyte mode. 
> 
> Yes, but it only applies to chars in the 128-255 range (in the
> domain of the chars internal to Emacs, not in latin-1, or koi-8,
> or whathaveyou).

Thanks, I believe I understand what you mean.

> When not in multibyte mode, most likely your umlaute are correctly
> recognized as latin-1 and represented as latin-1 chars internally
> (with codepoints around 2200 or 2300 IIRC). Try C-u C-x = on one
> of those chars.

The codepoint in my particular example is 228:

  character: ä (0344, 228, 0xe4)
    charset: eight-bit-graphic (8-bit graphic char (0xA0..0xFF))
 code point: 228
     syntax: w  which means: word
buffer code: 0xE4
  file code: 0xE4 (encoded by coding system raw-text-unix)
    display: by display table entry [?] (see below)

The display table entry is displayed by these fonts (glyph codes):
ä: -Misc-Fixed-Medium-R-Normal--20-200-75-75-C-100-ISO8859-1 (0xE4)

> To see the octal sequence try: open the file in unibyte mode, then type
> M-x toggle-enable-multibyte-characters RET.

When I load the file in an emacs --unibyte, then I type
M-x toggle-enable-multibyte-characters RET
nothing happens.

Roland

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: displaying 8bit characters octal sequences
  2004-10-11 17:58           ` Roland Winkler
@ 2004-10-11 19:41             ` Stefan Monnier
  2004-10-11 20:02               ` Roland Winkler
  2004-10-11 19:45             ` Stefan Monnier
  1 sibling, 1 reply; 15+ messages in thread
From: Stefan Monnier @ 2004-10-11 19:41 UTC (permalink / raw)
  Cc: emacs-devel

>> When not in multibyte mode, most likely your umlaute are correctly
        ^^^
>> recognized as latin-1 and represented as latin-1 chars internally
>> (with codepoints around 2200 or 2300 IIRC). Try C-u C-x = on one
>> of those chars.

I meant "when in multibyte mode" or "when not in unibyte mode".


        Stefan

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: displaying 8bit characters octal sequences
  2004-10-11 17:58           ` Roland Winkler
  2004-10-11 19:41             ` Stefan Monnier
@ 2004-10-11 19:45             ` Stefan Monnier
  1 sibling, 0 replies; 15+ messages in thread
From: Stefan Monnier @ 2004-10-11 19:45 UTC (permalink / raw)
  Cc: emacs-devel

>> To see the octal sequence try: open the file in unibyte mode, then type
>> M-x toggle-enable-multibyte-characters RET.

> When I load the file in an emacs --unibyte, then I type
> M-x toggle-enable-multibyte-characters RET
> nothing happens.

Have you done the (standard-display-default 128 255)?
It works for me:

  % ~/src/emacs/trunk/src/emacs --unibyte -Q ~/html/2030/tp2/slip.hs
  M-: (standard-display-default 128 255) RET
  M-x toggle-enable-multibyte-characters RET


-- Stefan

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: displaying 8bit characters octal sequences
  2004-10-11 19:41             ` Stefan Monnier
@ 2004-10-11 20:02               ` Roland Winkler
  2004-10-12 14:30                 ` Stefan Monnier
  0 siblings, 1 reply; 15+ messages in thread
From: Roland Winkler @ 2004-10-11 20:02 UTC (permalink / raw)
  Cc: emacs-devel

On Mon Oct 11 2004 Stefan Monnier wrote:
> >> When not in multibyte mode, most likely your umlaute are correctly
>         ^^^
> >> recognized as latin-1 and represented as latin-1 chars internally
> >> (with codepoints around 2200 or 2300 IIRC). Try C-u C-x = on one
> >> of those chars.
> 
> I meant "when in multibyte mode" or "when not in unibyte mode".

Then the codepoints are still much smaller than 2200:

  character: ä (04344, 2276, 0x8e4, U+00E4)
    charset: latin-iso8859-1
	     (Right-Hand Part of Latin Alphabet 1 (ISO/IEC 8859-1): ISO-IR-100.)
 code point: 100
     syntax: w 	which means: word
   category: l:Latin  
buffer code: 0x81 0xE4
  file code: 0xE4 (encoded by coding system iso-latin-1-unix)
    display: by this font (glyph code)
     -Misc-Fixed-Medium-R-Normal--20-200-75-75-C-100-ISO8859-1 (0xE4)


> Have you done the (standard-display-default 128 255)?
> It works for me:
> 
>   % ~/src/emacs/trunk/src/emacs --unibyte -Q ~/html/2030/tp2/slip.hs
>   M-: (standard-display-default 128 255) RET
>   M-x toggle-enable-multibyte-characters RET

Thank you, that works! Before, I tried either the
standard-display-default or the toggle-enable-multibyte-characters.

Now the only thing I still would like to do is: Display the 8bit
characters as octal sequences without switching to multibyte mode.
(Presently, the above is, however, a reasonable workaround for me.)

Roland

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: displaying 8bit characters octal sequences
  2004-10-11 14:55       ` Kim F. Storm
@ 2004-10-12  7:38         ` Kenichi Handa
  2004-10-13 13:44           ` Roland Winkler
  0 siblings, 1 reply; 15+ messages in thread
From: Kenichi Handa @ 2004-10-12  7:38 UTC (permalink / raw)
  Cc: roland.winkler, monnier, emacs-devel

In article <m3u0t1uwx5.fsf@kfs-l.imdomain.dk>, storm@cua.dk (Kim F. Storm) writes:

> Stefan <monnier@iro.umontreal.ca> writes:
>>  ..... yes, now I remember ..... someone changed the default display of
>>  eight-bit-graphic chars: in multibyte buffers it's as before
>>  (i.e. octal sequences), but in unibyte buffers they're displayed as
>>  you're seeing them (i.e. as which ever glyph your default font chose for
>>  those non-ascii chars).
>> 
>>  Kim did you do this change?  

> Not on purpose.

> I think Handa did it:

> 2002-08-27  Kenichi Handa  <handa@etl.go.jp>

> 	* xdisp.c (get_next_display_element): In unibyte case, don't use
> 	octal form for such eight-bit characters that can be converted to
> 	multibyte char.

I don't remember well :-(, but it seems that the change is
to make unibyte-display-via-language-environment work
without setting up standard-display-table.  I've just
installed the attached patch.  Now
    M-: (standard-display-default 128 255) RET
should work.

---
Ken'ichi HANDA
handa@m17n.org

2004-10-12  Kenichi Handa  <handa@m17n.org>

	* xdisp.c (get_next_display_element): If
	unibyte_display_via_language_environment is zero, display 8-bit
	chars in octal in unibyte buffer.

*** xdisp.c	30 Sep 2004 10:23:04 +0900	1.911
--- xdisp.c	12 Oct 2004 16:11:49 +0900	
***************
*** 4895,4901 ****
  			   && it->len == 1)
  			  || !CHAR_PRINTABLE_P (it->c))
  		       : (it->c >= 127
! 			  && it->c == unibyte_char_to_multibyte (it->c))))
  	    {
  	      /* IT->c is a control character which must be displayed
  		 either as '\003' or as `^C' where the '\\' and '^'
--- 4895,4902 ----
  			   && it->len == 1)
  			  || !CHAR_PRINTABLE_P (it->c))
  		       : (it->c >= 127
! 			  && (!unibyte_display_via_language_environment
! 			      || it->c == unibyte_char_to_multibyte (it->c)))))
  	    {
  	      /* IT->c is a control character which must be displayed
  		 either as '\003' or as `^C' where the '\\' and '^'

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: displaying 8bit characters octal sequences
  2004-10-11 14:12     ` Stefan
  2004-10-11 14:28       ` Roland Winkler
  2004-10-11 14:55       ` Kim F. Storm
@ 2004-10-12  8:57       ` Richard Stallman
  2 siblings, 0 replies; 15+ messages in thread
From: Richard Stallman @ 2004-10-12  8:57 UTC (permalink / raw)
  Cc: emacs-devel, roland.winkler

    Kim did you do this change?  I couldn't find mention of it in NEWS (looked
    for "unibyte" and "eight-bit") and don't know how a user can overrule
    this change.

A user could overrule this change by setting up a display table
containing strings like "\\300" to display these codes.  For each
code, the appropriate string.  That ought to work straightforwardly in
unibyte mode.  It should work also in multibyte, if you use the proper
Emacs internal character codes starting with 04200 or so.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: displaying 8bit characters octal sequences
  2004-10-11 20:02               ` Roland Winkler
@ 2004-10-12 14:30                 ` Stefan Monnier
  0 siblings, 0 replies; 15+ messages in thread
From: Stefan Monnier @ 2004-10-12 14:30 UTC (permalink / raw)
  Cc: emacs-devel

>> >> When not in multibyte mode, most likely your umlaute are correctly
>> ^^^
>> >> recognized as latin-1 and represented as latin-1 chars internally
>> >> (with codepoints around 2200 or 2300 IIRC). Try C-u C-x = on one
>> >> of those chars.
>> 
>> I meant "when in multibyte mode" or "when not in unibyte mode".

> Then the codepoints are still much smaller than 2200:

>   character: ä (04344, 2276, 0x8e4, U+00E4)
>     charset: latin-iso8859-1
> 	     (Right-Hand Part of Latin Alphabet 1 (ISO/IEC 8859-1): ISO-IR-100.)
>  code point: 100
>      syntax: w 	which means: word
>    category: l:Latin  
> buffer code: 0x81 0xE4
>   file code: 0xE4 (encoded by coding system iso-latin-1-unix)
>     display: by this font (glyph code)
>      -Misc-Fixed-Medium-R-Normal--20-200-75-75-C-100-ISO8859-1 (0xE4)

Hmm... by code point I meant the 2276 (i.e. the integer used in elisp to
represent the char).  I think the "code point 100" is the code point within
the latin-iso8859-1 charset, rather than within the whole emacs-mule space
of characters.


        Stefan

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: displaying 8bit characters octal sequences
  2004-10-12  7:38         ` Kenichi Handa
@ 2004-10-13 13:44           ` Roland Winkler
  0 siblings, 0 replies; 15+ messages in thread
From: Roland Winkler @ 2004-10-13 13:44 UTC (permalink / raw)
  Cc: emacs-devel, monnier, storm

On Tue Oct 12 2004 Kenichi Handa wrote:
> I don't remember well :-(, but it seems that the change is
> to make unibyte-display-via-language-environment work
> without setting up standard-display-table.  I've just
> installed the attached patch.  Now
>     M-: (standard-display-default 128 255) RET
> should work.

Thank you very much. The patch works perfect for me.

Roland

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2004-10-13 13:44 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-10-10 22:34 displaying 8bit characters octal sequences Roland Winkler
2004-10-10 22:51 ` Stefan
2004-10-10 23:09   ` Roland Winkler
2004-10-11 14:12     ` Stefan
2004-10-11 14:28       ` Roland Winkler
2004-10-11 14:45         ` Stefan
2004-10-11 17:58           ` Roland Winkler
2004-10-11 19:41             ` Stefan Monnier
2004-10-11 20:02               ` Roland Winkler
2004-10-12 14:30                 ` Stefan Monnier
2004-10-11 19:45             ` Stefan Monnier
2004-10-11 14:55       ` Kim F. Storm
2004-10-12  7:38         ` Kenichi Handa
2004-10-13 13:44           ` Roland Winkler
2004-10-12  8:57       ` Richard Stallman

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).