unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: Kenichi Handa <handa@m17n.org>
Cc: emacs-devel@gnu.org
Subject: eight-bit char handling in emacs-unicode
Date: Fri, 14 Nov 2003 09:47:51 +0900 (JST)	[thread overview]
Message-ID: <200311140047.JAA06414@etlken.m17n.org> (raw)
In-Reply-To: <ilun0b08by1.fsf@latte.josefsson.org> (message from Simon Josefsson on Thu, 13 Nov 2003 17:34:14 +0100)

In article <ilun0b08by1.fsf@latte.josefsson.org>, Simon Josefsson <jas@extundo.com> writes:
> rfc2104.el now works, thanks.  But does the fix really have to
> explicitly mention charsets like iso-latin-1?  Is there no way to
> handle binary octet strings in emacs-unicode?  Preferably in a
> portable way, that works on old Emacs versions and on XEmacs.

>>  This is a typical problem of emacs-unicode in which
>>  characters 128..255 are valid Unicode characters, thus, for
>>  instance, (concat '(?a ?\300)) returns a multibyte string of
>>  `a' and `À'.  But in the current Emacs, it returns a unibyte
>>  string.
>> 
>>  I suspect the similar fix is necessary in several other
>>  places.

> Having a way to deal with data that is a pure single byte, without
> involving coding systems, seems like a rather important thing to me.

I agree with you.  Currently, I can think of these methods:

(1) Perhaps the easiest way.

Check `default-enable-multibyte-characters' or a newly
instroduced variable `byte-as-byte' to decide whether a
integer 128..255 must be treated as a Latin-1 char or a
byte.   So,
(concat '(?a ?\300)) => "aÀ" (multibyte string)
(let ((byte-as-byte t))
  (concat '(?a ?\300))) => "a\300" (unibyte string)

(2) Introduce a new function `eight-bit-char'.

It converts an argument to ascii or eight-bit-char.
(eight-bit-char ?a) => 94
(eight-bit-char ?\300) => 4194240
Then,
(concat '(?a (eight-bit-char ?\300))) => "a\300"

(3) Make a series of new functions (I think it's not good)

concat vs concat-unibyte
string vs string-unibyte
aset vs aset-unibyte

(4) Most drastic way (the cleanest but requires lots of work)

The basic problem is that we don't distinguish a character
(code) and a number.  So, we introduce a character object
(like XEmacs).  The function `character' converts a
character code into the corresponding character object.  The
lisp reader always generate a character object for ?a,
?\300, etc.   So:
 (concat '(?a ?\300)) => "aÀ"
 (concat '(?a #o300)) => "a\300"
 (concat '(?a (character #o300))) => "aÀ"
 (concat '(?a #o300 (character #o300))) => "a\300À"

Note: (character X) == (decode-char 'ucs X)

> It started now, but when I enter a summary buffer it crashed:

> Program received signal SIGSEGV, Segmentation fault.
> 0x081a3c81 in skip_chars (forwardp=1, string=160, lim=36) at syntax.c:1591
> 1591                      char_ranges[n_char_ranges++] = c;
> (gdb) bt
> #0  0x081a3c81 in skip_chars (forwardp=1, string=160, lim=36) at syntax.c:1591

I just tried gnus but I couldn't reproduce it.  So, I need
more help.  Could you show me the results of the following?

(gdb) p n_char_ranges
(gbd) p c
(gdb) p string
(gdb) xstring
(gdb) p *$

---
Ken'ichi HANDA
handa@m17n.org

  reply	other threads:[~2003-11-14  0:47 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-11-12 16:11 BIG5-HKSCS? Simon Josefsson
2003-11-13  1:53 ` BIG5-HKSCS? Kenichi Handa
2003-11-13  4:14   ` BIG5-HKSCS? Simon Josefsson
2003-11-13  5:34     ` BIG5-HKSCS? Kenichi Handa
2003-11-13  5:50       ` BIG5-HKSCS? Simon Josefsson
2003-11-13  4:49   ` BIG5-HKSCS? Simon Josefsson
2003-11-13  6:10     ` BIG5-HKSCS? Kenichi Handa
2003-11-13  6:51       ` BIG5-HKSCS? Simon Josefsson
2003-11-13  9:01         ` BIG5-HKSCS? Kenichi Handa
2003-11-13 13:29           ` BIG5-HKSCS? Oliver Scholz
2003-11-13 23:40             ` BIG5-HKSCS? Kenichi Handa
2003-11-14 13:35               ` BIG5-HKSCS? Oliver Scholz
2003-11-13 16:34           ` BIG5-HKSCS? Simon Josefsson
2003-11-14  0:47             ` Kenichi Handa [this message]
2003-11-14 13:25               ` eight-bit char handling in emacs-unicode Oliver Scholz
2003-11-15  1:09                 ` Kenichi Handa
2003-11-15 10:26                   ` Oliver Scholz
2003-11-15 21:47                     ` Simon Josefsson
2003-11-15  3:04               ` Simon Josefsson
2003-11-16 15:03                 ` Alex Schroeder
2003-11-17 21:17               ` Stefan Monnier
2003-11-18  7:33                 ` Kenichi Handa
2003-11-18 17:12                   ` Stefan Monnier
2003-11-19  0:06                     ` Kenichi Handa
2003-11-19  3:05                       ` Stefan Monnier
2003-11-19 10:46                         ` Juri Linkov
2003-11-19 13:48                           ` Stefan Monnier
2003-11-20 23:41                           ` Kenichi Handa
2003-11-21  0:41                         ` Kenichi Handa
2003-11-21  5:27                           ` Stefan Monnier
2003-11-21  6:27                             ` Kenichi Handa
2003-11-21 14:59                               ` Stefan Monnier
2003-11-22  1:25                                 ` Kenichi Handa
2003-11-22 23:53                                   ` Stefan Monnier
2003-11-23  7:30                                     ` Kenichi Handa
2003-11-23 23:48                                       ` Stefan Monnier
2003-11-25  1:07                                         ` Kenichi Handa
     [not found]                                           ` <jwvfzgcsbuv.fsf-monnier+emacs/devel@vor.iro.umontreal.ca>
2003-11-26  0:07                                             ` Kenichi Handa
2003-11-26 14:14                                               ` Stefan Monnier
2003-11-27  1:34                                                 ` Kenichi Handa
2003-11-27 14:23                                                   ` Stefan Monnier
2003-12-01  0:43                                                     ` Kenichi Handa
2003-12-01 16:15                                                       ` Stefan Monnier
2003-12-02 13:07                                                         ` Kenichi Handa
2003-12-02 16:06                                                           ` Stefan Monnier
2003-11-25  4:28                                         ` Richard Stallman
     [not found]                                     ` <jwv7k1gtswz.fsf-monnier+emacs/devel@vor.iro.umontreal.ca>
2003-12-09 21:49                                       ` Richard Stallman
2003-11-15 22:32       ` BIG5-HKSCS? Simon Josefsson
2003-11-17  1:12         ` BIG5-HKSCS? Kenichi Handa
2003-11-17  2:06           ` BIG5-HKSCS? Simon Josefsson
2003-11-17  5:45             ` BIG5-HKSCS? Eli Zaretskii
2003-11-17  7:43               ` BIG5-HKSCS? Simon Josefsson
2003-11-18  7:01                 ` BIG5-HKSCS? Richard Stallman
2003-11-18  8:56                   ` BIG5-HKSCS? Simon Josefsson
2003-11-19  5:15                     ` BIG5-HKSCS? Richard Stallman
2003-11-20  5:48                       ` BIG5-HKSCS? Simon Josefsson
2003-11-20  5:56                         ` BIG5-HKSCS? Eli Zaretskii
2003-11-20  6:20                           ` BIG5-HKSCS? Simon Josefsson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200311140047.JAA06414@etlken.m17n.org \
    --to=handa@m17n.org \
    --cc=emacs-devel@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).