all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Stefan Monnier <monnier@IRO.UMontreal.CA>
Cc: emacs-devel@gnu.org, jas@extundo.com
Subject: Re: eight-bit char handling in emacs-unicode
Date: 18 Nov 2003 22:05:39 -0500	[thread overview]
Message-ID: <jwvptfp139w.fsf-monnier+emacs/devel@vor.iro.umontreal.ca> (raw)
In-Reply-To: <200311190006.JAA14847@etlken.m17n.org>

> I see.  Apart from the design itself, I agree that it's difficult to
> introduce a new type.  But, when I discussed with Richard about the
> Character type object a few year ago, he was not that negative provided
> that it gives sure improvement.

Sounds about right to me: we have one free tag that we could use for chars
(and that I currently use to boost the max buffer size from 256MB to 512MB
in my local code).
But it needs to pay for itself.

> Then, we can't use make-string-unibyte for the current case
> because, in emacs-unicode, (concat '(?a 192)) returns a
> multibyte string whose second element is A-grave, not an
> eight-bit-char.  Am I missing something?

Well, obviously we need to make it accept this case (i.e. accept both the
latin-1 192 and the eight-bit-char 192).  I'm sure there'll be other issues.
I haven't had much time to think about it and you're obviously better
placed to foresee potential problems.

>> To do what your string-make-unibyte does you should use
>> `encode-coding-string' where the coding system is passed explicitly.

> Those are conceptually different things (I remember the
> similar discussion we had a while ago).

> encode-coding-string does:
> char-sequence --CCS-set--> (CCS/codepoint-pair)-sequence
>     --CES--> encoded-byte-sequence

> string-make-unibyte does:
> char-sequence --CCS--> code-point-sequence
>     --concat--> code-point-sequence

> These two yield the same result only when CCS support all
> chars in "char-sequence" and CES is stateless
> (e.g. iso-latin-1) and .

You lost me here (I'm a poor soul whose doesn't know much outside of the
latin-1 world).
I thought that string-make-unibyte only behaves meaningfully for
"normal 8bit coding-systems" such as latin-1.

>> I've changed my Emacs so that string-make-unibyte does the above
>> (i.e. signals an error if it encounters a non-byte char) and it works fairly
>> well, except for the few places where the elisp code is sloppy and needs to
>> be fixed.

> How did you change it?  string-make-unibyte internally uses
> the function copy_text.  Did you change it?  But, then, each
> time you copy a multibyte string into a unibyte buffer, you
> should get an error.

Of course: it's an error.  A unibyte buffer cannot represent multibyte
chars, so you need to encode them first (into a unibyte string).

Now to tell you the truth, my change had to accept a few (not so) special
cases and it took a bit of fiddling to make the code lenient enough to
accept elisp code I didn't feel like "fixing".  I can't remember the details
off-hand, but I remember having problems with regexp matching functions
where multibyte regexps are used in unibyte buffers.


-- Stefan

  reply	other threads:[~2003-11-19  3:05 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-11-12 16:11 BIG5-HKSCS? Simon Josefsson
2003-11-13  1:53 ` BIG5-HKSCS? Kenichi Handa
2003-11-13  4:14   ` BIG5-HKSCS? Simon Josefsson
2003-11-13  5:34     ` BIG5-HKSCS? Kenichi Handa
2003-11-13  5:50       ` BIG5-HKSCS? Simon Josefsson
2003-11-13  4:49   ` BIG5-HKSCS? Simon Josefsson
2003-11-13  6:10     ` BIG5-HKSCS? Kenichi Handa
2003-11-13  6:51       ` BIG5-HKSCS? Simon Josefsson
2003-11-13  9:01         ` BIG5-HKSCS? Kenichi Handa
2003-11-13 13:29           ` BIG5-HKSCS? Oliver Scholz
2003-11-13 23:40             ` BIG5-HKSCS? Kenichi Handa
2003-11-14 13:35               ` BIG5-HKSCS? Oliver Scholz
2003-11-13 16:34           ` BIG5-HKSCS? Simon Josefsson
2003-11-14  0:47             ` eight-bit char handling in emacs-unicode Kenichi Handa
2003-11-14 13:25               ` Oliver Scholz
2003-11-15  1:09                 ` Kenichi Handa
2003-11-15 10:26                   ` Oliver Scholz
2003-11-15 21:47                     ` Simon Josefsson
2003-11-15  3:04               ` Simon Josefsson
2003-11-16 15:03                 ` Alex Schroeder
2003-11-17 21:17               ` Stefan Monnier
2003-11-18  7:33                 ` Kenichi Handa
2003-11-18 17:12                   ` Stefan Monnier
2003-11-19  0:06                     ` Kenichi Handa
2003-11-19  3:05                       ` Stefan Monnier [this message]
2003-11-19 10:46                         ` Juri Linkov
2003-11-19 13:48                           ` Stefan Monnier
2003-11-20 23:41                           ` Kenichi Handa
2003-11-21  0:41                         ` Kenichi Handa
2003-11-21  5:27                           ` Stefan Monnier
2003-11-21  6:27                             ` Kenichi Handa
2003-11-21 14:59                               ` Stefan Monnier
2003-11-22  1:25                                 ` Kenichi Handa
2003-11-22 23:53                                   ` Stefan Monnier
2003-11-23  7:30                                     ` Kenichi Handa
2003-11-23 23:48                                       ` Stefan Monnier
2003-11-25  1:07                                         ` Kenichi Handa
     [not found]                                           ` <jwvfzgcsbuv.fsf-monnier+emacs/devel@vor.iro.umontreal.ca>
2003-11-26  0:07                                             ` Kenichi Handa
2003-11-26 14:14                                               ` Stefan Monnier
2003-11-27  1:34                                                 ` Kenichi Handa
2003-11-27 14:23                                                   ` Stefan Monnier
2003-12-01  0:43                                                     ` Kenichi Handa
2003-12-01 16:15                                                       ` Stefan Monnier
2003-12-02 13:07                                                         ` Kenichi Handa
2003-12-02 16:06                                                           ` Stefan Monnier
2003-11-25  4:28                                         ` Richard Stallman
     [not found]                                     ` <jwv7k1gtswz.fsf-monnier+emacs/devel@vor.iro.umontreal.ca>
2003-12-09 21:49                                       ` Richard Stallman
2003-11-15 22:32       ` BIG5-HKSCS? Simon Josefsson
2003-11-17  1:12         ` BIG5-HKSCS? Kenichi Handa
2003-11-17  2:06           ` BIG5-HKSCS? Simon Josefsson
2003-11-17  5:45             ` BIG5-HKSCS? Eli Zaretskii
2003-11-17  7:43               ` BIG5-HKSCS? Simon Josefsson
2003-11-18  7:01                 ` BIG5-HKSCS? Richard Stallman
2003-11-18  8:56                   ` BIG5-HKSCS? Simon Josefsson
2003-11-19  5:15                     ` BIG5-HKSCS? Richard Stallman
2003-11-20  5:48                       ` BIG5-HKSCS? Simon Josefsson
2003-11-20  5:56                         ` BIG5-HKSCS? Eli Zaretskii
2003-11-20  6:20                           ` BIG5-HKSCS? Simon Josefsson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=jwvptfp139w.fsf-monnier+emacs/devel@vor.iro.umontreal.ca \
    --to=monnier@iro.umontreal.ca \
    --cc=emacs-devel@gnu.org \
    --cc=jas@extundo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.