all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Eli Zaretskii <eliz@gnu.org>
To: Michal Nazarewicz <mina86@mina86.com>
Cc: 24425@debbugs.gnu.org
Subject: bug#24425: [PATCH] Don’t cast Unicode to 8-bit when casing unibyte strings
Date: Thu, 15 Sep 2016 21:55:20 +0300	[thread overview]
Message-ID: <83twdh56xz.fsf@gnu.org> (raw)
In-Reply-To: <xa1toa3p44xx.fsf@mina86.com> (message from Michal Nazarewicz on Thu, 15 Sep 2016 16:23:54 +0200)

> From: Michal Nazarewicz <mina86@mina86.com>
> Cc: 24425@debbugs.gnu.org
> Date: Thu, 15 Sep 2016 16:23:54 +0200
> 
> On Tue, Sep 13 2016, Eli Zaretskii wrote:
> > Currently, case changes in unibyte characters and strings are only
> > well defined for pure ASCII text; if the input or the result is not
> > pure ASCII, we produce "undefined behavior".
> 
> Would the following (not tested) make sense then:

AFAIU, it would disallow handling unibyte text by setting up case
tables for 8-bit characters in their multibyte representation,
i.e. above #x3FFF00.  I'd rather not lose that, although I don't think
I've ever seen that used.

> > Properly means that upcasing "istanbul" in the above example will
> > produce "İSTANBUL", not "iSTANBUL", and downcasing "IRMA" will produce
> > "ırma".
> 
> I thought about that but then another corner case is "istanbul\xff"
> which is a unibyte string with 8-bit bytes.

And what is the problem in that case?

> I have no strong feelings either way so I’m happy just leaving it as is
> as well.

That is fine with me.

Was there some real-life use case where you bumped into this?  If so,
maybe we should discuss that use case, perhaps the solution, if we
need one, is something other than what we talked about until now.





  reply	other threads:[~2016-09-15 18:55 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-09-12 22:46 bug#24425: [PATCH] Don’t cast Unicode to 8-bit when casing unibyte strings Michal Nazarewicz
2016-09-13 14:33 ` Eli Zaretskii
2016-09-15 14:23   ` Michal Nazarewicz
2016-09-15 18:55     ` Eli Zaretskii [this message]
2016-09-16 17:41       ` Michal Nazarewicz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=83twdh56xz.fsf@gnu.org \
    --to=eliz@gnu.org \
    --cc=24425@debbugs.gnu.org \
    --cc=mina86@mina86.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.