unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
From: Eli Zaretskii <eliz@gnu.org>
To: Paul Eggert <eggert@cs.ucla.edu>
Cc: johnw@gnu.org, 24206@debbugs.gnu.org
Subject: bug#24206: 25.1; Curly quotes generate invalid strings, leading to a segfault
Date: Thu, 18 Aug 2016 17:30:44 +0300	[thread overview]
Message-ID: <83lgzudu97.fsf@gnu.org> (raw)
In-Reply-To: <b09bc143-5ae2-d5bc-c7a3-82aa5eca0628@cs.ucla.edu> (message from Paul Eggert on Wed, 17 Aug 2016 13:52:42 -0700)

> Cc: johnw@gnu.org, 24206@debbugs.gnu.org
> From: Paul Eggert <eggert@cs.ucla.edu>
> Date: Wed, 17 Aug 2016 13:52:42 -0700
> 
>         Some change in this area was needed because the 'multibyte' flag went away.
> 
>     Only because you removed it.  You could have left it alone, it would
>     have worked
> 
> Sure, but it was no longer necessary, as the code no longer needs to record whether the original string was multibyte. Keeping an unnecessary variable around would make the code harder to read.

The code that got removed was the easy and intuitive part: it dealt
with processing single-byte strings one byte at a time.  The
hard-to-read part of the code is still with us.  We have less 'if'
conditionals, but that's hardly the main complication in the original
code.

>     even after the call to Fstring_make_multibyte, for the
>     reasons I explained earlier: the result is not necessarily a multibyte
>     string.
> 
> That doesn't affect the fact that the 'multibyte' variable is no longer necessary. In emacs-25, 'multibyte' does not mean that the result is a multibyte string

You are missing my point: the code on master now processes a string,
that could be either unibyte or multibyte, using only multibyte
methods.  With the flag in place, each kind of string would have used
the method that's natural with it.  The way things are now, one has to
think hard about what the code does to convince oneself it's valid.

>     I don't see why it is tricky, we do that in Emacs in other places.
>
> Really? A call to STRING_CHAR_AND_LENGTH followed by a length test followed by a call to memcpy for length > 1 and a special case inline copy for length == 1? When copying multibyte data? Where else does Emacs do that?

What exactly confuses you in that snippet?  The call to
STRING_CHAR_AND_LENGTH itself? we have that in umpteen other places.
The single-byte optimization of not calling memcpy?  That's standard
practice in C.  If you need an example for using
STRING_CHAR_AND_LENGTH while copying text, you can find it in
copy_text, for example.  I really don't understand what's your problem
with that code.

>     it's more clear for you
> 
> Replacing 14 unusually and unnecessarily tricky lines with zero lines should help clarify things for most readers.

They are not unusually tricky at all.  And you replaced it with a
fall-through, which is harder to follow and easier to introduce subtle
bugs.

>     I could simply revert your commit, it would have saved us both quite
>     some time.  Would you prefer that?
> 
> It'd be even simpler to leave things alone, as the master code works better than emacs-25 does.

Sorry, leaving alone changes that I find questionable or gratuitous is
not in the job description.

>         Alan wanted something that he could put into his .emacs that would cause
>         (message PERCENTLESS) to output the string PERCENTLESS as-is, assuming
>         PERCENTLESS lacks %. This was the point of his original bug report; his original
>         example involved ` and ' but he wanted the same behavior for ‘ and ’, a point
>         that became clear during the discussion of Bug#23425.
> 
>     Then why not for '..' as well?  How is that different from ‘..’?
> 
> It's not different. Alan wanted the same behavior for '..', and he got that too.

But the behavior is not the same:

  (let ((text-quoting-style 'curve))
    (substitute-command-keys "'foo'"))
      => ’foo’

but

  (let ((text-quoting-style 'grave))
    (substitute-command-keys "‘foo’"))
      => ‘foo’

I would have expected the first example to yield 'foo', i.e. leave the
apostrophes alone, as we do with curved quotes in the second example.
What we have now is inconsistent, and its rationale evades me.





  reply	other threads:[~2016-08-18 14:30 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-11 18:55 bug#24206: 25.1; Curly quotes generate invalid strings, leading to a segfault Phil
2016-08-11 20:05 ` Eli Zaretskii
2016-08-11 23:51   ` Philipp Stephani
2016-08-13  8:32     ` Eli Zaretskii
2016-08-13 12:25       ` Nicolas Petton
2016-08-14  6:33         ` John Wiegley
2016-08-14  4:54 ` Paul Eggert
2016-08-14 14:27   ` Eli Zaretskii
2016-08-14 14:51     ` Paul Eggert
2016-08-14 17:18       ` Eli Zaretskii
2016-08-15  2:04         ` Paul Eggert
2016-08-15 16:09           ` Eli Zaretskii
2016-08-15 16:46             ` Andreas Schwab
2016-08-15 18:43               ` Paul Eggert
2016-08-15 19:04                 ` Eli Zaretskii
2016-08-15 18:51               ` Eli Zaretskii
2016-08-15 19:05                 ` John Wiegley
2016-08-15 20:41                 ` Paul Eggert
2016-08-16 14:38                   ` Eli Zaretskii
2016-08-16 15:25                     ` John Wiegley
2016-08-16 16:09                       ` Nicolas Petton
2016-08-18 16:30                       ` Nicolas Petton
2016-08-18 16:41                         ` John Wiegley
2016-08-18 17:35                           ` Eli Zaretskii
2016-08-16 17:37                     ` Paul Eggert
2016-08-16 17:45                       ` John Wiegley
2016-08-16 17:55                         ` Paul Eggert
2016-08-16 17:57                           ` John Wiegley
2016-08-16 18:44                           ` Dmitry Gutov
2016-08-16 18:31                       ` Eli Zaretskii
2016-08-16 14:52                   ` Eli Zaretskii
2016-08-16 21:07                     ` Paul Eggert
2016-08-17 15:12                       ` Eli Zaretskii
2016-08-17 17:41                         ` Paul Eggert
2016-08-17 18:06                           ` Eli Zaretskii
2016-08-17 20:52                             ` Paul Eggert
2016-08-18 14:30                               ` Eli Zaretskii [this message]
2016-08-18 18:33                                 ` Paul Eggert
2016-08-18 18:58                                   ` Eli Zaretskii
2016-08-17 17:50                       ` Dmitry Gutov
2016-08-14 15:21   ` Dmitry Gutov
2016-08-15  1:53     ` Paul Eggert
2016-08-15  1:57       ` Dmitry Gutov
2016-08-15  2:05         ` Paul Eggert
2016-08-14 17:21   ` Eli Zaretskii
2016-08-14 20:16     ` Paul Eggert
2016-08-15  1:12       ` Paul Eggert

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=83lgzudu97.fsf@gnu.org \
    --to=eliz@gnu.org \
    --cc=24206@debbugs.gnu.org \
    --cc=eggert@cs.ucla.edu \
    --cc=johnw@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).