unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#16048: 24.3.50; String compare surprise
@ 2013-12-04 11:44 michael.albinus
  2013-12-04 13:07 ` Andreas Schwab
  0 siblings, 1 reply; 14+ messages in thread
From: michael.albinus @ 2013-12-04 11:44 UTC (permalink / raw)
  To: 16048


The following form evals to nil:

  (string-equal "\377" "ÿ")

The character code of "ÿ" is

Char: ÿ (255, #o377, #xff, file ...) point=244 of 5726 (4%) column=23




In GNU Emacs 24.3.50.10 (i686-pc-linux-gnu, GTK+ Version 2.24.10)
 of 2013-12-03 on uw001237
Bzr revision: 115361 rudalics@gmx.at-20131203074554-p6glzuiqh5zp4k97
Windowing system distributor `The X.Org Foundation', version 11.0.11103000
System Description:	Ubuntu 12.04.3 LTS

Important settings:
  value of $LC_MONETARY: en_US.UTF-8
  value of $LC_NUMERIC: en_US.UTF-8
  value of $LC_TIME: en_US.UTF-8
  value of $LANG: en_US.UTF-8
  locale-coding-system: utf-8-unix
  default enable-multibyte-characters: t

Major mode: Group

Minor modes in effect:
  gnus-undo-mode: t
  erc-notify-mode: t
  erc-list-mode: t
  erc-menu-mode: t
  erc-autojoin-mode: t
  erc-ring-mode: t
  erc-networks-mode: t
  erc-pcomplete-mode: t
  erc-track-mode: t
  erc-match-mode: t
  erc-button-mode: t
  erc-fill-mode: t
  erc-stamp-mode: t
  erc-netsplit-mode: t
  erc-irccontrols-mode: t
  erc-noncommands-mode: t
  erc-move-to-prompt-mode: t
  erc-readonly-mode: t
  display-time-mode: t
  shell-dirtrack-mode: t
  iswitchb-mode: t
  icomplete-mode: t
  show-paren-mode: t
  tooltip-mode: t
  electric-indent-mode: t
  mouse-wheel-mode: t
  menu-bar-mode: t
  file-name-shadow-mode: t
  global-font-lock-mode: t
  font-lock-mode: t
  blink-cursor-mode: t
  auto-composition-mode: t
  auto-encryption-mode: t
  auto-compression-mode: t
  buffer-read-only: t
  column-number-mode: t
  line-number-mode: t
  transient-mark-mode: t

Recent input:
<f2> C-g y <escape> x r e p o r t <tab> <return>

Recent messages:
Opening TLS connection to `imap.gmx.net'...done
Opening connection to imap.gmx.net...done
Reading active file via nnml...
Reading incoming mail from file...
nnml: Reading incoming mail (no new mail)...done
Reading active file via nnml...done
Reading active file via nndraft...done
nnimap read 0k from imap.gmx.net
Checking new news...done
Warning: Quit trying to open server nnimap+email.tieto.com

Load-path shadows:
/home/albinmic/src/elpa/packages/debbugs/debbugs hides /home/albinmic/.emacs.d/elpa/debbugs-0.5/debbugs
/home/albinmic/src/elpa/packages/debbugs/debbugs-gnu hides /home/albinmic/.emacs.d/elpa/debbugs-0.5/debbugs-gnu
/home/albinmic/src/elpa/packages/debbugs/debbugs-org hides /home/albinmic/.emacs.d/elpa/debbugs-0.5/debbugs-org
/home/albinmic/src/elpa/packages/debbugs/debbugs-pkg hides /home/albinmic/.emacs.d/elpa/debbugs-0.5/debbugs-pkg
/home/albinmic/src/elpa/packages/debbugs/debbugs-autoloads hides /home/albinmic/.emacs.d/elpa/debbugs-0.5/debbugs-autoloads
~/src/tramp/lisp/tramp-cache hides /home/albinmic/src/emacs/lisp/net/tramp-cache
~/src/tramp/lisp/tramp-cmds hides /home/albinmic/src/emacs/lisp/net/tramp-cmds
~/src/tramp/lisp/tramp-adb hides /home/albinmic/src/emacs/lisp/net/tramp-adb
~/src/tramp/lisp/trampver hides /home/albinmic/src/emacs/lisp/net/trampver
~/src/tramp/lisp/tramp-smb hides /home/albinmic/src/emacs/lisp/net/tramp-smb
~/src/tramp/lisp/tramp hides /home/albinmic/src/emacs/lisp/net/tramp
~/src/tramp/lisp/tramp-ftp hides /home/albinmic/src/emacs/lisp/net/tramp-ftp
~/src/tramp/lisp/tramp-gw hides /home/albinmic/src/emacs/lisp/net/tramp-gw
~/src/tramp/lisp/tramp-gvfs hides /home/albinmic/src/emacs/lisp/net/tramp-gvfs
~/src/tramp/lisp/tramp-uu hides /home/albinmic/src/emacs/lisp/net/tramp-uu
~/src/tramp/lisp/tramp-sh hides /home/albinmic/src/emacs/lisp/net/tramp-sh
~/src/tramp/lisp/tramp-compat hides /home/albinmic/src/emacs/lisp/net/tramp-compat
~/src/tramp/lisp/tramp-loaddefs hides /home/albinmic/src/emacs/lisp/net/tramp-loaddefs

Features:
(shadow sort mail-extr warnings emacsbug utf-7 nndraft nnmh nnml
gnus-agent gnus-srvr gnus-score score-mode nnvirtual gnus-msg gnus-art
mm-uu mml2015 epg-config mm-view mml-smime smime dig mailcap gnus-cache
gnus-sum network-stream starttls nnimap parse-time tls utf7 netrc
smtpmail sendmail gnus-demon nntp gnus-group gnus-undo nnmail
mail-source nnoo gnus-start gnus-spec gnus-int gnus-range message rfc822
mml mml-sec mm-decode mm-bodies mm-encode mail-parse rfc2231 rfc2047
rfc2045 ietf-drums mailabbrev gmm-utils mailheader gnus-win gnus
gnus-ems nnheader mail-utils erc-notify erc-list erc-menu erc-join
erc-ring erc-networks erc-pcomplete erc-track erc-match erc-button
wid-edit erc-fill erc-stamp erc-netsplit erc-goodies erc erc-backend
erc-compat thingatpt pp cperl-mode info easymenu package time tramp
tramp-compat auth-source eieio byte-opt bytecomp byte-compile cconv
eieio-core gnus-util mm-util mail-prsvr password-cache tramp-loaddefs
cl-macs gv trampver shell pcomplete comint ansi-color ring format-spec
advice help-fns cl cl-loaddefs cl-lib iswitchb jka-compr icomplete paren
ps-print ps-def lpr vc vc-dispatcher dired time-date tooltip electric
uniquify ediff-hook vc-hooks lisp-float-type mwheel x-win x-dnd tool-bar
dnd fontset image regexp-opt fringe tabulated-list newcomment lisp-mode
prog-mode register page menu-bar rfn-eshadow timer select scroll-bar
mouse jit-lock font-lock syntax facemenu font-core frame cham georgian
utf-8-lang misc-lang vietnamese tibetan thai tai-viet lao korean
japanese hebrew greek romanian slovak czech european ethiopic indian
cyrillic chinese case-table epa-hook jka-cmpr-hook help simple abbrev
minibuffer nadvice loaddefs button faces cus-face macroexp files
text-properties overlay sha1 md5 base64 format env code-pages mule
custom widget hashtable-print-readable backquote make-network-process
dbusbind gfilenotify dynamic-setting system-font-setting
font-render-setting move-toolbar gtk x-toolkit x multi-tty emacs)





^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#16048: 24.3.50; String compare surprise
  2013-12-04 11:44 bug#16048: 24.3.50; String compare surprise michael.albinus
@ 2013-12-04 13:07 ` Andreas Schwab
  2013-12-04 14:00   ` Josh
  2013-12-04 14:05   ` Michael Albinus
  0 siblings, 2 replies; 14+ messages in thread
From: Andreas Schwab @ 2013-12-04 13:07 UTC (permalink / raw)
  To: michael.albinus; +Cc: 16048

michael.albinus@gmx.de writes:

> The following form evals to nil:
>
>   (string-equal "\377" "ÿ")

"\377" is a unibyte string.  When converted to multibyte it yields
"\x3fffff".

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."





^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#16048: 24.3.50; String compare surprise
  2013-12-04 13:07 ` Andreas Schwab
@ 2013-12-04 14:00   ` Josh
  2013-12-04 17:29     ` Eli Zaretskii
  2013-12-04 14:05   ` Michael Albinus
  1 sibling, 1 reply; 14+ messages in thread
From: Josh @ 2013-12-04 14:00 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: Michael Albinus, 16048

[-- Attachment #1: Type: text/plain, Size: 1025 bytes --]

On Wed, Dec 4, 2013 at 5:07 AM, Andreas Schwab <schwab@linux-m68k.org>wrote:

> michael.albinus@gmx.de writes:
>
> > The following form evals to nil:
> >
> >   (string-equal "\377" "ÿ")
>
> "\377" is a unibyte string.  When converted to multibyte it yields
> "\x3fffff".


At least as of 24.3, the manual[0] suggests that such a conversion
should not occur in this case:

    You can also use hexadecimal escape sequences (`\xN') and octal
    escape sequences (`\N') in string constants.  *But beware:* If a
    string constant contains hexadecimal or octal escape sequences,
    and these escape sequences all specify unibyte characters (i.e.,
    less than 256), and there are no other literal non-ASCII
    characters or Unicode-style escape sequences in the string, then
    Emacs automatically assumes that it is a unibyte string.  That is
    to say, it assumes that all non-ASCII characters occurring in the
    string are 8-bit raw bytes.

[0] (info "(elisp) Non-ASCII in Strings")

Josh

[-- Attachment #2: Type: text/html, Size: 1538 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#16048: 24.3.50; String compare surprise
  2013-12-04 13:07 ` Andreas Schwab
  2013-12-04 14:00   ` Josh
@ 2013-12-04 14:05   ` Michael Albinus
  2013-12-04 17:34     ` Eli Zaretskii
  1 sibling, 1 reply; 14+ messages in thread
From: Michael Albinus @ 2013-12-04 14:05 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: 16048-done

Andreas Schwab <schwab@linux-m68k.org> writes:

> michael.albinus@gmx.de writes:
>
>> The following form evals to nil:
>>
>>   (string-equal "\377" "ÿ")
>
> "\377" is a unibyte string.  When converted to multibyte it yields
> "\x3fffff".

Ah, well. In `dbus-unescape-from-identifier', there is

     (lambda (x) (format "%c" (string-to-number (substring x 1) 16)))

If I replace it by

     (lambda (x) (byte-to-string (string-to-number (substring x 1) 16)))

everything works fine.

> Andreas.

Thanks, and best regards, Michael (writng dbus-tests.el).





^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#16048: 24.3.50; String compare surprise
  2013-12-04 14:00   ` Josh
@ 2013-12-04 17:29     ` Eli Zaretskii
  2013-12-04 20:13       ` Josh
  0 siblings, 1 reply; 14+ messages in thread
From: Eli Zaretskii @ 2013-12-04 17:29 UTC (permalink / raw)
  To: Josh; +Cc: michael.albinus, schwab, 16048

> From: Josh <josh@foxtail.org>
> Date: Wed, 4 Dec 2013 06:00:46 -0800
> Cc: Michael Albinus <michael.albinus@gmx.de>, 16048@debbugs.gnu.org
> 
> On Wed, Dec 4, 2013 at 5:07 AM, Andreas Schwab <schwab@linux-m68k.org>wrote:
> 
> > michael.albinus@gmx.de writes:
> >
> > > The following form evals to nil:
> > >
> > >   (string-equal "\377" "ÿ")
> >
> > "\377" is a unibyte string.  When converted to multibyte it yields
> > "\x3fffff".
> 
> 
> At least as of 24.3, the manual[0] suggests that such a conversion
> should not occur in this case:

And it doesn't occur, indeed:

  (multibyte-string-p "\377")

    => nil

>     You can also use hexadecimal escape sequences (`\xN') and octal
>     escape sequences (`\N') in string constants.  *But beware:* If a
>     string constant contains hexadecimal or octal escape sequences,
>     and these escape sequences all specify unibyte characters (i.e.,
>     less than 256), and there are no other literal non-ASCII
>     characters or Unicode-style escape sequences in the string, then
>     Emacs automatically assumes that it is a unibyte string.  That is
>     to say, it assumes that all non-ASCII characters occurring in the
>     string are 8-bit raw bytes.
> 
> [0] (info "(elisp) Non-ASCII in Strings")

Best citation contest? you're on!

   -- Function: string= string1 string2
       This function returns `t' if the characters of the two strings
       match exactly.  Symbols are also allowed as arguments, in which
       case the symbol names are used.  Case is always significant,
       regardless of `case-fold-search'.

   [...]

       For technical reasons, a unibyte and a multibyte string are
       `equal' if and only if they contain the same sequence of character
       codes and all these codes are either in the range 0 through 127
       (ASCII) or 160 through 255 (`eight-bit-graphic').  However, when a
       unibyte string is converted to a multibyte string, all characters
       with codes in the range 160 through 255 are converted to
       characters with higher codes, whereas ASCII characters remain
       unchanged.  Thus, a unibyte string and its conversion to multibyte
       are only `equal' if the string is all ASCII.

Note the last sentence.





^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#16048: 24.3.50; String compare surprise
  2013-12-04 14:05   ` Michael Albinus
@ 2013-12-04 17:34     ` Eli Zaretskii
  2013-12-04 19:12       ` Stefan Monnier
  0 siblings, 1 reply; 14+ messages in thread
From: Eli Zaretskii @ 2013-12-04 17:34 UTC (permalink / raw)
  To: Michael Albinus; +Cc: 16048

> From: Michael Albinus <michael.albinus@gmx.de>
> Date: Wed, 04 Dec 2013 15:05:00 +0100
> Cc: 16048-done@debbugs.gnu.org
> 
> Ah, well. In `dbus-unescape-from-identifier', there is
> 
>      (lambda (x) (format "%c" (string-to-number (substring x 1) 16)))
> 
> If I replace it by
> 
>      (lambda (x) (byte-to-string (string-to-number (substring x 1) 16)))
> 
> everything works fine.

Beware: byte-to-string returns a unibyte string.  You do NOT want
unibyte strings in your application code.  The problem you had that
started this thread is a very good demonstration why.

So I would leave dbus-unescape-from-identifier intact, and instead fix
the other side of the string comparison, the one that yields "\377".





^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#16048: 24.3.50; String compare surprise
  2013-12-04 17:34     ` Eli Zaretskii
@ 2013-12-04 19:12       ` Stefan Monnier
  2013-12-05  7:51         ` Michael Albinus
  0 siblings, 1 reply; 14+ messages in thread
From: Stefan Monnier @ 2013-12-04 19:12 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Michael Albinus, 16048

> Beware: byte-to-string returns a unibyte string.  You do NOT want
> unibyte strings in your application code.

IIUC this is dbus code, so it likely handles marshalled data, which
often has to manage bytes rather than chars, so a unibyte string might
be the right thing.


        Stefan "who doesn't actually know what he's talking about"





^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#16048: 24.3.50; String compare surprise
  2013-12-04 17:29     ` Eli Zaretskii
@ 2013-12-04 20:13       ` Josh
  0 siblings, 0 replies; 14+ messages in thread
From: Josh @ 2013-12-04 20:13 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Michael Albinus, Andreas Schwab, 16048

[-- Attachment #1: Type: text/plain, Size: 2852 bytes --]

On Wed, Dec 4, 2013 at 9:29 AM, Eli Zaretskii <eliz@gnu.org> wrote:
> > From: Josh <josh@foxtail.org>
> > Date: Wed, 4 Dec 2013 06:00:46 -0800
> > Cc: Michael Albinus <michael.albinus@gmx.de>, 16048@debbugs.gnu.org
> > On Wed, Dec 4, 2013 at 5:07 AM, Andreas Schwab <schwab@linux-m68k.org
>wrote:
> > > michael.albinus@gmx.de writes:
> > >
> > > > The following form evals to nil:
> > > >
> > > >   (string-equal "\377" "ÿ")
> > >
> > > "\377" is a unibyte string.  When converted to multibyte it yields
> > > "\x3fffff".
> >
> >
> > At least as of 24.3, the manual[0] suggests that such a conversion
> > should not occur in this case:
> And it doesn't occur, indeed:
>
>   (multibyte-string-p "\377")
>
>     => nil
>
> >     You can also use hexadecimal escape sequences (`\xN') and octal
> >     escape sequences (`\N') in string constants.  *But beware:* If a
> >     string constant contains hexadecimal or octal escape sequences,
> >     and these escape sequences all specify unibyte characters (i.e.,
> >     less than 256), and there are no other literal non-ASCII
> >     characters or Unicode-style escape sequences in the string, then
> >     Emacs automatically assumes that it is a unibyte string.  That is
> >     to say, it assumes that all non-ASCII characters occurring in the
> >     string are 8-bit raw bytes.
> >
> > [0] (info "(elisp) Non-ASCII in Strings")
> Best citation contest? you're on!

No, thanks.  I haven't entered such contests in many years.

>    -- Function: string= string1 string2
>        This function returns `t' if the characters of the two strings
>        match exactly.  Symbols are also allowed as arguments, in which
>        case the symbol names are used.  Case is always significant,
>        regardless of `case-fold-search'.
>
>    [...]
>
>        For technical reasons, a unibyte and a multibyte string are
>        `equal' if and only if they contain the same sequence of character
>        codes and all these codes are either in the range 0 through 127
>        (ASCII) or 160 through 255 (`eight-bit-graphic').  However, when a
>        unibyte string is converted to a multibyte string, all characters
>        with codes in the range 160 through 255 are converted to
>        characters with higher codes, whereas ASCII characters remain
>        unchanged.  Thus, a unibyte string and its conversion to multibyte
>        are only `equal' if the string is all ASCII.
>
> Note the last sentence.

Yes, I must have misunderstood Andreas' meaning; I believed he was
suggesting that the two strings compared differently due to "\377"
having been converted to a multibyte string and therefore miscomparing
with the unibyte (or so I thought) string "ÿ".  I see now that I had
it exactly backwards.  Thanks for setting me straight.

[-- Attachment #2: Type: text/html, Size: 3730 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#16048: 24.3.50; String compare surprise
  2013-12-04 19:12       ` Stefan Monnier
@ 2013-12-05  7:51         ` Michael Albinus
  2013-12-05 17:38           ` Eli Zaretskii
  0 siblings, 1 reply; 14+ messages in thread
From: Michael Albinus @ 2013-12-05  7:51 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: 16048

Stefan Monnier <monnier@iro.umontreal.ca> writes:

>> Beware: byte-to-string returns a unibyte string.  You do NOT want
>> unibyte strings in your application code.
>
> IIUC this is dbus code, so it likely handles marshalled data, which
> often has to manage bytes rather than chars, so a unibyte string might
> be the right thing.

Indeed. My ert test case is

  (should
   (string-equal
    (dbus-unescape-from-identifier
     (dbus-escape-as-identifier "0123abc_xyz\x01\xff"))
    "0123abc_xyz\x01\xff"))

`dbus-unescape-from-identifier' cannot know, whether the original string
was unibyte or multibyte. So it must decide for one, and unibyte seems
to be the better decision.

I will add to the docstring of `dbus-unescape-from-identifier', that it
returns always a unibyte string.

>         Stefan "who doesn't actually know what he's talking about"

Best regards, Michael.





^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#16048: 24.3.50; String compare surprise
  2013-12-05  7:51         ` Michael Albinus
@ 2013-12-05 17:38           ` Eli Zaretskii
  2013-12-05 19:11             ` Stefan Monnier
  0 siblings, 1 reply; 14+ messages in thread
From: Eli Zaretskii @ 2013-12-05 17:38 UTC (permalink / raw)
  To: Michael Albinus; +Cc: 16048

> From: Michael Albinus <michael.albinus@gmx.de>
> Cc: Eli Zaretskii <eliz@gnu.org>,  16048@debbugs.gnu.org
> Date: Thu, 05 Dec 2013 08:51:41 +0100
> 
> > IIUC this is dbus code, so it likely handles marshalled data, which
> > often has to manage bytes rather than chars, so a unibyte string might
> > be the right thing.
> 
> Indeed. My ert test case is
> 
>   (should
>    (string-equal
>     (dbus-unescape-from-identifier
>      (dbus-escape-as-identifier "0123abc_xyz\x01\xff"))
>     "0123abc_xyz\x01\xff"))

FWIW, I don't see anything in this snippet that requires unibyte
strings.

Just to make it clear: Emacs is perfectly capable of holding raw bytes
in multibyte strings.  That's why we have the eight-bit charset.





^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#16048: 24.3.50; String compare surprise
  2013-12-05 17:38           ` Eli Zaretskii
@ 2013-12-05 19:11             ` Stefan Monnier
  2013-12-05 19:18               ` Eli Zaretskii
  2013-12-05 19:22               ` Michael Albinus
  0 siblings, 2 replies; 14+ messages in thread
From: Stefan Monnier @ 2013-12-05 19:11 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Michael Albinus, 16048

> Just to make it clear: Emacs is perfectly capable of holding raw bytes
> in multibyte strings.  That's why we have the eight-bit charset.

When manipulating sequences of bytes (as opposed to sequences of chars),
I find it is preferable to use unibyte strings.

Indeed, multibyte strings can work as well, but they can be more tricky
to work with since `aref' returns a "eight-bit byte" character rather
than a value between 128-255.

Of course, if your string can contain a mix of bytes and chars, you
don't have a choice.


        Stefan





^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#16048: 24.3.50; String compare surprise
  2013-12-05 19:11             ` Stefan Monnier
@ 2013-12-05 19:18               ` Eli Zaretskii
  2013-12-05 19:24                 ` Michael Albinus
  2013-12-05 19:22               ` Michael Albinus
  1 sibling, 1 reply; 14+ messages in thread
From: Eli Zaretskii @ 2013-12-05 19:18 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: michael.albinus, 16048

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: Michael Albinus <michael.albinus@gmx.de>,  16048@debbugs.gnu.org
> Date: Thu, 05 Dec 2013 14:11:11 -0500
> 
> Of course, if your string can contain a mix of bytes and chars, you
> don't have a choice.

"0123abc_xyz\x01\xff" looks just such a mix to me.





^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#16048: 24.3.50; String compare surprise
  2013-12-05 19:11             ` Stefan Monnier
  2013-12-05 19:18               ` Eli Zaretskii
@ 2013-12-05 19:22               ` Michael Albinus
  1 sibling, 0 replies; 14+ messages in thread
From: Michael Albinus @ 2013-12-05 19:22 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: 16048

Stefan Monnier <monnier@iro.umontreal.ca> writes:

>> Just to make it clear: Emacs is perfectly capable of holding raw bytes
>> in multibyte strings.  That's why we have the eight-bit charset.
>
> When manipulating sequences of bytes (as opposed to sequences of chars),
> I find it is preferable to use unibyte strings.

We are speaking about functions of dbus.el, which convert a string into
something with a C-style identifier syntax, and back. Nothing I would
expect to be a multibyte string in real life. (Agreed, my example looks
strange, but this is for the hard test in dbus-tests.el)

> Of course, if your string can contain a mix of bytes and chars, you
> don't have a choice.

For the other function in dbus.el, which handles arrays of bytes (often
used to marshall whatever strings are on the wire) I've added earlier
today the possiblity to encode them as unibyte or multibyte. As above,
the function `dbus-byte-array-to-string' cannot decide itself how to
interpret the bytestream, so the caller is requested to decide.

>         Stefan

Best regards, Michael.





^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#16048: 24.3.50; String compare surprise
  2013-12-05 19:18               ` Eli Zaretskii
@ 2013-12-05 19:24                 ` Michael Albinus
  0 siblings, 0 replies; 14+ messages in thread
From: Michael Albinus @ 2013-12-05 19:24 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 16048

Eli Zaretskii <eliz@gnu.org> writes:

>> Of course, if your string can contain a mix of bytes and chars, you
>> don't have a choice.
>
> "0123abc_xyz\x01\xff" looks just such a mix to me.

It is a hard core example, not something from the wild. In practice, I
expect rather ASCII strings to be handled.

Best regards, Michael.





^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2013-12-05 19:24 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-12-04 11:44 bug#16048: 24.3.50; String compare surprise michael.albinus
2013-12-04 13:07 ` Andreas Schwab
2013-12-04 14:00   ` Josh
2013-12-04 17:29     ` Eli Zaretskii
2013-12-04 20:13       ` Josh
2013-12-04 14:05   ` Michael Albinus
2013-12-04 17:34     ` Eli Zaretskii
2013-12-04 19:12       ` Stefan Monnier
2013-12-05  7:51         ` Michael Albinus
2013-12-05 17:38           ` Eli Zaretskii
2013-12-05 19:11             ` Stefan Monnier
2013-12-05 19:18               ` Eli Zaretskii
2013-12-05 19:24                 ` Michael Albinus
2013-12-05 19:22               ` Michael Albinus

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).