unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#40702: 28.0.50; (what-cursor-position) barfs on non-ASCII char
@ 2020-04-18 21:27 Dima Kogan
  2020-04-18 21:53 ` Štěpán Němec
  0 siblings, 1 reply; 13+ messages in thread
From: Dima Kogan @ 2020-04-18 21:27 UTC (permalink / raw)
  To: 40702

Hi. I'm using a very recent build of emacs from git. I see this:

1. emacs -Q
   Fresh emacs. Opens in the *scratch* buffer

2. C-x 8 ' e
   i.e. insert some non-ASCII character. Opening any buffer with such
   characters works too

3. Left
   Move the point to this character

4. C-x =
   (what-cursor-position) to ask emacs to tell us about this character.
   I see this:

   cl--assertion-failed: Assertion failed: (not (multibyte-string-p str))

Thanks!





^ permalink raw reply	[flat|nested] 13+ messages in thread

* bug#40702: 28.0.50; (what-cursor-position) barfs on non-ASCII char
  2020-04-18 21:27 bug#40702: 28.0.50; (what-cursor-position) barfs on non-ASCII char Dima Kogan
@ 2020-04-18 21:53 ` Štěpán Němec
  2020-04-18 22:22   ` Dima Kogan
  0 siblings, 1 reply; 13+ messages in thread
From: Štěpán Němec @ 2020-04-18 21:53 UTC (permalink / raw)
  To: Dima Kogan; +Cc: 40702

On Sat, 18 Apr 2020 14:27:39 -0700
Dima Kogan wrote:

> Hi. I'm using a very recent build of emacs from git. I see this:
>
> 1. emacs -Q
>    Fresh emacs. Opens in the *scratch* buffer
>
> 2. C-x 8 ' e
>    i.e. insert some non-ASCII character. Opening any buffer with such
>    characters works too
>
> 3. Left
>    Move the point to this character
>
> 4. C-x =
>    (what-cursor-position) to ask emacs to tell us about this character.
>    I see this:
>
>    cl--assertion-failed: Assertion failed: (not (multibyte-string-p str))

I can't reproduce this on current master (d890e5b73a Fix misnamed
variable breaking GNUstep)

GNU Emacs 28.0.50 (build 26, x86_64-pc-linux-gnu, GTK+ Version 3.24.17, cairo version 1.17.3)

-- 
Štěpán





^ permalink raw reply	[flat|nested] 13+ messages in thread

* bug#40702: 28.0.50; (what-cursor-position) barfs on non-ASCII char
  2020-04-18 21:53 ` Štěpán Němec
@ 2020-04-18 22:22   ` Dima Kogan
  2020-04-19 13:02     ` Štěpán Němec
  2020-04-19 16:44     ` Stefan Monnier
  0 siblings, 2 replies; 13+ messages in thread
From: Dima Kogan @ 2020-04-18 22:22 UTC (permalink / raw)
  To: Štěpán Němec; +Cc: 40702

Štěpán Němec <stepnem@gmail.com> writes:

> I can't reproduce this on current master

Thanks for checking. It's very consistent on my end. I poked at it a
little bit just now.

I see that buffer-file-coding-system is nil

It ends up evaluating

  (encoded-string-description "é" nil)

which looks at the value of

  (multibyte-string-p "é")

[ The string above is supposed to be a single unicode character; my
  email maybe will mangle it; I don't know ]

On my install this evaluates to t, which is causing the error. Which of
these shouldn't be happening? For the record, it used to work for me.





^ permalink raw reply	[flat|nested] 13+ messages in thread

* bug#40702: 28.0.50; (what-cursor-position) barfs on non-ASCII char
  2020-04-18 22:22   ` Dima Kogan
@ 2020-04-19 13:02     ` Štěpán Němec
  2020-04-19 15:22       ` Eli Zaretskii
  2020-04-19 16:44     ` Stefan Monnier
  1 sibling, 1 reply; 13+ messages in thread
From: Štěpán Němec @ 2020-04-19 13:02 UTC (permalink / raw)
  To: Dima Kogan; +Cc: 40702

On Sat, 18 Apr 2020 15:22:13 -0700
Dima Kogan wrote:

> Thanks for checking. It's very consistent on my end. I poked at it a
> little bit just now.
>
> I see that buffer-file-coding-system is nil
>
> It ends up evaluating
>
>   (encoded-string-description "é" nil)
>
> which looks at the value of
>
>   (multibyte-string-p "é")
>
> [ The string above is supposed to be a single unicode character; my
>   email maybe will mangle it; I don't know ]
>
> On my install this evaluates to t, which is causing the error. Which of
> these shouldn't be happening? For the record, it used to work for me.

I'm not sure I'll be able to help you given my lack of familiarity with
this and related code, but can you at least post the full backtrace?

Looking at `what-cursor-position', apparently due to your
`buffer-file-coding-system' being nil (which seems a bit strange to me:
is even your (default-value 'buffer-file-coding-system) nil?) the
multibyte string isn't properly encoded and instead passed directly to
`encoded-string-description', leading to the error.

That said, there haven't been any relevant recent changes to
`what-cursor-position'.

In any case, I think more info is needed: backtrace, system/environment.

-- 
Štěpán





^ permalink raw reply	[flat|nested] 13+ messages in thread

* bug#40702: 28.0.50; (what-cursor-position) barfs on non-ASCII char
  2020-04-19 13:02     ` Štěpán Němec
@ 2020-04-19 15:22       ` Eli Zaretskii
  2020-04-19 16:18         ` Štěpán Němec
  0 siblings, 1 reply; 13+ messages in thread
From: Eli Zaretskii @ 2020-04-19 15:22 UTC (permalink / raw)
  To: Štěpán Němec; +Cc: dima, 40702

> From: Štěpán Němec
>  <stepnem@gmail.com>
> Date: Sun, 19 Apr 2020 15:02:24 +0200
> Cc: 40702@debbugs.gnu.org
> 
> Looking at `what-cursor-position', apparently due to your
> `buffer-file-coding-system' being nil (which seems a bit strange to me:
> is even your (default-value 'buffer-file-coding-system) nil?)

buffer-file-coding-system being nil means 'no-conversion'.  You can
easily simulate that yourself, by an explicit setq, and you will then
get the error described in the report.

> the multibyte string isn't properly encoded and instead passed
> directly to `encoded-string-description', leading to the error.

Emacs 26.3 doesn't signal an error in this case, so I think this is a
regression we should fix.

> That said, there haven't been any relevant recent changes to
> `what-cursor-position'.
> 
> In any case, I think more info is needed: backtrace, system/environment.

Here's a backtrace:

  Debugger entered--Lisp error: (cl-assertion-failed ((not (multibyte-string-p str)) nil))
    cl--assertion-failed((not (multibyte-string-p str)))
    encoded-string-description(#("é" 0 1 (charset unicode)) nil)
    describe-char(146)
    what-cursor-position((4))
    funcall-interactively(what-cursor-position (4))
    call-interactively(what-cursor-position nil nil)
    command-execute(what-cursor-position)





^ permalink raw reply	[flat|nested] 13+ messages in thread

* bug#40702: 28.0.50; (what-cursor-position) barfs on non-ASCII char
  2020-04-19 15:22       ` Eli Zaretskii
@ 2020-04-19 16:18         ` Štěpán Němec
  2020-04-19 16:50           ` Eli Zaretskii
  0 siblings, 1 reply; 13+ messages in thread
From: Štěpán Němec @ 2020-04-19 16:18 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: dima, 40702, Stefan Monnier

On Sun, 19 Apr 2020 18:22:30 +0300
Eli Zaretskii wrote:

>> Looking at `what-cursor-position', apparently due to your
>> `buffer-file-coding-system' being nil (which seems a bit strange to me:
>> is even your (default-value 'buffer-file-coding-system) nil?)
>
> buffer-file-coding-system being nil means 'no-conversion'.  You can
> easily simulate that yourself, by an explicit setq, and you will then
> get the error described in the report.

Indeed, thanks, the meaning of `nil' is described in the doc string. I
was more surprised that it ever ends up being nil by default, but that's
probably because I have very little understanding of how the Emacs
coding setup works.

>> the multibyte string isn't properly encoded and instead passed
>> directly to `encoded-string-description', leading to the error.
>
> Emacs 26.3 doesn't signal an error in this case, so I think this is a
> regression we should fix.
>
>> That said, there haven't been any relevant recent changes to
>> `what-cursor-position'.
>> 
>> In any case, I think more info is needed: backtrace, system/environment.
>
> Here's a backtrace:
>
>   Debugger entered--Lisp error: (cl-assertion-failed ((not (multibyte-string-p str)) nil))
>     cl--assertion-failed((not (multibyte-string-p str)))
>     encoded-string-description(#("é" 0 1 (charset unicode)) nil)
>     describe-char(146)
>     what-cursor-position((4))
>     funcall-interactively(what-cursor-position (4))
>     call-interactively(what-cursor-position nil nil)
>     command-execute(what-cursor-position)

Thanks. I was looking at all the wrong places. The problem was simply
introduced by the addition of the assert in

2019-05-28T20:59:35-04:00!monnier@iro.umontreal.ca
146486f8a6 (* mule-cmds.el (encoded-string-description): Require unibyte string as input)
https://git.sv.gnu.org/cgit/emacs.git/commit/?id=146486f8a6

Removing the assertion reverts to the Emacs 26 behaviour.

Unfortunately there is no explanation regarding the change. Maybe Stefan
could provide some insight?

-- 
Štěpán





^ permalink raw reply	[flat|nested] 13+ messages in thread

* bug#40702: 28.0.50; (what-cursor-position) barfs on non-ASCII char
  2020-04-18 22:22   ` Dima Kogan
  2020-04-19 13:02     ` Štěpán Němec
@ 2020-04-19 16:44     ` Stefan Monnier
  2020-04-20  4:16       ` Dima Kogan
  2020-09-30  3:45       ` Lars Ingebrigtsen
  1 sibling, 2 replies; 13+ messages in thread
From: Stefan Monnier @ 2020-04-19 16:44 UTC (permalink / raw)
  To: Dima Kogan; +Cc: Štěpán Němec, 40702

>> I can't reproduce this on current master
> Thanks for checking. It's very consistent on my end. I poked at it a
> little bit just now.
> I see that buffer-file-coding-system is nil

It would be worth looking into how/why you get a nil value here.

> It ends up evaluating
>   (encoded-string-description "é" nil)

This seems to point to a bug in `encode-coding-char`:

    M-: (encode-coding-char ?\é nil) RET

returns "é" which is not a unibyte string and hence is not a valid
encoded string.  Note that

    M-: (encode-coding-char ?\é 'no-conversion) RET

does not suffer from the same problem.  This comes from
`encode-coding-string` which also returns a multibyte string when its
coding arg is nil.

I'm not sure if `encode-coding-string/char` should accept a nil argument
nor how it should treat it, so maybe it's a bug in `what-char-position`
which should not pass a nil argument here.  So maybe the patch below
is a good fix?


        Stefan


diff --git a/lisp/simple.el b/lisp/simple.el
index 8bc84a9dfa..e5180119e8 100644
--- a/lisp/simple.el
+++ b/lisp/simple.el
@@ -1470,7 +1470,11 @@ what-cursor-position
 	    encoded encoding-msg display-prop under-display)
 	(if (or (not coding)
 		(eq (coding-system-type coding) t))
-	    (setq coding (default-value 'buffer-file-coding-system)))
+	    (setq coding (or (default-value 'buffer-file-coding-system)
+                             ;; A nil value of `buffer-file-coding-system'
+                             ;; means "no conversion" which means each byte
+                             ;; is a char and vice versa.
+                             'binary)))
 	(if (eq (char-charset char) 'eight-bit)
 	    (setq encoding-msg
 		  (format "(%d, #o%o, #x%x%s, raw-byte)" char char char char-name-fmt))






^ permalink raw reply related	[flat|nested] 13+ messages in thread

* bug#40702: 28.0.50; (what-cursor-position) barfs on non-ASCII char
  2020-04-19 16:18         ` Štěpán Němec
@ 2020-04-19 16:50           ` Eli Zaretskii
  2020-04-19 19:39             ` Štěpán Němec
  0 siblings, 1 reply; 13+ messages in thread
From: Eli Zaretskii @ 2020-04-19 16:50 UTC (permalink / raw)
  To: Štěpán Němec; +Cc: dima, 40702, monnier

> From: Štěpán Němec <stepnem@gmail.com>
> Cc: dima@secretsauce.net,  40702@debbugs.gnu.org, Stefan Monnier
>  <monnier@iro.umontreal.ca>
> Date: Sun, 19 Apr 2020 18:18:13 +0200
> 
> >   Debugger entered--Lisp error: (cl-assertion-failed ((not (multibyte-string-p str)) nil))
> >     cl--assertion-failed((not (multibyte-string-p str)))
> >     encoded-string-description(#("é" 0 1 (charset unicode)) nil)
> >     describe-char(146)
> >     what-cursor-position((4))
> >     funcall-interactively(what-cursor-position (4))
> >     call-interactively(what-cursor-position nil nil)
> >     command-execute(what-cursor-position)
> 
> Thanks. I was looking at all the wrong places. The problem was simply
> introduced by the addition of the assert in
> 
> 2019-05-28T20:59:35-04:00!monnier@iro.umontreal.ca
> 146486f8a6 (* mule-cmds.el (encoded-string-description): Require unibyte string as input)
> https://git.sv.gnu.org/cgit/emacs.git/commit/?id=146486f8a6
> 
> Removing the assertion reverts to the Emacs 26 behaviour.
> 
> Unfortunately there is no explanation regarding the change. Maybe Stefan
> could provide some insight?

Could the discussion below provide such an explanation?

  https://lists.gnu.org/archive/html/emacs-devel/2019-05/msg00949.html





^ permalink raw reply	[flat|nested] 13+ messages in thread

* bug#40702: 28.0.50; (what-cursor-position) barfs on non-ASCII char
  2020-04-19 16:50           ` Eli Zaretskii
@ 2020-04-19 19:39             ` Štěpán Němec
  0 siblings, 0 replies; 13+ messages in thread
From: Štěpán Němec @ 2020-04-19 19:39 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: dima, 40702, monnier

On Sun, 19 Apr 2020 19:50:10 +0300
Eli Zaretskii wrote:

>> 2019-05-28T20:59:35-04:00!monnier@iro.umontreal.ca
>> 146486f8a6 (* mule-cmds.el (encoded-string-description): Require unibyte string as input)
>> https://git.sv.gnu.org/cgit/emacs.git/commit/?id=146486f8a6
>> 
>> Removing the assertion reverts to the Emacs 26 behaviour.
>> 
>> Unfortunately there is no explanation regarding the change. Maybe Stefan
>> could provide some insight?
>
> Could the discussion below provide such an explanation?
>
>   https://lists.gnu.org/archive/html/emacs-devel/2019-05/msg00949.html

Yes, and also a lot of other useful context/reference information.

Thanks!

-- 
Štěpán





^ permalink raw reply	[flat|nested] 13+ messages in thread

* bug#40702: 28.0.50; (what-cursor-position) barfs on non-ASCII char
  2020-04-19 16:44     ` Stefan Monnier
@ 2020-04-20  4:16       ` Dima Kogan
  2020-04-20 13:27         ` Stefan Monnier
  2020-09-30  3:45       ` Lars Ingebrigtsen
  1 sibling, 1 reply; 13+ messages in thread
From: Dima Kogan @ 2020-04-20  4:16 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Štěpán Němec, 40702

Stefan Monnier <monnier@iro.umontreal.ca> writes:

>> I see that buffer-file-coding-system is nil
>
> It would be worth looking into how/why you get a nil value here.

Any suggestions about how to do that? For the record, unicode stuff
seems to work in general, this bug excepted. Would you expect stuff to
break with nil here?





^ permalink raw reply	[flat|nested] 13+ messages in thread

* bug#40702: 28.0.50; (what-cursor-position) barfs on non-ASCII char
  2020-04-20  4:16       ` Dima Kogan
@ 2020-04-20 13:27         ` Stefan Monnier
  2020-04-20 21:44           ` Dima Kogan
  0 siblings, 1 reply; 13+ messages in thread
From: Stefan Monnier @ 2020-04-20 13:27 UTC (permalink / raw)
  To: Dima Kogan; +Cc: Štěpán Němec, 40702

>>> I see that buffer-file-coding-system is nil
>> It would be worth looking into how/why you get a nil value here.
> Any suggestions about how to do that?

If you get that in the scratch buffer in `emacs -Q`, then I'd guess it
depends on the locale setting.


        Stefan






^ permalink raw reply	[flat|nested] 13+ messages in thread

* bug#40702: 28.0.50; (what-cursor-position) barfs on non-ASCII char
  2020-04-20 13:27         ` Stefan Monnier
@ 2020-04-20 21:44           ` Dima Kogan
  0 siblings, 0 replies; 13+ messages in thread
From: Dima Kogan @ 2020-04-20 21:44 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Štěpán Němec, 40702

Stefan Monnier <monnier@iro.umontreal.ca> writes:

> If you get that in the scratch buffer in `emacs -Q`, then I'd guess it
> depends on the locale setting.

  $ locale

  LANG=C
  LANGUAGE=
  LC_CTYPE="C"
  LC_NUMERIC="C"
  LC_TIME="C"
  LC_COLLATE="C"
  LC_MONETARY="C"
  LC_MESSAGES="C"
  LC_PAPER="C"
  LC_NAME="C"
  LC_ADDRESS="C"
  LC_TELEPHONE="C"
  LC_MEASUREMENT="C"
  LC_IDENTIFICATION="C"
  LC_ALL=C

I happen to live in an English-speaking country, so generally doing
everything in ASCII works ok. Is there anything to "fix" here?





^ permalink raw reply	[flat|nested] 13+ messages in thread

* bug#40702: 28.0.50; (what-cursor-position) barfs on non-ASCII char
  2020-04-19 16:44     ` Stefan Monnier
  2020-04-20  4:16       ` Dima Kogan
@ 2020-09-30  3:45       ` Lars Ingebrigtsen
  1 sibling, 0 replies; 13+ messages in thread
From: Lars Ingebrigtsen @ 2020-09-30  3:45 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Štěpán Němec, Dima Kogan, 40702

Stefan Monnier <monnier@iro.umontreal.ca> writes:

> I'm not sure if `encode-coding-string/char` should accept a nil argument
> nor how it should treat it, so maybe it's a bug in `what-char-position`
> which should not pass a nil argument here.  So maybe the patch below
> is a good fix?

With

LANG=C LANGUAGE= LC_CTYPE="C" LC_NUMERIC="C" LC_TIME="C" LC_COLLATE="C" LC_MONETARY="C" LC_MESSAGES="C" LC_PAPER="C" LC_NAME="C" LC_ADDRESS="C" LC_TELEPHONE="C" LC_MEASUREMENT="C" LC_IDENTIFICATION="C" LC_ALL=C ./src/emacs -geometry -0+0 -Q  

I can reproduce the bug Dima is seeing, and Stefan's patch fixes the
problem, and seems otherwise unproblematic, so I've pushed it to Emacs
28.

There may be other, more general problems when running under the "C"
locale, but...

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2020-09-30  3:45 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-18 21:27 bug#40702: 28.0.50; (what-cursor-position) barfs on non-ASCII char Dima Kogan
2020-04-18 21:53 ` Štěpán Němec
2020-04-18 22:22   ` Dima Kogan
2020-04-19 13:02     ` Štěpán Němec
2020-04-19 15:22       ` Eli Zaretskii
2020-04-19 16:18         ` Štěpán Němec
2020-04-19 16:50           ` Eli Zaretskii
2020-04-19 19:39             ` Štěpán Němec
2020-04-19 16:44     ` Stefan Monnier
2020-04-20  4:16       ` Dima Kogan
2020-04-20 13:27         ` Stefan Monnier
2020-04-20 21:44           ` Dima Kogan
2020-09-30  3:45       ` Lars Ingebrigtsen

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).