* bug#73307: Fix ctype(3) usage
@ 2024-09-16 21:25 Thomas Klausner
2024-09-17 11:52 ` Eli Zaretskii
0 siblings, 1 reply; 10+ messages in thread
From: Thomas Klausner @ 2024-09-16 21:25 UTC (permalink / raw)
To: 73307
[-- Attachment #1: Type: text/plain, Size: 1824 bytes --]
Tags: patch
When compiling emacs on NetBSD-10.99.12/amd64, I get the following
warning
In file included from /usr/include/ctype.h:100,
from fns.c:29:
fns.c: In function 'Fyes_or_no_p':
fns.c:3582:33: warning: array subscript has type 'char' [-Wchar-subscripts]
3582 | if ((len > 0) && !isspace (s[len - 1]))
| ^
The NetBSD man page for ctype(3): https://man.netbsd.org/ctype.3
is quite explicit about the problems with this - ctype(3) functions only
accept -1 and "unsigned char" and you can get very weird problems if
this is disregarded.
The attached patch adds the missing cast.
In GNU Emacs 31.0.50 (build 1, x86_64--netbsd, GTK+ Version 3.24.43,
cairo version 1.18.0) of 2024-09-16
Repository revision: f27553c30a772a0103d2e6762e4d7f588f302e4b
Repository branch: HEAD
System Description: NetBSD 10.99.12/amd64
Configured using:
'configure --srcdir=/scratch/wip/emacs-git/work/emacs
--localstatedir=/var --with-native-compilation --without-ns
--without-imagemagick --without-xaw3d --with-x-toolkit=gtk3
--prefix=/usr/pkg --build=x86_64--netbsd --host=x86_64--netbsd
--infodir=/usr/pkg/info --mandir=/usr/pkg/man
--enable-option-checking=yes 'CFLAGS=-O2 -g -g -fstack-clash-protection
-I/usr/pkg/include -I/usr/include -I/usr/pkg/include/freetype2
-I/usr/pkg/include/glib-2.0 -I/usr/pkg/include/gio-unix-2.0
-I/usr/pkg/lib/glib-2.0/include -I/usr/pkg/include/harfbuzz
-I/usr/pkg/include/libdrm' 'CPPFLAGS=-g -I/usr/pkg/include
-I/usr/include -I/usr/pkg/include/freetype2 -I/usr/pkg/include/glib-2.0
-I/usr/pkg/include/gio-unix-2.0 -I/usr/pkg/lib/glib-2.0/include
-I/usr/pkg/include/harfbuzz -I/usr/pkg/include/libdrm'
'LDFLAGS=-Wl,-R/usr/pkg/gcc14/lib -Wl,-zrelro -Wl,-znow -L/usr/pkg/lib
-Wl,-R/usr/pkg/lib -L/usr/lib -Wl,-R/usr/lib''
[-- Attachment #2: patch-src_fns.c --]
[-- Type: text/x-c, Size: 460 bytes --]
$NetBSD$
Fix ctype(3) usage.
--- src/fns.c.orig 2024-09-16 21:11:40.908684144 +0000
+++ src/fns.c
@@ -3579,7 +3579,7 @@ by a mouse, or by some window-system ges
{
char *s = SSDATA (prompt);
ptrdiff_t len = strlen (s);
- if ((len > 0) && !isspace (s[len - 1]))
+ if ((len > 0) && !isspace ((unsigned char)s[len - 1]))
prompt = CALLN (Fconcat, prompt, build_string (" "));
}
prompt = CALLN (Fconcat, prompt, Vyes_or_no_prompt);
^ permalink raw reply [flat|nested] 10+ messages in thread
* bug#73307: Fix ctype(3) usage
2024-09-16 21:25 bug#73307: Fix ctype(3) usage Thomas Klausner
@ 2024-09-17 11:52 ` Eli Zaretskii
2024-09-17 11:55 ` Thomas Klausner
2024-09-18 0:05 ` Paul Eggert
0 siblings, 2 replies; 10+ messages in thread
From: Eli Zaretskii @ 2024-09-17 11:52 UTC (permalink / raw)
To: Thomas Klausner, Paul Eggert; +Cc: 73307
> Date: Mon, 16 Sep 2024 23:25:16 +0200
> From: Thomas Klausner <wiz@gatalith.at>
>
> When compiling emacs on NetBSD-10.99.12/amd64, I get the following
> warning
>
> In file included from /usr/include/ctype.h:100,
> from fns.c:29:
> fns.c: In function 'Fyes_or_no_p':
> fns.c:3582:33: warning: array subscript has type 'char' [-Wchar-subscripts]
> 3582 | if ((len > 0) && !isspace (s[len - 1]))
> | ^
>
> The NetBSD man page for ctype(3): https://man.netbsd.org/ctype.3
> is quite explicit about the problems with this - ctype(3) functions only
> accept -1 and "unsigned char" and you can get very weird problems if
> this is disregarded.
>
> The attached patch adds the missing cast.
> [...]
>
> --- src/fns.c.orig 2024-09-16 21:11:40.908684144 +0000
> +++ src/fns.c
> @@ -3579,7 +3579,7 @@ by a mouse, or by some window-system ges
> {
> char *s = SSDATA (prompt);
> ptrdiff_t len = strlen (s);
> - if ((len > 0) && !isspace (s[len - 1]))
> + if ((len > 0) && !isspace ((unsigned char)s[len - 1]))
> prompt = CALLN (Fconcat, prompt, build_string (" "));
> }
> prompt = CALLN (Fconcat, prompt, Vyes_or_no_prompt);
Thanks. However, I'm not sure this is the right fix, the function is
defined with argument type of 'int'. Paul, any comments?
^ permalink raw reply [flat|nested] 10+ messages in thread
* bug#73307: Fix ctype(3) usage
2024-09-17 11:52 ` Eli Zaretskii
@ 2024-09-17 11:55 ` Thomas Klausner
2024-09-18 0:05 ` Paul Eggert
1 sibling, 0 replies; 10+ messages in thread
From: Thomas Klausner @ 2024-09-17 11:55 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: 73307, Paul Eggert
On Tue, Sep 17, 2024 at 02:52:20PM +0300, Eli Zaretskii wrote:
> Thanks. However, I'm not sure this is the right fix, the function is
> defined with argument type of 'int'. Paul, any comments?
That is exactly the problem.
If you pass a 'signed char' below -1, it gets converted to the same
integer value, also a negative number below -1, and you end up passing
an invalid argument to the function. The function is defined to only
accept EOF and non-negative values in the range of 'unsigned char'.
-1 is only special because that's usually the value of EOF.
Thomas
^ permalink raw reply [flat|nested] 10+ messages in thread
* bug#73307: Fix ctype(3) usage
2024-09-17 11:52 ` Eli Zaretskii
2024-09-17 11:55 ` Thomas Klausner
@ 2024-09-18 0:05 ` Paul Eggert
2024-09-18 2:13 ` Po Lu via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-09-18 11:45 ` Eli Zaretskii
1 sibling, 2 replies; 10+ messages in thread
From: Paul Eggert @ 2024-09-18 0:05 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: Po Lu, 73307-done, Thomas Klausner
[-- Attachment #1: Type: text/plain, Size: 1108 bytes --]
On 2024-09-17 04:52, Eli Zaretskii wrote:
> However, I'm not sure this is the right fix, the function is
> defined with argument type of 'int'. Paul, any comments?
Although that patch was an improvement it still had problems, as it
incorrectly assumed the string does not end in a multibyte space, and
that Emacs's locale matches the system's.
Emacs itself should not use <ctype.h> unless it knows the string is
unibyte and the system locale matches Emacs's. I scanned through its
source code looking for all problematic instances of <ctype.h> that have
crept in (except I didn't scan the MS-Windows code, where you're the
expert), and found five other places where ctype.h was obviously
misused. I installed the attached to fix these glitches and am boldly
closing this the report.
I can't easily test patch 0003, which fixes Android-specific code.
Although I think it's an improvement, in unlikely cases I suspect it
still doesn't exactly match what the Android kernel does with #! lines.
I don't know whether that matters. I'll CC this to Po Lu (my goto person
for Android) as a heads-up.
[-- Attachment #2: 0001-Fix-yes-or-no-p-with-multibyte-spaces.patch --]
[-- Type: text/x-patch, Size: 2141 bytes --]
From 43cde03fa5a663a1509a762077c11eb57a60cee8 Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Tue, 17 Sep 2024 15:22:02 -0700
Subject: [PATCH 1/4] Fix yes-or-no-p with multibyte spaces
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Problem reported by Thomas Klausner (Bug#73307).
Emacs shouldn’t use ctype.h, as it doesn’t work for multibyte
chars and it doesn’t work with Emacs’s locale model anyway.
* src/fns.c: Include syntax.h, not ctype.h.
(Fyes_or_no_p): Check the character category with SYNTAX, not
with isspace, which assumes the current locale and works only
with single-byte characters.
---
src/fns.c | 18 ++++++++++--------
1 file changed, 10 insertions(+), 8 deletions(-)
diff --git a/src/fns.c b/src/fns.c
index 57113a8c5ed..370f7711b90 100644
--- a/src/fns.c
+++ b/src/fns.c
@@ -26,7 +26,6 @@ Copyright (C) 1985-2024 Free Software Foundation, Inc.
#include <intprops.h>
#include <vla.h>
#include <errno.h>
-#include <ctype.h>
#include <math.h>
#include "lisp.h"
@@ -36,6 +35,7 @@ Copyright (C) 1985-2024 Free Software Foundation, Inc.
#include "composite.h"
#include "buffer.h"
#include "intervals.h"
+#include "syntax.h"
#include "window.h"
#include "puresize.h"
#include "gnutls.h"
@@ -3576,13 +3576,15 @@ DEFUN ("yes-or-no-p", Fyes_or_no_p, Syes_or_no_p, 1, 1, 0,
if (use_short_answers)
return call1 (Qy_or_n_p, prompt);
- {
- char *s = SSDATA (prompt);
- ptrdiff_t len = strlen (s);
- if ((len > 0) && !isspace (s[len - 1]))
- prompt = CALLN (Fconcat, prompt, build_string (" "));
- }
- prompt = CALLN (Fconcat, prompt, Vyes_or_no_prompt);
+ ptrdiff_t promptlen = SCHARS (prompt);
+ bool prompt_ends_in_nonspace
+ = (0 < promptlen
+ && (SYNTAX (XFIXNAT (Faref (prompt, make_fixnum (promptlen - 1))))
+ != Swhitespace));
+ AUTO_STRING (space_string, " ");
+ prompt = CALLN (Fconcat, prompt,
+ prompt_ends_in_nonspace ? space_string : empty_unibyte_string,
+ Vyes_or_no_prompt);
specpdl_ref count = SPECPDL_INDEX ();
specbind (Qenable_recursive_minibuffers, Qt);
--
2.43.0
[-- Attachment #3: 0002-Fix-misuse-of-toupper-in-sfnt_parse_style.patch --]
[-- Type: text/x-patch, Size: 1360 bytes --]
From 58a44b6ac317c9a17bcdd208e4ec33ff9772429e Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Tue, 17 Sep 2024 15:23:47 -0700
Subject: [PATCH 2/4] Fix misuse of toupper in sfnt_parse_style
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
* src/sfntfont.c: Include c-ctype.h, not ctype.h.
(sfnt_parse_style): Upcase just initial ASCII letters;
that’s good enough here.
---
src/sfntfont.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/src/sfntfont.c b/src/sfntfont.c
index 8c02c05e7a6..f21b6c0782f 100644
--- a/src/sfntfont.c
+++ b/src/sfntfont.c
@@ -20,7 +20,7 @@ Copyright (C) 2023-2024 Free Software Foundation, Inc.
#include <config.h>
#include <fcntl.h>
-#include <ctype.h>
+#include <c-ctype.h>
#include "lisp.h"
@@ -534,12 +534,12 @@ sfnt_parse_style (Lisp_Object style_name, struct sfnt_font_desc *desc)
}
/* This token is extraneous or was not recognized. Capitalize
- the first letter and set it as the adstyle. */
+ the first letter if it's ASCII lowercase, then set the token as
+ the adstyle. */
if (strlen (single))
{
- if (islower (single[0]))
- single[0] = toupper (single[0]);
+ single[0] = c_toupper (single[0]);
if (NILP (desc->adstyle))
desc->adstyle = build_string (single);
--
2.43.0
[-- Attachment #4: 0003-Fix-some-misparsing-in-check_interpreter.patch --]
[-- Type: text/x-patch, Size: 1889 bytes --]
From e0b027d1215ed32fe0f3d0956d2d7a81552274f2 Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Tue, 17 Sep 2024 16:38:53 -0700
Subject: [PATCH 3/4] Fix some #! misparsing in check_interpreter
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
* exec/exec.c: Do not include ctype.h, as the kernel
does not care about the locale.
(check_interpreter): Treat only spaces and tabs as white space.
Do not inspect more bytes than were read.
Although the resulting code does not exactly match what
the Android kernel does, it’s closer than what it was before.
---
exec/exec.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/exec/exec.c b/exec/exec.c
index 775a8b06b96..f31c9a89744 100644
--- a/exec/exec.c
+++ b/exec/exec.c
@@ -24,7 +24,6 @@ Copyright (C) 2023-2024 Free Software Foundation, Inc.
#include <fcntl.h>
#include <assert.h>
#include <string.h>
-#include <ctype.h>
#include <stdlib.h>
#include <sys/ptrace.h>
@@ -116,11 +115,11 @@ check_interpreter (const char *name, int fd, const char **extra)
/* Strip leading whitespace. */
start = buffer;
- while (*start && ((unsigned char) *start) < 128 && isspace (*start))
+ while (start < buffer + rc && (*start == ' ' || *start == '\t'))
++start;
/* Look for a newline character. */
- end = memchr (start, '\n', rc);
+ end = memchr (start, '\n', buffer + rc - start);
if (!end)
goto fail;
@@ -130,11 +129,12 @@ check_interpreter (const char *name, int fd, const char **extra)
*end = '\0';
/* Now look for any whitespace characters. */
- ws = strchr (start, ' ');
+ for (ws = start; *ws && *ws != ' ' && *ws != '\t'; ws++)
+ continue;
/* If there's no whitespace, return the entire start. */
- if (!ws)
+ if (!*ws)
{
if (lseek (fd, 0, SEEK_SET))
goto fail;
--
2.43.0
[-- Attachment #5: 0004-Use-c-ctype.h-in-lib-src.patch --]
[-- Type: text/x-patch, Size: 4041 bytes --]
From 865b54e2acea4fdaa3f302ed225f50281b371d6e Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Tue, 17 Sep 2024 16:54:17 -0700
Subject: [PATCH 4/4] Use c-ctype.h in lib-src
This fixes some unlikely bugs and removes the temptation
of using ctype.h. Although some uses were correct,
many weren't.
* lib-src/ebrowse.c: Include c-ctype.h, not ctype.h.
* lib-src/emacsclient.c: Include c-ctype.h, not ctype.h.
* lib-src/update-game-score.c: Include c-ctype.h, not ctype.h.
All uses changed.
---
lib-src/ebrowse.c | 10 +++++-----
lib-src/emacsclient.c | 6 +++---
lib-src/update-game-score.c | 4 ++--
3 files changed, 10 insertions(+), 10 deletions(-)
diff --git a/lib-src/ebrowse.c b/lib-src/ebrowse.c
index 821c29272a4..13c1befdc3e 100644
--- a/lib-src/ebrowse.c
+++ b/lib-src/ebrowse.c
@@ -22,11 +22,11 @@ Copyright (C) 1992-2024 Free Software Foundation, Inc.
#include <stddef.h>
#include <stdlib.h>
#include <string.h>
-#include <ctype.h>
#include <assert.h>
#include <getopt.h>
#include <attribute.h>
+#include <c-ctype.h>
#include <flexmember.h>
#include <min-max.h>
#include <unlocked-io.h>
@@ -1875,7 +1875,7 @@ #define RSTRING_EOF_CHECK \
int_suffixes:
/* Integer suffixes. */
- while (isalpha (c))
+ while (c_isalpha (c))
GET (c);
UNGET ();
return CINT;
@@ -1907,7 +1907,7 @@ #define RSTRING_EOF_CHECK \
}
/* Optional type suffixes. */
- while (isalpha (c))
+ while (c_isalpha (c))
GET (c);
UNGET ();
return CFLOAT;
@@ -2158,7 +2158,7 @@ init_scanner (void)
/* Set up character class vectors. */
for (i = 0; i < sizeof is_ident; ++i)
{
- if (i == '_' || isalnum (i))
+ if (i == '_' || c_isalnum (i))
is_ident[i] = 1;
if (i >= '0' && i <= '9')
@@ -2946,7 +2946,7 @@ operator_name (int *sc)
MATCH ();
/* If this is a simple operator like `+', stop now. */
- if (!isalpha ((unsigned char) *s) && *s != '(' && *s != '[')
+ if (!c_isalpha (*s) && *s != '(' && *s != '[')
break;
++tokens_matched;
diff --git a/lib-src/emacsclient.c b/lib-src/emacsclient.c
index 8e64f1e92d3..c1ffa1480ec 100644
--- a/lib-src/emacsclient.c
+++ b/lib-src/emacsclient.c
@@ -68,7 +68,6 @@ Copyright (C) 1986-2024 Free Software Foundation, Inc.
#define DEFAULT_TIMEOUT (30)
-#include <ctype.h>
#include <errno.h>
#include <getopt.h>
#include <inttypes.h>
@@ -83,6 +82,7 @@ #define DEFAULT_TIMEOUT (30)
#include <unistd.h>
#include <attribute.h>
+#include <c-ctype.h>
#include <filename.h>
#include <intprops.h>
#include <min-max.h>
@@ -2124,7 +2124,7 @@ main (int argc, char **argv)
unsigned char c;
do
c = *++p;
- while (isdigit (c) || c == ':');
+ while (c_isdigit (c) || c == ':');
if (c == 0)
{
@@ -2136,7 +2136,7 @@ main (int argc, char **argv)
}
#ifdef WINDOWSNT
else if (! IS_ABSOLUTE_FILE_NAME (argv[i])
- && (isalpha (argv[i][0]) && argv[i][1] == ':'))
+ && (c_isalpha (argv[i][0]) && argv[i][1] == ':'))
/* Windows can have a different default directory for each
drive, so the cwd passed via "-dir" is not sufficient
to account for that.
diff --git a/lib-src/update-game-score.c b/lib-src/update-game-score.c
index e3b24ad7717..7545d0390c1 100644
--- a/lib-src/update-game-score.c
+++ b/lib-src/update-game-score.c
@@ -41,11 +41,11 @@ Copyright (C) 2002-2024 Free Software Foundation, Inc.
#include <stdlib.h>
#include <time.h>
#include <pwd.h>
-#include <ctype.h>
#include <fcntl.h>
#include <sys/stat.h>
#include <getopt.h>
+#include <c-ctype.h>
#include <unlocked-io.h>
#ifdef WINDOWSNT
@@ -143,7 +143,7 @@ normalize_integer (char *num)
{
bool neg;
char *p;
- while (*num != '\n' && isspace (*num))
+ while (*num != '\n' && c_isspace (*num))
num++;
neg = *num == '-';
num += neg || *num == '-';
--
2.43.0
^ permalink raw reply related [flat|nested] 10+ messages in thread
* bug#73307: Fix ctype(3) usage
2024-09-18 0:05 ` Paul Eggert
@ 2024-09-18 2:13 ` Po Lu via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-09-18 11:45 ` Eli Zaretskii
1 sibling, 0 replies; 10+ messages in thread
From: Po Lu via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2024-09-18 2:13 UTC (permalink / raw)
To: Paul Eggert; +Cc: Eli Zaretskii, 73307-done, Thomas Klausner
Paul Eggert <eggert@cs.ucla.edu> writes:
> I can't easily test patch 0003, which fixes Android-specific
> code. Although I think it's an improvement, in unlikely cases I
> suspect it still doesn't exactly match what the Android kernel does
> with #! lines. I don't know whether that matters. I'll CC this to Po
> Lu (my goto person for Android) as a heads-up.
It's OK. Though I have never encountered shebangs with tabs.
^ permalink raw reply [flat|nested] 10+ messages in thread
* bug#73307: Fix ctype(3) usage
2024-09-18 0:05 ` Paul Eggert
2024-09-18 2:13 ` Po Lu via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2024-09-18 11:45 ` Eli Zaretskii
2024-09-18 15:18 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-09-18 15:35 ` Paul Eggert
1 sibling, 2 replies; 10+ messages in thread
From: Eli Zaretskii @ 2024-09-18 11:45 UTC (permalink / raw)
To: Paul Eggert, Stefan Monnier; +Cc: luangruo, 73307, wiz
> Date: Tue, 17 Sep 2024 17:05:29 -0700
> Cc: 73307-done@debbugs.gnu.org, Thomas Klausner <wiz@gatalith.at>,
> Po Lu <luangruo@yahoo.com>
> From: Paul Eggert <eggert@cs.ucla.edu>
>
> On 2024-09-17 04:52, Eli Zaretskii wrote:
> > However, I'm not sure this is the right fix, the function is
> > defined with argument type of 'int'. Paul, any comments?
>
> Although that patch was an improvement it still had problems, as it
> incorrectly assumed the string does not end in a multibyte space, and
> that Emacs's locale matches the system's.
>
> Emacs itself should not use <ctype.h> unless it knows the string is
> unibyte and the system locale matches Emacs's. I scanned through its
> source code looking for all problematic instances of <ctype.h> that have
> crept in (except I didn't scan the MS-Windows code, where you're the
> expert), and found five other places where ctype.h was obviously
> misused. I installed the attached to fix these glitches and am boldly
> closing this the report.
>
> I can't easily test patch 0003, which fixes Android-specific code.
> Although I think it's an improvement, in unlikely cases I suspect it
> still doesn't exactly match what the Android kernel does with #! lines.
> I don't know whether that matters. I'll CC this to Po Lu (my goto person
> for Android) as a heads-up.
Thanks.
This part doesn't look right to me:
> - {
> - char *s = SSDATA (prompt);
> - ptrdiff_t len = strlen (s);
> - if ((len > 0) && !isspace (s[len - 1]))
> - prompt = CALLN (Fconcat, prompt, build_string (" "));
> - }
> - prompt = CALLN (Fconcat, prompt, Vyes_or_no_prompt);
> + ptrdiff_t promptlen = SCHARS (prompt);
> + bool prompt_ends_in_nonspace
> + = (0 < promptlen
> + && (SYNTAX (XFIXNAT (Faref (prompt, make_fixnum (promptlen - 1))))
> + != Swhitespace));
> + AUTO_STRING (space_string, " ");
> + prompt = CALLN (Fconcat, prompt,
> + prompt_ends_in_nonspace ? space_string : empty_unibyte_string,
> + Vyes_or_no_prompt);
You are testing characters for whitespace syntax, which is AFAIU
subject to buffer-local syntax tables. Thus, a strange enough setting
of buffer syntax could make this code return unexpected results.
Stefan, am I right?
Why can't we use something closer to the original code, which doesn't
depend on buffer-local customizations? In particular, what's the
problem with using c_isspace (and what do you mean by "multibyte
space"?)
^ permalink raw reply [flat|nested] 10+ messages in thread
* bug#73307: Fix ctype(3) usage
2024-09-18 11:45 ` Eli Zaretskii
@ 2024-09-18 15:18 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-09-18 15:35 ` Paul Eggert
1 sibling, 0 replies; 10+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2024-09-18 15:18 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: luangruo, 73307, Paul Eggert, wiz
> You are testing characters for whitespace syntax, which is AFAIU
> subject to buffer-local syntax tables. Thus, a strange enough setting
> of buffer syntax could make this code return unexpected results.
>
> Stefan, am I right?
Yup.
> Why can't we use something closer to the original code, which doesn't
> depend on buffer-local customizations?
We could use the global/default syntax table?
[ Or rely on some of the unicode tables? ]
Stefan
^ permalink raw reply [flat|nested] 10+ messages in thread
* bug#73307: Fix ctype(3) usage
2024-09-18 11:45 ` Eli Zaretskii
2024-09-18 15:18 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2024-09-18 15:35 ` Paul Eggert
2024-09-18 16:11 ` Eli Zaretskii
1 sibling, 1 reply; 10+ messages in thread
From: Paul Eggert @ 2024-09-18 15:35 UTC (permalink / raw)
To: Eli Zaretskii, Stefan Monnier; +Cc: luangruo, 73307, wiz
[-- Attachment #1: Type: text/plain, Size: 762 bytes --]
On 2024-09-18 04:45, Eli Zaretskii wrote:
> what's the
> problem with using c_isspace (and what do you mean by "multibyte
> space"
Using c_isspace would be OK, in that it'd handle most practical
examples. I wrote the fancier version to handle prompts that end in
non-ASCII spaces, e.g., (yes-or-no-p "Delete all files? ") where the
last character is actually U+00A0 NO-BREAK_SPACE instead of U+0020
SPACE, so its UTF-8 encoding is multiple bytes. If it's not important to
handle such cases we could just use c_isspace.
> You are testing characters for whitespace syntax, which is AFAIU
> subject to buffer-local syntax tables.
Thanks, I didn't think of that. How about the attached patch? It is an
alternative to just using c_isspace.
[-- Attachment #2: fns.c.diff --]
[-- Type: text/x-patch, Size: 825 bytes --]
diff --git a/src/fns.c b/src/fns.c
index 370f7711b90..d3ee98c3bae 100644
--- a/src/fns.c
+++ b/src/fns.c
@@ -35,7 +35,6 @@ Copyright (C) 1985-2024 Free Software Foundation, Inc.
#include "composite.h"
#include "buffer.h"
#include "intervals.h"
-#include "syntax.h"
#include "window.h"
#include "puresize.h"
#include "gnutls.h"
@@ -3579,8 +3578,7 @@ DEFUN ("yes-or-no-p", Fyes_or_no_p, Syes_or_no_p, 1, 1, 0,
ptrdiff_t promptlen = SCHARS (prompt);
bool prompt_ends_in_nonspace
= (0 < promptlen
- && (SYNTAX (XFIXNAT (Faref (prompt, make_fixnum (promptlen - 1))))
- != Swhitespace));
+ && !blankp (XFIXNAT (Faref (prompt, make_fixnum (promptlen - 1)))));
AUTO_STRING (space_string, " ");
prompt = CALLN (Fconcat, prompt,
prompt_ends_in_nonspace ? space_string : empty_unibyte_string,
^ permalink raw reply related [flat|nested] 10+ messages in thread
* bug#73307: Fix ctype(3) usage
2024-09-18 15:35 ` Paul Eggert
@ 2024-09-18 16:11 ` Eli Zaretskii
2024-09-18 16:22 ` Paul Eggert
0 siblings, 1 reply; 10+ messages in thread
From: Eli Zaretskii @ 2024-09-18 16:11 UTC (permalink / raw)
To: Paul Eggert; +Cc: luangruo, 73307, wiz, monnier
> Date: Wed, 18 Sep 2024 08:35:18 -0700
> Cc: 73307@debbugs.gnu.org, wiz@gatalith.at, luangruo@yahoo.com
> From: Paul Eggert <eggert@cs.ucla.edu>
>
> > You are testing characters for whitespace syntax, which is AFAIU
> > subject to buffer-local syntax tables.
>
> Thanks, I didn't think of that. How about the attached patch? It is an
> alternative to just using c_isspace.
That's much better, thanks. But I'm not sure we really want to allow
any Zs character there. For example, what about U+3000 IDEOGRAPHIC
SPACE? But if this is okay with us, please install.
^ permalink raw reply [flat|nested] 10+ messages in thread
* bug#73307: Fix ctype(3) usage
2024-09-18 16:11 ` Eli Zaretskii
@ 2024-09-18 16:22 ` Paul Eggert
0 siblings, 0 replies; 10+ messages in thread
From: Paul Eggert @ 2024-09-18 16:22 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: luangruo, 73307, wiz, monnier
On 2024-09-18 09:11, Eli Zaretskii wrote:
> what about U+3000 IDEOGRAPHIC SPACE?
That should be OK. For (yes-or-no-p "您想删除所有文件吗? "), where the prompt
ends in U+3000, no space is needed. I installed the patch.
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2024-09-18 16:22 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-16 21:25 bug#73307: Fix ctype(3) usage Thomas Klausner
2024-09-17 11:52 ` Eli Zaretskii
2024-09-17 11:55 ` Thomas Klausner
2024-09-18 0:05 ` Paul Eggert
2024-09-18 2:13 ` Po Lu via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-09-18 11:45 ` Eli Zaretskii
2024-09-18 15:18 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-09-18 15:35 ` Paul Eggert
2024-09-18 16:11 ` Eli Zaretskii
2024-09-18 16:22 ` Paul Eggert
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.