* bug#73307: Fix ctype(3) usage @ 2024-09-16 21:25 Thomas Klausner 2024-09-17 11:52 ` Eli Zaretskii 0 siblings, 1 reply; 10+ messages in thread From: Thomas Klausner @ 2024-09-16 21:25 UTC (permalink / raw) To: 73307 [-- Attachment #1: Type: text/plain, Size: 1824 bytes --] Tags: patch When compiling emacs on NetBSD-10.99.12/amd64, I get the following warning In file included from /usr/include/ctype.h:100, from fns.c:29: fns.c: In function 'Fyes_or_no_p': fns.c:3582:33: warning: array subscript has type 'char' [-Wchar-subscripts] 3582 | if ((len > 0) && !isspace (s[len - 1])) | ^ The NetBSD man page for ctype(3): https://man.netbsd.org/ctype.3 is quite explicit about the problems with this - ctype(3) functions only accept -1 and "unsigned char" and you can get very weird problems if this is disregarded. The attached patch adds the missing cast. In GNU Emacs 31.0.50 (build 1, x86_64--netbsd, GTK+ Version 3.24.43, cairo version 1.18.0) of 2024-09-16 Repository revision: f27553c30a772a0103d2e6762e4d7f588f302e4b Repository branch: HEAD System Description: NetBSD 10.99.12/amd64 Configured using: 'configure --srcdir=/scratch/wip/emacs-git/work/emacs --localstatedir=/var --with-native-compilation --without-ns --without-imagemagick --without-xaw3d --with-x-toolkit=gtk3 --prefix=/usr/pkg --build=x86_64--netbsd --host=x86_64--netbsd --infodir=/usr/pkg/info --mandir=/usr/pkg/man --enable-option-checking=yes 'CFLAGS=-O2 -g -g -fstack-clash-protection -I/usr/pkg/include -I/usr/include -I/usr/pkg/include/freetype2 -I/usr/pkg/include/glib-2.0 -I/usr/pkg/include/gio-unix-2.0 -I/usr/pkg/lib/glib-2.0/include -I/usr/pkg/include/harfbuzz -I/usr/pkg/include/libdrm' 'CPPFLAGS=-g -I/usr/pkg/include -I/usr/include -I/usr/pkg/include/freetype2 -I/usr/pkg/include/glib-2.0 -I/usr/pkg/include/gio-unix-2.0 -I/usr/pkg/lib/glib-2.0/include -I/usr/pkg/include/harfbuzz -I/usr/pkg/include/libdrm' 'LDFLAGS=-Wl,-R/usr/pkg/gcc14/lib -Wl,-zrelro -Wl,-znow -L/usr/pkg/lib -Wl,-R/usr/pkg/lib -L/usr/lib -Wl,-R/usr/lib'' [-- Attachment #2: patch-src_fns.c --] [-- Type: text/x-c, Size: 460 bytes --] $NetBSD$ Fix ctype(3) usage. --- src/fns.c.orig 2024-09-16 21:11:40.908684144 +0000 +++ src/fns.c @@ -3579,7 +3579,7 @@ by a mouse, or by some window-system ges { char *s = SSDATA (prompt); ptrdiff_t len = strlen (s); - if ((len > 0) && !isspace (s[len - 1])) + if ((len > 0) && !isspace ((unsigned char)s[len - 1])) prompt = CALLN (Fconcat, prompt, build_string (" ")); } prompt = CALLN (Fconcat, prompt, Vyes_or_no_prompt); ^ permalink raw reply [flat|nested] 10+ messages in thread
* bug#73307: Fix ctype(3) usage 2024-09-16 21:25 bug#73307: Fix ctype(3) usage Thomas Klausner @ 2024-09-17 11:52 ` Eli Zaretskii 2024-09-17 11:55 ` Thomas Klausner 2024-09-18 0:05 ` Paul Eggert 0 siblings, 2 replies; 10+ messages in thread From: Eli Zaretskii @ 2024-09-17 11:52 UTC (permalink / raw) To: Thomas Klausner, Paul Eggert; +Cc: 73307 > Date: Mon, 16 Sep 2024 23:25:16 +0200 > From: Thomas Klausner <wiz@gatalith.at> > > When compiling emacs on NetBSD-10.99.12/amd64, I get the following > warning > > In file included from /usr/include/ctype.h:100, > from fns.c:29: > fns.c: In function 'Fyes_or_no_p': > fns.c:3582:33: warning: array subscript has type 'char' [-Wchar-subscripts] > 3582 | if ((len > 0) && !isspace (s[len - 1])) > | ^ > > The NetBSD man page for ctype(3): https://man.netbsd.org/ctype.3 > is quite explicit about the problems with this - ctype(3) functions only > accept -1 and "unsigned char" and you can get very weird problems if > this is disregarded. > > The attached patch adds the missing cast. > [...] > > --- src/fns.c.orig 2024-09-16 21:11:40.908684144 +0000 > +++ src/fns.c > @@ -3579,7 +3579,7 @@ by a mouse, or by some window-system ges > { > char *s = SSDATA (prompt); > ptrdiff_t len = strlen (s); > - if ((len > 0) && !isspace (s[len - 1])) > + if ((len > 0) && !isspace ((unsigned char)s[len - 1])) > prompt = CALLN (Fconcat, prompt, build_string (" ")); > } > prompt = CALLN (Fconcat, prompt, Vyes_or_no_prompt); Thanks. However, I'm not sure this is the right fix, the function is defined with argument type of 'int'. Paul, any comments? ^ permalink raw reply [flat|nested] 10+ messages in thread
* bug#73307: Fix ctype(3) usage 2024-09-17 11:52 ` Eli Zaretskii @ 2024-09-17 11:55 ` Thomas Klausner 2024-09-18 0:05 ` Paul Eggert 1 sibling, 0 replies; 10+ messages in thread From: Thomas Klausner @ 2024-09-17 11:55 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 73307, Paul Eggert On Tue, Sep 17, 2024 at 02:52:20PM +0300, Eli Zaretskii wrote: > Thanks. However, I'm not sure this is the right fix, the function is > defined with argument type of 'int'. Paul, any comments? That is exactly the problem. If you pass a 'signed char' below -1, it gets converted to the same integer value, also a negative number below -1, and you end up passing an invalid argument to the function. The function is defined to only accept EOF and non-negative values in the range of 'unsigned char'. -1 is only special because that's usually the value of EOF. Thomas ^ permalink raw reply [flat|nested] 10+ messages in thread
* bug#73307: Fix ctype(3) usage 2024-09-17 11:52 ` Eli Zaretskii 2024-09-17 11:55 ` Thomas Klausner @ 2024-09-18 0:05 ` Paul Eggert 2024-09-18 2:13 ` Po Lu via Bug reports for GNU Emacs, the Swiss army knife of text editors 2024-09-18 11:45 ` Eli Zaretskii 1 sibling, 2 replies; 10+ messages in thread From: Paul Eggert @ 2024-09-18 0:05 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Po Lu, 73307-done, Thomas Klausner [-- Attachment #1: Type: text/plain, Size: 1108 bytes --] On 2024-09-17 04:52, Eli Zaretskii wrote: > However, I'm not sure this is the right fix, the function is > defined with argument type of 'int'. Paul, any comments? Although that patch was an improvement it still had problems, as it incorrectly assumed the string does not end in a multibyte space, and that Emacs's locale matches the system's. Emacs itself should not use <ctype.h> unless it knows the string is unibyte and the system locale matches Emacs's. I scanned through its source code looking for all problematic instances of <ctype.h> that have crept in (except I didn't scan the MS-Windows code, where you're the expert), and found five other places where ctype.h was obviously misused. I installed the attached to fix these glitches and am boldly closing this the report. I can't easily test patch 0003, which fixes Android-specific code. Although I think it's an improvement, in unlikely cases I suspect it still doesn't exactly match what the Android kernel does with #! lines. I don't know whether that matters. I'll CC this to Po Lu (my goto person for Android) as a heads-up. [-- Attachment #2: 0001-Fix-yes-or-no-p-with-multibyte-spaces.patch --] [-- Type: text/x-patch, Size: 2141 bytes --] From 43cde03fa5a663a1509a762077c11eb57a60cee8 Mon Sep 17 00:00:00 2001 From: Paul Eggert <eggert@cs.ucla.edu> Date: Tue, 17 Sep 2024 15:22:02 -0700 Subject: [PATCH 1/4] Fix yes-or-no-p with multibyte spaces MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Problem reported by Thomas Klausner (Bug#73307). Emacs shouldn’t use ctype.h, as it doesn’t work for multibyte chars and it doesn’t work with Emacs’s locale model anyway. * src/fns.c: Include syntax.h, not ctype.h. (Fyes_or_no_p): Check the character category with SYNTAX, not with isspace, which assumes the current locale and works only with single-byte characters. --- src/fns.c | 18 ++++++++++-------- 1 file changed, 10 insertions(+), 8 deletions(-) diff --git a/src/fns.c b/src/fns.c index 57113a8c5ed..370f7711b90 100644 --- a/src/fns.c +++ b/src/fns.c @@ -26,7 +26,6 @@ Copyright (C) 1985-2024 Free Software Foundation, Inc. #include <intprops.h> #include <vla.h> #include <errno.h> -#include <ctype.h> #include <math.h> #include "lisp.h" @@ -36,6 +35,7 @@ Copyright (C) 1985-2024 Free Software Foundation, Inc. #include "composite.h" #include "buffer.h" #include "intervals.h" +#include "syntax.h" #include "window.h" #include "puresize.h" #include "gnutls.h" @@ -3576,13 +3576,15 @@ DEFUN ("yes-or-no-p", Fyes_or_no_p, Syes_or_no_p, 1, 1, 0, if (use_short_answers) return call1 (Qy_or_n_p, prompt); - { - char *s = SSDATA (prompt); - ptrdiff_t len = strlen (s); - if ((len > 0) && !isspace (s[len - 1])) - prompt = CALLN (Fconcat, prompt, build_string (" ")); - } - prompt = CALLN (Fconcat, prompt, Vyes_or_no_prompt); + ptrdiff_t promptlen = SCHARS (prompt); + bool prompt_ends_in_nonspace + = (0 < promptlen + && (SYNTAX (XFIXNAT (Faref (prompt, make_fixnum (promptlen - 1)))) + != Swhitespace)); + AUTO_STRING (space_string, " "); + prompt = CALLN (Fconcat, prompt, + prompt_ends_in_nonspace ? space_string : empty_unibyte_string, + Vyes_or_no_prompt); specpdl_ref count = SPECPDL_INDEX (); specbind (Qenable_recursive_minibuffers, Qt); -- 2.43.0 [-- Attachment #3: 0002-Fix-misuse-of-toupper-in-sfnt_parse_style.patch --] [-- Type: text/x-patch, Size: 1360 bytes --] From 58a44b6ac317c9a17bcdd208e4ec33ff9772429e Mon Sep 17 00:00:00 2001 From: Paul Eggert <eggert@cs.ucla.edu> Date: Tue, 17 Sep 2024 15:23:47 -0700 Subject: [PATCH 2/4] Fix misuse of toupper in sfnt_parse_style MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * src/sfntfont.c: Include c-ctype.h, not ctype.h. (sfnt_parse_style): Upcase just initial ASCII letters; that’s good enough here. --- src/sfntfont.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/src/sfntfont.c b/src/sfntfont.c index 8c02c05e7a6..f21b6c0782f 100644 --- a/src/sfntfont.c +++ b/src/sfntfont.c @@ -20,7 +20,7 @@ Copyright (C) 2023-2024 Free Software Foundation, Inc. #include <config.h> #include <fcntl.h> -#include <ctype.h> +#include <c-ctype.h> #include "lisp.h" @@ -534,12 +534,12 @@ sfnt_parse_style (Lisp_Object style_name, struct sfnt_font_desc *desc) } /* This token is extraneous or was not recognized. Capitalize - the first letter and set it as the adstyle. */ + the first letter if it's ASCII lowercase, then set the token as + the adstyle. */ if (strlen (single)) { - if (islower (single[0])) - single[0] = toupper (single[0]); + single[0] = c_toupper (single[0]); if (NILP (desc->adstyle)) desc->adstyle = build_string (single); -- 2.43.0 [-- Attachment #4: 0003-Fix-some-misparsing-in-check_interpreter.patch --] [-- Type: text/x-patch, Size: 1889 bytes --] From e0b027d1215ed32fe0f3d0956d2d7a81552274f2 Mon Sep 17 00:00:00 2001 From: Paul Eggert <eggert@cs.ucla.edu> Date: Tue, 17 Sep 2024 16:38:53 -0700 Subject: [PATCH 3/4] Fix some #! misparsing in check_interpreter MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * exec/exec.c: Do not include ctype.h, as the kernel does not care about the locale. (check_interpreter): Treat only spaces and tabs as white space. Do not inspect more bytes than were read. Although the resulting code does not exactly match what the Android kernel does, it’s closer than what it was before. --- exec/exec.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/exec/exec.c b/exec/exec.c index 775a8b06b96..f31c9a89744 100644 --- a/exec/exec.c +++ b/exec/exec.c @@ -24,7 +24,6 @@ Copyright (C) 2023-2024 Free Software Foundation, Inc. #include <fcntl.h> #include <assert.h> #include <string.h> -#include <ctype.h> #include <stdlib.h> #include <sys/ptrace.h> @@ -116,11 +115,11 @@ check_interpreter (const char *name, int fd, const char **extra) /* Strip leading whitespace. */ start = buffer; - while (*start && ((unsigned char) *start) < 128 && isspace (*start)) + while (start < buffer + rc && (*start == ' ' || *start == '\t')) ++start; /* Look for a newline character. */ - end = memchr (start, '\n', rc); + end = memchr (start, '\n', buffer + rc - start); if (!end) goto fail; @@ -130,11 +129,12 @@ check_interpreter (const char *name, int fd, const char **extra) *end = '\0'; /* Now look for any whitespace characters. */ - ws = strchr (start, ' '); + for (ws = start; *ws && *ws != ' ' && *ws != '\t'; ws++) + continue; /* If there's no whitespace, return the entire start. */ - if (!ws) + if (!*ws) { if (lseek (fd, 0, SEEK_SET)) goto fail; -- 2.43.0 [-- Attachment #5: 0004-Use-c-ctype.h-in-lib-src.patch --] [-- Type: text/x-patch, Size: 4041 bytes --] From 865b54e2acea4fdaa3f302ed225f50281b371d6e Mon Sep 17 00:00:00 2001 From: Paul Eggert <eggert@cs.ucla.edu> Date: Tue, 17 Sep 2024 16:54:17 -0700 Subject: [PATCH 4/4] Use c-ctype.h in lib-src This fixes some unlikely bugs and removes the temptation of using ctype.h. Although some uses were correct, many weren't. * lib-src/ebrowse.c: Include c-ctype.h, not ctype.h. * lib-src/emacsclient.c: Include c-ctype.h, not ctype.h. * lib-src/update-game-score.c: Include c-ctype.h, not ctype.h. All uses changed. --- lib-src/ebrowse.c | 10 +++++----- lib-src/emacsclient.c | 6 +++--- lib-src/update-game-score.c | 4 ++-- 3 files changed, 10 insertions(+), 10 deletions(-) diff --git a/lib-src/ebrowse.c b/lib-src/ebrowse.c index 821c29272a4..13c1befdc3e 100644 --- a/lib-src/ebrowse.c +++ b/lib-src/ebrowse.c @@ -22,11 +22,11 @@ Copyright (C) 1992-2024 Free Software Foundation, Inc. #include <stddef.h> #include <stdlib.h> #include <string.h> -#include <ctype.h> #include <assert.h> #include <getopt.h> #include <attribute.h> +#include <c-ctype.h> #include <flexmember.h> #include <min-max.h> #include <unlocked-io.h> @@ -1875,7 +1875,7 @@ #define RSTRING_EOF_CHECK \ int_suffixes: /* Integer suffixes. */ - while (isalpha (c)) + while (c_isalpha (c)) GET (c); UNGET (); return CINT; @@ -1907,7 +1907,7 @@ #define RSTRING_EOF_CHECK \ } /* Optional type suffixes. */ - while (isalpha (c)) + while (c_isalpha (c)) GET (c); UNGET (); return CFLOAT; @@ -2158,7 +2158,7 @@ init_scanner (void) /* Set up character class vectors. */ for (i = 0; i < sizeof is_ident; ++i) { - if (i == '_' || isalnum (i)) + if (i == '_' || c_isalnum (i)) is_ident[i] = 1; if (i >= '0' && i <= '9') @@ -2946,7 +2946,7 @@ operator_name (int *sc) MATCH (); /* If this is a simple operator like `+', stop now. */ - if (!isalpha ((unsigned char) *s) && *s != '(' && *s != '[') + if (!c_isalpha (*s) && *s != '(' && *s != '[') break; ++tokens_matched; diff --git a/lib-src/emacsclient.c b/lib-src/emacsclient.c index 8e64f1e92d3..c1ffa1480ec 100644 --- a/lib-src/emacsclient.c +++ b/lib-src/emacsclient.c @@ -68,7 +68,6 @@ Copyright (C) 1986-2024 Free Software Foundation, Inc. #define DEFAULT_TIMEOUT (30) -#include <ctype.h> #include <errno.h> #include <getopt.h> #include <inttypes.h> @@ -83,6 +82,7 @@ #define DEFAULT_TIMEOUT (30) #include <unistd.h> #include <attribute.h> +#include <c-ctype.h> #include <filename.h> #include <intprops.h> #include <min-max.h> @@ -2124,7 +2124,7 @@ main (int argc, char **argv) unsigned char c; do c = *++p; - while (isdigit (c) || c == ':'); + while (c_isdigit (c) || c == ':'); if (c == 0) { @@ -2136,7 +2136,7 @@ main (int argc, char **argv) } #ifdef WINDOWSNT else if (! IS_ABSOLUTE_FILE_NAME (argv[i]) - && (isalpha (argv[i][0]) && argv[i][1] == ':')) + && (c_isalpha (argv[i][0]) && argv[i][1] == ':')) /* Windows can have a different default directory for each drive, so the cwd passed via "-dir" is not sufficient to account for that. diff --git a/lib-src/update-game-score.c b/lib-src/update-game-score.c index e3b24ad7717..7545d0390c1 100644 --- a/lib-src/update-game-score.c +++ b/lib-src/update-game-score.c @@ -41,11 +41,11 @@ Copyright (C) 2002-2024 Free Software Foundation, Inc. #include <stdlib.h> #include <time.h> #include <pwd.h> -#include <ctype.h> #include <fcntl.h> #include <sys/stat.h> #include <getopt.h> +#include <c-ctype.h> #include <unlocked-io.h> #ifdef WINDOWSNT @@ -143,7 +143,7 @@ normalize_integer (char *num) { bool neg; char *p; - while (*num != '\n' && isspace (*num)) + while (*num != '\n' && c_isspace (*num)) num++; neg = *num == '-'; num += neg || *num == '-'; -- 2.43.0 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* bug#73307: Fix ctype(3) usage 2024-09-18 0:05 ` Paul Eggert @ 2024-09-18 2:13 ` Po Lu via Bug reports for GNU Emacs, the Swiss army knife of text editors 2024-09-18 11:45 ` Eli Zaretskii 1 sibling, 0 replies; 10+ messages in thread From: Po Lu via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2024-09-18 2:13 UTC (permalink / raw) To: Paul Eggert; +Cc: Eli Zaretskii, 73307-done, Thomas Klausner Paul Eggert <eggert@cs.ucla.edu> writes: > I can't easily test patch 0003, which fixes Android-specific > code. Although I think it's an improvement, in unlikely cases I > suspect it still doesn't exactly match what the Android kernel does > with #! lines. I don't know whether that matters. I'll CC this to Po > Lu (my goto person for Android) as a heads-up. It's OK. Though I have never encountered shebangs with tabs. ^ permalink raw reply [flat|nested] 10+ messages in thread
* bug#73307: Fix ctype(3) usage 2024-09-18 0:05 ` Paul Eggert 2024-09-18 2:13 ` Po Lu via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2024-09-18 11:45 ` Eli Zaretskii 2024-09-18 15:18 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2024-09-18 15:35 ` Paul Eggert 1 sibling, 2 replies; 10+ messages in thread From: Eli Zaretskii @ 2024-09-18 11:45 UTC (permalink / raw) To: Paul Eggert, Stefan Monnier; +Cc: luangruo, 73307, wiz > Date: Tue, 17 Sep 2024 17:05:29 -0700 > Cc: 73307-done@debbugs.gnu.org, Thomas Klausner <wiz@gatalith.at>, > Po Lu <luangruo@yahoo.com> > From: Paul Eggert <eggert@cs.ucla.edu> > > On 2024-09-17 04:52, Eli Zaretskii wrote: > > However, I'm not sure this is the right fix, the function is > > defined with argument type of 'int'. Paul, any comments? > > Although that patch was an improvement it still had problems, as it > incorrectly assumed the string does not end in a multibyte space, and > that Emacs's locale matches the system's. > > Emacs itself should not use <ctype.h> unless it knows the string is > unibyte and the system locale matches Emacs's. I scanned through its > source code looking for all problematic instances of <ctype.h> that have > crept in (except I didn't scan the MS-Windows code, where you're the > expert), and found five other places where ctype.h was obviously > misused. I installed the attached to fix these glitches and am boldly > closing this the report. > > I can't easily test patch 0003, which fixes Android-specific code. > Although I think it's an improvement, in unlikely cases I suspect it > still doesn't exactly match what the Android kernel does with #! lines. > I don't know whether that matters. I'll CC this to Po Lu (my goto person > for Android) as a heads-up. Thanks. This part doesn't look right to me: > - { > - char *s = SSDATA (prompt); > - ptrdiff_t len = strlen (s); > - if ((len > 0) && !isspace (s[len - 1])) > - prompt = CALLN (Fconcat, prompt, build_string (" ")); > - } > - prompt = CALLN (Fconcat, prompt, Vyes_or_no_prompt); > + ptrdiff_t promptlen = SCHARS (prompt); > + bool prompt_ends_in_nonspace > + = (0 < promptlen > + && (SYNTAX (XFIXNAT (Faref (prompt, make_fixnum (promptlen - 1)))) > + != Swhitespace)); > + AUTO_STRING (space_string, " "); > + prompt = CALLN (Fconcat, prompt, > + prompt_ends_in_nonspace ? space_string : empty_unibyte_string, > + Vyes_or_no_prompt); You are testing characters for whitespace syntax, which is AFAIU subject to buffer-local syntax tables. Thus, a strange enough setting of buffer syntax could make this code return unexpected results. Stefan, am I right? Why can't we use something closer to the original code, which doesn't depend on buffer-local customizations? In particular, what's the problem with using c_isspace (and what do you mean by "multibyte space"?) ^ permalink raw reply [flat|nested] 10+ messages in thread
* bug#73307: Fix ctype(3) usage 2024-09-18 11:45 ` Eli Zaretskii @ 2024-09-18 15:18 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2024-09-18 15:35 ` Paul Eggert 1 sibling, 0 replies; 10+ messages in thread From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2024-09-18 15:18 UTC (permalink / raw) To: Eli Zaretskii; +Cc: luangruo, 73307, Paul Eggert, wiz > You are testing characters for whitespace syntax, which is AFAIU > subject to buffer-local syntax tables. Thus, a strange enough setting > of buffer syntax could make this code return unexpected results. > > Stefan, am I right? Yup. > Why can't we use something closer to the original code, which doesn't > depend on buffer-local customizations? We could use the global/default syntax table? [ Or rely on some of the unicode tables? ] Stefan ^ permalink raw reply [flat|nested] 10+ messages in thread
* bug#73307: Fix ctype(3) usage 2024-09-18 11:45 ` Eli Zaretskii 2024-09-18 15:18 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2024-09-18 15:35 ` Paul Eggert 2024-09-18 16:11 ` Eli Zaretskii 1 sibling, 1 reply; 10+ messages in thread From: Paul Eggert @ 2024-09-18 15:35 UTC (permalink / raw) To: Eli Zaretskii, Stefan Monnier; +Cc: luangruo, 73307, wiz [-- Attachment #1: Type: text/plain, Size: 762 bytes --] On 2024-09-18 04:45, Eli Zaretskii wrote: > what's the > problem with using c_isspace (and what do you mean by "multibyte > space" Using c_isspace would be OK, in that it'd handle most practical examples. I wrote the fancier version to handle prompts that end in non-ASCII spaces, e.g., (yes-or-no-p "Delete all files? ") where the last character is actually U+00A0 NO-BREAK_SPACE instead of U+0020 SPACE, so its UTF-8 encoding is multiple bytes. If it's not important to handle such cases we could just use c_isspace. > You are testing characters for whitespace syntax, which is AFAIU > subject to buffer-local syntax tables. Thanks, I didn't think of that. How about the attached patch? It is an alternative to just using c_isspace. [-- Attachment #2: fns.c.diff --] [-- Type: text/x-patch, Size: 825 bytes --] diff --git a/src/fns.c b/src/fns.c index 370f7711b90..d3ee98c3bae 100644 --- a/src/fns.c +++ b/src/fns.c @@ -35,7 +35,6 @@ Copyright (C) 1985-2024 Free Software Foundation, Inc. #include "composite.h" #include "buffer.h" #include "intervals.h" -#include "syntax.h" #include "window.h" #include "puresize.h" #include "gnutls.h" @@ -3579,8 +3578,7 @@ DEFUN ("yes-or-no-p", Fyes_or_no_p, Syes_or_no_p, 1, 1, 0, ptrdiff_t promptlen = SCHARS (prompt); bool prompt_ends_in_nonspace = (0 < promptlen - && (SYNTAX (XFIXNAT (Faref (prompt, make_fixnum (promptlen - 1)))) - != Swhitespace)); + && !blankp (XFIXNAT (Faref (prompt, make_fixnum (promptlen - 1))))); AUTO_STRING (space_string, " "); prompt = CALLN (Fconcat, prompt, prompt_ends_in_nonspace ? space_string : empty_unibyte_string, ^ permalink raw reply related [flat|nested] 10+ messages in thread
* bug#73307: Fix ctype(3) usage 2024-09-18 15:35 ` Paul Eggert @ 2024-09-18 16:11 ` Eli Zaretskii 2024-09-18 16:22 ` Paul Eggert 0 siblings, 1 reply; 10+ messages in thread From: Eli Zaretskii @ 2024-09-18 16:11 UTC (permalink / raw) To: Paul Eggert; +Cc: luangruo, 73307, wiz, monnier > Date: Wed, 18 Sep 2024 08:35:18 -0700 > Cc: 73307@debbugs.gnu.org, wiz@gatalith.at, luangruo@yahoo.com > From: Paul Eggert <eggert@cs.ucla.edu> > > > You are testing characters for whitespace syntax, which is AFAIU > > subject to buffer-local syntax tables. > > Thanks, I didn't think of that. How about the attached patch? It is an > alternative to just using c_isspace. That's much better, thanks. But I'm not sure we really want to allow any Zs character there. For example, what about U+3000 IDEOGRAPHIC SPACE? But if this is okay with us, please install. ^ permalink raw reply [flat|nested] 10+ messages in thread
* bug#73307: Fix ctype(3) usage 2024-09-18 16:11 ` Eli Zaretskii @ 2024-09-18 16:22 ` Paul Eggert 0 siblings, 0 replies; 10+ messages in thread From: Paul Eggert @ 2024-09-18 16:22 UTC (permalink / raw) To: Eli Zaretskii; +Cc: luangruo, 73307, wiz, monnier On 2024-09-18 09:11, Eli Zaretskii wrote: > what about U+3000 IDEOGRAPHIC SPACE? That should be OK. For (yes-or-no-p "您想删除所有文件吗? "), where the prompt ends in U+3000, no space is needed. I installed the patch. ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2024-09-18 16:22 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-09-16 21:25 bug#73307: Fix ctype(3) usage Thomas Klausner 2024-09-17 11:52 ` Eli Zaretskii 2024-09-17 11:55 ` Thomas Klausner 2024-09-18 0:05 ` Paul Eggert 2024-09-18 2:13 ` Po Lu via Bug reports for GNU Emacs, the Swiss army knife of text editors 2024-09-18 11:45 ` Eli Zaretskii 2024-09-18 15:18 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2024-09-18 15:35 ` Paul Eggert 2024-09-18 16:11 ` Eli Zaretskii 2024-09-18 16:22 ` Paul Eggert
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/emacs.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).