all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Martin Jerabek <om@mailservice.ms>
To: 58718@debbugs.gnu.org
Subject: bug#58718: Incorrect regex in nXML URI check
Date: Sat, 22 Oct 2022 14:58:28 +0200	[thread overview]
Message-ID: <938384610b8c5411944afa7fae860f15e2e40eae.camel@fastmail.fm> (raw)

[-- Attachment #1: Type: text/plain, Size: 3212 bytes --]

Hi!

In the function rng-uri-file-name-1 (file lisp/nxml/rng-uri.el line 71)
the regular expression to check the passed URI is wrong. It is intended
to make sure that special characters in the URI are correctly encoded,
i.e. that the percent sign is followed by exactly two hex digits. The
relevant part of the regular expression is

%[[:xdigit:]]{2}

However, the curly braces only have their special meaning of specifying
the number of repetitions if they are escaped by a backslash. Otherwise
they are interpreted as literal braces, so the current regular
expression would only match a percent sign followed by one hex digit
followed by the literal string "{2}".

I stumbled upon this problem when trying to edit an XML file located in
a path whose name contained space characters. Associating a RELAX-NG
schema with this file (in the schemas.xml file) and reloading the file
with rng-auto-set-schema-and-validate resulted in the error message

"Bad escapes in URI 'file:///home/user/path%20with%20spaces/foo.rnc'"

I used a path without spaces to work around this problem for the time
being. As far as I can tell, this bug has existing since the first
version of rng-uri.el, i.e. it has never worked as intended (unless the
Emacs regular expression syntax changed in the meantime).

I am using Emacs 28.1 from the current Fedora 36 package but I checked
out the master branch of the Emacs source code to make sure that the
problem still exists there.

Find attached a patch to fix this problem. I hope it is trivial enough
to be applied without a formal copyright assignment (it just adds four
backslashes).

Best regards
Martin Jerabek


In GNU Emacs 28.1 (build 1, x86_64-redhat-linux-gnu, GTK+ Version
3.24.34, cairo version 1.17.6)
 of 2022-07-15 built on buildhw-x86-02.iad2.fedoraproject.org Windowing
system distributor 'The X.Org Foundation', version 11.0.12014000 System
Description: Fedora Linux 36 (Workstation Edition)

Configured using:
 'configure --build=x86_64-redhat-linux-gnu
 --host=x86_64-redhat-linux-gnu --program-prefix=
 --disable-dependency-tracking --prefix=/usr --exec-prefix=/usr
 --bindir=/usr/bin --sbindir=/usr/sbin --sysconfdir=/etc
 --datadir=/usr/share --includedir=/usr/include --libdir=/usr/lib64
 --libexecdir=/usr/libexec --localstatedir=/var sharedstatedir=/var/lib
 ----mandir=/usr/share/man infodir=/usr/share/info --with-dbus with-gif
 ------with-jpeg --with-png with-rsvg --with-tiff --with-xpm
 ----with-x-toolkit=gtk3 --with-gpm=no with-xwidgets --with-modules
 ----with-harfbuzz --with-cairo --with-json with-native-compilation
 --build_alias=x86_64-redhat-linux-gnu
 host_alias=x86_64-redhat-linux-gnu CC=gcc 'CFLAGS=-DMAIL_USE_LOCKF -O2
 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches
pipe
 --Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2
 -Wp,-D_GLIBCXX_ASSERTIONS specs=/usr/lib/rpm/redhat/redhat-hardened-
cc1
 --fstack-protector-strong specs=/usr/lib/rpm/redhat/redhat-annobin-cc1
 --m64 -mtune=generic fasynchronous-unwind-tables
 --fstack-clash-protection -fcf-protection'
 LDFLAGS=-Wl,-z,relro
 PKG_CONFIG_PATH=:/usr/lib64/pkgconfig:/usr/share/pkgconfig'


[-- Attachment #2: rng-uri.patch --]
[-- Type: text/x-patch, Size: 571 bytes --]

diff --git a/lisp/nxml/rng-uri.el b/lisp/nxml/rng-uri.el
index 77fed8c32d..0a6fa39acb 100644
--- a/lisp/nxml/rng-uri.el
+++ b/lisp/nxml/rng-uri.el
@@ -68,7 +68,7 @@ Signal an error if URI is not a valid file URL."
 
 ;; pattern is either nil or match or replace
 (defun rng-uri-file-name-1 (uri pattern)
-  (unless (string-match "\\`\\(?:[^%]\\|%[[:xdigit:]]{2}\\)*\\'" uri)
+  (unless (string-match "\\`\\(?:[^%]\\|%[[:xdigit:]]\\{2\\}\\)*\\'" uri)
     (rng-uri-error "Bad escapes in URI `%s'" uri))
   (setq uri (rng-uri-unescape-multibyte uri))
   (let* ((components

             reply	other threads:[~2022-10-22 12:58 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-22 12:58 Martin Jerabek [this message]
     [not found] ` <handler.58718.B.16664517395858.ack@debbugs.gnu.org>
2022-10-23 10:46   ` bug#58718: Acknowledgement (Incorrect regex in nXML URI check) Martin Jerabek
2022-10-24 10:51 ` bug#58718: Incorrect regex in nXML URI check Mattias Engdegård
2022-10-24 12:26   ` Robert Pluim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=938384610b8c5411944afa7fae860f15e2e40eae.camel@fastmail.fm \
    --to=om@mailservice.ms \
    --cc=58718@debbugs.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.