unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
From: David Ponce <da_vid@orange.fr>
To: 65496@debbugs.gnu.org
Subject: bug#65496: 30.0.50; Issue with the regexp used to auto-detect PBM image data
Date: Mon, 4 Sep 2023 18:32:22 +0200	[thread overview]
Message-ID: <bc7d3182-9647-05be-f2c0-b27275a824e5@orange.fr> (raw)
In-Reply-To: <2fea228e-a8e8-5b8e-b91d-2d808d624649@orange.fr>

On 24/08/2023 12:55, David Ponce wrote:
> Hello,
> 
> While experimenting with code to create image from data, I encountered
> an issue with the regexp in `image-type-header-regexps' used to
> auto-detect PBM image type from the first bytes of image data. That is:
> 
> "\\`P[1-6]\\(?:\
> \\(?:\\(?:#[^\r\n]*[\r\n]\\)*[[:space:]]\\)+\
> \\(?:\\(?:#[^\r\n]*[\r\n]\\)*[0-9]\\)+\
> \\)\\{2\\}"
> 
> Here is a simple recipe to illustrate the issue:
> 
> In *scratch* buffer eval:
> -------------------------
> ;; Get content of a pbm file.
> (setq test-data
>        (with-current-buffer
>            (find-file-noselect "[YourEmacsPath]/etc/images/splash.pbm")
>          (prog1 (buffer-substring-no-properties (point-min) (point-max))
>            (kill-buffer (current-buffer)))))
> 
> ;; Check string data fail for pbm image-type!
> (image-type-from-data test-data)
>>>> nil
> ;; With a temp buffer current, the same test works!
> (with-temp-buffer
>   (image-type-from-data test-data))
>>>> pbm
> -------------------------
> 
> After further digging, I found that the problem might be due to the use
> of the [:space:] character class whose meaning, according to the manual,
> depends on the syntax of whitespace characters setup in current buffer.
> So, using discrete values in place of syntax class seems to solve the
> issue:
> 
> (setcar (nth 1 image-type-header-regexps)
>          "\\`P[1-6]\\(?:\
> \\(?:\\(?:#[^\r\n]*[\r\n]\\)*[ \t\r\n]\\)+\
> \\(?:\\(?:#[^\r\n]*[\r\n]\\)*[0-9]\\)+\
> \\)\\{2\\}")
> 
> (image-type-from-data test-data)
>>>> pbm
> 
> I attached a patch proposal.
> Hope it will help.
> Regards

Some additions.

Basic string matching recipe:

In *scratch* buffer eval:
-------------------------

(let ((re "\\`P[1-6]\\(?:\
\\(?:\\(?:#[^\r\n]*[\r\n]\\)*[[:space:]]\\)+\
\\(?:\\(?:#[^\r\n]*[\r\n]\\)*[0-9]\\)+\
\\)\\{2\\}")
       (text "P4
333 233"))
   (string-match-p re text))
>>> nil

(with-syntax-table (standard-syntax-table)
   (let ((re "\\`P[1-6]\\(?:\
\\(?:\\(?:#[^\r\n]*[\r\n]\\)*[[:space:]]\\)+\
\\(?:\\(?:#[^\r\n]*[\r\n]\\)*[0-9]\\)+\
\\)\\{2\\}")
         (text "P4
333 233"))
     (string-match-p re text)))
>>> 0

I wonder if it is expected that matching a regular expression against a string
object depends on the syntax-table setup in current buffer?
Shouldn't (standard-syntax-table) implied when matching a regexp against a string
object, that is, regardless of any buffer context?

Regards





  reply	other threads:[~2023-09-04 16:32 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-24 10:55 bug#65496: 30.0.50; Issue with the regexp used to auto-detect PBM image data David Ponce
2023-09-04 16:32 ` David Ponce [this message]
2023-09-04 17:36   ` Eli Zaretskii
     [not found]     ` <6e4af25a-03b1-ef82-b1c0-2da81938e215@orange.fr>
2023-09-05 11:08       ` Eli Zaretskii
2023-09-06 14:05         ` David Ponce
2023-09-06 16:00           ` Eli Zaretskii
2023-09-06 16:19             ` David Ponce

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bc7d3182-9647-05be-f2c0-b27275a824e5@orange.fr \
    --to=da_vid@orange.fr \
    --cc=65496@debbugs.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).