From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Francis Wright Newsgroups: gmane.emacs.devel Subject: Emacs pretest list; wildcard-to-regexp Date: Sat, 28 Apr 2012 18:21:10 +0100 Message-ID: NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_NextPart_000_0056_01CD256B.B018DC00" X-Trace: dough.gmane.org 1335636843 10435 80.91.229.3 (28 Apr 2012 18:14:03 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Sat, 28 Apr 2012 18:14:03 +0000 (UTC) To: Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Sat Apr 28 20:14:00 2012 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1SOC9g-0002NU-4l for ged-emacs-devel@m.gmane.org; Sat, 28 Apr 2012 20:14:00 +0200 Original-Received: from localhost ([::1]:53975 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SOC9f-0008S4-Cg for ged-emacs-devel@m.gmane.org; Sat, 28 Apr 2012 14:13:59 -0400 Original-Received: from eggs.gnu.org ([208.118.235.92]:48850) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SOBKe-0002MZ-2n for emacs-devel@gnu.org; Sat, 28 Apr 2012 13:21:17 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1SOBKc-0007cw-Cr for emacs-devel@gnu.org; Sat, 28 Apr 2012 13:21:15 -0400 Original-Received: from snt0-omc4-s44.snt0.hotmail.com ([65.54.51.95]:14048) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SOBKc-0007cf-60 for emacs-devel@gnu.org; Sat, 28 Apr 2012 13:21:14 -0400 Original-Received: from SNT133-DS16 ([65.55.90.199]) by snt0-omc4-s44.snt0.hotmail.com with Microsoft SMTPSVC(6.0.3790.4675); Sat, 28 Apr 2012 10:21:10 -0700 X-Originating-IP: [84.13.156.200] X-Originating-Email: [f.j.wright@live.co.uk] X-Mailer: Microsoft Outlook 14.0 Thread-Index: Ac0lY03D+1enz7e0Q3OrmCWcS802AQ== Content-Language: en-gb X-OriginalArrivalTime: 28 Apr 2012 17:21:10.0856 (UTC) FILETIME=[4E422480:01CD2563] X-detected-operating-system: by eggs.gnu.org: Windows 2000 SP4, XP SP1+ X-Received-From: 65.54.51.95 X-Mailman-Approved-At: Sat, 28 Apr 2012 14:13:57 -0400 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:150123 Archived-At: ------=_NextPart_000_0056_01CD256B.B018DC00 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Can you please change my email address on the pretest list from f.j.wright@qmul.ac.uk to f.j.wright@live.co.uk? (I don't think I can do this myself, can I?) Thanks. About 10 years ago, I wrote a replacement for `wildcard-to-regexp' (defined in files.el) that supports bash-style {a,b,...} expansion controlled by the customizable option `wildcard-to-regexp-expand-{}'. I have just updated this for Emacs 24 at the request of a user who said he has found it useful. (I had forgotten that I ever released it, but I must have done!) I attach my current version. Would you like to include this modification in a future version of Emacs or as a package? If the former then I'll provide it as a file of diffs to files.el. If the latter then I'll provide it in the rigtht format for a package. (If neither then I'll think about releasing it somewhere else, such as EmacsWiki.) I have already assigned to the FSF copyright of any Emacs modifications that I might make. Best wishes, Francis ------=_NextPart_000_0056_01CD256B.B018DC00 Content-Type: application/octet-stream; name="wildcard-to-regexp.el" Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename="wildcard-to-regexp.el" ;;; wildcard-to-regexp.el --- with bash-style {a,b,...} expansion=0A= =0A= ;; Modified by: Francis J. Wright =0A= ;; Time-stamp: <2012-04-28 17:53:03 fjw>=0A= =0A= ;;; Commentary:=0A= =0A= ;; This version of `wildcard-to-regexp' adds optional bash-style=0A= ;; {a,b,...} expansion controlled by the customizable option=0A= ;; `wildcard-to-regexp-expand-{}'. In particular, this gives users of=0A= ;; `ls-lisp' (the default version of `ls' used by non-UNIX ports of=0A= ;; Emacs) the main functionality of running `ls' under bash. (The only=0A= ;; other standard Emacs 24 packages that call `wildcard-to-regexp' are=0A= ;; `grep' and `vhdl-mode'.) It also supports escaping by \ better, but=0A= ;; quoting is not supported.=0A= =0A= ;; The function `wildcard-to-regexp' (defined in files.el) is=0A= ;; preloaded, so just byte-compile this package somewhere in your=0A= ;; `load-path' and (require 'wildcard-to-regexp) in order to redefine=0A= ;; `wildcard-to-regexp'.=0A= =0A= ;; Test examples:=0A= =0A= ;; Note that in these examples ^@ represents the ASCII NULL character.=0A= ;; Hence [^^@] represents a regexp that matches any non-NULL ASCII=0A= ;; character. (This code will probably not work correctly with=0A= ;; non-ASCII filenames!) This file contains only US-ASCII characters.=0A= =0A= ;; These examples are all expanded:=0A= =0A= ;; (wildcard-to-regexp "*.{c,h,cpp}") -> = "\\`[^^@]*\\.\\(c\\|h\\|cpp\\)\\'"=0A= ;; (wildcard-to-regexp "*.{txt,texi}") -> = "\\`[^^@]*\\.\\(txt\\|texi\\)\\'"=0A= ;; (wildcard-to-regexp "*.{txt,}") -> "\\`[^^@]*\\.\\(txt\\|\\)\\'"=0A= ;; (wildcard-to-regexp "*.{,txt}") -> "\\`[^^@]*\\.\\(\\|txt\\)\\'"=0A= =0A= ;; but these are not:=0A= =0A= ;; (wildcard-to-regexp "*.{txt}") -> "\\`[^^@]*\\.{txt}\\'"=0A= ;; (wildcard-to-regexp "*.{txt\\,texi}") -> "\\`[^^@]*\\.{txt\\,texi}\\'"=0A= ;; (wildcard-to-regexp "*.\\{txt,texi}") -> "\\`[^^@]*\\.\\{txt,texi}\\'"=0A= =0A= ;; From bash manual:=0A= =0A= ;; (wildcard-to-regexp "/usr/local/src/bash/{old,new,dist,bugs}") ->=0A= ;; "\\`/usr/local/src/bash/\\(old\\|new\\|dist\\|bugs\\)\\'"=0A= =0A= ;; (wildcard-to-regexp "/usr/{ucb/{ex,edit},lib/{ex?.?*,how_ex}}") ->=0A= ;; = "\\`/usr/\\(ucb/\\(ex\\|edit\\)\\|lib/\\(ex[^^@]\\.[^^@][^^@]*\\|how_ex\\= )\\)\\'"=0A= =0A= ;;; History:=0A= =0A= ;; This file is based closely on `wildcard-to-regexp' defined in=0A= ;; `files.el', with a small addition that I originally wrote in July=0A= ;; 2001 for use with Emacs 21. I subsequently updated it in April 2012=0A= ;; for use with Emacs 24.=0A= =0A= ;;; Code:=0A= =0A= (eval-when-compile=0A= (require 'cl)) ; for incf, where `(incf i)' equiv `(setq i (1+ i))'=0A= =0A= (defcustom wildcard-to-regexp-expand-{} t=0A= "*Non-nil causes `wildcard-to-regexp' to expand {a,b,...} like bash.=0A= This affects `grep', `ls-lisp' (which `dired' may use, see=0A= `insert-directory') and `vhdl-mode'."=0A= :type 'boolean=0A= :group 'grep=0A= :group 'ls-lisp=0A= :group 'vhdl-mode=0A= :version "21.2")=0A= =0A= (defun wildcard-to-regexp (wildcard)=0A= "Given a shell file name pattern WILDCARD, return an equivalent regexp.=0A= The generated regexp will match a filename only if the filename=0A= matches that wildcard according to shell rules.=0A= If `wildcard-to-regexp-expand-{}' is non-nil then expand `{a,b,...}'=0A= like bash, allowing arbitrary nesting. To use `{', `,' and `}' for=0A= any other purpose they must be escaped by a preceding `\\'."=0A= ;; Shell wildcards should match the entire filename,=0A= ;; not its part. Make the regexp say so.=0A= (concat "\\`" (wildcard-to-regexp-1 wildcard) "\\'"))=0A= =0A= (defun wildcard-to-regexp-1 (wildcard)=0A= "As `wildcard-to-regexp' (WILDCARD) but without the \\`...\\'.=0A= Calls itself recursively."=0A= (let* ((i (string-match "[[.*+\\^$?{]" wildcard))=0A= ;; Copy the initial run of non-special characters.=0A= (result (substring wildcard 0 i))=0A= (len (length wildcard)))=0A= ;; If no special characters, we're almost done.=0A= (if i=0A= (while (< i len)=0A= (let ((ch (aref wildcard i))=0A= j)=0A= (setq=0A= result=0A= (concat result=0A= (cond=0A= ((and (eq ch ?\[)=0A= (< (1+ i) len)=0A= (eq (aref wildcard (1+ i)) ?\]))=0A= "\\[")=0A= ((eq ch ?\[) ; [...] maps to regexp char class=0A= (incf i)=0A= (concat=0A= (cond=0A= ((eq (aref wildcard i) ?!) ; [!...] -> [^...]=0A= (incf i)=0A= (if (eq (aref wildcard i) ?\])=0A= (progn=0A= (incf i)=0A= "[^]")=0A= "[^"))=0A= ((eq (aref wildcard i) ?^)=0A= ;; Found "[^". Insert a `\0' character=0A= ;; (which cannot happen in a filename)=0A= ;; into the character class, so that `^'=0A= ;; is not the first character after `[',=0A= ;; and thus non-special in a regexp.=0A= (incf i)=0A= "[\000^")=0A= ((eq (aref wildcard i) ?\])=0A= ;; I don't think `]' can appear in a=0A= ;; character class in a wildcard, but=0A= ;; let's be general here.=0A= (incf i)=0A= "[]")=0A= (t "["))=0A= (prog1 ; copy everything upto next `]'.=0A= (substring wildcard=0A= i=0A= (setq j (string-match=0A= "]" wildcard i)))=0A= (setq i (if j (1- j) (1- len))))))=0A= ((eq ch ?.) "\\.")=0A= ((eq ch ?*) "[^\000]*")=0A= ((eq ch ?+) "\\+")=0A= ((eq ch ?^) "\\^")=0A= ((eq ch ?$) "\\$")=0A= ((eq ch ?\\) ; FJW=0A= (incf i)=0A= (if (< i len)=0A= (concat "\\" (char-to-string (aref wildcard i)))=0A= "\\\\"))=0A= ((eq ch ??) "[^\000]")=0A= ((and (eq ch ?{)=0A= wildcard-to-regexp-expand-{})=0A= ;; FJW: {a,b,...} -> \(a\|b\|...\)=0A= ;; Return regexp equivalent to bash=0A= ;; `{a,b,...}'-pattern in string wildcard=0A= ;; beginning at index i.=0A= ;; [Note that wildcard-to-regexp-find-\,} start=0A= ;; index must allow for a preceding character=0A= ;; [^\], and so is i rather than (1+ i), etc.]=0A= ;; Find first comma:=0A= (let (s j ii)=0A= (if (not (and (setq j (wildcard-to-regexp-find-\,} wildcard i))=0A= (eq (aref wildcard j) ?,)))=0A= "{" ; does not match {a,...}=0A= (setq s (concat "\\(" ; Emacs 21: use shy group "\\(?:" ?=0A= (wildcard-to-regexp-1=0A= (substring wildcard (1+ i) j)))=0A= ii j)=0A= ;; Find subsequent commas or closing brace:=0A= (while (and (setq j (wildcard-to-regexp-find-\,} wildcard ii))=0A= (eq (aref wildcard j) ?,))=0A= (setq s (concat s "\\|"=0A= (wildcard-to-regexp-1=0A= (substring wildcard (1+ ii) j)))=0A= ii j))=0A= ;; Found closing brace or failed:=0A= (cond=0A= (j (setq s (concat s "\\|"=0A= (wildcard-to-regexp-1=0A= (substring wildcard (1+ ii) j)))=0A= i j) ; update i=0A= (concat s "\\)")) ; return regexp=0A= (t "{")) ; does not match {a,...}=0A= )))=0A= (t (char-to-string ch)))))=0A= (incf i))))=0A= result))=0A= =0A= (defun wildcard-to-regexp-find-\,} (s i)=0A= "Return index of first top-level `,' or `}' after `{' in string S at = index I.=0A= Allow nested `{...}' and ignore characters escaped by a preceding `\\'."=0A= (setq i (string-match "[^\\][{,}]" s i))=0A= (while (and i (eq (aref s (1+ i)) ?{))=0A= (setq i (wildcard-to-regexp-skip-{} s (1+ i)))=0A= (if i (setq i (string-match "[^\\][{,}]" s i))))=0A= (and i (1+ i)))=0A= =0A= (defun wildcard-to-regexp-skip-{} (s i)=0A= "Return index of `}' matching `{' in string S at index I.=0A= Allow nested `{...}' and ignore characters escaped by a preceding `\\'."=0A= (setq i (string-match "[^\\][{}]" s i))=0A= (while (and i (eq (aref s (1+ i)) ?{))=0A= (setq i (wildcard-to-regexp-skip-{} s (1+ i)))=0A= (if i (setq i (string-match "[^\\][{}]" s i))))=0A= (and i (1+ i)))=0A= =0A= (provide 'wildcard-to-regexp)=0A= =0A= ;;; wildcard-to-regexp.el ends here=0A= ------=_NextPart_000_0056_01CD256B.B018DC00--