all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: David Hansen <david.hansen@gmx.net>
To: emacs-devel@gnu.org
Subject: Re: [david.hansen@gmx.net: Re: comint's directory tracking doesn't understand \( or \)]
Date: Wed, 07 Mar 2007 15:49:33 +0100	[thread overview]
Message-ID: <873b4hjegy.fsf@localhorst.mine.nu> (raw)
In-Reply-To: 87irddev4c.fsf@catnip.gol.com

On Wed, 07 Mar 2007 09:48:51 +0900 Miles Bader wrote:

> Stefan Monnier <monnier@iro.umontreal.ca> writes:
>> Wouldn't it be better to move it to shell.el?
>> I.e. create a new function shell-argument?
>
> I think it would be a nicer interface if it were split into two
> functions:  one which which would just parse the arguments and return a
> lisp list of them (e.g. `shell-split') and one which would call
> shell-split and do the choose-N-through-M-and-apply-mapconcat stuff to
> return a string (e.g., `shell-arguments-string').
>
> E.g.:
>
>    (shell-split "this is \"a test\"") => ("this" "is" "a test")
>
>    (shell-arguments-string "this is \"a test\"" 1 2) => "is \"a test\""
>
> Simple uses might use the latter but the former seems a generally
> cleaner interface and building block for other uses.

OK, first of all I'm really really sorry, but I just couldn't resist
to allow "&", ";" and "|" in directory names too.  So i dropped what
Stefan called `shell-arguments-string'.  In this case it makes
things just more complicated.

I have named `shell-split' to `shell-split-arguments' and introduced
a little helper function `shell-split-commands'.

`shell-split-commands' replaces the old regular-expression based
guessing of what are different commands in the string (eg. "cd foo;
cd bar && cd baz").  Now where we have the list of arguments from
`shell-split-arguments' this can be done far more reliable.

I feel a bit bad that this became such a long discussion.  I think
this is an improvement to shell-mode but it's not worth to delay the
release of the one true editor.  If these changes go to far lets
just drop it for now and resume the discussion after the release.

David

*** shell.el	05 Mar 2007 01:10:08 +0100	1.149
--- shell.el	07 Mar 2007 15:48:35 +0100	
***************
*** 105,110 ****
--- 105,111 ----
  ;;; Code:
  
  (require 'comint)
+ (eval-when-compile (require 'cl))
  
  ;;; Customization and Buffer Variables
  
***************
*** 569,574 ****
--- 570,681 ----
  ;; Don't do this when shell.el is loaded, only while dumping.
  ;;;###autoload (add-hook 'same-window-buffer-names "*shell*")
  
+ ;;; Argument Splitting
+ 
+ (defun shell-split-arguments (string)
+   "Split STRING into it's arguments and return a list of arguments.
+ We assume members of `comint-delimiter-argument-list' and
+ whitespaces separate arguments, except within quotes or (unless
+ on MS-DOS) if escaped by a backslash character.  A run of more
+ than one character in `comint-delimiter-argument-list' is treated
+ as a single argument."
+   (let ((len (length string))
+         (esc (unless (and (fboundp 'w32-shell-dos-semantics)
+                           (w32-shell-dos-semantics))
+                ?\\))          ; backslash escape character, none on MS-DOS
+         (ifs '(?\n ?\t ?\ ))  ; whitespace word delimiters (the bash default)
+         (quo '(?\" ?\' ?\`))  ; string quoting characters
+         (i 0)                 ; character index of the string
+         (beg 0)               ; beginning of the currently parsed argument
+         state                 ; stack of parsing states (see below for details)
+         args)                 ; list of arguments parsed so far
+     (flet ((push-arg (new-beg)
+              ;; With the index `i' at the end of an argument push it to the
+              ;; list `args' and set the beginning of the next argument `beg'
+              ;; to NEW-BEG.
+              (push (substring string beg i) args)
+              (setq beg new-beg)))
+       ;; Loop over the characters of STRING and maintain a stack of "parser
+       ;; states".  Each parser state is a character.
+       ;;
+       ;; If it is a member of the list `quo' we are within a quoted string that
+       ;; is delimited by this character.
+       ;;
+       ;; If it is a member of `comint-delimiter-argument-list' it is the value
+       ;; of the prevously scanned character.  We need to keep track of it as a
+       ;; sequence of equal elements of `comint-delimiter-argument-list' are
+       ;; considered as one single argument (like '>>' or '&&').
+       ;;
+       ;; If it is `esc' (a backslash on most systems) the current character is
+       ;; escaped by a and treated like any ordinary non special character.
+       (while (<= i len)
+         (let ((s (car state))                      ; current state
+               (c (and (< i len) (aref string i)))) ; current character
+           (cond
+             ((and esc (eq esc s))       ; backslash escaped
+              (pop state))
+             ;; If within a sequence of `comint-delimiter-argument-list'
+             ;; characters check for the end of it (some different character).
+             ((and (member s comint-delimiter-argument-list) (not (eq s c)))
+              (push-arg i)
+              (pop state)
+              (decf i))                  ; parse this character again
+             ((member c quo)             ; quote character
+              (if (eq c s)
+                  ;; We are within a quote delimited string and the current
+                  ;; character is the same as the one that started the string.
+                  ;; We reached the end.  Update `state'.
+                  (pop state)
+                ;; The current character only starts a new quote delimited
+                ;; string if we aren't already in such a construct (which is
+                ;; equivalent to `s' being nil).  Keeping track of nested
+                ;; constructs doesn't make any sense when splitting arguments.
+                (or s (push c state))))
+             ;; If the current character is a backslash it quotes the next
+             ;; character unless we are within a `'' or ``' delimited string.
+             ((and (eq esc c) (not (or (eq ?\' s) (eq ?\` s))))
+              (push c state))
+             ((and (not s) (member c ifs)) ; space delimiters
+              (if (= beg i)
+                  ;; Some other character before this space already delimited an
+                  ;; argument.  Just adjust the beginning of the next argument.
+                  (incf beg)
+                ;; We found the end of an argument.
+                (push-arg (1+ i))))
+             ;; Check for special argument delimiting characters.
+             ((and (not s) (member c comint-delimiter-argument-list))
+              (push c state)
+              (when (/= beg i)
+                ;; This character ends the previous argument (there are no
+                ;; whitespaces before it).
+                (push-arg i)))
+             ((not c)                    ; end of the string
+              (unless (= beg len)        ; no whitespace at the end
+                (push-arg len))))
+           (incf i)))
+       (nreverse args))))
+ 
+ (defun shell-split-commands (string)
+   ;; Split STRING into a list of commands.
+   ;; First STRING is split into its arguments.  Then every argument that does
+   ;; not start with `shell-command-regexp' is assumed to be a seperator of two
+   ;; commands.
+   ;; Return a list of commands where each command itself is a list of all its
+   ;; arguments.
+   (let ((regexp (concat "^" shell-command-regexp))
+         (arguments (shell-split-arguments string))
+         (command-list '())
+         (command '())
+         arg)
+     (while (setq arg (pop arguments))
+       (if (string-match regexp arg)
+           (push arg command)
+         (when command
+           (push (nreverse command) command-list)
+           (setq command nil))))
+     (and command (push (nreverse command) command-list))
+     command-list))
+ 
  ;;; Directory tracking
  ;;
  ;; This code provides the shell mode input sentinel
***************
*** 626,663 ****
    (if shell-dirtrackp
        ;; We fail gracefully if we think the command will fail in the shell.
        (condition-case chdir-failure
! 	  (let ((start (progn (string-match
! 			       (concat "^" shell-command-separator-regexp)
! 			       str) ; skip whitespace
! 			      (match-end 0)))
! 		end cmd arg1)
! 	    (while (string-match shell-command-regexp str start)
! 	      (setq end (match-end 0)
! 		    cmd (comint-arguments (substring str start end) 0 0)
! 		    arg1 (comint-arguments (substring str start end) 1 1))
! 	      (if arg1
! 		  (setq arg1 (shell-unquote-argument arg1)))
! 	      (cond ((string-match (concat "\\`\\(" shell-popd-regexp
! 					   "\\)\\($\\|[ \t]\\)")
! 				   cmd)
! 		     (shell-process-popd (comint-substitute-in-file-name arg1)))
! 		    ((string-match (concat "\\`\\(" shell-pushd-regexp
! 					   "\\)\\($\\|[ \t]\\)")
! 				   cmd)
! 		     (shell-process-pushd (comint-substitute-in-file-name arg1)))
! 		    ((string-match (concat "\\`\\(" shell-cd-regexp
! 					   "\\)\\($\\|[ \t]\\)")
! 				   cmd)
! 		     (shell-process-cd (comint-substitute-in-file-name arg1)))
! 		    ((and shell-chdrive-regexp
! 			  (string-match (concat "\\`\\(" shell-chdrive-regexp
! 						"\\)\\($\\|[ \t]\\)")
! 					cmd))
! 		     (shell-process-cd (comint-substitute-in-file-name cmd))))
! 	      (setq start (progn (string-match shell-command-separator-regexp
! 					       str end)
! 				 ;; skip again
! 				 (match-end 0)))))
  	(error "Couldn't cd"))))
  
  (defun shell-unquote-argument (string)
--- 733,759 ----
    (if shell-dirtrackp
        ;; We fail gracefully if we think the command will fail in the shell.
        (condition-case chdir-failure
!           (loop for args in (shell-split-commands str)
!              as cmd = (pop args)
!              as arg1 = (shell-unquote-argument (or (pop args) ""))
!              do (cond
!                   ((string-match (concat "\\`\\(" shell-popd-regexp
!                                          "\\)\\($\\|[ \t]\\)")
!                                  cmd)
!                    (shell-process-popd (comint-substitute-in-file-name arg1)))
!                   ((string-match (concat "\\`\\(" shell-pushd-regexp
!                                          "\\)\\($\\|[ \t]\\)")
!                                  cmd)
!                    (shell-process-pushd (comint-substitute-in-file-name arg1)))
!                   ((string-match (concat "\\`\\(" shell-cd-regexp
!                                          "\\)\\($\\|[ \t]\\)")
!                                  cmd)
!                    (shell-process-cd (comint-substitute-in-file-name arg1)))
!                   ((and shell-chdrive-regexp
!                         (string-match (concat "\\`\\(" shell-chdrive-regexp
!                                               "\\)\\($\\|[ \t]\\)")
!                                       cmd))
!                    (shell-process-cd (comint-substitute-in-file-name cmd)))))
  	(error "Couldn't cd"))))
  
  (defun shell-unquote-argument (string)

  reply	other threads:[~2007-03-07 14:49 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-03-02 17:44 [david.hansen@gmx.net: Re: comint's directory tracking doesn't understand \( or \)] Richard Stallman
2007-03-04 13:13 ` David Hansen
2007-03-04 15:45   ` Chong Yidong
2007-03-04 15:51     ` David Kastrup
2007-03-04 19:26       ` Chong Yidong
2007-03-04 19:32         ` David Kastrup
2007-03-04 19:39           ` Lennart Borgman (gmail)
2007-03-04 20:16             ` David Kastrup
2007-03-04 20:25               ` Lennart Borgman (gmail)
2007-03-04 19:47           ` Miles Bader
2007-03-04 21:45         ` Stefan Monnier
2007-03-04 22:06           ` Andreas Seltenreich
2007-03-04 23:42             ` Stefan Monnier
2007-03-04 23:13         ` David Hansen
2007-03-04 23:30           ` David Kastrup
2007-03-05  2:09             ` David Hansen
2007-03-04 19:40       ` Robert J. Chassell
2007-03-04 20:17         ` David Kastrup
2007-03-04 23:22     ` Chris Moore
2007-03-04 23:23       ` Tom Tromey
2007-03-05  2:55   ` Richard Stallman
2007-03-05  6:23     ` David Hansen
2007-03-05 21:50       ` Richard Stallman
2007-03-06  2:23         ` Stefan Monnier
2007-03-06  3:10           ` David Hansen
2007-03-06 22:36           ` Richard Stallman
2007-03-07  0:48           ` Miles Bader
2007-03-07 14:49             ` David Hansen [this message]
2007-03-08  3:16               ` Richard Stallman
2007-03-09 19:55               ` Chong Yidong
2007-03-09 20:28                 ` Glenn Morris
2007-03-09 20:45                   ` David Hansen
2007-03-09 21:08                     ` David Kastrup
2007-03-10  0:04                 ` Chong Yidong
2007-03-10  8:06                   ` David Hansen
2007-03-10 20:18                   ` Chong Yidong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=873b4hjegy.fsf@localhorst.mine.nu \
    --to=david.hansen@gmx.net \
    --cc=emacs-devel@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.