From mboxrd@z Thu Jan 1 00:00:00 1970 Path: main.gmane.org!not-for-mail From: ishikawa@yk.rim.or.jp Newsgroups: gmane.emacs.bugs Subject: vc directory mode confused by Japanese date string. Date: Sun, 24 Nov 2002 14:51:48 +0900 (JST) Sender: bug-gnu-emacs-admin@gnu.org Message-ID: Reply-To: ishikawa@yk.rim.or.jp NNTP-Posting-Host: main.gmane.org X-Trace: main.gmane.org 1038117126 659 80.91.224.249 (24 Nov 2002 05:52:06 GMT) X-Complaints-To: usenet@main.gmane.org NNTP-Posting-Date: Sun, 24 Nov 2002 05:52:06 +0000 (UTC) Cc: Andre Spiegel Return-path: Original-Received: from monty-python.gnu.org ([199.232.76.173]) by main.gmane.org with esmtp (Exim 3.35 #1 (Debian)) id 18FpgZ-0000AS-00 for ; Sun, 24 Nov 2002 06:52:03 +0100 Original-Received: from localhost ([127.0.0.1] helo=monty-python.gnu.org) by monty-python.gnu.org with esmtp (Exim 4.10) id 18Fpgg-0000aK-00; Sun, 24 Nov 2002 00:52:10 -0500 Original-Received: from list by monty-python.gnu.org with tmda-scanned (Exim 4.10) id 18FpgU-0000Y8-00 for bug-gnu-emacs@gnu.org; Sun, 24 Nov 2002 00:51:58 -0500 Original-Received: from mail by monty-python.gnu.org with spam-scanned (Exim 4.10) id 18FpgR-0000Xv-00 for bug-gnu-emacs@gnu.org; Sun, 24 Nov 2002 00:51:57 -0500 Original-Received: from pl1583.nas911.n-yokohama.nttpc.ne.jp ([210.139.45.47] helo=standard.erephon) by monty-python.gnu.org with esmtp (Exim 4.10) id 18FpgP-0000Wz-00; Sun, 24 Nov 2002 00:51:53 -0500 Original-Received: by yk.rim.or.jp via sendmail from stdin id (Debian Smail3.2.0.114) for spiegel@gnu.org; Sun, 24 Nov 2002 14:51:48 +0900 (JST) Original-To: bug-gnu-emacs@gnu.org Errors-To: bug-gnu-emacs-admin@gnu.org X-BeenThere: bug-gnu-emacs@gnu.org X-Mailman-Version: 2.0.11 Precedence: bulk List-Help: List-Post: List-Subscribe: , List-Id: Bug reports for GNU Emacs, the Swiss army knife of text editors List-Unsubscribe: , List-Archive: Xref: main.gmane.org gmane.emacs.bugs:3943 X-Report-Spam: http://spam.gmane.org/gmane.emacs.bugs:3943 This bug report will be sent to the Free Software Foundation, not to your local site managers! Please write in English, because the Emacs maintainers do not have translators to read other languages for them. Your bug report will be posted to the bug-gnu-emacs@gnu.org mailing list, and to the gnu.emacs.bug news group. In GNU Emacs 21.2.1 (i686-pc-linux-gnu, X toolkit) of 2002-04-06 on duron Important settings: value of $LC_ALL: ja_JP.ujis value of $LC_COLLATE: nil value of $LC_CTYPE: nil value of $LC_MESSAGES: nil value of $LC_MONETARY: nil value of $LC_NUMERIC: nil value of $LC_TIME: nil value of $LANG: ja_JP.ujis locale-coding-system: japanese-iso-8bit default-enable-multibyte-characters: t Please describe exactly what actions triggered the bug and the precise symptoms of the bug: Hi, I recently tried to use CVS mode extensively in Emacs 21.2 and I noticed a potential CVS mode problem in vc.el. When I tried to use vc-directory mode invoked from the "Tools" menu, the vc directory mode refuses my attempt to mark a file or anything claiming there is not a file on the line or with some such message. This rings a bell. I have seen similar bugs with the standard dired.el (acutally electric-buffer mode) before. I use Emacs on Linux (and Sun Solaris at the office) and use Japanese locale for some time now. The localized/Japanized output of "ls" command puts a Japanese date string in its output and, in the past, it confused Emacs directory mode until proper regular expression was put into lisp library to match the Japanese date that appears on the ls output. That is, the Japanese date string broke the handling of dired mode and the proper file name was not recognized until the fix was put into place. This only surfaced when many OS vendors Sun Microsystems, HP, DEC and others began offering I18N/L10N binaries. So I checked the source file, dired.el and vc.el and tried to see if there is a difference in the matching/handling of the Japanese date string in the ls output lines. Indeed, there seems to be a difference between the two. The following pieces of code are what I think the relevant code (acutally the regular expression) to catch Japanese date string. The former is from dired.el and the latter is from vc.el. You can see that there is a subtle difference between the two regarding how the regular expression for Japanese date is created. I think this is the cause of the vc directory mode failing due to Japanese date string on the ls output line. The relevant lines are marked with "-->" at the beginning of line below. May I suggest that you modify the "(japanese ...)" in vc.el line to match the line from dired.el? (The dired.el works just fine and has been working since the localization issues surfaced several years ago.) (1) From dired.el (defvar dired-move-to-filename-regexp (let* ((l "\\([A-Za-z]\\|[^\0-\177]\\)") ;; In some locales, month abbreviations are as short as 2 letters, ;; and they can be padded on the right with spaces. ;; weiand: changed: month ends potentially with . or , or ., ;;old (month (concat l l "+ *")) (month (concat l l "+[.]?,? *")) ;; Recognize any non-ASCII character. ;; The purpose is to match a Kanji character. (k "[^\0-\177]") ;; (k "[^\x00-\x7f\x80-\xff]") (s " ") (yyyy "[0-9][0-9][0-9][0-9]") (mm "[ 0-1]?[0-9]") ;;old (dd "[ 0-3][0-9]") (dd "[ 0-3][0-9][.]?") (HH:MM "[ 0-2][0-9]:[0-5][0-9]") (seconds "[0-6][0-9]\\([.,][0-9]+\\)?") (zone "[-+][0-2][0-9][0-5][0-9]") (iso-mm-dd "[01][0-9]-[0-3][0-9]") (iso-time (concat HH:MM "\\(:" seconds "\\( ?" zone "\\)?\\)?")) (iso (concat "\\(\\(" yyyy "-\\)?" iso-mm-dd "[ T]" iso-time "\\|" yyyy "-" iso-mm-dd " ?\\)")) (western (concat "\\(" month s dd "\\|" dd s month "\\)" ;; weiand: changed: year potentially unaligned ;;old s "\\(" HH:MM "\\|" s yyyy "\\|" yyyy s "\\)")) s "\\(" HH:MM "\\|" yyyy s s "?" "\\|" s "?" yyyy "\\)")) ---> (japanese ---> (concat mm k "?" s dd k "?" s "+" ---> "\\(" HH:MM "\\|" yyyy k "?" "\\)"))) ;; The "[0-9]" below requires the previous column to end in a digit. ;; This avoids recognizing `1 may 1997' as a date in the line: ;; -r--r--r-- 1 may 1997 1168 Oct 19 16:49 README ;; The "[kMGTPEZY]?" below supports "ls -alh" output. ;; The ".*" below finds the last match if there are multiple matches. ;; This avoids recognizing `jservice 10 1024' as a date in the line: ;; drwxr-xr-x 3 jservice 10 1024 Jul 2 1997 esg-host (concat ".*[0-9][kMGTPEZY]?" s "\\(" western "\\|" japanese "\\|" iso "\\)" s)) "Regular expression to match up to the file name in a directory listing. The default value is designed to recognize dates and times regardless of the language.") |CI's comment: The above will produce something like the following. | | mm k "?" s dd k "?" s "+" ( "HH:MM" | yyyy k "?" ) (2) From vc.el. (define-derived-mode vc-dired-mode dired-mode "Dired under VC" "The major mode used in VC directory buffers. It works like Dired, but lists only files under version control, with the current VC state of each file being indicated in the place of the file's link count, owner, group and size. Subdirectories are also listed, and you may insert them into the buffer as desired, like in Dired. All Dired commands operate normally, with the exception of `v', which is redefined as the version control prefix, so that you can type `vl', `v=' etc. to invoke `vc-print-log', `vc-diff', and the like on the file named in the current Dired buffer line. `vv' invokes `vc-next-action' on this file, or on all files currently marked. There is a special command, `*l', to mark all files currently locked." ;; define-derived-mode does it for us in Emacs-21, but not in Emacs-20. ;; We do it here because dired might not be loaded yet ;; when vc-dired-mode-map is initialized. (set-keymap-parent vc-dired-mode-map dired-mode-map) (make-local-hook 'dired-after-readin-hook) (add-hook 'dired-after-readin-hook 'vc-dired-hook nil t) ;; The following is slightly modified from dired.el, ;; because file lines look a bit different in vc-dired-mode. (set (make-local-variable 'dired-move-to-filename-regexp) (let* ((l "\\([A-Za-z]\\|[^\0-\177]\\)") ;; In some locales, month abbreviations are as short as 2 letters, ;; and they can be padded on the right with spaces. (month (concat l l "+ *")) ;; Recognize any non-ASCII character. ;; The purpose is to match a Kanji character. (k "[^\0-\177]") ;; (k "[^\x00-\x7f\x80-\xff]") (s " ") (yyyy "[0-9][0-9][0-9][0-9]") (mm "[ 0-1][0-9]") (dd "[ 0-3][0-9]") (HH:MM "[ 0-2][0-9]:[0-5][0-9]") (western (concat "\\(" month s dd "\\|" dd s month "\\)" s "\\(" HH:MM "\\|" s yyyy"\\|" yyyy s "\\)")) ---> (japanese (concat mm k s dd k s "\\(" s HH:MM "\\|" yyyy k "\\)"))) ;; the .* below ensures that we find the last match on a line (concat ".*" s "\\(" western "\\|" japanese "\\)" s))) (and (boundp 'vc-dired-switches) vc-dired-switches (set (make-local-variable 'dired-actual-switches) vc-dired-switches)) (set (make-local-variable 'vc-dired-terse-mode) vc-dired-terse-display) (setq vc-dired-mode t)) ======================================== The above will produce something like the following. mm k s dd k s ( s HH:MM | yyyy k ) This differs from the one in dired.el mentioned above. > mm k "?" s dd k "?" s "+" ( "HH:MM" | yyyy k "?" ) I think you may even want to simply re-use the value of "(defvar dired-move-to-filename-regexp ..." in dired.el without definining a new variable locally here since I suspect that the dired.el is more likely to be changed according to the need of I18N/L10N issues and if you simply re-use the value, then you don't have to worry about the issue yourself. [end of message] Recent input: n f u s e d SPC b y SPC J a p a n e s e SPC d a t a C-b e SPC s t r i n g C-k C-x o C-g C-SPC > w C-x C-b C-n C-n SPC < C-x C-b SPC C-x C-b SPC C-p C-x o C-p C-p C-p C-p C-p C-f C-f C-f C-f C-f C-f C-f C-f C-k x r e p o r - t C-b C-b C-t - C-b C-b C-b C-k - t C-b C-b r t C-d C-d Recent messages: I recently tried to use CVS mode e" <<< Press Return to bury the buffer list >>> Mark set <<< Press Return to bury the buffer list >>> <<< Press Return to bury the buffer list >>> Making completion list...