From mboxrd@z Thu Jan 1 00:00:00 1970 Path: main.gmane.org!not-for-mail From: Andreas Schwab Newsgroups: gmane.emacs.devel,gmane.comp.gnu.core-utils.bugs Subject: Re: dired doesn't work properly with a multibyte locale Date: Mon, 27 Jan 2003 11:56:48 +0100 Sender: emacs-devel-bounces+emacs-devel=quimby.gnus.org@gnu.org Message-ID: References: <200301151043.TAA09856@etlken.m17n.org> <200301230602.PAA07755@etlken.m17n.org> <200301250049.JAA11674@etlken.m17n.org> NNTP-Posting-Host: main.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-Trace: main.gmane.org 1043669098 18215 80.91.224.249 (27 Jan 2003 12:04:58 GMT) X-Complaints-To: usenet@main.gmane.org NNTP-Posting-Date: Mon, 27 Jan 2003 12:04:58 +0000 (UTC) Cc: Kenichi Handa Return-path: Original-Received: from quimby.gnus.org ([80.91.224.244]) by main.gmane.org with esmtp (Exim 3.35 #1 (Debian)) id 18d80W-0004jd-00 for ; Mon, 27 Jan 2003 13:04:56 +0100 Original-Received: from monty-python.gnu.org ([199.232.76.173]) by quimby.gnus.org with esmtp (Exim 3.12 #1 (Debian)) id 18d84h-0003OQ-00 for ; Mon, 27 Jan 2003 13:09:15 +0100 Original-Received: from localhost ([127.0.0.1] helo=monty-python.gnu.org) by monty-python.gnu.org with esmtp (Exim 4.10.13) id 18d7Io-00072b-00 for emacs-devel@quimby.gnus.org; Mon, 27 Jan 2003 06:19:46 -0500 Original-Received: from list by monty-python.gnu.org with tmda-scanned (Exim 4.10.13) id 18d713-0004rX-00 for emacs-devel@gnu.org; Mon, 27 Jan 2003 06:01:25 -0500 Original-Received: from mail by monty-python.gnu.org with spam-scanned (Exim 4.10.13) id 18d6z0-0004cG-00 for emacs-devel@gnu.org; Mon, 27 Jan 2003 05:59:25 -0500 Original-Received: from ns.suse.de ([213.95.15.193] helo=Cantor.suse.de) by monty-python.gnu.org with esmtp (Exim 4.10.13) id 18d6xR-0004Q3-00; Mon, 27 Jan 2003 05:57:41 -0500 Original-Received: from Hermes.suse.de (Hermes.suse.de [213.95.15.136]) by Cantor.suse.de (Postfix) with ESMTP id 4C87114A0D; Mon, 27 Jan 2003 11:56:50 +0100 (MET) Original-To: Miles Bader X-Yow: You can't hurt me!! I have an ASSUMABLE MORTGAGE!! In-Reply-To: (Miles Bader's message of "27 Jan 2003 13:17:01 +0900") User-Agent: Gnus/5.090013 (Oort Gnus v0.13) Emacs/21.3.50 (ia64-suse-linux) Original-cc: bug-coreutils@gnu.org Original-cc: emacs-pretest-bug@gnu.org Original-cc: emacs-devel@gnu.org X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1b5 Precedence: list List-Id: Emacs development discussions. List-Help: List-Post: List-Subscribe: , List-Archive: List-Unsubscribe: , Errors-To: emacs-devel-bounces+emacs-devel=quimby.gnus.org@gnu.org Xref: main.gmane.org gmane.emacs.devel:11121 gmane.comp.gnu.core-utils.bugs:89 X-Report-Spam: http://spam.gmane.org/gmane.emacs.devel:11121 Miles Bader writes: |> Kenichi Handa writes: |> > > If there's a file containing a newline, then if LANG=C, dired can |> > > correctly deal with it (e.g., I can put the cursor on it and hit RET, |> > > and it visits that file), but if LANG=ja_JP.eucjp, then it correctly |> > > displays all _other_ files, but you can't use RET to visit the |> > > newline-in-the-file-name file (it says `File no longer exists; type `g' |> > > to update Dired buffer'). |> > |> > So, I have no idea what's wrong with the current code. |> > Could you debug it? |> |> Hmm, it actually seems to be a bug with `ls'! |> |> I created two files, one called `abc\ndef' (where \n is a newline), and |> one called `1234567'. Here's what ls prints if stdout is a tty (I've |> indented the output by 3 spaces): |> |> (tmp) LANG=ja_JP.eucJP ls -l --dired abc* 123* |> -rw-rw---- 1 miles 6 2003-01-27 13:03 1234567 |> -rw-rw---- 1 miles 6 2003-01-27 12:58 abc?def |> //DIRED// 53 60 114 121 |> //DIRED-OPTIONS// --quoting-style=literal |> |> [note that the start/end offsets of each filename differ by 7] |> |> But here's what the _same_ command prints if stdout is a pipe (which I |> presume is the case for dired): |> |> (tmp) LANG=ja_JP.eucJP ls -l --dired abc* 123* | cat |> -rw-rw---- 1 miles 6 2003-01-27 13:03 1234567 |> -rw-rw---- 1 miles 6 2003-01-27 12:58 abc |> def |> //DIRED// 53 60 114 120 |> //DIRED-OPTIONS// --quoting-style=literal |> |> Now the start/end offsets of `abc\ndef' now only differ by 6 (which is |> obviously wrong, since the filename is 7 characters long)! Morever |> this problem only seems to occur if LANG=ja_JP.eucJP, _not_ if LANG=C. Here is a patch. The dired offset are documented as being byte counts, not character counts. The bug happens in any multibyte locale. Andreas. 2003-01-27 Andreas Schwab * src/ls.c (quote_name): Add fourth parameter width into which to store the screen columns and return number of bytes instead. (print_dir): Pass NULL as fourth parameter of quote_name. (print_name_with_quoting): Likewise. (length_of_file_name_and_frills): Get the width from the fourth parameter of quote_name instead of return value. --- src/ls.c 2002/12/16 18:58:01 1.3 +++ src/ls.c 2003/01/27 10:50:51 @@ -255,7 +255,8 @@ char *getgroup (); char *getuser (); static size_t quote_name PARAMS ((FILE *out, const char *name, - struct quoting_options const *options)); + struct quoting_options const *options, + size_t *width)); static char *make_link_path PARAMS ((const char *path, const char *linkname)); static int decode_switches PARAMS ((int argc, char **argv)); static int file_interesting PARAMS ((const struct dirent *next)); @@ -2222,7 +2223,7 @@ print_dir (const char *name, const char DIRED_INDENT (); PUSH_CURRENT_DIRED_POS (&subdired_obstack); dired_pos += quote_name (stdout, realname ? realname : name, - dirname_quoting_options); + dirname_quoting_options, NULL); PUSH_CURRENT_DIRED_POS (&subdired_obstack); DIRED_FPUTS_LITERAL (":\n", stdout); } @@ -3064,11 +3065,13 @@ print_long_format (const struct fileinfo /* Output to OUT a quoted representation of the file name NAME, using OPTIONS to control quoting. Produce no output if OUT is NULL. - Return the number of screen columns occupied by NAME's quoted - representation. */ + Store the number of screen columns occupied by NAME's quoted + representation into WIDTH, if non-NULL. Return the number of bytes + produced. */ static size_t -quote_name (FILE *out, const char *name, struct quoting_options const *options) +quote_name (FILE *out, const char *name, struct quoting_options const *options, + size_t *width) { char smallbuf[BUFSIZ]; size_t len = quotearg_buffer (smallbuf, sizeof smallbuf, name, -1, options); @@ -3203,20 +3206,32 @@ quote_name (FILE *out, const char *name, displayed_width = len; } } - else + else if (width != NULL) { - /* Assume unprintable characters have a displayed_width of 1. */ #if HAVE_MBRTOWC if (MB_CUR_MAX > 1) displayed_width = mbsnwidth (buf, len, 0); else #endif - displayed_width = len; + { + char *p = buf; + char const *plimit = buf + len; + + displayed_width = 0; + while (p < plimit) + { + if (ISPRINT ((unsigned char) *p)) + displayed_width++; + p++; + } + } } if (out != NULL) fwrite (buf, 1, len, out); - return displayed_width; + if (width != NULL) + *width = displayed_width; + return len; } static void @@ -3229,7 +3244,7 @@ print_name_with_quoting (const char *p, if (stack) PUSH_CURRENT_DIRED_POS (stack); - dired_pos += quote_name (stdout, p, filename_quoting_options); + dired_pos += quote_name (stdout, p, filename_quoting_options, NULL); if (stack) PUSH_CURRENT_DIRED_POS (stack); @@ -3395,6 +3410,7 @@ static int length_of_file_name_and_frills (const struct fileinfo *f) { register int len = 0; + size_t name_width; if (print_inode) len += INODE_DIGITS + 1; @@ -3402,7 +3418,8 @@ length_of_file_name_and_frills (const st if (print_block_size) len += 1 + block_size_size; - len += quote_name (NULL, f->name, filename_quoting_options); + quote_name (NULL, f->name, filename_quoting_options, &name_width); + len += name_width; if (indicator_style != none) { -- Andreas Schwab, SuSE Labs, schwab@suse.de SuSE Linux AG, Deutschherrnstr. 15-19, D-90429 Nürnberg Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different."