unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: Kenichi Handa <handa@m17n.org>
Cc: emacs-devel@gnu.org
Subject: Re: dired doesn't work properly with a multibyte locale
Date: Wed, 15 Jan 2003 19:43:55 +0900 (JST)	[thread overview]
Message-ID: <200301151043.TAA09856@etlken.m17n.org> (raw)
In-Reply-To: <buok7hibyqd.fsf@mcspd15.ucom.lsi.nec.co.jp> (message from Miles Bader on 06 Jan 2003 15:04:26 +0900)

Sorry for the late reply.

In article <buok7hibyqd.fsf@mcspd15.ucom.lsi.nec.co.jp>, Miles Bader <miles@lsi.nec.co.jp> writes:

> I'm now using a multibyte locale (LANG=ja_JP.eucJP), and dired is
> screwed up: it can't properly find filenames in the directory listing.

> The reason seems to be that dired uses `ls --dired', which encodes the
> positions of filenames as byte-offsets into the ls output.  However, my
> system's `ls' program sees the non-C LANG, and so the `total' line at the
> beginning of the ls output is now a multibyte-encoded word.  Emacs decodes
> this fine, but the number of characters in the decoded word is _not_ the
> same as the number of bytes in the original ls output, so all the offsets
> from --dired are wrong.  [note that if there are multibyte-encoded
> filenames, the offsets will get screwed up further later in the listing]

> It doesn't seem simple to get the byte offset information, so perhaps the
> best thing to do is simply not use --dired if `file-name-coding-system' is
> a multibyte encoding.  That change is simple to make in dired (and I just
> manually set `dired-use-ls-dired' to nil), but I'm not sure how to tell if
> a particular coding system is multibyte or not.  It'd be nice if there was
> a function like `coding-system-multibyte-p'...

Even if we have such a function, it's very hard to correct
the byte offset information for a multibyte coding system.

Miles Bader <miles@gnu.org> writes:
> On Sat, Jan 11, 2003 at 03:00:12PM -0500, Stefan Monnier wrote:
>>  > It doesn't seem simple to get the byte offset
>>  > information, so perhaps the best thing to do is simply
>>  > not use --dired if `file-name-coding-system' is a
>>  > multibyte encoding.  That change is simple to make in
>>  > dired (and I just manually set `dired-use-ls-dired' to
>>  > nil), but I'm not sure how to tell if a particular
>>  > coding system is multibyte or not.  It'd be nice if
>>  > there was a function like
>>  > `coding-system-multibyte-p'...
>>  
>>  The other solution is to get "ls --dired" output with a "binary"
>>  coding system, then use the byte-offsets to add text-properties, and
>>  then do the decode-coding-region.

Yes.  I think that is the correct fix.

> Won't the decode-coding-region smash all the text-properties?

It surely removes all text properties.  But, we can preserve
the text-property `dired-filename' by decoding one bunch by
one.  Could you please try the attached patch?  I have not
yet installed it because I don't have such a system at hand
and can't test it.

---
Ken'ichi HANDA
handa@m17n.org

2003-01-15  Kenichi Handa  <handa@m17n.org>

	* files.el (insert-directory): Read the output of "ls" by
	no-conversion, and decode it later while preserving
	`dired-filename' property.

*** files.el.~1.630.~	Wed Jan 15 13:12:22 2003
--- files.el	Wed Jan 15 17:44:45 2003
***************
*** 4017,4028 ****
  
  	  ;; Read the actual directory using `insert-directory-program'.
  	  ;; RESULT gets the status code.
! 	  (let* ((coding-system-for-read
  		  (and enable-multibyte-characters
  		       (or file-name-coding-system
! 			   default-file-name-coding-system)))
! 		 ;; This is to control encoding the arguments in call-process.
! 		 (coding-system-for-write coding-system-for-read))
  	    (setq result
  		  (if wildcard
  		      ;; Run ls in the directory part of the file pattern
--- 4017,4031 ----
  
  	  ;; Read the actual directory using `insert-directory-program'.
  	  ;; RESULT gets the status code.
! 	  (let* (;; We at first read by no-conversion, then after
! 		 ;; putting text property `dired-filename, decode one
! 		 ;; bunch by one to preserve that property.
! 		 (coding-system-for-read 'no-conversion)
! 		 ;; This is to control encoding the arguments in call-process.
! 		 (coding-system-for-write 
  		  (and enable-multibyte-characters
  		       (or file-name-coding-system
! 			   default-file-name-coding-system))))
  	    (setq result
  		  (if wildcard
  		      ;; Run ls in the directory part of the file pattern
***************
*** 4105,4110 ****
--- 4108,4130 ----
  	      (goto-char end)
  	      (beginning-of-line)
  	      (delete-region (point) (progn (forward-line 2) (point)))))
+ 
+ 	  ;; Now decode what read if necessary.
+ 	  (let ((coding (or coding-system-for-write
+ 			    (detect-coding-region beg (point) t)))
+ 		val pos)
+ 	    (if (not (eq (coding-system-base coding) 'undecided))
+ 		(save-restriction
+ 		  (narrow-to-region beg (point))
+ 		  (goto-char (point-min))
+ 		  (while (not (eobp))
+ 		    (setq pos (point)
+ 			  val (get-text-property (point) 'dired-filename))
+ 		    (goto-char (next-single-property-change
+ 				(point) 'dired-filename nil (point-max)))
+ 		    (decode-coding-region pos (point) coding)
+ 		    (if val
+ 			(put-text-property pos (point) 'dired-filename t))))))
  
  	  (if full-directory-p
  	      ;; Try to insert the amount of free space.

  parent reply	other threads:[~2003-01-15 10:43 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-01-06  6:04 dired doesn't work properly with a multibyte locale Miles Bader
2003-01-11 20:00 ` Stefan Monnier
2003-01-11 20:16   ` Miles Bader
2003-01-12 11:56 ` Richard Stallman
2003-01-15 10:43 ` Kenichi Handa [this message]
2003-01-15 23:30   ` Richard Stallman
2003-01-23  4:31   ` Miles Bader
2003-01-23  6:02     ` Kenichi Handa
2003-01-23  6:12       ` Miles Bader
2003-01-25  0:49         ` Kenichi Handa
2003-01-27  4:17           ` Miles Bader
2003-01-27  5:01             ` Kenichi Handa
2003-01-27 10:58               ` Andreas Schwab
2003-01-27 11:09                 ` Kenichi Handa
2003-01-27 12:15                   ` Andreas Schwab
2003-02-03  0:17                     ` Kenichi Handa
2003-02-03  1:24                       ` Miles Bader
2003-02-03  2:11                         ` Kenichi Handa
2003-02-03  2:22                           ` Miles Bader
2003-02-03  8:40                             ` Kenichi Handa
2003-02-03  9:02                               ` Miles Bader
2003-02-03  9:10                                 ` Kenichi Handa
2003-02-03  9:22                                   ` Miles Bader
2003-02-03  9:37                                     ` Jim Meyering
2003-02-03 11:00                                   ` Andreas Schwab
2003-02-03 11:17                                     ` Kenichi Handa
2003-02-13 13:58                                       ` Dave Love
2003-02-17  6:19                                         ` Kenichi Handa
2003-02-03 17:47                           ` Dave Love
2003-02-03 17:44                         ` Dave Love
2003-02-03 18:45                           ` Michael Livshin
2003-02-03 19:13                             ` Eli Zaretskii
2003-02-03  9:37                       ` Jim Meyering
2003-02-03 17:20                       ` Richard Stallman
2003-02-03 18:53                         ` Andreas Schwab
2003-01-27 10:56             ` Andreas Schwab
2003-01-27 13:35               ` Jim Meyering
2003-01-24  5:42     ` Richard Stallman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200301151043.TAA09856@etlken.m17n.org \
    --to=handa@m17n.org \
    --cc=emacs-devel@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).