unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#19591: 24.4; file & buffer compare failures
@ 2015-01-13 19:56 Glenn Linderman
  2015-01-14 18:28 ` Eli Zaretskii
  2019-09-30  1:08 ` Stefan Kangas
  0 siblings, 2 replies; 4+ messages in thread
From: Glenn Linderman @ 2015-01-13 19:56 UTC (permalink / raw)
  To: 19591

[-- Attachment #1: Type: text/plain, Size: 5461 bytes --]



I'm delighted that emacs 24.4 can now open all files, even those
that have characters in their names that are not part of the current
ANSI set.

However, the auxiliary program diff when launched by emacs still doesn't
accept files with such characters. The latest version of diff for
windows that I can find is 2.8.7. The error message from diff in the
error buffer seems to contain the proper characters for the file name,
but diff reports it cannot find the file so I tihnk it is a deficiency
in diff, like was in emacs versions prior to 24.4, using the
"bytes" version of open instead of the "widechars" version.

While it may be somewhat inefficient, it would be possible for emacs to
work around the deficiency of diff by saving temporary copies of the
buffers to be compared using generated names in the ANSI subset.

Obviously I can achieve that myself, and have a number of times, but then
one must be careful to copy the fixed data back to the original file.


In GNU Emacs 24.4.1 (x86_64-w64-mingw32)
  of 2014-10-20 on KAEL
Windowing system distributor `Microsoft Corp.', version 6.1.7601
Configured using:
  `configure --prefix=/z/emacs --host=x86_64-w64-mingw32
  --target=x86_64-w64-mingw32 --build=x86_64-w64-mingw32 --with-wide-int
  --with-jpeg --with-xpm --with-png --with-tiff --with-rsvg --with-xml2
  --with-gnutls --with-xft --with-sound=yes --with-file-notification=yes
  --without-dbus --without-imagemagick 'CFLAGS=-Ofast
  -fomit-frame-pointer -funroll-loops -g0 -pipe' 'CPPFLAGS=-DNDEBUG
  -DDBUS_STATIC_BUILD' 'LDFLAGS=-static-libgcc -static-libstdc++ -static
  -s -Wl,-s''

Important settings:
   value of $LANG: ENU
   locale-coding-system: cp1252

Major mode: Emacs-Lisp

Minor modes in effect:
   shell-dirtrack-mode: t
   which-function-mode: t
   tooltip-mode: t
   electric-indent-mode: t
   mouse-wheel-mode: t
   tool-bar-mode: t
   menu-bar-mode: t
   file-name-shadow-mode: t
   global-font-lock-mode: t
   font-lock-mode: t
   blink-cursor-mode: t
   auto-composition-mode: t
   auto-encryption-mode: t
   auto-compression-mode: t
   size-indication-mode: t
   column-number-mode: t
   line-number-mode: t
   transient-mark-mode: t

Recent input:
<backspace> <backspace> o <backspace> <backspace> o
p e n s SPC o n e SPC t h a t SPC i t SPC p <backspace>
<backspace> <return> p r e v i o u s l y SPC c o u
l d n ' <backspace> ' t SPC ( b e c a u s e SPC t h
e SPC c h a a r <backspace> <backspace> r a c e <backspace>
t e r SPC i s SPC n o t SPC i n SPC t h e SPC c u r
r e n t A N S I <backspace> <backspace> <backspace>
<backspace> SPC A N S I <return> s e t ) , SPC t h
e SPC d i s p l a y SPC o f SPC t h e SPC f i l e SPC
n a m e SPC i n SPC t h e SPC t i t l e SPC b a r SPC
h a s SPC c h o s e <backspace> <backspace> <backspace>
<backspace> <backspace> s u c h SPC c h a r a c t e
r s <return> o m i t t e d . <help-echo> <down-mouse-1>
<mouse-1> C-n C-SPC C-b C-b C-b C-b C-b C-b C-b C-b
C-b C-b C-b C-b C-b C-b C-b C-b C-b C-b C-b C-b C-b
<escape> w <help-echo> <down-mouse-1> <mouse-1> C-SPC
C-b C-b C-b C-b C-b C-b C-b C-b C-b C-b C-b C-b C-b
C-b C-b C-b C-b C-b C-b C-b C-b C-b C-b C-b C-b C-b
C-b C-b C-b C-b C-b <escape> w <help-echo> <down-mouse-1>
<mouse-1> C-SPC <escape> > <escape> w <down-mouse-1>
<mouse-1> <escape> < <help-echo> <down-mouse-1> <mouse-1>
C-x k <return> y e s <return> <escape> x r e p o r
t <tab> <return>

Recent messages:
Checking 151 files in d:/emacs/share/emacs/24.4/lisp/emacs-lisp...
Checking 24 files in d:/emacs/share/emacs/24.4/lisp/cedet...
Checking 57 files in d:/emacs/share/emacs/24.4/lisp/calendar...
Checking 87 files in d:/emacs/share/emacs/24.4/lisp/calc...
Checking 95 files in d:/emacs/share/emacs/24.4/lisp/obsolete...
Checking for load-path shadows...done
Auto-saving...done
Mark set [3 times]
Saved text from "
I'm delighted that emacs 24.4 can, at l"

Load-path shadows:
None found.

Features:
(pp shadow sort gnus-util mail-extr emacsbug message format-spec rfc822
mml mml-sec mm-decode mm-bodies mm-encode mail-parse rfc2231 mailabbrev
gmm-utils mailheader sendmail rfc2047 rfc2045 ietf-drums mm-util
mail-prsvr mail-utils add-log python-mode derived skeleton advice
help-fns edmacro kmacro cl-macs thingatpt flymake rx shell pcomplete
cc-cmds cc-engine cc-vars cc-defs compile cl gv cl-loaddefs cl-lib
comint ansi-color ring misearch multi-isearch help-mode easymenu
whitespace which-func imenu time-date tooltip electric uniquify
ediff-hook vc-hooks lisp-float-type mwheel dos-w32 ls-lisp
w32-common-fns disp-table w32-win w32-vars tool-bar dnd fontset image
regexp-opt fringe tabulated-list newcomment lisp-mode prog-mode register
page menu-bar rfn-eshadow timer select scroll-bar mouse jit-lock
font-lock syntax facemenu font-core frame cham georgian utf-8-lang
misc-lang vietnamese tibetan thai tai-viet lao korean japanese hebrew
greek romanian slovak czech european ethiopic indian cyrillic chinese
case-table epa-hook jka-cmpr-hook help simple abbrev minibuffer nadvice
loaddefs button faces cus-face macroexp files text-properties overlay
sha1 md5 base64 format env code-pages mule custom widget
hashtable-print-readable backquote make-network-process w32notify w32
multi-tty emacs)

Memory information:
((conses 16 169994 18131)
  (symbols 56 23492 0)
  (miscs 48 92 169)
  (strings 32 28274 6229)
  (string-bytes 1 978775)
  (vectors 16 21772)
  (vector-slots 8 1297449 203189)
  (floats 8 77 457)
  (intervals 56 1134 37)
  (buffers 960 17))


[-- Attachment #2: Type: text/html, Size: 7120 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* bug#19591: 24.4; file & buffer compare failures
  2015-01-13 19:56 bug#19591: 24.4; file & buffer compare failures Glenn Linderman
@ 2015-01-14 18:28 ` Eli Zaretskii
       [not found]   ` <54B6C620.6050909@g.nevcal.com>
  2019-09-30  1:08 ` Stefan Kangas
  1 sibling, 1 reply; 4+ messages in thread
From: Eli Zaretskii @ 2015-01-14 18:28 UTC (permalink / raw)
  To: Glenn Linderman; +Cc: 19591

> Date: Tue, 13 Jan 2015 11:56:54 -0800
> From: Glenn Linderman <v+python@g.nevcal.com>
> 
> However, the auxiliary program diff when launched by emacs still doesn't
> accept files with such characters. The latest version of diff for
> windows that I can find is 2.8.7. The error message from diff in the
> error buffer seems to contain the proper characters for the file name,
> but diff reports it cannot find the file so I tihnk it is a deficiency
> in diff, like was in emacs versions prior to 24.4, using the
> "bytes" version of open instead of the "widechars" version.

Yes, Diff, as all the other native ports of GNU software to Windows,
uses the ANSI APIs to access files and its command-line arguments.

It is hardly the job of the Emacs team to fix programs that are not
part of the Emacs package.  So I'm not sure what exactly did you
expect of the Emacs project in this matter.

You should know that the Emacs support for non-ASCII characters
outside of the current system codepage stops short of extending that
support to subprocesses invoked by Emacs, for this very reason: there
are no native ports known to me of popular programs, such as Diff,
Grep, find/xargs, etc. that can handle such file names.  So being able
to pass such non-ASCII file names to those programs would be a waste
of effort, since they cannot handle them.

> While it may be somewhat inefficient, it would be possible for emacs to
> work around the deficiency of diff by saving temporary copies of the
> buffers to be compared using generated names in the ANSI subset.

This is not practical.  The place in Emacs sources where command-line
arguments of subprocesses are constructed and encoded has no idea
which of these arguments are file names and which aren't.  (There are
also additional technical difficulties to do that, too boring to go
into here.)  Only the application level -- the Lisp program that needs
to invoke Diff or whatever -- knows that.  So what you suggest would
mean we need to add this kind of work-around in each and every place
where some Lisp invokes some program, too many places to do that.  On
top of that, this would be inefficient: a file could be very large.

So I don't think this problem could or should be solved in Emacs.  Let
people who produce the ports of Diff etc. add support for these
characters first, then there will be a good reason for Emacs to do the
same.

Thanks.





^ permalink raw reply	[flat|nested] 4+ messages in thread

* bug#19591: 24.4; file & buffer compare failures
       [not found]   ` <54B6C620.6050909@g.nevcal.com>
@ 2015-01-14 19:53     ` Eli Zaretskii
  0 siblings, 0 replies; 4+ messages in thread
From: Eli Zaretskii @ 2015-01-14 19:53 UTC (permalink / raw)
  To: Glenn Linderman; +Cc: 19591

[Please keep the bug address on the CC list.]

> Date: Wed, 14 Jan 2015 11:40:16 -0800
> From: Glenn Linderman <v+python@g.nevcal.com>
> 
> I didn't expect a fix for diff from the emacs team, but I do know that the
> excellent file comparison is one huge reason that people use emacs... I know
> people that use vi for most editing, but fire up emacs for file comparison...
> probably works on Unix even with funny names...

Most Unix systems use UTF-8 to encode file names, which is why Diff
doesn't have a problem on such systems.

> I was sort of thinking, though, that the case of buffer comparison is a case
> where emacs is creating the files to do the diff, and that it creates temp
> files with names derived from the buffer name, which is, I suppose somewhat
> mnemonic when looking at the error message, but temporary file names such as
> "compare-buffer-1.txt" and "compare-buffer-2.txt" would be just as useful. And
> the file has to be written before the compare can be done in that case anyway.

If you are talking about comparing buffers, not files, then yes,
perhaps Emacs can do something about the issue, if it exists.  But
please provide a reproducible recipe, starting from "emacs -Q", that
shows the problem.

> Of course, the other approach, since diff is invoked with very specific options
> by buffer/file compare, would be to reimplement that aspect of diff internally,
> which would actually be an optimization (not needed to write the files, call
> the external program, and read its results) that would also sidestep the need
> for file names at all.

Emacs tries not to reinvent the wheels that already exist.

> It does seem, though, that the correct file names are being passed to the
> external programs, at least, the error message seen in emacs has the correct
> file name... it is just that diff isn't smart enough to use the right API to
> open it. Or else the incorrect name being passed isn't being included in the
> error message.

I think just the error message, being generated inside Emacs, shows
the correct file names, what Diff gets are file names butchered by
conversion to the ANSI codepage.  Once again, if you show the command
you issued and the error message you've got in response, we could look
into that and tell what really happens in your case.





^ permalink raw reply	[flat|nested] 4+ messages in thread

* bug#19591: 24.4; file & buffer compare failures
  2015-01-13 19:56 bug#19591: 24.4; file & buffer compare failures Glenn Linderman
  2015-01-14 18:28 ` Eli Zaretskii
@ 2019-09-30  1:08 ` Stefan Kangas
  1 sibling, 0 replies; 4+ messages in thread
From: Stefan Kangas @ 2019-09-30  1:08 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 19591-done, Glenn Linderman

Eli Zaretskii <eliz@gnu.org> writes:

>> Date: Tue, 13 Jan 2015 11:56:54 -0800
>> From: Glenn Linderman <v+python@g.nevcal.com>
>>
>> However, the auxiliary program diff when launched by emacs still doesn't
>> accept files with such characters. The latest version of diff for
>> windows that I can find is 2.8.7. The error message from diff in the
>> error buffer seems to contain the proper characters for the file name,
>> but diff reports it cannot find the file so I tihnk it is a deficiency
>> in diff, like was in emacs versions prior to 24.4, using the
>> "bytes" version of open instead of the "widechars" version.
>
> Yes, Diff, as all the other native ports of GNU software to Windows,
> uses the ANSI APIs to access files and its command-line arguments.
>
> It is hardly the job of the Emacs team to fix programs that are not
> part of the Emacs package.  So I'm not sure what exactly did you
> expect of the Emacs project in this matter.
>
> You should know that the Emacs support for non-ASCII characters
> outside of the current system codepage stops short of extending that
> support to subprocesses invoked by Emacs, for this very reason: there
> are no native ports known to me of popular programs, such as Diff,
> Grep, find/xargs, etc. that can handle such file names.  So being able
> to pass such non-ASCII file names to those programs would be a waste
> of effort, since they cannot handle them.
>
>> While it may be somewhat inefficient, it would be possible for emacs to
>> work around the deficiency of diff by saving temporary copies of the
>> buffers to be compared using generated names in the ANSI subset.
>
> This is not practical.  The place in Emacs sources where command-line
> arguments of subprocesses are constructed and encoded has no idea
> which of these arguments are file names and which aren't.  (There are
> also additional technical difficulties to do that, too boring to go
> into here.)  Only the application level -- the Lisp program that needs
> to invoke Diff or whatever -- knows that.  So what you suggest would
> mean we need to add this kind of work-around in each and every place
> where some Lisp invokes some program, too many places to do that.  On
> top of that, this would be inefficient: a file could be very large.
>
> So I don't think this problem could or should be solved in Emacs.  Let
> people who produce the ports of Diff etc. add support for these
> characters first, then there will be a good reason for Emacs to do the
> same.

With the above explanation, I'm closing this bug report.

Best regards,
Stefan Kagas





^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2019-09-30  1:08 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-01-13 19:56 bug#19591: 24.4; file & buffer compare failures Glenn Linderman
2015-01-14 18:28 ` Eli Zaretskii
     [not found]   ` <54B6C620.6050909@g.nevcal.com>
2015-01-14 19:53     ` Eli Zaretskii
2019-09-30  1:08 ` Stefan Kangas

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).