unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* [wenbinye@gmail.com: hexl-max-address in hexl-mode is incorrect]
@ 2006-11-19  7:59 Richard Stallman
  2006-11-22 16:02 ` Chong Yidong
  0 siblings, 1 reply; 7+ messages in thread
From: Richard Stallman @ 2006-11-19  7:59 UTC (permalink / raw)


Would someone please DTRT and ack?

------- Start of forwarded message -------
To: "emacs-pretest-bug@gnu.org" <emacs-pretest-bug@gnu.org>
From: "Ye Wenbin" <wenbinye@gmail.com>
Organization: Personal
Content-Type: text/plain; format=flowed; delsp=yes; charset=utf-8
MIME-Version: 1.0
Date: Tue, 14 Nov 2006 12:53:04 +0800
Subject: hexl-max-address in hexl-mode is incorrect
X-Spam-Status: No, score=0.5 required=5.0 tests=RCVD_BY_IP,TO_ADDRESS_EQ_REAL 
	autolearn=no version=3.0.4

You can test like this:
Test case 1:
In *scratch* buffer, erase the buffer, and choose TeX input method,
input two multibyte characters, such as "«»", which can display in my
emacs. Then change to hexl-mode. Use C-f to move point. You can see
the cursor can't arrive to the last byte. Use C-h v hexl-max-address
to see the value of hexl-max-address. The hexl-max-address is 2 which
is the buffer-size of *scratch*.

Test case 2:
Open a new file, such as /tmp/test.txt. Use C-x RET f to set the file
coding system to utf-16. Input any letters such as "ab", and save the
buffer. Then change mode to hexl-mode. C-h v hexl-max-address show the
value is still 2 which is the buffer-size rather than the sizze of the
file.

====================================================================


In GNU Emacs 22.0.50.1 (i486-pc-linux-gnu, GTK+ Version 2.8.17)
  of 2006-08-24 on vernadsky, modified by Debian
  (Debian emacs-snapshot package, version 1:20060707-1~dapper1)
X server distributor `The X.Org Foundation', version 11.0.70000000
configured using `configure '--build' 'i486-linux-gnu' '--host'  
'i486-linux-gnu' '--prefix=/usr' '--sharedstatedir=/var/lib'  
'--libexecdir=/usr/lib' '--localstatedir=/var' '--infodir=/usr/share/info'  
'--mandir=/usr/share/man' '--with-pop=yes'  
'--enable-locallisppath=/etc/emacs-snapshot:/etc/emacs:/usr/local/share/emacs/22.0.50/site-lisp:/usr/local/share/emacs/site-lisp:/usr/share/emacs/22.0.50/site-lisp:/usr/share/emacs/site-lisp:/usr/share/emacs/22.0.50/leim'  
'--with-x=yes' '--with-x-toolkit=gtk' 'CFLAGS=-DDEBIAN -g -O2'  
'build_alias=i486-linux-gnu' 'host_alias=i486-linux-gnu''

Important settings:
   value of $LC_ALL: nil
   value of $LC_COLLATE: nil
   value of $LC_CTYPE: nil
   value of $LC_MESSAGES: nil
   value of $LC_MONETARY: nil
   value of $LC_NUMERIC: nil
   value of $LC_TIME: nil
   value of $LANG: zh_CN.UTF-8
   locale-coding-system: utf-8
   default-enable-multibyte-characters: t

Major mode: Hexl

Minor modes in effect:
   shell-dirtrack-mode: t
   icomplete-mode: t
   partial-completion-mode: t
   desktop-save-mode: t
   auto-image-file-mode: t
   show-paren-mode: t
   encoded-kbd-mode: t
   tooltip-mode: t
   mouse-wheel-mode: t
   menu-bar-mode: t
   file-name-shadow-mode: t
   global-font-lock-mode: t
   font-lock-mode: t
   blink-cursor-mode: t
   unify-8859-on-encoding-mode: t
   utf-translate-cjk-mode: t
   auto-compression-mode: t
   column-number-mode: t
   line-number-mode: t
   transient-mark-mode: t

Recent input:
<help-echo> <help-echo> <help-echo> <help-echo> <help-echo>
<help-echo> <help-echo> <help-echo> <help-echo> <help-echo>
<help-echo> <down-mouse-1> <mouse-1> SPC <down-mouse-5>
<mouse-5> <down-mouse-5> <mouse-5> <double-down-mouse-5>
<double-mouse-5> <triple-down-mouse-5> <triple-mouse-5>
<triple-down-mouse-5> <triple-mouse-5> <down-mouse-5>
<mouse-5> <double-down-mouse-5> <double-mouse-5> <triple-down-mouse-5>
<triple-mouse-5> <triple-down-mouse-5> <triple-mouse-5>
<triple-down-mouse-5> <triple-mouse-5> <down-mouse-5>
<mouse-5> <double-down-mouse-5> <double-mouse-5> <triple-down-mouse-5>
<triple-mouse-5> <triple-down-mouse-5> <triple-mouse-5>
<triple-down-mouse-5> <triple-mouse-5> <down-mouse-5>
<mouse-5> <double-down-mouse-5> <double-mouse-5> <triple-down-mouse-5>
<triple-mouse-5> <triple-down-mouse-5> <triple-mouse-5>
<triple-down-mouse-5> <triple-mouse-5> <triple-down-mouse-5>
<triple-mouse-5> SPC <down-mouse-5> <mouse-5> <double-down-mouse-5>
<double-mouse-5> <triple-down-mouse-5> <triple-mouse-5>
<triple-down-mouse-5> <triple-mouse-5> <triple-down-mouse-5>
<triple-mouse-5> <triple-down-mouse-5> <triple-mouse-5>
<triple-down-mouse-5> <triple-mouse-5> <triple-down-mouse-5>
<triple-mouse-5> <triple-down-mouse-5> <triple-mouse-5>
<down-mouse-4> <mouse-4> <double-down-mouse-4> <double-mouse-4>
<triple-down-mouse-4> <triple-mouse-4> <triple-down-mouse-4>
<triple-mouse-4> <triple-down-mouse-4> <triple-mouse-4>
<help-echo> <help-echo> <help-echo> <down-mouse-1>
<mouse-1> M-x r e p o r - <backspace> <tab> <retur
n>

Recent messages:
Char: » (2235, #o4273, #x8bb, file ...) point=2 of 2 (50%) column=1
Loading hexl...done
Converting to hexl format discards undo info; ok? (y or n)
hexl-goto-address: Out of hexl region [6 times]
uncompressing emacs.gz...done
uncompressing emacs-1.gz...done
Mark saved where search started [2 times]
uncompressing emacs-5.gz...done
byte-code: End of buffer [3 times]
Loading emacsbug...done


- -- 
Using Opera's revolutionary e-mail client: http://www.opera.com/mail/


_______________________________________________
emacs-pretest-bug mailing list
emacs-pretest-bug@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug
------- End of forwarded message -------

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [wenbinye@gmail.com: hexl-max-address in hexl-mode is incorrect]
  2006-11-19  7:59 [wenbinye@gmail.com: hexl-max-address in hexl-mode is incorrect] Richard Stallman
@ 2006-11-22 16:02 ` Chong Yidong
  2006-11-22 16:59   ` Chong Yidong
  2006-11-22 17:15   ` Kevin Rodgers
  0 siblings, 2 replies; 7+ messages in thread
From: Chong Yidong @ 2006-11-22 16:02 UTC (permalink / raw)
  Cc: emacs-devel

> The hexl-max-address usually set to buffer-size, but when the buffer
> contain a multiple byte character or the file associated to the buffer
> is  encoded by multibyte coding system such as utf-16, the
> hexl-max-address is  usually less the the real byte of buffer.
>
> You can test like this:
>
> Test case 2:
> Open a new file, such as /tmp/test.txt. Use C-x RET f to set the file
> coding system to utf-16. Input any letters such as "ab", and save the
> buffer. Then change mode to hexl-mode. C-h v hexl-max-address show the
> value is still 2 which is the buffer-size rather than the sizze of the
> file.
>
> Here is my solution to set hexl-max-address which might help:
> (setq hexl-max-address
>       (1- (if buffer-file-name
>               (nth 7 (file-attributes buffer-file-name))
>             (length
>              (decode-coding-string (buffer-string)
> buffer-file-coding-system)))))

The (nth 7 (file-attributes buffer-file-name)) method returns the
correct byte count.  However, the
(decode-coding-string (buffer-string) buffer-file-coding-system)
method doesn't seem to work for me; it returns erratic incorrect
results.  In the case of a utf-16 buffer containing just "ab" without
an associated file, it returns 1; if there is an associated buffer, it
returns 0.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [wenbinye@gmail.com: hexl-max-address in hexl-mode is incorrect]
  2006-11-22 16:02 ` Chong Yidong
@ 2006-11-22 16:59   ` Chong Yidong
  2006-11-22 17:15   ` Kevin Rodgers
  1 sibling, 0 replies; 7+ messages in thread
From: Chong Yidong @ 2006-11-22 16:59 UTC (permalink / raw)
  Cc: rms, emacs-devel

>> The hexl-max-address usually set to buffer-size, but when the buffer
>> contain a multiple byte character or the file associated to the buffer
>> is  encoded by multibyte coding system such as utf-16, the
>> hexl-max-address is  usually less the the real byte of buffer.
>
> However, the (decode-coding-string (buffer-string) buffer-file-coding-system)
> method doesn't seem to work for me; it returns erratic incorrect
> results.

Actually, I think the correct thing to do is to ENcode the buffer
string, not DEcode it.

(length (encode-coding-string (buffer-string) buffer-file-coding-system))

This seems to produce the correct results.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [wenbinye@gmail.com: hexl-max-address in hexl-mode is incorrect]
  2006-11-22 16:02 ` Chong Yidong
  2006-11-22 16:59   ` Chong Yidong
@ 2006-11-22 17:15   ` Kevin Rodgers
  2006-11-22 22:02     ` Chong Yidong
  1 sibling, 1 reply; 7+ messages in thread
From: Kevin Rodgers @ 2006-11-22 17:15 UTC (permalink / raw)


Chong Yidong wrote:
>> The hexl-max-address usually set to buffer-size, but when the buffer
>> contain a multiple byte character or the file associated to the buffer
>> is  encoded by multibyte coding system such as utf-16, the
>> hexl-max-address is  usually less the the real byte of buffer.
>>
>> You can test like this:
>>
>> Test case 2:
>> Open a new file, such as /tmp/test.txt. Use C-x RET f to set the file
>> coding system to utf-16. Input any letters such as "ab", and save the
>> buffer. Then change mode to hexl-mode. C-h v hexl-max-address show the
>> value is still 2 which is the buffer-size rather than the sizze of the
>> file.
>>
>> Here is my solution to set hexl-max-address which might help:
>> (setq hexl-max-address
>>       (1- (if buffer-file-name
>>               (nth 7 (file-attributes buffer-file-name))
>>             (length
>>              (decode-coding-string (buffer-string)
>> buffer-file-coding-system)))))
> 
> The (nth 7 (file-attributes buffer-file-name)) method returns the
> correct byte count.  However, the
> (decode-coding-string (buffer-string) buffer-file-coding-system)
> method doesn't seem to work for me; it returns erratic incorrect
> results.  In the case of a utf-16 buffer containing just "ab" without
> an associated file, it returns 1; if there is an associated buffer, it
> returns 0.

How about: (- (position-bytes (point-max)) (position-bytes (point-min)))

-- 
Kevin

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [wenbinye@gmail.com: hexl-max-address in hexl-mode is incorrect]
  2006-11-22 17:15   ` Kevin Rodgers
@ 2006-11-22 22:02     ` Chong Yidong
  2006-11-27 19:09       ` Stuart D. Herring
  0 siblings, 1 reply; 7+ messages in thread
From: Chong Yidong @ 2006-11-22 22:02 UTC (permalink / raw)
  Cc: emacs-devel

> How about: (- (position-bytes (point-max)) (position-bytes (point-min)))

I already looked in this, and couldn't find a way to get it to work.
Hexl mode works by passing the buffer contents to an external program,
lib-src/hexl; thus, the byte values seen by hexl-mode depend on
buffer-file-coding-system.  The value of position-bytes is independent
of this.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [wenbinye@gmail.com: hexl-max-address in hexl-mode is  incorrect]
  2006-11-22 22:02     ` Chong Yidong
@ 2006-11-27 19:09       ` Stuart D. Herring
  2006-11-28  0:48         ` Kenichi Handa
  0 siblings, 1 reply; 7+ messages in thread
From: Stuart D. Herring @ 2006-11-27 19:09 UTC (permalink / raw)
  Cc: Kevin Rodgers, emacs-devel

>> How about: (- (position-bytes (point-max)) (position-bytes (point-min)))
>
> I already looked in this, and couldn't find a way to get it to work.
> Hexl mode works by passing the buffer contents to an external program,
> lib-src/hexl; thus, the byte values seen by hexl-mode depend on
> buffer-file-coding-system.  The value of position-bytes is independent
> of this.

Is there some other coding system variable that does affect position-bytes
(which we could then let-bind), or is it based on Emacs' internal,
invariant encoding?

Davis

-- 
This product is sold by volume, not by mass.  If it appears too dense or
too sparse, it is because mass-energy conversion has occurred during
shipping.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [wenbinye@gmail.com: hexl-max-address in hexl-mode is incorrect]
  2006-11-27 19:09       ` Stuart D. Herring
@ 2006-11-28  0:48         ` Kenichi Handa
  0 siblings, 0 replies; 7+ messages in thread
From: Kenichi Handa @ 2006-11-28  0:48 UTC (permalink / raw)
  Cc: cyd, ihs_4664, emacs-devel

In article <49866.128.165.123.18.1164654589.squirrel@webmail.lanl.gov>, "Stuart D. Herring" <herring@lanl.gov> writes:

>>> How about: (- (position-bytes (point-max)) (position-bytes (point-min)))
> >
> > I already looked in this, and couldn't find a way to get it to work.
> > Hexl mode works by passing the buffer contents to an external program,
> > lib-src/hexl; thus, the byte values seen by hexl-mode depend on
> > buffer-file-coding-system.  The value of position-bytes is independent
> > of this.

> Is there some other coding system variable that does affect position-bytes
> (which we could then let-bind), or is it based on Emacs' internal,
> invariant encoding?

The latter.  The byte sequence of a buffer after decoded is
always in emacs-mule (in emacs-unicode-2 branch, it's
utf-8).  So, changing buffer-file-coding-system or any other
coding-system-related variables doesn't affects
position-bytes.

---
Kenichi Handa
handa@m17n.org

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2006-11-28  0:48 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-11-19  7:59 [wenbinye@gmail.com: hexl-max-address in hexl-mode is incorrect] Richard Stallman
2006-11-22 16:02 ` Chong Yidong
2006-11-22 16:59   ` Chong Yidong
2006-11-22 17:15   ` Kevin Rodgers
2006-11-22 22:02     ` Chong Yidong
2006-11-27 19:09       ` Stuart D. Herring
2006-11-28  0:48         ` Kenichi Handa

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).