* suggestion: function: buffer-bytes
@ 2007-07-01 0:18 T. V. Raman
2007-07-01 1:39 ` Stefan Monnier
0 siblings, 1 reply; 14+ messages in thread
From: T. V. Raman @ 2007-07-01 0:18 UTC (permalink / raw)
To: emacs-devel
Emacs built-in buffer-size returns the number of characters ---
in some situations one needs the count of bytes.
Here is a small function that does this --- perhaps it could be
ncluded in subr.el?
(defsubst buffer-bytes (&optional buffer)
"Return number of bytes in a buffer."
(save-excursion
(and buffer (set-buffer buffer))
(1- (position-bytes (point-max)))))
--
Best Regards,
--raman
Email: raman@users.sf.net
WWW: http://emacspeak.sf.net/raman/
AIM: emacspeak GTalk: tv.raman.tv@gmail.com
PGP: http://emacspeak.sf.net/raman/raman-almaden.asc
Google: tv+raman
IRC: irc://irc.freenode.net/#emacs
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: suggestion: function: buffer-bytes
2007-07-01 0:18 suggestion: function: buffer-bytes T. V. Raman
@ 2007-07-01 1:39 ` Stefan Monnier
2007-07-01 2:40 ` T. V. Raman
0 siblings, 1 reply; 14+ messages in thread
From: Stefan Monnier @ 2007-07-01 1:39 UTC (permalink / raw)
To: raman; +Cc: emacs-devel
> Emacs built-in buffer-size returns the number of characters ---
> in some situations one needs the count of bytes.
> Here is a small function that does this --- perhaps it could be
> ncluded in subr.el?
> (defsubst buffer-bytes (&optional buffer)
> "Return number of bytes in a buffer."
> (save-excursion
> (and buffer (set-buffer buffer))
> (1- (position-bytes (point-max)))))
This function is very unlikely to ever be useful: the number of bytes to
represent a particular sequence of characters depends on the encoding used.
So for example the result will be different for the exact same text when run
in Emacs-22 or in Emacs-unicode. And it most likely will different from the
number of bytes of the file associated with the buffer.
Stefan
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: suggestion: function: buffer-bytes
2007-07-01 1:39 ` Stefan Monnier
@ 2007-07-01 2:40 ` T. V. Raman
2007-07-01 6:31 ` David Kastrup
2007-07-01 8:22 ` Kenichi Handa
0 siblings, 2 replies; 14+ messages in thread
From: T. V. Raman @ 2007-07-01 2:40 UTC (permalink / raw)
To: monnier; +Cc: raman, emacs-devel
Stephane,
Where I used this:
Package g-client
http://emacspeak.googlecode.com/svn/trunk/lisp/g-client
I use curl to talk HTTP in that package -- uses Atom Publishing
Protocol to talk to servers --
and I needed the byte count for computing HTTP headers
correctly.
It does appear to work, but also because I do set buffer-encoding
appropriately in those buffers where I am building up the HTTP
message being posted.
buffer-size definitely bombs in that use case -- do you have a
better suggestion for how one might count bytes?
>>>>> "Stefan" == Stefan Monnier <monnier@iro.umontreal.ca> writes:
>> Emacs built-in buffer-size returns the number of
>> characters --- in some situations one needs the count of
>> bytes. Here is a small function that does this ---
>> perhaps it could be ncluded in subr.el?
Stefan>
>> (defsubst buffer-bytes (&optional buffer) "Return number
>> of bytes in a buffer." (save-excursion (and buffer
>> (set-buffer buffer)) (1- (position-bytes (point-max)))))
Stefan>
Stefan> This function is very unlikely to ever be useful: the
Stefan> number of bytes to represent a particular sequence of
Stefan> characters depends on the encoding used. So for
Stefan> example the result will be different for the exact
Stefan> same text when run in Emacs-22 or in Emacs-unicode.
Stefan> And it most likely will different from the number of
Stefan> bytes of the file associated with the buffer.
Stefan>
Stefan>
Stefan> Stefan
Stefan>
Stefan>
Stefan> _______________________________________________
Stefan> Emacs-devel mailing list Emacs-devel@gnu.org
Stefan> http://lists.gnu.org/mailman/listinfo/emacs-devel
--
Best Regards,
--raman
Email: raman@users.sf.net
WWW: http://emacspeak.sf.net/raman/
AIM: emacspeak GTalk: tv.raman.tv@gmail.com
PGP: http://emacspeak.sf.net/raman/raman-almaden.asc
Google: tv+raman
IRC: irc://irc.freenode.net/#emacs
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: suggestion: function: buffer-bytes
2007-07-01 2:40 ` T. V. Raman
@ 2007-07-01 6:31 ` David Kastrup
2007-07-01 8:22 ` Kenichi Handa
1 sibling, 0 replies; 14+ messages in thread
From: David Kastrup @ 2007-07-01 6:31 UTC (permalink / raw)
To: raman; +Cc: monnier, emacs-devel
"T. V. Raman" <raman@users.sf.net> writes:
> Stephane,
>
> Where I used this:
>
> Package g-client
> http://emacspeak.googlecode.com/svn/trunk/lisp/g-client
>
> I use curl to talk HTTP in that package -- uses Atom Publishing
> Protocol to talk to servers --
> and I needed the byte count for computing HTTP headers
> correctly.
> It does appear to work, but also because I do set buffer-encoding
> appropriately in those buffers where I am building up the HTTP
> message being posted.
You can't: buffers are always encoded in Emacs-mule (or its own
version of utf-8 in Emacs 23), or in unibyte (in which case the
position-byte function becomes rather pointless). I don't know what
you call "set buffer-encoding appropriately".
You can. presumably, talk in unibyte with your server and do the
encoding and decoding on the way to the buffer manually. In which
case you can just use buffer positions in characters as synonyms with
those in bytes.
--
David Kastrup, Kriemhildstr. 15, 44793 Bochum
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: suggestion: function: buffer-bytes
2007-07-01 2:40 ` T. V. Raman
2007-07-01 6:31 ` David Kastrup
@ 2007-07-01 8:22 ` Kenichi Handa
2007-07-01 12:39 ` Stefan Monnier
` (2 more replies)
1 sibling, 3 replies; 14+ messages in thread
From: Kenichi Handa @ 2007-07-01 8:22 UTC (permalink / raw)
To: raman; +Cc: emacs-devel, monnier, raman
In article <18055.5127.154758.705881@gargle.gargle.HOWL>, "T. V. Raman" <raman@users.sourceforge.net> writes:
> Package g-client
> http://emacspeak.googlecode.com/svn/trunk/lisp/g-client
> I use curl to talk HTTP in that package -- uses Atom Publishing
> Protocol to talk to servers --
> and I needed the byte count for computing HTTP headers
> correctly.
> It does appear to work, but also because I do set buffer-encoding
> appropriately in those buffers where I am building up the HTTP
> message being posted.
> buffer-size definitely bombs in that use case -- do you have a
> better suggestion for how one might count bytes?
Then perhaps what you need is this.
(defun buffer-encoded-size (&optional buffer coding)
"Return the encoded size of the current byffer in bytes.
..."
(save-excursion
(and buffer (set-buffer buffer))
(or coding
(setq coding buffer-file-coding-system))
(length (encode-coding-string (buffer-string) coding))))
In emacs-unicode-2, you can use a little bit faster version.
(defun buffer-encoded-size (&optional buffer coding)
"Return the encoded size of the current byffer in bytes.
..."
(save-excursion
(and buffer (set-buffer buffer))
(or coding
(setq coding buffer-file-coding-system))
(length (encode-coding-region (point-min) (point-max) coding t))))
---
Kenichi Handa
handa@m17n.org
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: suggestion: function: buffer-bytes
2007-07-01 8:22 ` Kenichi Handa
@ 2007-07-01 12:39 ` Stefan Monnier
2007-07-02 0:48 ` Kenichi Handa
2007-07-01 17:47 ` T. V. Raman
2007-07-01 19:34 ` Eli Zaretskii
2 siblings, 1 reply; 14+ messages in thread
From: Stefan Monnier @ 2007-07-01 12:39 UTC (permalink / raw)
To: Kenichi Handa; +Cc: emacs-devel, raman
> Then perhaps what you need is this.
> (defun buffer-encoded-size (&optional buffer coding)
But in his case, he's probably better off just encoding the buffer manually
before passing the data to the process/network, so that he can get his hands
on the number of bytes with just buffer-size.
Stefan
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: suggestion: function: buffer-bytes
2007-07-01 8:22 ` Kenichi Handa
2007-07-01 12:39 ` Stefan Monnier
@ 2007-07-01 17:47 ` T. V. Raman
2007-07-02 1:11 ` Kenichi Handa
2007-07-01 19:34 ` Eli Zaretskii
2 siblings, 1 reply; 14+ messages in thread
From: T. V. Raman @ 2007-07-01 17:47 UTC (permalink / raw)
To: handa; +Cc: emacs-devel, monnier, raman
So someone on the g-client group originally proposed computing
the length of buffer-string using string-encoding; I proposed
using position-bytes as a simpler alternative.
>>>>> "Kenichi" == Kenichi Handa <handa@m17n.org> writes:
Kenichi> In article
Kenichi> <18055.5127.154758.705881@gargle.gargle.HOWL>,
Kenichi> "T. V. Raman" <raman@users.sourceforge.net> writes:
>> Package g-client
>> http://emacspeak.googlecode.com/svn/trunk/lisp/g-client
Kenichi>
>> I use curl to talk HTTP in that package -- uses Atom
>> Publishing Protocol to talk to servers -- and I needed the
>> byte count for computing HTTP headers correctly. It does
>> appear to work, but also because I do set buffer-encoding
>> appropriately in those buffers where I am building up the
>> HTTP message being posted. buffer-size definitely bombs
>> in that use case -- do you have a better suggestion for
>> how one might count bytes?
Kenichi>
Kenichi> Then perhaps what you need is this.
Kenichi>
Kenichi> (defun buffer-encoded-size (&optional buffer coding)
Kenichi> "Return the encoded size of the current byffer in
Kenichi> bytes. ..." (save-excursion (and buffer
Kenichi> (set-buffer buffer)) (or coding (setq coding
Kenichi> buffer-file-coding-system)) (length
Kenichi> (encode-coding-string (buffer-string) coding))))
Kenichi>
Kenichi> In emacs-unicode-2, you can use a little bit faster
Kenichi> version.
Kenichi>
Kenichi> (defun buffer-encoded-size (&optional buffer coding)
Kenichi> "Return the encoded size of the current byffer in
Kenichi> bytes. ..." (save-excursion (and buffer
Kenichi> (set-buffer buffer)) (or coding (setq coding
Kenichi> buffer-file-coding-system)) (length
Kenichi> (encode-coding-region (point-min) (point-max) coding
Kenichi> t))))
Kenichi>
Kenichi> --- Kenichi Handa handa@m17n.org
Kenichi>
Kenichi>
Kenichi> _______________________________________________
Kenichi> Emacs-devel mailing list Emacs-devel@gnu.org
Kenichi> http://lists.gnu.org/mailman/listinfo/emacs-devel
--
Best Regards,
--raman
Email: raman@users.sf.net
WWW: http://emacspeak.sf.net/raman/
AIM: emacspeak GTalk: tv.raman.tv@gmail.com
PGP: http://emacspeak.sf.net/raman/raman-almaden.asc
Google: tv+raman
IRC: irc://irc.freenode.net/#emacs
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: suggestion: function: buffer-bytes
2007-07-01 8:22 ` Kenichi Handa
2007-07-01 12:39 ` Stefan Monnier
2007-07-01 17:47 ` T. V. Raman
@ 2007-07-01 19:34 ` Eli Zaretskii
2007-07-02 0:55 ` Kenichi Handa
2 siblings, 1 reply; 14+ messages in thread
From: Eli Zaretskii @ 2007-07-01 19:34 UTC (permalink / raw)
To: Kenichi Handa; +Cc: emacs-devel, raman
> From: Kenichi Handa <handa@m17n.org>
> Date: Sun, 01 Jul 2007 17:22:29 +0900
> Cc: emacs-devel@gnu.org, monnier@iro.umontreal.ca, raman@users.sourceforge.net
>
> In emacs-unicode-2, you can use a little bit faster version.
>
> (defun buffer-encoded-size (&optional buffer coding)
> "Return the encoded size of the current byffer in bytes.
> ..."
> (save-excursion
> (and buffer (set-buffer buffer))
> (or coding
> (setq coding buffer-file-coding-system))
> (length (encode-coding-region (point-min) (point-max) coding t))))
Why can't he use this version in Emacs 22?
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: suggestion: function: buffer-bytes
2007-07-01 12:39 ` Stefan Monnier
@ 2007-07-02 0:48 ` Kenichi Handa
0 siblings, 0 replies; 14+ messages in thread
From: Kenichi Handa @ 2007-07-02 0:48 UTC (permalink / raw)
To: Stefan Monnier; +Cc: raman, emacs-devel
In article <jwvabug711b.fsf-monnier+emacs@gnu.org>, Stefan Monnier <monnier@iro.umontreal.ca> writes:
> > Then perhaps what you need is this.
> > (defun buffer-encoded-size (&optional buffer coding)
> But in his case, he's probably better off just encoding the buffer manually
> before passing the data to the process/network, so that he can get his hands
> on the number of bytes with just buffer-size.
Perhaps. But, I don't know how and when the data is passed
to the process and when he needs the encoded byte size.
---
Kenichi Handa
handa@m17n.org
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: suggestion: function: buffer-bytes
2007-07-01 19:34 ` Eli Zaretskii
@ 2007-07-02 0:55 ` Kenichi Handa
0 siblings, 0 replies; 14+ messages in thread
From: Kenichi Handa @ 2007-07-02 0:55 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: emacs-devel, raman
In article <uved3lxy8.fsf@gnu.org>, Eli Zaretskii <eliz@gnu.org> writes:
> > From: Kenichi Handa <handa@m17n.org>
> > Date: Sun, 01 Jul 2007 17:22:29 +0900
> > Cc: emacs-devel@gnu.org, monnier@iro.umontreal.ca, raman@users.sourceforge.net
> >
> > In emacs-unicode-2, you can use a little bit faster version.
> >
> > (defun buffer-encoded-size (&optional buffer coding)
> > "Return the encoded size of the current byffer in bytes.
> > ..."
> > (save-excursion
> > (and buffer (set-buffer buffer))
> > (or coding
> > (setq coding buffer-file-coding-system))
> > (length (encode-coding-region (point-min) (point-max) coding t))))
> Why can't he use this version in Emacs 22?
Because the 4th argument DESTINATION of encode-coding-region
is introduced by emacs-unicode-2.
Optional 4th arguments DESTINATION specifies where the encoded text goes.
If nil, the region between start and end is replace by the encoded text.
If buffer, the encoded text is inserted in the buffer.
If t, the encoded text is returned.
---
Kenichi Handa
handa@m17n.org
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: suggestion: function: buffer-bytes
2007-07-01 17:47 ` T. V. Raman
@ 2007-07-02 1:11 ` Kenichi Handa
2007-07-04 16:31 ` T. V. Raman
0 siblings, 1 reply; 14+ messages in thread
From: Kenichi Handa @ 2007-07-02 1:11 UTC (permalink / raw)
To: raman; +Cc: raman, monnier, emacs-devel
In article <18055.59548.547994.901837@gargle.gargle.HOWL>, "T. V. Raman" <raman@users.sourceforge.net> writes:
> So someone on the g-client group originally proposed computing
> the length of buffer-string using string-encoding; I proposed
> using position-bytes as a simpler alternative.
I see. And, do you see why position-bytes can't be used in
the current case now?
---
Kenichi Handa
handa@m17n.org
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: suggestion: function: buffer-bytes
2007-07-02 1:11 ` Kenichi Handa
@ 2007-07-04 16:31 ` T. V. Raman
2007-07-05 14:42 ` Stefan Monnier
0 siblings, 1 reply; 14+ messages in thread
From: T. V. Raman @ 2007-07-04 16:31 UTC (permalink / raw)
To: handa; +Cc: emacs-devel, monnier, raman
I can se how position-bytes is not a generic solution for writing
a buffer-bytes function.
But for the use case I needed, namely someone composing blog
entries using nxml-mode,
and then building up an http post message from what they created,
the position-bytes solution does work.
>>>>> "Kenichi" == Kenichi Handa <handa@m17n.org> writes:
Kenichi> In article
Kenichi> <18055.59548.547994.901837@gargle.gargle.HOWL>,
Kenichi> "T. V. Raman" <raman@users.sourceforge.net> writes:
>> So someone on the g-client group originally proposed
>> computing the length of buffer-string using
>> string-encoding; I proposed using position-bytes as a
>> simpler alternative.
Kenichi>
Kenichi> I see. And, do you see why position-bytes can't be
Kenichi> used in the current case now?
Kenichi>
Kenichi> --- Kenichi Handa handa@m17n.org
Kenichi>
Kenichi>
Kenichi> _______________________________________________
Kenichi> Emacs-devel mailing list Emacs-devel@gnu.org
Kenichi> http://lists.gnu.org/mailman/listinfo/emacs-devel
--
Best Regards,
--raman
Email: raman@users.sf.net
WWW: http://emacspeak.sf.net/raman/
AIM: emacspeak GTalk: tv.raman.tv@gmail.com
PGP: http://emacspeak.sf.net/raman/raman-almaden.asc
Google: tv+raman
IRC: irc://irc.freenode.net/#emacs
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: suggestion: function: buffer-bytes
2007-07-04 16:31 ` T. V. Raman
@ 2007-07-05 14:42 ` Stefan Monnier
2007-07-09 2:44 ` Kenichi Handa
0 siblings, 1 reply; 14+ messages in thread
From: Stefan Monnier @ 2007-07-05 14:42 UTC (permalink / raw)
To: raman; +Cc: raman, emacs-devel, handa
> But for the use case I needed, namely someone composing blog
> entries using nxml-mode,
> and then building up an http post message from what they created,
> the position-bytes solution does work.
If it worked in some case, it was only by accident, unless they specifically
decided to use emacs-mule as the coding system for their file (and to switch
wholesale to utf-8 at the exact time they upgrade their Emacs), which would
be rather unusual.
Stefan
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: suggestion: function: buffer-bytes
2007-07-05 14:42 ` Stefan Monnier
@ 2007-07-09 2:44 ` Kenichi Handa
0 siblings, 0 replies; 14+ messages in thread
From: Kenichi Handa @ 2007-07-09 2:44 UTC (permalink / raw)
To: Stefan Monnier; +Cc: emacs-devel, raman
In article <jwvk5teoqvj.fsf-monnier+emacs@gnu.org>, Stefan Monnier <monnier@iro.umontreal.ca> writes:
> > But for the use case I needed, namely someone composing blog
> > entries using nxml-mode,
> > and then building up an http post message from what they created,
> > the position-bytes solution does work.
> If it worked in some case, it was only by accident, unless they specifically
> decided to use emacs-mule as the coding system for their file (and to switch
> wholesale to utf-8 at the exact time they upgrade their Emacs), which would
> be rather unusual.
Right. By the way, currently emacs-mule uses 2-bytes, for
instance, for Latin-1 characters, and UTF-8 also uses
2-bytes for them. So, in the case of sending a Latin-1 text
to a process with UTF-8, position-bytes solution works just
by chance.
---
Kenichi Handa
handa@m17n.org
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2007-07-09 2:44 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-07-01 0:18 suggestion: function: buffer-bytes T. V. Raman
2007-07-01 1:39 ` Stefan Monnier
2007-07-01 2:40 ` T. V. Raman
2007-07-01 6:31 ` David Kastrup
2007-07-01 8:22 ` Kenichi Handa
2007-07-01 12:39 ` Stefan Monnier
2007-07-02 0:48 ` Kenichi Handa
2007-07-01 17:47 ` T. V. Raman
2007-07-02 1:11 ` Kenichi Handa
2007-07-04 16:31 ` T. V. Raman
2007-07-05 14:42 ` Stefan Monnier
2007-07-09 2:44 ` Kenichi Handa
2007-07-01 19:34 ` Eli Zaretskii
2007-07-02 0:55 ` Kenichi Handa
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).