From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Joseph Turner Newsgroups: gmane.emacs.devel Subject: Re: How to get buffer byte length (not number of characters)? Date: Wed, 21 Aug 2024 02:20:09 -0700 Message-ID: <87bk1lhkvg.fsf@ushin.org> References: <87wmkbekjp.fsf@ushin.org> <86o75nwilg.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="29724"; mail-complaints-to="usenet@ciao.gmane.io" Cc: emacs-devel@gnu.org, Andreas Schwab , Adam Porter To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Wed Aug 21 19:33:49 2024 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1sgpDQ-0007Xw-8R for ged-emacs-devel@m.gmane-mx.org; Wed, 21 Aug 2024 19:33:48 +0200 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sgpCZ-0005pG-P8; Wed, 21 Aug 2024 13:32:55 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sgohB-0006Uh-9V for emacs-devel@gnu.org; Wed, 21 Aug 2024 13:00:29 -0400 Original-Received: from out-189.mta1.migadu.com ([2001:41d0:203:375::bd]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sgoh5-0001c7-Qm for emacs-devel@gnu.org; Wed, 21 Aug 2024 13:00:27 -0400 X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ushin.org; s=key1; t=1724259610; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=OInzZJZ/ehOkZGLrtWN/IRMdBoBPFWheH9kS48JtpRo=; b=ehynux5va66O49C6B4hNEc5Tfbn+OEVe9Bq8z9S6YyuXqBcaJUBij5NSjNGBcUp7vz/urZ zYr9BGtu1sGmd8aqC6SGketMoppv7EIeBrzUHXVbcRzkXbPTb/bKd231R+uqzS+NL2DDQd SxiRrmtI147vC5RhJiA8amQ9RPYo2phFRhdpVBE9P8ScLM7Hkup8K1OSSgRcPgaQoeeAsD nvFWwYHdho0Bdmf9ZW8W5wbHXK2XB+QhEc/NfGR1CEtHI/Nah0nMOIi2hJfQ2wOE5zxF9O jdTyhBlBEmUdTalaMGcx5Ej+3tcGQNVTg11Y5Z2x4LgDDAzAHfQvc+bjj7DMnA== In-Reply-To: <86o75nwilg.fsf@gnu.org> (Eli Zaretskii's message of "Tue, 20 Aug 2024 14:15:39 +0300") X-Migadu-Flow: FLOW_OUT Received-SPF: pass client-ip=2001:41d0:203:375::bd; envelope-from=joseph@ushin.org; helo=out-189.mta1.migadu.com X-Spam_score_int: -5 X-Spam_score: -0.6 X-Spam_bar: / X-Spam_report: (-0.6 / 5.0 requ) BAYES_00=-1.9, DATE_IN_PAST_06_12=1.543, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=no autolearn_force=no X-Spam_action: no action X-Mailman-Approved-At: Wed, 21 Aug 2024 13:32:51 -0400 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.devel:323015 Archived-At: Eli Zaretskii writes: >> From: Joseph Turner >> Date: Tue, 20 Aug 2024 00:10:50 -0700 >> >> How can I get a buffer's byte length without writing to a file? > > This depends on why do you need the byte length of the buffer. > > If I interpret your question literally, then this is the answer: > > (position-bytes (point-max)) > > perhaps preceded by a call to 'widen'. But that returns the number of > bytes that the buffer's characters take when represented in the > internal Emacs representation of characters, which is not necessarily > useful to Lisp programs. For example, if you need to know how many > bytes will Emacs write to a file if you save the buffer, or to a > network connection or a sub-process if you send the buffer there, then > you need to consider the encoding process: Emacs always encodes the > buffer text on output to the external world. If this is what you > want, then you need to use bufferpos-to-filepos, and make sure you > pass the correct coding-system argument to it. > > If you need this for something else, please tell the details. Thank you, Eli, Andreas! Eli's crystal ball is correct: I'd like to know how many bytes Emacs will send when passing buffer contents (or a string) to a subprocess, and first I need to figure out which coding system is appropriate. The hyperdrive.el package provides a UI for creating and accessing shared virtual filesystems. hyperdrive.el uses plz.el as an Elisp API for curl in order to communicate with a local HTTP server. We want to be able to create hyperdrive "files" in an Emacs buffer and then upload them with the correct encoding. We also want to know how large they will be before uploading them. A couple of examples: Let's say I create a textual hyperdrive file using hyperdrive.el, and then I upload it by sending its contents via curl to the local HTTP server. What coding system should be used when the file is uploaded? Let's say I have a `iso-latin-1'-encoded file "foo.txt" on my local filesystem. I upload this encoded file to my hyperdrive by passing the filename to curl, which uploads the bytes with no conversion. Then I open the "foo.txt" hyperdrive file using hyperdrive.el, which receives the contents via curl from the local HTTP server. In the hyperdrive file buffer, buffer-file-coding-system should be `iso-latin-1' (right?). Then, I edit the buffer and save it to the hyperdrive again with hyperdrive.el, which this time sends the modified contents over the wire to curl. The uploaded file should be `iso-latin-1'-encoded (right?). Currently, plz.el always creates the curl subprocess like so: (make-process :coding 'binary ...) https://git.savannah.gnu.org/cgit/emacs/elpa.git/tree/plz.el?h=externals-release/plz#n519 Does this DTRT? Should we use buffer-file-coding-system not 'binary? Thank you for helping me understand encodings in Emacs. Joseph