From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.devel Subject: Re: How to get buffer byte length (not number of characters)? Date: Tue, 20 Aug 2024 14:20:38 +0300 Message-ID: <86msl7wid5.fsf@gnu.org> References: <87wmkbekjp.fsf@ushin.org> <87o75neio9.fsf@ushin.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="39799"; mail-complaints-to="usenet@ciao.gmane.io" Cc: emacs-devel@gnu.org To: Joseph Turner Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Tue Aug 20 13:21:16 2024 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1sgMvM-000A9Z-4t for ged-emacs-devel@m.gmane-mx.org; Tue, 20 Aug 2024 13:21:16 +0200 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sgMup-0001Se-Pj; Tue, 20 Aug 2024 07:20:43 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sgMuo-0001SM-0k for emacs-devel@gnu.org; Tue, 20 Aug 2024 07:20:42 -0400 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sgMun-0004Bv-0o; Tue, 20 Aug 2024 07:20:41 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=MIME-version:References:Subject:In-Reply-To:To:From: Date; bh=b6WL5+CCCs5v6Q+IPcEeZMjcAVnmD2t15sAI0oCmNHs=; b=oQ/q4uFJTt3HDIrKk6cm j/Rt3WyKpdYjmQtY4maaeuAwkhNCNPBNlniNbtOXPGiUPGgsnu6xWVKWni5gTND6uqBbmeRPGcnow m3hbWEyBT3OQ5gU1fwq90eLHO5WGFDcyvPwjJ+B1LQxH42y4lSbWbdWzQ8VAbe+0N/ny1UWMtzGMs p0OtsEmhlty4beL/lUlEVNHp2bPDgLAFCUZW4uSV1MOL+phxmiNsquFT5SRHuz7SHojz+Dr+gakPC gUYQ6LWxZ+CRAXy2yQ0dHxOrldiHK/p8TaDK0HGuTtB3XsCHSikOoPmgaW9KlxP9+C2uk0IY086FD BNYErXgI0SCgsQ==; In-Reply-To: <87o75neio9.fsf@ushin.org> (message from Joseph Turner on Tue, 20 Aug 2024 00:51:18 -0700) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.devel:322956 Archived-At: > From: Joseph Turner > Date: Tue, 20 Aug 2024 00:51:18 -0700 > > Joseph Turner writes: > > > How can I get a buffer's byte length without writing to a file? > > This seems to work: > > (with-temp-buffer > (insert "你好") > (set-buffer-multibyte nil) > (buffer-size)) ;; 6 > > although, curiously, this does not: > > (with-temp-buffer > (set-buffer-multibyte nil) > (insert "你好") > (buffer-size)) ;; 2 > > Is the `set-buffer-multibyte' approach the best solution? No, as you already discovered. Unibyte buffers and strings are messy and full of surprises, so my suggestion is to stay away of them as much as you can. > If I have a multibyte string and I want the byte length, do I need to > insert it into a buffer and perform the same dance as above? No, you can use string-bytes instead. But again: whether the result is useful for whatever the needs which triggered these questions, is uncertain, and my crystal ball says that this is not what you want. For example, raw bytes sometimes take 2 bytes in the internal Emacs representation, something that will get in the way of most uses of these results. So please tell more about the background and the context of these questions.