From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Brennan Vincent Newsgroups: gmane.emacs.devel Subject: Re: [PATCH] Add a mechanism for passing unibyte strings from lisp to modules. Date: Fri, 21 Jun 2024 16:14:05 -0400 Message-ID: <225D336D-933E-4CA3-B245-89992D7E6C41@umanwizard.com> References: <86v822jeqh.fsf@gnu.org> Mime-Version: 1.0 (1.0) Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="25355"; mail-complaints-to="usenet@ciao.gmane.io" Cc: emacs-devel@gnu.org To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Fri Jun 21 22:14:55 2024 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1sKkes-0006Hy-K6 for ged-emacs-devel@m.gmane-mx.org; Fri, 21 Jun 2024 22:14:54 +0200 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sKkeR-0008VH-2m; Fri, 21 Jun 2024 16:14:27 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sKkeP-0008Uw-9w for emacs-devel@gnu.org; Fri, 21 Jun 2024 16:14:25 -0400 Original-Received: from smtp.umanwizard.com ([54.203.248.109]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sKkeN-00037x-Kw for emacs-devel@gnu.org; Fri, 21 Jun 2024 16:14:25 -0400 Original-Received: from smtpclient.apple ([174.197.135.120]) by smtp.umanwizard.com ; 21 Jun 2024 20:14:18 +0000 X-Fes-Received-For: emacs-devel@gnu.org X-Fes-Received-From: In-Reply-To: <86v822jeqh.fsf@gnu.org> X-Mailer: iPhone Mail (21F90) X-Fes-Encrypted: true X-Fes-Ehlo-Domain: smtpclient.apple Received-SPF: pass client-ip=54.203.248.109; envelope-from=brennan@umanwizard.com; helo=smtp.umanwizard.com X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.devel:320446 Archived-At: > On Jun 21, 2024, at 15:08, Eli Zaretskii wrote: >=20 > =EF=BB=BF >>=20 >> From: Brennan Vincent >> Date: Fri, 21 Jun 2024 14:13:14 -0400 >>=20 >> Since the introduction of make_unibyte_string, it has been possible to pa= ss >> raw binary data from modules to lisp, but not the other way around >> (except by using vectors of bytes, which is inefficient). This >> patch implements that feature so that raw binary data can be sent both wa= ys. >=20 > Please describe the motivation and real-life use cases for this. As far as I know, unibyte strings are the only efficient way to represent ar= bitrary binary buffers in emacs. If that=E2=80=99s not true, I=E2=80=99d be h= appy to be corrected. I think there are many possible cases where module authors will want to comm= unicate binary data, but I=E2=80=99ll just describe one (my own). I=E2=80=99= m working on a major mode that reads ELF files (whose contents it stores in a= unibyte buffer) and provides various features like disassembling code. To d= o this it passes chunks of code to a module which in turn passes them to the= Capstone disassembly library. To do this without being able to pass unibyte= strings, I have to take the string of bytes, expand it to a vector of bytes= , pass that to the module, and have the module copy each byte back out in a l= oop. This is very inefficient. > In general, we want to minimize the use of unibyte strings in Emacs. Why? What else should be used instead to represent arbitrary bytes? >=20 > I also don't understand the need for unibyte-string-p, since we > already have multibyte-string-p. That=E2=80=99s fair, I only added it so I could use it as an argument to CHE= CK_TYPE.=