From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.devel Subject: Re: [PATCH] Add a mechanism for passing unibyte strings from lisp to modules. Date: Mon, 24 Jun 2024 14:45:34 +0300 Message-ID: <861q4md0o1.fsf@gnu.org> References: <86v822jeqh.fsf@gnu.org> <225D336D-933E-4CA3-B245-89992D7E6C41@umanwizard.com> <86frt5jwtc.fsf@gnu.org> <87o77t6lyn.fsf@taipei.mail-host-address-is-not-set> <867cehgdn1.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="9206"; mail-complaints-to="usenet@ciao.gmane.io" Cc: brennan@umanwizard.com, stefankangas@gmail.com, emacs-devel@gnu.org To: Andrea Corallo Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Mon Jun 24 13:46:19 2024 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1sLi9K-000284-4l for ged-emacs-devel@m.gmane-mx.org; Mon, 24 Jun 2024 13:46:18 +0200 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sLi8s-0004uL-5j; Mon, 24 Jun 2024 07:45:50 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sLi8o-0004rY-Um for emacs-devel@gnu.org; Mon, 24 Jun 2024 07:45:47 -0400 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sLi8o-0002M8-J1; Mon, 24 Jun 2024 07:45:46 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=MIME-version:References:Subject:In-Reply-To:To:From: Date; bh=1i0ut7k3xutLtAd2scCFfRyzfw3o5Ca6FevljrMBwTc=; b=M8EVy3M8F2WUXWN07X5A 3F1vwNlNKF63G7gTRnQNntzEAVcsgIWmHe7w+3tm1UtCvTkG3m9yZbSaBGMUPStHvQITR8a5NV17V wsEpS62+e+Y1mmk/9YI7vciH1PVt14I6dxEEkEhH7QomiAbfchf31xzb78ukzXbGhzQRVZhw8T54c YSZyjijdLXVnW2RPYnYQ5u+x/2Q1NOh5Bcy1BK2qrnDRe7zdFvhKQk6BqZIZ5YsFQNPUny1AgMTBT eHZhsg9oUxgbMGoCGXhRwA2u6GQomM81dfwN3QfHxWSRrUksfgh3B6aTFzVHY7A2M49jKJKgy0Y4A k4oXcoEaGSEVAg==; In-Reply-To: (message from Andrea Corallo on Sun, 23 Jun 2024 17:15:39 -0400) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.devel:320597 Archived-At: > From: Andrea Corallo > Cc: "Brennan Vincent" , Stefan Kangas > , emacs-devel@gnu.org > Date: Sun, 23 Jun 2024 17:15:39 -0400 > > Eli Zaretskii writes: > > >> From: "Brennan Vincent" > >> Date: Sat, 22 Jun 2024 11:22:56 -0400 > >> > >> Eli Zaretskii writes: > >> > >> > Why can't you have the module code itself read the file, instead of > >> > getting the bytes from Emacs? Passing large amounts of bytes from > >> > Emacs to a module is a very inefficient way of talking to modules > >> > anyway, because Emacs is not optimized for moving text to and fro in > >> > the shape of Lisp strings. To say nothing of the GC pressure you will > >> > have in your mode, due to a constant consing of strings. It is best > >> > to avoid all that to begin with. > >> > >> Of course it's possible to do that, but I wanted to write my mode in > >> elisp as much as possible and keep the C side minimal, simply because I > >> find elisp a much more enjoyable language to use. But if > >> you are opposed to adding this code I can go with that approach. > >> > >> Another possibility which would avoid adding specifically > >> unibyte-related surface area to the modules API would be to create an > >> extended version of copy_string_contents which can take any coding > >> system, rather than forcing UTF-8. > >> > >> Would you be open to such an approach? If so, I will send an updated patch. > > > > I very much dislike the idea of letting modules deal with unibyte > > strings, for the reasons I explained. Basically, it will open a large > > Pandora box by allowing people who don't know enough about the > > subtleties of unibyte strings in Emacs to write buggy modules which > > will crash Emacs. > > > > But let's hear the other co-maintainers. Stefan and Andrea, what is > > your POV on these issues? > > I, for one, would be not too much worried. People writing modules > should be already very responsible for what they write as they have > already plenty of ways to shoot in their feet 🤷. The problem is that we get to clean up their mess in too many cases. Especially when the package is on ELPA. > Perhaps we could mitigate the risk with some doc/comment explaining the > specific usecase this interface is meant to serve so it's not miss-used? If we want to allow Emacs to send binary data, I'd rather come up with a specialized interface to do just that. Explaining the subtleties of using unibyte text in Emacs is a tough job, since it involves a lot of low-level technical details. When unibyte text comes from encoding human-readable text that is at least justified, since that's what Emacs was designed to d, among other things. But using Emacs as a handy method of reading binary data, to avoid doing that in the module itself, and asking us to add an interface for that use case is too much for my palate.