From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: "Brennan Vincent" Newsgroups: gmane.emacs.devel Subject: Re: [PATCH] Add a mechanism for passing unibyte strings from lisp to modules. Date: Tue, 25 Jun 2024 13:36:31 -0400 Message-ID: <87jzickjq8.fsf@taipei.mail-host-address-is-not-set> References: <86v822jeqh.fsf@gnu.org> <225D336D-933E-4CA3-B245-89992D7E6C41@umanwizard.com> <86frt5jwtc.fsf@gnu.org> <87o77t6lyn.fsf@taipei.mail-host-address-is-not-set> <867cehgdn1.fsf@gnu.org> <861q4md0o1.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="24703"; mail-complaints-to="usenet@ciao.gmane.io" Cc: stefankangas@gmail.com, emacs-devel@gnu.org To: Eli Zaretskii , Andrea Corallo Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Tue Jun 25 20:21:52 2024 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1sMAnf-0006AG-ST for ged-emacs-devel@m.gmane-mx.org; Tue, 25 Jun 2024 20:21:51 +0200 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sMAWW-0002MK-90; Tue, 25 Jun 2024 14:04:10 -0400 Original-Received: from [209.51.188.92] (helo=eggs.gnu.org) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sMAW4-0002Jz-En for emacs-devel@gnu.org; Tue, 25 Jun 2024 14:03:47 -0400 Original-Received: from smtp.umanwizard.com ([54.203.248.109]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sMAV0-0001OW-Sn for emacs-devel@gnu.org; Tue, 25 Jun 2024 14:03:13 -0400 Original-Received: from localhost ([174.197.135.120]) by smtp.umanwizard.com ; 25 Jun 2024 17:36:34 +0000 X-Fes-Received-For: emacs-devel@gnu.org X-Fes-Received-From: In-Reply-To: <861q4md0o1.fsf@gnu.org> X-Fes-Encrypted: true X-Fes-Ehlo-Domain: localhost Received-SPF: pass client-ip=54.203.248.109; envelope-from=brennan@umanwizard.com; helo=smtp.umanwizard.com X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_PASS=-0.001, T_SPF_HELO_TEMPERROR=0.01 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.devel:320640 Archived-At: Eli Zaretskii writes: >> From: Andrea Corallo >> Cc: "Brennan Vincent" , Stefan Kangas >> , emacs-devel@gnu.org >> Date: Sun, 23 Jun 2024 17:15:39 -0400 >>=20 >> Eli Zaretskii writes: >>=20 >> >> From: "Brennan Vincent" >> >> Date: Sat, 22 Jun 2024 11:22:56 -0400 >> >>=20 >> >> Eli Zaretskii writes: >> >>=20 >> >> > Why can't you have the module code itself read the file, instead of >> >> > getting the bytes from Emacs? Passing large amounts of bytes from >> >> > Emacs to a module is a very inefficient way of talking to modules >> >> > anyway, because Emacs is not optimized for moving text to and fro in >> >> > the shape of Lisp strings. To say nothing of the GC pressure you w= ill >> >> > have in your mode, due to a constant consing of strings. It is best >> >> > to avoid all that to begin with. >> >>=20 >> >> Of course it's possible to do that, but I wanted to write my mode in >> >> elisp as much as possible and keep the C side minimal, simply because= I >> >> find elisp a much more enjoyable language to use. But if >> >> you are opposed to adding this code I can go with that approach. >> >>=20 >> >> Another possibility which would avoid adding specifically >> >> unibyte-related surface area to the modules API would be to create an >> >> extended version of copy_string_contents which can take any coding >> >> system, rather than forcing UTF-8. >> >>=20 >> >> Would you be open to such an approach? If so, I will send an updated = patch. >> > >> > I very much dislike the idea of letting modules deal with unibyte >> > strings, for the reasons I explained. Basically, it will open a large >> > Pandora box by allowing people who don't know enough about the >> > subtleties of unibyte strings in Emacs to write buggy modules which >> > will crash Emacs. >> > >> > But let's hear the other co-maintainers. Stefan and Andrea, what is >> > your POV on these issues? >>=20 >> I, for one, would be not too much worried. People writing modules >> should be already very responsible for what they write as they have >> already plenty of ways to shoot in their feet =F0=9F=A4=B7. > > The problem is that we get to clean up their mess in too many cases. > Especially when the package is on ELPA. > >> Perhaps we could mitigate the risk with some doc/comment explaining the >> specific usecase this interface is meant to serve so it's not miss-used? > > If we want to allow Emacs to send binary data, I'd rather come up with > a specialized interface to do just that. Explaining the subtleties of > using unibyte text in Emacs is a tough job, since it involves a lot of > low-level technical details. When unibyte text comes from encoding > human-readable text that is at least justified, since that's what > Emacs was designed to d, among other things. But using Emacs as a > handy method of reading binary data, to avoid doing that in the module > itself, and asking us to add an interface for that use case is too > much for my palate. I think it would be great if emacs grew a specialized vector-of-bytes type. BTW, I have already rewritten my mode to not attempt to pass data with unibyte strings, and to read/write the file in C. So this is no longer relevant to me personally. But I think other module writers will hit a similar issue, and it will be good to have something in place for this use case.