From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: "J.P." Newsgroups: gmane.emacs.bugs Subject: bug#49860: 28.0.50; add IRCv3 building blocks to ERC Date: Mon, 15 Jul 2024 23:35:40 -0700 Message-ID: <87h6cp6dz7.fsf__22255.4758469172$1721111790$gmane$org@neverwas.me> References: <87pmuuvx3p.fsf@neverwas.me> Mime-Version: 1.0 Content-Type: text/plain Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="2567"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) Cc: emacs-erc@gnu.org To: 49860@debbugs.gnu.org Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Tue Jul 16 08:36:22 2024 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1sTbnR-0000QG-8Q for geb-bug-gnu-emacs@m.gmane-mx.org; Tue, 16 Jul 2024 08:36:21 +0200 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sTbn7-00014m-Ns; Tue, 16 Jul 2024 02:36:01 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sTbn6-00014J-4H for bug-gnu-emacs@gnu.org; Tue, 16 Jul 2024 02:36:00 -0400 Original-Received: from debbugs.gnu.org ([2001:470:142:5::43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sTbn5-0004qt-SH for bug-gnu-emacs@gnu.org; Tue, 16 Jul 2024 02:35:59 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1sTbn8-0000fT-47 for bug-gnu-emacs@gnu.org; Tue, 16 Jul 2024 02:36:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: "J.P." Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Tue, 16 Jul 2024 06:36:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 49860 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch Original-Received: via spool by 49860-submit@debbugs.gnu.org id=B49860.17211117532552 (code B ref 49860); Tue, 16 Jul 2024 06:36:02 +0000 Original-Received: (at 49860) by debbugs.gnu.org; 16 Jul 2024 06:35:53 +0000 Original-Received: from localhost ([127.0.0.1]:60850 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1sTbmx-0000f5-MR for submit@debbugs.gnu.org; Tue, 16 Jul 2024 02:35:52 -0400 Original-Received: from mail-108-mta158.mxroute.com ([136.175.108.158]:35703) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1sTbmu-0000ev-T1 for 49860@debbugs.gnu.org; Tue, 16 Jul 2024 02:35:50 -0400 Original-Received: from filter006.mxroute.com ([136.175.111.3] filter006.mxroute.com) (Authenticated sender: mN4UYu2MZsgR) by mail-108-mta158.mxroute.com (ZoneMTA) with ESMTPSA id 190ba4113ae00017a3.001 for <49860@debbugs.gnu.org> (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384); Tue, 16 Jul 2024 06:35:45 +0000 X-Zone-Loop: 83da3df486010145ea989d22bb92b618ec6922e0634a X-Originating-IP: [136.175.111.3] DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=neverwas.me ; s=x; h=Content-Type:MIME-Version:Message-ID:Date:References:In-Reply-To: Subject:Cc:To:From:Sender:Reply-To:Content-Transfer-Encoding:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=bwlFGPqNhT8uns4f5axBCU8QD+uoRu8AeTNNjqdUDBw=; b=EJRFUOWGH7UlZXXpX7E+y3lL/k HfVeaOkR8t3kX1VTffcOcZiuT2ueprtviwX/nD2vsqEuvll1savhQEoqUNpzCvfgilJjtQEObWmSG 21NoZm3fY35RKrM4H21bsltQbrpjsevNRerAU8rJIoK/SmM4rX8/gLiFGduiSG6GaXElfGIFbXb0w nrg1uxI8vYN40WP1rwUdCUvCZmKbcPWidszGuPvZvSioJSCwsgLzXS3aiHaTt1WhEjVL0RfAMjmNL fVr6NG2f2VGCiBWlzuLAVU76PToIwS4lGlHMoXClDS6xlxqTOAUIEa1G+ilIkieDpBlnjfbiJVpBO iuD5uiAw==; In-Reply-To: <87pmuuvx3p.fsf@neverwas.me> (J. P.'s message of "Tue, 03 Aug 2021 18:04:42 -0700") X-Authenticated-Id: masked@neverwas.me X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.bugs:288881 Archived-At: Here's an update on the current state of the proposed implementation. As of now, the target version remains ERC 5.7, which will coincide with Emacs 31 at the earliest. To get involved in this initiative, please comment in this thread or in the channel. Modeling an IRCv3 extension ~~~~~~~~~~~~~~~~~~~~~~~~~~~ Modules are the primary means of extending ERC. They're basically ERC-managed minor modes and customization groups. IRCv3 extensions are likewise modular and also provide functionality that's activated in a declarative manner. In many ways, it would make sense to model extensions as local modules. However, extensions are protocol driven and depend on coordination with the server. This means successful activation depends on details discovered from the logical connection long after modules are typically initialized. Extensions also generally describe and affect a lower level of functionality than do traditional modules. For the sake of simplicity, I believe we should frame extensions as supplementary in nature and as providing additional functionality to modules. For example, `nickbar' might display account names and away statuses when extensions providing such awareness become active. And even when an extension's functionality seems inherently coupled to a module's feature set, I think we should resist the urge to let extensions and modules activate or otherwise control one another [1]. For example, the `read-marker' extension seems naturally suited for integration with `keep-place-indicator', so much so that a hypothetical implementation would likely be a no-op when the module's disabled. In these situations, I'd actually prefer silently dropping functionality over engaging in active feedback, e.g., by enabling the module on the user's behalf or issuing a didactic warning. Instead, we can explain these nuanced relationships in the manual and mention in doc strings that modules can exhibit alternative functionality when various extensions are active. With this in mind, I'm proposing we adopt a mostly traditional, object-oriented approach to modeling extensions, specifically one that marries a new struct-based type with the convenience and familiarity of minor modes, all without exposing either as part of the public interface. Users will instead mostly interact with extensions indirectly, through a new `v3' module. The subset of extensions slated for activation will itself be a user option. As for a serious library API, I'd rather wait a release or two. Under the proposed scheme, extensions will be actual minor modes with an associated mode variable and a non-interactive toggle function (rather than a traditional mode command). These modes will be kept internal and modified slightly to meet the demands of your typical extension. Most importantly, instead of being t, a mode variable's enabled value will be an instance of a new `extension' "type," a hierarchical data structure to be shared among all buffers of a session as a first-class citizen. Its purpose: to describe the extension's health and lifecycle stage and its relationship to other extensions. It may also contain arbitrary application state relevant to the session. The bulk of the technical challenges arising from this design are well understood and thus solvable by prior art. For example, an extension's definition includes its dependencies, which will be resolved and loaded in topological order. The more novel challenges mostly involve making the design play nice with ERC's existing architecture. For example, modules persist state between IRC sessions by inspecting and possibly assuming ownership of assets they manage from the prior session. This ritual normally occurs just after major-mode activation and before dialing. As mentioned previously, extensions are subject to discovery and possibly negotiation, meaning if they're to manage modules, they'll need the ritual to be postponed or prolonged so they can participate. Complicating matters slightly is the proposed means of presenting users access to extensions and IRCv3 functionality as a whole: being ERC, this interface must be compatibility focused and optional by default. History would thus dictate we do this by encapsulating it all behind a single `v3' library and an accompanying local module. Here's an example of an extension's definition: (erc-v3--define-capability spam :depends '(batch) :supports '(labeled-response multiline) :aliases '(draft/spam) :enablep #'erc-spam--extract-cap-values :slots ((foo 0 :type integer) (bar nil :type list)) :keymap erc-v3--spam-mode-map (if erc-v3--spam (do-init-stuff) (undo-init-stuff))) This can be thought of as a quasi "class" definition for a so-called `capability', which is a type of `extension' that has additional methods and attributes relevant to IRCv3 capability negotiation. Under the hood, this defines a minor mode named `erc-v3--spam' whose activation function and local mode variable share the same name. The lack of a "-mode" suffix is an obfuscation tactic to dissuade users and package authors from discovering and handling it directly. We can provide analogs for user-defined extensions, if necessary. Continuing with the example, the `:depends' and `:supports' items declare hard and soft dependencies respectively, which are guaranteed to precede this extension if selected for activation. If a hard dependency is missing, ERC silently skips activation. Everything after the final keyword pair becomes the body of the mode's toggle function. As mentioned earlier, the non-nil (enabled) value of the mode variable is actually an instance of a `capability' (subtype) and is instantiated via the mode's activation toggle, which doubles as a constructor. Historical insertions and deletions ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ A few key extensions pretty much require some minimally invasive means of inserting and deleting messages at various historical points in a buffer (rather than just appending). I've heard various IRC experts refer to this as the "mutable buffer" requirement. These operations don't demand "random access" in the constant-time sense. In our case, such insertions and deletions will be reserved for relatively rare occasions, so we can likely afford to scan for a single UUID or a timestamp when necessary. And of those operations stemming from a single application action, all but the first will be sequential and won't need scanning. This should afford us the liberty of not having to retain and associate a marker with every inserted message but instead only track important ones demarcating interesting regions, such as datelines and history playback intervals. The end goal of adding such an infrastructure is a richer, somewhat more web-like and less terminal-like experience, in which a buffer's visible contents update dynamically in response to arriving messages. By far the biggest challenge to providing this functionality is accommodating so-called "stateful" features, namely, those that treat messages as quasi "recurrences" where the appearance and behavior of a message depends on the appearance and behavior of the one preceding it. Modules providing such features need a way to hook into supported splicing and excising operations to run integrating code of a healing or "mending" nature. In practice, this sort of manual interpolation will likely rely on crude approximations and heuristics, although having a persistent message store may help minimize any visible scarring. Here are some examples of stateful features that require nuanced mending: - Invisibility. Many modules add their own `invisible' text-property tokens for controlling the visibility of affected portions of the buffer. For example, timestamp and fool visibility can overlap yet be toggled independently. This is made possible by the meticulous merging and teasing apart of adjacent regions of invisible properties so that intervening newlines and neutral regions also abide. - Smart folding. A hypothetical future module might provide dynamic folding and hiding of successive "JOIN" and "QUIT" messages from the same user. When enabled, certain sequences of alternating message types would be automatically hidden but could later be revealed (unfolded) for inspection. - Amalgamated reactions. Unseen reactions can show up as normal messages, e.g., "* bob reacted with a :thumbs up: to alice saying 'ok, you?' 30 min ago", but then vanish when marked as read or scrolled off screen (or clicked on), at which point their reaction will be aggregated in the summary displayed on the referenced message, down-buffer. The current proposal for an infrastructure supporting such mending operations boils down to intercepting the execution of insertion-hook members in `erc-insert-line'. Basically, we'll have a function-valued variable that modules can decorate as needed with local advice in order to exert influence over insertion-hook members as they're visited by `run-hook-wrapped'. In most cases, a wrapper will inspect surrounding messages before and/or after insertion and take care to alter the behavior of its own hook members as needed. An analogous and somewhat simpler interface will be offered for deletions. In general, the proposed approach is messy and inelegant, with plenty of unwanted cross pollination and implementation leakage. While a tidier and more abstraction-preserving pattern would be much preferred, we can at least find some solace in knowing that these ungainly interfaces will all be kept internal. A backing store ~~~~~~~~~~~~~~~ An efficient storage mechanism for structured message data will allow us to "repaint" portions of a buffer for various needs. This means we can discard similar info currently retained in text properties and buffer-local variables, often for relatively rare uses. For example, we might want to retain the account name of a speaker on all their messages in order to know whether they were logged in at the time a certain message attributed to their nick was inserted (not so much for forensics as UI enhancement). But if we can query that information at will, we can instead dispense with keeping it in-buffer. We'll also be free to destructively modify inserted messages in order to display them in an abbreviated or idealized way instead of dressing them up in a veneer of `display' props merely to ensure their underlying text remains unadulterated (for faithful logging and killing, etc.). It's worth emphasizing that this proposal does not currently advocate for a meaningfully persistent nonvolatile storage solution spanning Emacs sessions. Although the store should be resilient enough to survive reconnects, its primary purpose will be to facilitate the lessening and simplification of in-buffer, per-message data. It's also worth noting that this addition won't really help with historical insertions and deletions, which describe an orthogonal concern. And although the current WIP implementation has yet to be fleshed out and wired in, it's pretty much a given it'll rely on SQLite as a back end, meaning we'll be needing a fallback solution for older Emacsen. Generic response handlers ~~~~~~~~~~~~~~~~~~~~~~~~~ One key to minimizing the maintenance footprint of this initiative is somehow finding a way to override long-established response-processing behavior. Most of it originates from default response handlers, like `erc-server-PRIVSMG', which run on abnormal response hooks, like `erc-server-PRIVMSG-functions'. The traditional way of doing this involves preempting the default handler (e.g., `erc-server-PRIVMSG') by adding an overriding hook member that returns non-nil. I'm proposing we add another, internal means of overriding such behavior, namely, by converting a small subset of default handlers to generic functions. Going this route should reduce the presence of response-hook members managed by ERC while also sparing third-party members likely churn (as well as the hassle of learning about hook depth, which currently only affects insertion hooks). Folks who worry about generics proliferation are usually referring to public functions designed to be overridden by users. In our case, only a handful of implementations for a given handler will ever exist, and they'll all be internal, so the "polymorphic" dispatch penalty should be kept relatively negligible. This penalty is usually at most a minor concern for high-level code like ours that runs relatively infrequently. For example, in Python 3.9, the penalty is roughly n log(n) when doing a "BINARY_ADD" on two lists, which is why Pythonistas use "LIST_EXTEND" (star syntax) when dealing with hot code paths because its complexity is linear on account of not needing any dispatch. One potential complication to be mindful of with these generic handlers is how they'll intersect with handler aliases. For the purpose of code reuse, some default handlers have aliases, like `erc-server-NOTICE' for `erc-server-PRIVMSG'. When converted to generics, these handlers will still always share the same code as their referent. To put it another way, barring some terrible abuse of `&context' specializers, generics can't help us override only one among a set of aliased handlers (something that's occasionally desirable). For example, if we only want to override "PRIVMSG" handling, we must code that into the method's implementation, e.g., by running `cl-call-next-method' on receiving a "NOTICE", so the message gets the default treatment. This may seem obvious, but a historical quirk makes it easy to confuse with related hook behavior because ERC doesn't `defvar-alias' them, so modifying one never modifies others. Reusable response handling ~~~~~~~~~~~~~~~~~~~~~~~~~~ A common gripe regarding ERC's response handling API is that there's no way to pass refined, processed data down the line to other handlers. Although annoying, it's only meaningful to the extent suitable library functions exist to process such data. This proposal includes a plan to address both deficiencies, but only in service of the use case explained in the previous section about overriding default handlers. The idea is to leverage a common message-handling paradigm that preserves work artifacts derived from a raw message and other inputs. A shared message object retains these products for the remainder of its life. The object typically offers a set of methods and properties for handlers to perform common operations on inputs, often repeatedly, without being wasteful or fussing over complicated implementation details. At present, ERC's main message type is the `erc-response', which at face value is inadequate for this purpose. However, if we indulge the notion of its "substitutability" in existing infrastructure and pretend that functions and variables expecting a traditional `erc-response' won't balk when handed a subtype, a wealth of possibilities emerge. (I suggest we do this.) Ignoring whatever performance gains this reuse-focused scheme may provide, the main win here, from a maintenance standpoint, is that a module can override some or all of a default handler's duties without additional upkeep. IOW, the "downstream" library no longer has to study the default handler and replicate choice bits of copy pasta. Rather, the library merely wires together whatever combination of getters and properties it desires, a la carte. The proposed implementation demonstrated below may seem a bit heavy on magic, but it makes adapting existing code to these new, more specific response objects relatively seamless and transparent (aside from the requisite symbol renaming). The variant being proposed doesn't actually use traditional methods but rather slot accessors themselves as caching getters. Regardless, the key takeaway is that it introduces a set of `erc-response' subtypes for commonly overridden responses, each with relevant slots that it initializes lazily, on first use. Pros include code reuse and encapsulation to better isolate concerns as well as (likely infinitesimal) performance gains. Cons include additional onboarding overhead for new contributors and a slightly elevated risk of misuse due to faulty assumptions about its nonstandard struct behavior. Here's an example response definition specific to a "PRIVMSG": (erc--define-zresponse (PRIVMSG NOTICE) :include erc--zstatused ( buffer (erc-get-buffer (if (erc--zPRIVMSG-query-p parsed) (car (erc--zresponse-nuh parsed)) (erc--zstatused-target parsed)) erc-server-process) :type buffer) ( query-p (equal (erc-downcase (erc--zstatused-target parsed)) (erc--zresponse-mynick-d parsed)) :type boolean) ( speaker (car (erc--zPRIVMSG-nuh parsed)) :type string) ( notice-p (string= (erc-response.command parsed) "NOTICE") :type boolean) ( input-p (string= (erc--zPRIVMSG-mynick-d parsed) (car (erc--zPRIVMSG-nuh parsed))) :type boolean)) In the definition above, the init forms are basically single-use method bodies for the generated accessors (subsequent calls return cached results). The init forms themselves are free to reference other accessors generated by this definition, which in turn initialize _their_ slots, if necessary, in a cascading fashion. The variable `parsed' is a reference to the `zresponse' instance, i.e., "this"/"self". Some care has been taken to replicate the inlining benefits and gv-place awareness provided by definitions normally generated by `cl-defstruct' (although review by an expert in this area would be most welcome). Notes ~~~~~ [1] IMO, allowing services to "pull in" one another will only lead to unwanted complications. Systems that allow for this already have a sophisticated foundation in place to manage intricate interactions between producers and consumers. A note on terminology: I used to make a point of distinguishing between "active" and "enabled" when referring to modules: "enabled" meant present in `erc-modules' and "active" meant activated for the session as a minor mode. I've since abandoned trying to advocate for this usage or any such distinction and am fully resigned to the fact that others will always use them interchangeably.