From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp0.migadu.com ([2001:41d0:403:4876::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms13.migadu.com with LMTPS id SP3uDM+zyGYLBAEAqHPOHw:P1 (envelope-from ) for ; Fri, 23 Aug 2024 16:07:43 +0000 Received: from aspmx1.migadu.com ([2001:41d0:403:4876::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp0.migadu.com with LMTPS id SP3uDM+zyGYLBAEAqHPOHw (envelope-from ) for ; Fri, 23 Aug 2024 18:07:43 +0200 X-Envelope-To: larch@yhetil.org Authentication-Results: aspmx1.migadu.com; dkim=fail ("headers rsa verify failed") header.d=debbugs.gnu.org header.s=debbugs-gnu-org header.b="Pd8/HUXR"; dkim=fail ("headers rsa verify failed") header.d=incana.org header.s=key1 header.b=VMPAG+Kb; spf=pass (aspmx1.migadu.com: domain of "guix-patches-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="guix-patches-bounces+larch=yhetil.org@gnu.org"; dmarc=pass (policy=none) header.from=gnu.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1724429263; h=from:from:sender:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:resent-cc:resent-from:resent-sender: resent-message-id:in-reply-to:in-reply-to:references:references: list-id:list-help:list-unsubscribe:list-subscribe:list-post: dkim-signature; bh=E9TtV6HCQ2R62VCSK1BRxfX8OLKJuRYdthOc5Hs5L/E=; b=NnBtUpzN/mFlIsLHNcI8H47i+I/fmZzS/bMDumEdE/H/hHs7MJWYbqeR4lKPoQDNDTJP0c c5dW+b7BSy+z4fG/Dv8jEMI12Ow6YHqpmo0zmr/C2rQW1VvPVeRlO61Nqxsn6y+mxzQKkb jEVOC4Hk0oCqT5k1LsSOy63fZMVx9fP5fg40Vz8NvJ8Mw7e7IE4jgvqWWMmJ6Uu+ryVi2y 8oJO1iI77bI5D8iyaEUOA0wjreLbG4WG4BdpMsFHPt7dTRaNTyH1Vu7QzXCTp0LSRpLiJ9 Zmn4NhRo78FPPFTvdjuTWCrMAOrJtl0+W5N8+lTiRtK3L9alS0Y/LHyNh1vTXA== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=fail ("headers rsa verify failed") header.d=debbugs.gnu.org header.s=debbugs-gnu-org header.b="Pd8/HUXR"; dkim=fail ("headers rsa verify failed") header.d=incana.org header.s=key1 header.b=VMPAG+Kb; spf=pass (aspmx1.migadu.com: domain of "guix-patches-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="guix-patches-bounces+larch=yhetil.org@gnu.org"; dmarc=pass (policy=none) header.from=gnu.org ARC-Seal: i=1; s=key1; d=yhetil.org; t=1724429263; a=rsa-sha256; cv=none; b=PE1DsshixiyxD5ZoxvUcAbzqn8r6J7uJyv4oiGgq5tjlwKf4ErwpXWzyVQ/EpOevZVtj1T CTlphmOcVRNW7Q0Ty40bPCinQaZKRGWftx2OswZKlyuDg2/Z1vqeE+Cscn5gsZK7nX7+pz 4tGZk6vmbDf+263hLrmQdwIr8Mroox60+3Dxv814Ny9wxA9ar1Y+11su4JtLHW+oYmtLUA 34O2AZVV0tJZIRGm7fRvWG6iLTCHL7qUQEzet10DFz0G2LF4ovPhpp9Gqm8u6UsO2T5Zr4 SiUT6bS8yewTbx7Bz5jiIHmw6qmKlbV0qvBC2TGpm7ajfb9NYZOhsQSETwyJHA== Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 0476C24AEC for ; Fri, 23 Aug 2024 18:07:43 +0200 (CEST) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1shWos-0000vN-2Y; Fri, 23 Aug 2024 12:07:22 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1shWol-0000uy-Vi for guix-patches@gnu.org; Fri, 23 Aug 2024 12:07:17 -0400 Received: from debbugs.gnu.org ([2001:470:142:5::43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1shWok-000341-QH for guix-patches@gnu.org; Fri, 23 Aug 2024 12:07:15 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=debbugs.gnu.org; s=debbugs-gnu-org; h=MIME-Version:References:In-Reply-To:From:Date:To:Subject; bh=E9TtV6HCQ2R62VCSK1BRxfX8OLKJuRYdthOc5Hs5L/E=; b=Pd8/HUXRHi3uYkxbVMACY5bQ4A41cghsIIcbFvH1AN2jfMfwOY5EFdYwog1ufINEhofaRRzDrbAowaRinjtlxuBqA32Ko4cVr4kuCeenkkDjtFOXwAwJbj+JtysN0ECPV/7o8G4ahbF32klmqRy2NGFjaUgnOxHErdvyi8H/7koT7+aaWEO1uzFig6p2QfrLuCv2KxdDODucCENmm19TuYXwaQOez81j5ZDKx7HesjeUvpftX56FGWruUrHQZK+7Q1wvGUk1vouieoLCvt1/RstjiWKSnaTLX1m48jBJwn6zyi9TIAyokqJAaSsnHEe3RYI2DOR78ESKCNCpljKxqg==; Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1shWpW-0004o1-3G for guix-patches@gnu.org; Fri, 23 Aug 2024 12:08:02 -0400 X-Loop: help-debbugs@gnu.org Subject: [bug#72027] [PATCH v6] gnu: Add whisper-cpp. Resent-From: Juliana Sims Original-Sender: "Debbugs-submit" Resent-CC: guix-patches@gnu.org Resent-Date: Fri, 23 Aug 2024 16:08:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 72027 X-GNU-PR-Package: guix-patches X-GNU-PR-Keywords: patch To: atai@atai.org Cc: 72027@debbugs.gnu.org Received: via spool by 72027-submit@debbugs.gnu.org id=B72027.172442922718397 (code B ref 72027); Fri, 23 Aug 2024 16:08:02 +0000 Received: (at 72027) by debbugs.gnu.org; 23 Aug 2024 16:07:07 +0000 Received: from localhost ([127.0.0.1]:40034 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1shWoc-0004mf-Nx for submit@debbugs.gnu.org; Fri, 23 Aug 2024 12:07:07 -0400 Received: from out-179.mta1.migadu.com ([95.215.58.179]:49881) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1shWoa-0004m7-G0 for 72027@debbugs.gnu.org; Fri, 23 Aug 2024 12:07:05 -0400 Date: Fri, 23 Aug 2024 12:05:58 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=incana.org; s=key1; t=1724429170; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=E9TtV6HCQ2R62VCSK1BRxfX8OLKJuRYdthOc5Hs5L/E=; b=VMPAG+KbOVAqhDmMr/2qt3sE2MNAdaIgIlYAyOtZGnjBVsfmrTa1f/3PiGL7ptncrHA4Kb 86+YaZ9TV6nko2DQkRnmexjwUscA1xXN403sT/DinXpVcYUes5s0Z84gNLJBW0uPfhF4qq iny0opHPBKMeGPGQBFfZVBqwB0U14aLzqUc2Kbzc/p847I1+rMdSO4p0z3oym0qAMMr4lh XLkl9ewfnfVMSI9AJnUqAtRgfcmstlWyoyS2Pq2v7P3WGY1IOFWoetIFibF2un+nPG+K1k CgH+/bejd0NaSbw+05jHNHecH5xcVS1Mnq1OFh/2F/02VM8DQ74c2CsWdzrhjQ== X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. Message-Id: In-Reply-To: References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: guix-patches@gnu.org List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-to: Juliana Sims X-ACL-Warn: , Juliana Sims via Guix-patches From: Juliana Sims via Guix-patches via Errors-To: guix-patches-bounces+larch=yhetil.org@gnu.org Sender: guix-patches-bounces+larch=yhetil.org@gnu.org X-Migadu-Country: US X-Migadu-Flow: FLOW_IN X-Spam-Score: -7.08 X-Migadu-Queue-Id: 0476C24AEC X-Migadu-Scanner: mx10.migadu.com X-Migadu-Spam-Score: -7.08 X-TUID: 7BSwumoHZqP0 Well, I know I just asked if this was ready for review, but I figured I'd go ahead and do some "low-hanging fruit review" while I wait for you to get a chance to respond ;) Fortunately, most of what I've noticed so far is minor and stylistic. There are some bigger, discussion-opening comments near the end. Firstly, TODO comments are conventionally spelled as one word. This is not grammatically correct, nor is it a requirement, but it is the convention and folks searching files for such comments will likely search for "TODO" rather than "TO DO". Secondly, the synopsis should probably either drop the "Port of" or change "in" to "to". I don't think this is a major issue; it just flows better like that imo. I wouldn't oppose merging over this so if you prefer it the way it is, that's fine. Next, the description wants expansion. It should be three to five sentences. The sentence there now is a fine start, though it needs ending punctuation. Perhaps a sentence explaining how this package can be used ("whisper-cpp can be integrated with other programs to provide speech-to-text support" or something) and one explaining what makes it unique ("because whisper-cpp uses a leading speech recognition model, it is able to perform speech recognition rapidly with relatively few resources"). You could perhaps mention that it can run on CPU as well as GPU, that it offers some integrations with certain hardware, or so on. Whatever you think is important or interesting; these are just some ideas :) Now we enter the discussion-launching comments. I notice that ggml is vendored. This one is tricky. Firstly, there is no standalone ggml package (yet; I saw your patch 65284). Usually, Guix only asks that dependencies be unvendored if there is already a standalone package for the dependency, so unless and until that is merged there is no real issue. Secondly, though, and this touches on the ggml patches, it seems that there are no formal releases of ggml yet, and development is happening in the repositories for whisper-cpp and llama-cpp as well as its own repository. It seems these three versions of ggml are all slightly out of sync with each other. It would be nice if upstream used git submodules to ensure their work was synchronized. Alas, we as Guix can't do anything about that, and we must ensure the packages we offer work correctly. The inconsistencies between these versions of ggml make me think packaging it separately would risk breakage. With that in mind, might it be best to drop the standalone ggml patchset and just let llama-cpp and whisper-cpp vendor their versions? While suboptimal because it results in building "the same package" multiple times, I would argue that the divergence in the code means they are not, in fact, the same package. Finally, is this package complete? Looking at the store directory for the package, I see headers and the like but no actual models. Is this sufficient for using the inference? Are client libraries or programs supposed to install models themselves? Or can this package be used to generate models as described in the project's README? If not, should it be able to? I am admittedly fairly ignorant about the machine learning ecosystem, so feel free to explain as much as you think may be necessary. My goal in these questions is ensuring users get what they expect from this package. Relatedly, if this package is complete but requires further setup, I would strongly support explaining that in the package description. As a user, I've encountered a few packages that require more setup and don't mention that they do, and I'm then frustrated and confused when I learn this from trying to use the package and then have no support from Guix in trying to make things work properly. (Another step beyond this may be offering a system service which performs configuration, but that can be a future, separate patch.) To circle back to what I mentioned in my first email, I would like to package the whisper.el Emacs mode[1]. Currently, whisper.el plans to install and compile whisper.cpp and its models itself; I think we as Guix should make this unnecessary for an imagined future emacs-whisper package. Looking forward to hearing from you, Juli [1] https://github.com/natrys/whisper.el