From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp12.migadu.com ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms5.migadu.com with LMTPS id +MLaD1859mJUQgEAbAwnHQ (envelope-from ) for ; Fri, 12 Aug 2022 13:28:31 +0200 Received: from aspmx1.migadu.com ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp12.migadu.com with LMTPS id UPfBD1859mJcNAAAauVa8A (envelope-from ) for ; Fri, 12 Aug 2022 13:28:31 +0200 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 9F8BC384D8 for ; Fri, 12 Aug 2022 13:28:30 +0200 (CEST) Received: from localhost ([::1]:44400 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oMSq5-0000xi-ME for larch@yhetil.org; Fri, 12 Aug 2022 07:28:29 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:32774) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oMSpe-0000uM-Ei for guix-patches@gnu.org; Fri, 12 Aug 2022 07:28:02 -0400 Received: from debbugs.gnu.org ([209.51.188.43]:37786) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1oMSpe-0008NF-58 for guix-patches@gnu.org; Fri, 12 Aug 2022 07:28:02 -0400 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1oMSpd-0007ob-Uk for guix-patches@gnu.org; Fri, 12 Aug 2022 07:28:01 -0400 X-Loop: help-debbugs@gnu.org Subject: [bug#57151] [PATCH 1/2] gnu: Add tesseract-ocr-tessdata-fast. Resent-From: Simon South Original-Sender: "Debbugs-submit" Resent-CC: guix-patches@gnu.org Resent-Date: Fri, 12 Aug 2022 11:28:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 57151 X-GNU-PR-Package: guix-patches X-GNU-PR-Keywords: patch To: Maxim Cournoyer Cc: 57151@debbugs.gnu.org Received: via spool by 57151-submit@debbugs.gnu.org id=B57151.166030366930023 (code B ref 57151); Fri, 12 Aug 2022 11:28:01 +0000 Received: (at 57151) by debbugs.gnu.org; 12 Aug 2022 11:27:49 +0000 Received: from localhost ([127.0.0.1]:55768 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1oMSpR-0007oB-FO for submit@debbugs.gnu.org; Fri, 12 Aug 2022 07:27:49 -0400 Received: from mailout.easymail.ca ([64.68.200.34]:59052) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1oMSpM-0007nr-Lf for 57151@debbugs.gnu.org; Fri, 12 Aug 2022 07:27:47 -0400 Received: from localhost (localhost [127.0.0.1]) by mailout.easymail.ca (Postfix) with ESMTP id 274C86326E; Fri, 12 Aug 2022 11:27:37 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at emo07-pco.easydns.vpn Received: from mailout.easymail.ca ([127.0.0.1]) by localhost (emo07-pco.easydns.vpn [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id xysKnewthiQZ; Fri, 12 Aug 2022 11:27:37 +0000 (UTC) Received: from laptop (23-233-96-72.cpe.pppoe.ca [23.233.96.72]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mailout.easymail.ca (Postfix) with ESMTPSA id E115563211; Fri, 12 Aug 2022 11:27:36 +0000 (UTC) From: Simon South References: <20220812050543.3923-1-maxim.cournoyer@gmail.com> <20220812050752.3980-1-maxim.cournoyer@gmail.com> Date: Fri, 12 Aug 2022 07:27:35 -0400 In-Reply-To: <20220812050752.3980-1-maxim.cournoyer@gmail.com> (Maxim Cournoyer's message of "Fri, 12 Aug 2022 01:07:51 -0400") Message-ID: <87czd57lco.fsf@simonsouth.net> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: guix-patches@gnu.org List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-patches-bounces+larch=yhetil.org@gnu.org Sender: "Guix-patches" X-Migadu-Flow: FLOW_IN X-Migadu-To: larch@yhetil.org X-Migadu-Country: US ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1660303711; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:resent-cc:resent-from:resent-sender: resent-message-id:in-reply-to:in-reply-to:references:references: list-id:list-help:list-unsubscribe:list-subscribe:list-post; bh=lcMwIaXQ18K7bOjKJVLn+tbEv/plz/tmEfrjTPPKwLM=; b=c7K3jGzPRPwiAyT2gKf7KMjRVR5SVt5ZTVS9ukGR0TPZ9oxLQZS1uKXzFrf2wrj3cUupqq fFa1wlqLA3FirtAoXLNixYw1IskOl+5t5ETg738Ph5Srn+VoFSB8yyryW7vw41F5ECjQA7 ykGG5qpnQgNI1pVKL7mkYSQusqXd3sTAElvmwardyn9Gko1uzyr5v8PnUJVOX+Ll+jo/jd OJWDi6oFjtAfTalQLzs9PcnVBURQBi42upp6We+BdvSiUsyGbmWxpjWzfWXqCDObtEp6xU MSSO4q3F6OvZU5U1we+pw41CHBQpjBzh+d3w5+QJSnuBXAcxR+larFQ9yNDsOw== ARC-Seal: i=1; s=key1; d=yhetil.org; t=1660303711; a=rsa-sha256; cv=none; b=e31JzvU6KbR/QRGDFdcuWm5xDdT+6/5+nB//Ve6OFOiqxxdwlCt3IEfFHke7VwraJe1sZt qG1reIYuKFMMjPlEOLxsOGYLf30m2Gv5Z1sbInSVwC2gwNGMcfAo6OtigsgiCFKa+VUF3q K0s6MpAj/mxV176PPmqHDOXDCERW8Tnd/u3E6hOLKQxDpl+SJJyJD7xfqsnd+6m6XF9Sih BTJm334+bGdLx7OSpHBp8TRPTDUKKxjLREDqX+iACucxcbUTZfIy6xq2Yx97FZW9W0st/j UA2aOu0TJyQH2c8s2+1T1VyaHhKtajInISG6gPruX6YIukHzquOk4Bz6zEVLTw== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=none; dmarc=none; spf=pass (aspmx1.migadu.com: domain of "guix-patches-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="guix-patches-bounces+larch=yhetil.org@gnu.org" X-Migadu-Spam-Score: -3.08 Authentication-Results: aspmx1.migadu.com; dkim=none; dmarc=none; spf=pass (aspmx1.migadu.com: domain of "guix-patches-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="guix-patches-bounces+larch=yhetil.org@gnu.org" X-Migadu-Queue-Id: 9F8BC384D8 X-Spam-Score: -3.08 X-Migadu-Scanner: scn1.migadu.com X-TUID: JR6f+BzvZGOH Maxim Cournoyer writes: > * gnu/packages/ocr.scm (tesseract-ocr-tessdata-fast): New variable. Maxim, Would it not be better to generate a separate package for each of the languages and scripts this data covers, as is done by Debian for instance? The entire dataset is about a gigabyte in size and supports more than a hundred languages yet I imagine most people would be using only one or two. This would mean tesseract-ocr could simply propagate the "tesseract-ocr-tessdata-fast-eng" package rather than cherry-picking a specific file, and would establish a convention that would be necessary for packaging the "best" dataset as well, if that's desired. (Thanks for working on this; it's been on my to-do list for a while as well.) -- Simon South simon@simonsouth.net