From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
To: 57151@debbugs.gnu.org
Cc: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Subject: [bug#57151] [PATCH 1/2] gnu: Add tesseract-ocr-tessdata-fast.
Date: Fri, 12 Aug 2022 01:07:51 -0400 [thread overview]
Message-ID: <20220812050752.3980-1-maxim.cournoyer@gmail.com> (raw)
In-Reply-To: <20220812050543.3923-1-maxim.cournoyer@gmail.com>
* gnu/packages/ocr.scm (tesseract-ocr-tessdata-fast): New variable.
---
gnu/packages/ocr.scm | 27 +++++++++++++++++++++++++++
1 file changed, 27 insertions(+)
diff --git a/gnu/packages/ocr.scm b/gnu/packages/ocr.scm
index e28bd17668..e2c9f561cc 100644
--- a/gnu/packages/ocr.scm
+++ b/gnu/packages/ocr.scm
@@ -29,6 +29,7 @@ (define-module (gnu packages ocr)
#:use-module (guix gexp)
#:use-module (guix git-download)
#:use-module (guix build-system cmake)
+ #:use-module (guix build-system copy)
#:use-module (guix build-system gnu)
#:use-module (guix build-system python)
#:use-module (gnu packages)
@@ -74,6 +75,32 @@ (define-public ocrad
it produces text in 8-bit or UTF-8 formats.")
(license license:gpl3+)))
+(define-public tesseract-ocr-tessdata-fast
+ (package
+ (name "tesseract-ocr-tessdata-fast")
+ (version "4.1.0")
+ (source (origin
+ (method git-fetch)
+ (uri (git-reference
+ (url "https://github.com/tesseract-ocr/tessdata_fast")
+ (commit version)))
+ (file-name (git-file-name name version))
+ (sha256
+ (base32
+ "1m310cpb87xx8l8q7jy9fvzf6a0m8rm0dmjpbiwhc2mi6w4gn084"))))
+ (build-system copy-build-system)
+ (arguments (list #:install-plan #~'(("." "share/tesseract-ocr/tessdata"))
+ #:phases #~(modify-phases %standard-phases
+ (add-after 'unpack 'delete-broken-links
+ (lambda _
+ (delete-file "configs")
+ (delete-file "pdf.ttf"))))))
+ (home-page "https://github.com/tesseract-ocr/tessdata_fast")
+ (synopsis "Fast integer versions of trained LSTM models")
+ (description "This repository contains fast integer versions of trained
+models for the Tesseract OCR Engine.")
+ (license license:asl2.0)))
+
(define-public tesseract-ocr
(package
(name "tesseract-ocr")
--
2.36.1
next prev parent reply other threads:[~2022-08-12 5:09 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-08-12 5:05 [bug#57151] [PATCH 0/2] *** Add trained data models for Tesseract OCR *** Maxim Cournoyer
2022-08-12 5:07 ` Maxim Cournoyer [this message]
2022-08-12 5:07 ` [bug#57151] [PATCH 2/2] gnu: tesseract-ocr: Make the default install minimally useful Maxim Cournoyer
2022-08-12 11:27 ` [bug#57151] [PATCH 1/2] gnu: Add tesseract-ocr-tessdata-fast Simon South
2022-08-12 12:52 ` Maxim Cournoyer
[not found] ` <87bksp61wn.fsf@simonsouth.net>
2022-08-12 20:08 ` bug#57151: " Maxim Cournoyer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220812050752.3980-1-maxim.cournoyer@gmail.com \
--to=maxim.cournoyer@gmail.com \
--cc=57151@debbugs.gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/guix.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.