From: Nicolas Graves via Guix-patches via <guix-patches@gnu.org>
To: 73106@debbugs.gnu.org
Cc: ngraves@ngraves.fr
Subject: [bug#73106] [PATCH 08/10] gnu: Add rust-tokenizers.
Date: Sat, 7 Sep 2024 18:56:14 +0200 [thread overview]
Message-ID: <20240907165626.22651-8-ngraves@ngraves.fr> (raw)
In-Reply-To: <20240907165626.22651-1-ngraves@ngraves.fr>
* gnu/packages/machine-learning.scm (rust-tokenizers): New variable.
Change-Id: I3189a2d826f072f65ad053d77eb39be39775f1c2
---
gnu/packages/machine-learning.scm | 60 +++++++++++++++++++++++++++++++
1 file changed, 60 insertions(+)
diff --git a/gnu/packages/machine-learning.scm b/gnu/packages/machine-learning.scm
index 27d7f0526b..3b601f6c91 100644
--- a/gnu/packages/machine-learning.scm
+++ b/gnu/packages/machine-learning.scm
@@ -5675,6 +5675,66 @@ (define-public rust-hf-hub-0.3
python package, but only implements a smaller subset of functions.")
(license license:asl2.0)))
+(define-public rust-tokenizers
+ (package
+ (name "rust-tokenizers")
+ (version "0.19.1")
+ (source
+ (origin
+ (method url-fetch)
+ (uri (crate-uri "tokenizers" version))
+ (file-name (string-append name "-" version ".tar.gz"))
+ (sha256
+ (base32 "1zg6ffpllygijb5bh227m9p4lrhf0pjkysky68kddwrsvp8zl075"))
+ (modules '((guix build utils)))
+ (snippet
+ #~(substitute* "Cargo.toml"
+ (("0.1.12") ; rust-monostate requires a rust-syn-2 update
+ "0.1.11")
+ (("version = \"6.4\"") ; rust-onig
+ "version = \"6.1.1\"")))))
+ (build-system cargo-build-system)
+ (arguments
+ (list
+ #:tests? #f ; tests are relying on missing data.
+ #:cargo-inputs
+ `(("rust-aho-corasick" ,rust-aho-corasick-1)
+ ("rust-derive-builder" ,rust-derive-builder-0.20)
+ ("rust-esaxx-rs" ,rust-esaxx-rs-0.1)
+ ("rust-fancy-regex" ,rust-fancy-regex-0.13)
+ ("rust-getrandom" ,rust-getrandom-0.2)
+ ("rust-hf-hub" ,rust-hf-hub-0.3)
+ ("rust-indicatif" ,rust-indicatif-0.17)
+ ("rust-itertools" ,rust-itertools-0.12)
+ ("rust-lazy-static" ,rust-lazy-static-1)
+ ("rust-log" ,rust-log-0.4)
+ ("rust-macro-rules-attribute" ,rust-macro-rules-attribute-0.2)
+ ("rust-monostate" ,rust-monostate-0.1)
+ ("rust-onig" ,rust-onig-6)
+ ("rust-paste" ,rust-paste-1)
+ ("rust-rand" ,rust-rand-0.8)
+ ("rust-rayon" ,rust-rayon-1)
+ ("rust-rayon-cond" ,rust-rayon-cond-0.3)
+ ("rust-regex" ,rust-regex-1)
+ ("rust-regex-syntax" ,rust-regex-syntax-0.8)
+ ("rust-serde" ,rust-serde-1)
+ ("rust-serde-json" ,rust-serde-json-1)
+ ("rust-spm-precompiled" ,rust-spm-precompiled-0.1)
+ ("rust-thiserror" ,rust-thiserror-1)
+ ("rust-unicode-normalization-alignments" ,rust-unicode-normalization-alignments-0.1)
+ ("rust-unicode-segmentation" ,rust-unicode-segmentation-1)
+ ("rust-unicode-categories" ,rust-unicode-categories-0.1))
+ #:cargo-development-inputs
+ `(("rust-assert-approx-eq" ,rust-assert-approx-eq-1)
+ ("rust-criterion" ,rust-criterion-0.5)
+ ("rust-tempfile" ,rust-tempfile-3))))
+ (home-page "https://github.com/huggingface/tokenizers")
+ (synopsis "Implementation of various popular tokenizers")
+ (description
+ "This package provides a Rust implementation of today's most used
+tokenizers, with a focus on performances and versatility.")
+ (license license:asl2.0)))
+
(define-public python-hmmlearn
(package
(name "python-hmmlearn")
--
2.45.2
next prev parent reply other threads:[~2024-09-07 16:57 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-09-07 16:21 [bug#73106] [PATCH 00/10] Add python-tokenizers Nicolas Graves via Guix-patches via
2024-09-07 16:56 ` [bug#73106] [PATCH 01/10] gnu: Add rust-esaxx-rs-0.1 Nicolas Graves via Guix-patches via
2024-09-07 16:56 ` [bug#73106] [PATCH 02/10] gnu: Add rust-spm-precompiled-0.1 Nicolas Graves via Guix-patches via
2024-09-07 16:56 ` [bug#73106] [PATCH 03/10] gnu: Add rust-macro-rules-attribute-proc-macro-0.2 Nicolas Graves via Guix-patches via
2024-09-07 16:56 ` [bug#73106] [PATCH 04/10] gnu: Add rust-macro-rules-attribute-0.2 Nicolas Graves via Guix-patches via
2024-09-07 16:56 ` [bug#73106] [PATCH 05/10] gnu: Add rust-hf-hub-0.3 Nicolas Graves via Guix-patches via
2024-09-07 16:56 ` [bug#73106] [PATCH 06/10] gnu: Add rust-monostate-impl-0.1 Nicolas Graves via Guix-patches via
2024-09-07 16:56 ` [bug#73106] [PATCH 07/10] gnu: Add rust-monostate-0.1 Nicolas Graves via Guix-patches via
2024-09-07 16:56 ` Nicolas Graves via Guix-patches via [this message]
2024-09-07 16:56 ` [bug#73106] [PATCH 09/10] gnu: Add rust-numpy-0.21 Nicolas Graves via Guix-patches via
2024-09-07 16:56 ` [bug#73106] [PATCH 10/10] gnu: Add python-tokenizers Nicolas Graves via Guix-patches via
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240907165626.22651-8-ngraves@ngraves.fr \
--to=guix-patches@gnu.org \
--cc=73106@debbugs.gnu.org \
--cc=ngraves@ngraves.fr \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/guix.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.