unofficial mirror of guix-patches@gnu.org 
 help / color / mirror / code / Atom feed
* [bug#73106] [PATCH 00/10] Add python-tokenizers.
@ 2024-09-07 16:21 Nicolas Graves via Guix-patches via
  2024-09-07 16:56 ` [bug#73106] [PATCH 01/10] gnu: Add rust-esaxx-rs-0.1 Nicolas Graves via Guix-patches via
  0 siblings, 1 reply; 11+ messages in thread
From: Nicolas Graves via Guix-patches via @ 2024-09-07 16:21 UTC (permalink / raw)
  To: 73106; +Cc: ngraves

This patch series adds the package python-tokenizers, which is a
prerequisite for packaging python-transformers.

Nicolas Graves (10):
  gnu: Add rust-esaxx-rs-0.1.
  gnu: Add rust-spm-precompiled-0.1.
  gnu: Add rust-macro-rules-attribute-proc-macro-0.2.
  gnu: Add rust-macro-rules-attribute-0.2.
  gnu: Add rust-hf-hub-0.3.
  gnu: Add rust-monostate-impl-0.1.
  gnu: Add rust-monostate-0.1.
  gnu: Add rust-tokenizers.
  gnu: Add rust-numpy-0.21.
  gnu: Add python-tokenizers.

 gnu/packages/crates-io.scm        | 133 +++++++++++++++
 gnu/packages/machine-learning.scm | 266 ++++++++++++++++++++++++++++++
 2 files changed, 399 insertions(+)

-- 
2.45.2





^ permalink raw reply	[flat|nested] 11+ messages in thread

* [bug#73106] [PATCH 01/10] gnu: Add rust-esaxx-rs-0.1.
  2024-09-07 16:21 [bug#73106] [PATCH 00/10] Add python-tokenizers Nicolas Graves via Guix-patches via
@ 2024-09-07 16:56 ` Nicolas Graves via Guix-patches via
  2024-09-07 16:56   ` [bug#73106] [PATCH 02/10] gnu: Add rust-spm-precompiled-0.1 Nicolas Graves via Guix-patches via
                     ` (8 more replies)
  0 siblings, 9 replies; 11+ messages in thread
From: Nicolas Graves via Guix-patches via @ 2024-09-07 16:56 UTC (permalink / raw)
  To: 73106; +Cc: ngraves

* gnu/packages/machine-learning.scm (rust-esaxx-rs-0.1): New variable.

Change-Id: I38a666dd5b9f20dc721e0a28ad718ff5f227b708
---
 gnu/packages/machine-learning.scm | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/gnu/packages/machine-learning.scm b/gnu/packages/machine-learning.scm
index 12be1d7bf6..4385603a4a 100644
--- a/gnu/packages/machine-learning.scm
+++ b/gnu/packages/machine-learning.scm
@@ -5580,6 +5580,26 @@ (define-public python-torchfile
 Python.")
     (license license:bsd-3)))
 
+(define-public rust-esaxx-rs-0.1
+  (package
+    (name "rust-esaxx-rs")
+    (version "0.1.10")
+    (source
+     (origin
+       (method url-fetch)
+       (uri (crate-uri "esaxx-rs" version))
+       (file-name (string-append name "-" version ".tar.gz"))
+       (sha256
+        (base32 "1rm6vm5yr7s3n5ly7k9x9j6ra5p2l2ld151gnaya8x03qcwf05yq"))))
+    (build-system cargo-build-system)
+    (arguments
+     `(#:cargo-inputs (("rust-cc" ,rust-cc-1))))
+    (home-page "https://github.com/Narsil/esaxx-rs")
+    (synopsis "Wrapper for sentencepiece's esaxxx library")
+    (description
+     "This package provides a wrapper around sentencepiece's esaxxx library.")
+    (license license:asl2.0)))
+
 (define-public python-hmmlearn
   (package
     (name "python-hmmlearn")
-- 
2.45.2





^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [bug#73106] [PATCH 02/10] gnu: Add rust-spm-precompiled-0.1.
  2024-09-07 16:56 ` [bug#73106] [PATCH 01/10] gnu: Add rust-esaxx-rs-0.1 Nicolas Graves via Guix-patches via
@ 2024-09-07 16:56   ` Nicolas Graves via Guix-patches via
  2024-09-07 16:56   ` [bug#73106] [PATCH 03/10] gnu: Add rust-macro-rules-attribute-proc-macro-0.2 Nicolas Graves via Guix-patches via
                     ` (7 subsequent siblings)
  8 siblings, 0 replies; 11+ messages in thread
From: Nicolas Graves via Guix-patches via @ 2024-09-07 16:56 UTC (permalink / raw)
  To: 73106; +Cc: ngraves

* gnu/packages/machine-learning.scm (rust-spm-precompiled-0.1): New variable.

Change-Id: I622c1a875e10041703ef0a32e7c35074f534276b
---
 gnu/packages/machine-learning.scm | 27 +++++++++++++++++++++++++++
 1 file changed, 27 insertions(+)

diff --git a/gnu/packages/machine-learning.scm b/gnu/packages/machine-learning.scm
index 4385603a4a..d3f76ebeba 100644
--- a/gnu/packages/machine-learning.scm
+++ b/gnu/packages/machine-learning.scm
@@ -5600,6 +5600,33 @@ (define-public rust-esaxx-rs-0.1
      "This package provides a wrapper around sentencepiece's esaxxx library.")
     (license license:asl2.0)))
 
+(define-public rust-spm-precompiled-0.1
+  (package
+    (name "rust-spm-precompiled")
+    (version "0.1.4")
+    (source
+     (origin
+       (method url-fetch)
+       (uri (crate-uri "spm_precompiled" version))
+       (file-name (string-append name "-" version ".tar.gz"))
+       (sha256
+        (base32 "09pkdk2abr8xf4pb9kq3rk80dgziq6vzfk7aywv3diik82f6jlaq"))))
+    (build-system cargo-build-system)
+    (arguments
+     `(#:cargo-inputs
+       (("rust-base64" ,rust-base64-0.13)
+        ("rust-nom" ,rust-nom-7)
+        ("rust-serde" ,rust-serde-1)
+        ("rust-unicode-segmentation" ,rust-unicode-segmentation-1))))
+    (home-page "https://github.com/huggingface/spm_precompiled")
+    (synopsis "Emulate sentencepiece's DoubleArray")
+    (description
+     "This crate aims to emulate
+@url{https://github.com/google/sentencepiece,sentencepiece}
+Dart::@code{DoubleArray} struct and it's Normalizer.  This crate is highly
+specialized and not intended for general use.")
+    (license license:asl2.0)))
+
 (define-public python-hmmlearn
   (package
     (name "python-hmmlearn")
-- 
2.45.2





^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [bug#73106] [PATCH 03/10] gnu: Add rust-macro-rules-attribute-proc-macro-0.2.
  2024-09-07 16:56 ` [bug#73106] [PATCH 01/10] gnu: Add rust-esaxx-rs-0.1 Nicolas Graves via Guix-patches via
  2024-09-07 16:56   ` [bug#73106] [PATCH 02/10] gnu: Add rust-spm-precompiled-0.1 Nicolas Graves via Guix-patches via
@ 2024-09-07 16:56   ` Nicolas Graves via Guix-patches via
  2024-09-07 16:56   ` [bug#73106] [PATCH 04/10] gnu: Add rust-macro-rules-attribute-0.2 Nicolas Graves via Guix-patches via
                     ` (6 subsequent siblings)
  8 siblings, 0 replies; 11+ messages in thread
From: Nicolas Graves via Guix-patches via @ 2024-09-07 16:56 UTC (permalink / raw)
  To: 73106; +Cc: ngraves

* gnu/packages/crates-io.scm (rust-macro-rules-attribute-proc-macro-0.2): New variable.

Change-Id: I1fab6de81c897643cae52e733bd06bb00ea1bd7f
---
 gnu/packages/crates-io.scm | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

diff --git a/gnu/packages/crates-io.scm b/gnu/packages/crates-io.scm
index 36ecbe4430..d04f8723fd 100644
--- a/gnu/packages/crates-io.scm
+++ b/gnu/packages/crates-io.scm
@@ -41076,6 +41076,27 @@ (define-public rust-macaddr-1
     (description "This pakcage provides MAC address types.")
     (license (list license:asl2.0 license:expat))))
 
+(define-public rust-macro-rules-attribute-proc-macro-0.2
+  (package
+    (name "rust-macro-rules-attribute-proc-macro")
+    (version "0.2.0")
+    (source
+     (origin
+       (method url-fetch)
+       (uri (crate-uri "macro_rules_attribute-proc_macro" version))
+       (file-name (string-append name "-" version ".tar.gz"))
+       (sha256
+        (base32 "0s45j4zm0a5d041g3vcbanvr76p331dfjb7gw9qdmh0w8mnqbpdq"))))
+    (build-system cargo-build-system)
+    (home-page
+     "https://github.com/danielhenrymantilla/macro_rules_attribute-rs")
+    (synopsis "Use declarative macros in Rust")
+    (description
+     "This package provides the ability to use Rust declarative macros as
+proc_macro attributes or derives.  This package provides implementation
+details to @code{rust-macro-rules-attribute}.")
+    (license license:expat)))
+
 (define-public rust-macrotest-1
   (package
     (name "rust-macrotest")
-- 
2.45.2





^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [bug#73106] [PATCH 04/10] gnu: Add rust-macro-rules-attribute-0.2.
  2024-09-07 16:56 ` [bug#73106] [PATCH 01/10] gnu: Add rust-esaxx-rs-0.1 Nicolas Graves via Guix-patches via
  2024-09-07 16:56   ` [bug#73106] [PATCH 02/10] gnu: Add rust-spm-precompiled-0.1 Nicolas Graves via Guix-patches via
  2024-09-07 16:56   ` [bug#73106] [PATCH 03/10] gnu: Add rust-macro-rules-attribute-proc-macro-0.2 Nicolas Graves via Guix-patches via
@ 2024-09-07 16:56   ` Nicolas Graves via Guix-patches via
  2024-09-07 16:56   ` [bug#73106] [PATCH 05/10] gnu: Add rust-hf-hub-0.3 Nicolas Graves via Guix-patches via
                     ` (5 subsequent siblings)
  8 siblings, 0 replies; 11+ messages in thread
From: Nicolas Graves via Guix-patches via @ 2024-09-07 16:56 UTC (permalink / raw)
  To: 73106; +Cc: ngraves

* gnu/packages/crates-io.scm (rust-macro-rules-attribute-0.2): New variable.

Change-Id: I62c9ba35a8a9f71f05f0f3c5307d7abe11f408c8
---
 gnu/packages/crates-io.scm | 28 ++++++++++++++++++++++++++++
 1 file changed, 28 insertions(+)

diff --git a/gnu/packages/crates-io.scm b/gnu/packages/crates-io.scm
index d04f8723fd..658721b123 100644
--- a/gnu/packages/crates-io.scm
+++ b/gnu/packages/crates-io.scm
@@ -41097,6 +41097,34 @@ (define-public rust-macro-rules-attribute-proc-macro-0.2
 details to @code{rust-macro-rules-attribute}.")
     (license license:expat)))
 
+(define-public rust-macro-rules-attribute-0.2
+  (package
+    (name "rust-macro-rules-attribute")
+    (version "0.2.0")
+    (source
+     (origin
+       (method url-fetch)
+       (uri (crate-uri "macro_rules_attribute" version))
+       (file-name (string-append name "-" version ".tar.gz"))
+       (sha256
+        (base32 "04waa4qm28adwnxsxhx9135ki68mwkikr6m5pi5xhcy0gcgjg0la"))))
+    (build-system cargo-build-system)
+    (arguments
+     `(#:cargo-inputs
+       (("rust-macro-rules-attribute-proc-macro"
+         ,rust-macro-rules-attribute-proc-macro-0.2)
+        ("rust-paste" ,rust-paste-1))
+       #:cargo-development-inputs
+       (("rust-once-cell" ,rust-once-cell-1)
+        ("rust-pin-project-lite" ,rust-pin-project-lite-0.2)
+        ("rust-serde" ,rust-serde-1))))
+    (home-page "https://crates.io/crates/macro_rules_attribute")
+    (synopsis "Use declarative macros in Rust")
+    (description
+     "This package provides the ability to use Rust declarative macros as
+proc_macro attributes or derives.")
+    (license license:expat)))
+
 (define-public rust-macrotest-1
   (package
     (name "rust-macrotest")
-- 
2.45.2





^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [bug#73106] [PATCH 05/10] gnu: Add rust-hf-hub-0.3.
  2024-09-07 16:56 ` [bug#73106] [PATCH 01/10] gnu: Add rust-esaxx-rs-0.1 Nicolas Graves via Guix-patches via
                     ` (2 preceding siblings ...)
  2024-09-07 16:56   ` [bug#73106] [PATCH 04/10] gnu: Add rust-macro-rules-attribute-0.2 Nicolas Graves via Guix-patches via
@ 2024-09-07 16:56   ` Nicolas Graves via Guix-patches via
  2024-09-07 16:56   ` [bug#73106] [PATCH 06/10] gnu: Add rust-monostate-impl-0.1 Nicolas Graves via Guix-patches via
                     ` (4 subsequent siblings)
  8 siblings, 0 replies; 11+ messages in thread
From: Nicolas Graves via Guix-patches via @ 2024-09-07 16:56 UTC (permalink / raw)
  To: 73106; +Cc: ngraves

* gnu/packages/machine-learning.scm (rust-hf-hub-0.3): New variable.

Change-Id: I9e64c316dde8094e6142785af8549556953513e0
---
 gnu/packages/machine-learning.scm | 48 +++++++++++++++++++++++++++++++
 1 file changed, 48 insertions(+)

diff --git a/gnu/packages/machine-learning.scm b/gnu/packages/machine-learning.scm
index d3f76ebeba..27d7f0526b 100644
--- a/gnu/packages/machine-learning.scm
+++ b/gnu/packages/machine-learning.scm
@@ -78,7 +78,10 @@ (define-module (gnu packages machine-learning)
   #:use-module (gnu packages cmake)
   #:use-module (gnu packages cpp)
   #:use-module (gnu packages cran)
+  #:use-module (gnu packages crates-crypto)
   #:use-module (gnu packages crates-io)
+  #:use-module (gnu packages crates-tls)
+  #:use-module (gnu packages crates-web)
   #:use-module (gnu packages databases)
   #:use-module (gnu packages dejagnu)
   #:use-module (gnu packages documentation)
@@ -5627,6 +5630,51 @@ (define-public rust-spm-precompiled-0.1
 specialized and not intended for general use.")
     (license license:asl2.0)))
 
+(define-public rust-hf-hub-0.3
+  (package
+    (name "rust-hf-hub")
+    (version "0.3.2")
+    (source
+     (origin
+       (method url-fetch)
+       (uri (crate-uri "hf-hub" version))
+       (file-name (string-append name "-" version ".tar.gz"))
+       (sha256
+        (base32 "0cnpivy9fn62lm1fw85kmg3ryvrx8drq63c96vq94gabawshcy1b"))))
+    (build-system cargo-build-system)
+    (arguments
+     `(#:tests? #f  ; require network connection
+       #:cargo-inputs
+       (("rust-dirs" ,rust-dirs-5)
+        ("rust-futures" ,rust-futures-0.3)
+        ("rust-indicatif" ,rust-indicatif-0.17)
+        ("rust-log" ,rust-log-0.4)
+        ("rust-native-tls" ,rust-native-tls-0.2)
+        ("rust-num-cpus" ,rust-num-cpus-1)
+        ("rust-rand" ,rust-rand-0.8)
+        ("rust-reqwest" ,rust-reqwest-0.11)
+        ("rust-serde" ,rust-serde-1)
+        ("rust-serde-json" ,rust-serde-json-1)
+        ("rust-thiserror" ,rust-thiserror-1)
+        ("rust-tokio" ,rust-tokio-1)
+        ("rust-ureq" ,rust-ureq-2))
+       #:cargo-development-inputs
+       (("rust-hex-literal" ,rust-hex-literal-0.4)
+        ("rust-sha2" ,rust-sha2-0.10)
+        ("rust-tokio-test" ,rust-tokio-test-0.4))))
+    (native-inputs
+     (list pkg-config))
+    (inputs
+     (list openssl))
+    (home-page "https://github.com/huggingface/hf-hub")
+    (synopsis "Interact with HuggingFace in Rust")
+    (description
+     "This crates aims ease the interaction with
+@url{https://huggingface.co/,huggingface}.  It aims to be compatible with
+@url{https://github.com/huggingface/huggingface_hub/,huggingface_hub}
+python package, but only implements a smaller subset of functions.")
+    (license license:asl2.0)))
+
 (define-public python-hmmlearn
   (package
     (name "python-hmmlearn")
-- 
2.45.2





^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [bug#73106] [PATCH 06/10] gnu: Add rust-monostate-impl-0.1.
  2024-09-07 16:56 ` [bug#73106] [PATCH 01/10] gnu: Add rust-esaxx-rs-0.1 Nicolas Graves via Guix-patches via
                     ` (3 preceding siblings ...)
  2024-09-07 16:56   ` [bug#73106] [PATCH 05/10] gnu: Add rust-hf-hub-0.3 Nicolas Graves via Guix-patches via
@ 2024-09-07 16:56   ` Nicolas Graves via Guix-patches via
  2024-09-07 16:56   ` [bug#73106] [PATCH 07/10] gnu: Add rust-monostate-0.1 Nicolas Graves via Guix-patches via
                     ` (3 subsequent siblings)
  8 siblings, 0 replies; 11+ messages in thread
From: Nicolas Graves via Guix-patches via @ 2024-09-07 16:56 UTC (permalink / raw)
  To: 73106; +Cc: ngraves

* gnu/packages/crates-io.scm (rust-monostate-impl-0.1): New variable.

Change-Id: Ica72fb8bce3589ed1ee5b08c3d96dcc24aaee279
---
 gnu/packages/crates-io.scm | 23 +++++++++++++++++++++++
 1 file changed, 23 insertions(+)

diff --git a/gnu/packages/crates-io.scm b/gnu/packages/crates-io.scm
index 658721b123..28ff81c801 100644
--- a/gnu/packages/crates-io.scm
+++ b/gnu/packages/crates-io.scm
@@ -43718,6 +43718,29 @@ (define-public rust-modifier-0.1
       "Chaining APIs for both self -> Self and &mut self methods.")
     (license license:expat)))
 
+(define-public rust-monostate-impl-0.1
+  (package
+    (name "rust-monostate-impl")
+    (version "0.1.11")
+    (source
+     (origin
+       (method url-fetch)
+       (uri (crate-uri "monostate-impl" version))
+       (file-name (string-append name "-" version ".tar.gz"))
+       (sha256
+        (base32 "1km6kc6yxvpsxciaj02zar8cx1sq142s6jn6saqn77h7165dd1pn"))))
+    (build-system cargo-build-system)
+    (arguments
+     `(#:cargo-inputs
+       (("rust-proc-macro2" ,rust-proc-macro2-1)
+        ("rust-quote" ,rust-quote-1)
+        ("rust-syn" ,rust-syn-2))))
+    (home-page "https://github.com/dtolnay/monostate")
+    (synopsis "Implementation detail of the monostate crate")
+    (description
+     "This package provides Implementation detail of the monostate crate.")
+    (license (list license:expat license:asl2.0))))
+
 (define-public rust-more-asserts-0.3
   (package
     (name "rust-more-asserts")
-- 
2.45.2





^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [bug#73106] [PATCH 07/10] gnu: Add rust-monostate-0.1.
  2024-09-07 16:56 ` [bug#73106] [PATCH 01/10] gnu: Add rust-esaxx-rs-0.1 Nicolas Graves via Guix-patches via
                     ` (4 preceding siblings ...)
  2024-09-07 16:56   ` [bug#73106] [PATCH 06/10] gnu: Add rust-monostate-impl-0.1 Nicolas Graves via Guix-patches via
@ 2024-09-07 16:56   ` Nicolas Graves via Guix-patches via
  2024-09-07 16:56   ` [bug#73106] [PATCH 08/10] gnu: Add rust-tokenizers Nicolas Graves via Guix-patches via
                     ` (2 subsequent siblings)
  8 siblings, 0 replies; 11+ messages in thread
From: Nicolas Graves via Guix-patches via @ 2024-09-07 16:56 UTC (permalink / raw)
  To: 73106; +Cc: ngraves

* gnu/packages/crates-io.scm (rust-monostate-0.1): New variable.

Change-Id: I53f1ebfaf98e785eedeb3293f211bffa6f44bc76
---
 gnu/packages/crates-io.scm | 26 ++++++++++++++++++++++++++
 1 file changed, 26 insertions(+)

diff --git a/gnu/packages/crates-io.scm b/gnu/packages/crates-io.scm
index 28ff81c801..7a8f090fd9 100644
--- a/gnu/packages/crates-io.scm
+++ b/gnu/packages/crates-io.scm
@@ -43741,6 +43741,32 @@ (define-public rust-monostate-impl-0.1
      "This package provides Implementation detail of the monostate crate.")
     (license (list license:expat license:asl2.0))))
 
+(define-public rust-monostate-0.1
+  (package
+    (name "rust-monostate")
+    (version "0.1.11")
+    (source
+     (origin
+       (method url-fetch)
+       (uri (crate-uri "monostate" version))
+       (file-name (string-append name "-" version ".tar.gz"))
+       (sha256
+        (base32 "0xchz8cs990g7g5f8jjybjnyi9xnhykiq44gl97p5rbh3hgjm347"))))
+    (build-system cargo-build-system)
+    (arguments
+     `(#:cargo-inputs
+       (("rust-monostate-impl" ,rust-monostate-impl-0.1)
+        ("rust-serde" ,rust-serde-1))
+       #:cargo-development-inputs
+       (("rust-serde" ,rust-serde-1)
+        ("rust-serde-json" ,rust-serde-json-1))))
+    (home-page "https://github.com/dtolnay/monostate")
+    (synopsis "Type that deserializes only from one specific value")
+    (description
+     "This package provides a Rust type that deserializes only from one
+specific value.")
+    (license (list license:expat license:asl2.0))))
+
 (define-public rust-more-asserts-0.3
   (package
     (name "rust-more-asserts")
-- 
2.45.2





^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [bug#73106] [PATCH 08/10] gnu: Add rust-tokenizers.
  2024-09-07 16:56 ` [bug#73106] [PATCH 01/10] gnu: Add rust-esaxx-rs-0.1 Nicolas Graves via Guix-patches via
                     ` (5 preceding siblings ...)
  2024-09-07 16:56   ` [bug#73106] [PATCH 07/10] gnu: Add rust-monostate-0.1 Nicolas Graves via Guix-patches via
@ 2024-09-07 16:56   ` Nicolas Graves via Guix-patches via
  2024-09-07 16:56   ` [bug#73106] [PATCH 09/10] gnu: Add rust-numpy-0.21 Nicolas Graves via Guix-patches via
  2024-09-07 16:56   ` [bug#73106] [PATCH 10/10] gnu: Add python-tokenizers Nicolas Graves via Guix-patches via
  8 siblings, 0 replies; 11+ messages in thread
From: Nicolas Graves via Guix-patches via @ 2024-09-07 16:56 UTC (permalink / raw)
  To: 73106; +Cc: ngraves

* gnu/packages/machine-learning.scm (rust-tokenizers): New variable.

Change-Id: I3189a2d826f072f65ad053d77eb39be39775f1c2
---
 gnu/packages/machine-learning.scm | 60 +++++++++++++++++++++++++++++++
 1 file changed, 60 insertions(+)

diff --git a/gnu/packages/machine-learning.scm b/gnu/packages/machine-learning.scm
index 27d7f0526b..3b601f6c91 100644
--- a/gnu/packages/machine-learning.scm
+++ b/gnu/packages/machine-learning.scm
@@ -5675,6 +5675,66 @@ (define-public rust-hf-hub-0.3
 python package, but only implements a smaller subset of functions.")
     (license license:asl2.0)))
 
+(define-public rust-tokenizers
+  (package
+    (name "rust-tokenizers")
+    (version "0.19.1")
+    (source
+     (origin
+       (method url-fetch)
+       (uri (crate-uri "tokenizers" version))
+       (file-name (string-append name "-" version ".tar.gz"))
+       (sha256
+        (base32 "1zg6ffpllygijb5bh227m9p4lrhf0pjkysky68kddwrsvp8zl075"))
+       (modules '((guix build utils)))
+       (snippet
+        #~(substitute* "Cargo.toml"
+            (("0.1.12") ; rust-monostate requires a rust-syn-2 update
+             "0.1.11")
+            (("version = \"6.4\"")  ; rust-onig
+             "version = \"6.1.1\"")))))
+    (build-system cargo-build-system)
+    (arguments
+     (list
+      #:tests? #f  ; tests are relying on missing data.
+      #:cargo-inputs
+      `(("rust-aho-corasick" ,rust-aho-corasick-1)
+        ("rust-derive-builder" ,rust-derive-builder-0.20)
+        ("rust-esaxx-rs" ,rust-esaxx-rs-0.1)
+        ("rust-fancy-regex" ,rust-fancy-regex-0.13)
+        ("rust-getrandom" ,rust-getrandom-0.2)
+        ("rust-hf-hub" ,rust-hf-hub-0.3)
+        ("rust-indicatif" ,rust-indicatif-0.17)
+        ("rust-itertools" ,rust-itertools-0.12)
+        ("rust-lazy-static" ,rust-lazy-static-1)
+        ("rust-log" ,rust-log-0.4)
+        ("rust-macro-rules-attribute" ,rust-macro-rules-attribute-0.2)
+        ("rust-monostate" ,rust-monostate-0.1)
+        ("rust-onig" ,rust-onig-6)
+        ("rust-paste" ,rust-paste-1)
+        ("rust-rand" ,rust-rand-0.8)
+        ("rust-rayon" ,rust-rayon-1)
+        ("rust-rayon-cond" ,rust-rayon-cond-0.3)
+        ("rust-regex" ,rust-regex-1)
+        ("rust-regex-syntax" ,rust-regex-syntax-0.8)
+        ("rust-serde" ,rust-serde-1)
+        ("rust-serde-json" ,rust-serde-json-1)
+        ("rust-spm-precompiled" ,rust-spm-precompiled-0.1)
+        ("rust-thiserror" ,rust-thiserror-1)
+        ("rust-unicode-normalization-alignments" ,rust-unicode-normalization-alignments-0.1)
+        ("rust-unicode-segmentation" ,rust-unicode-segmentation-1)
+        ("rust-unicode-categories" ,rust-unicode-categories-0.1))
+      #:cargo-development-inputs
+      `(("rust-assert-approx-eq" ,rust-assert-approx-eq-1)
+        ("rust-criterion" ,rust-criterion-0.5)
+        ("rust-tempfile" ,rust-tempfile-3))))
+    (home-page "https://github.com/huggingface/tokenizers")
+    (synopsis "Implementation of various popular tokenizers")
+    (description
+     "This package provides a Rust implementation of today's most used
+tokenizers, with a focus on performances and versatility.")
+    (license license:asl2.0)))
+
 (define-public python-hmmlearn
   (package
     (name "python-hmmlearn")
-- 
2.45.2





^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [bug#73106] [PATCH 09/10] gnu: Add rust-numpy-0.21.
  2024-09-07 16:56 ` [bug#73106] [PATCH 01/10] gnu: Add rust-esaxx-rs-0.1 Nicolas Graves via Guix-patches via
                     ` (6 preceding siblings ...)
  2024-09-07 16:56   ` [bug#73106] [PATCH 08/10] gnu: Add rust-tokenizers Nicolas Graves via Guix-patches via
@ 2024-09-07 16:56   ` Nicolas Graves via Guix-patches via
  2024-09-07 16:56   ` [bug#73106] [PATCH 10/10] gnu: Add python-tokenizers Nicolas Graves via Guix-patches via
  8 siblings, 0 replies; 11+ messages in thread
From: Nicolas Graves via Guix-patches via @ 2024-09-07 16:56 UTC (permalink / raw)
  To: 73106; +Cc: ngraves

* gnu/packages/crates-io.scm (rust-numpy-0.21): New variable.

Change-Id: Idae5915f3cefa47c16c4bf9a5679f55621e35da7
---
 gnu/packages/crates-io.scm | 35 +++++++++++++++++++++++++++++++++++
 1 file changed, 35 insertions(+)

diff --git a/gnu/packages/crates-io.scm b/gnu/packages/crates-io.scm
index 7a8f090fd9..ba5cb75d2c 100644
--- a/gnu/packages/crates-io.scm
+++ b/gnu/packages/crates-io.scm
@@ -48734,6 +48734,41 @@ (define-public rust-number-prefix-0.3
 giga, kibi.")
     (license license:expat)))
 
+(define-public rust-numpy-0.21
+  (package
+    (name "rust-numpy")
+    (version "0.21.0")
+    (source
+     (origin
+       (method url-fetch)
+       (uri (crate-uri "numpy" version))
+       (file-name (string-append name "-" version ".tar.gz"))
+       (sha256
+        (base32 "1x1p5x7lwfc5nsccwj98sln5vx3g3n8sbgm5fmfmy5rpr8rhf5zc"))))
+    (build-system cargo-build-system)
+    (arguments
+     `(#:cargo-inputs
+       (("rust-half" ,rust-half-2)
+        ("rust-libc" ,rust-libc-0.2)
+        ("rust-nalgebra" ,rust-nalgebra-0.32)
+        ("rust-ndarray" ,rust-ndarray-0.13)
+        ("rust-num-complex" ,rust-num-complex-0.2)
+        ("rust-num-integer" ,rust-num-integer-0.1)
+        ("rust-num-traits" ,rust-num-traits-0.2)
+        ("rust-pyo3" ,rust-pyo3-0.21)
+        ("rust-rustc-hash" ,rust-rustc-hash-1))
+       #:cargo-development-inputs
+       (("rust-nalgebra" ,rust-nalgebra-0.32)
+        ("rust-pyo3" ,rust-pyo3-0.21))))
+    (native-inputs (list python-minimal
+                         (@ (gnu packages python-xyz) python-numpy)))
+    (home-page "https://github.com/PyO3/rust-numpy")
+    (synopsis "Rust bindings for the NumPy C-API")
+    (description
+     "This package provides @code{PyO3-based} Rust bindings of the
+@code{NumPy} C-API.")
+    (license license:bsd-2)))
+
 (define-public rust-numtoa-0.2
   (package
     (name "rust-numtoa")
-- 
2.45.2





^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [bug#73106] [PATCH 10/10] gnu: Add python-tokenizers.
  2024-09-07 16:56 ` [bug#73106] [PATCH 01/10] gnu: Add rust-esaxx-rs-0.1 Nicolas Graves via Guix-patches via
                     ` (7 preceding siblings ...)
  2024-09-07 16:56   ` [bug#73106] [PATCH 09/10] gnu: Add rust-numpy-0.21 Nicolas Graves via Guix-patches via
@ 2024-09-07 16:56   ` Nicolas Graves via Guix-patches via
  8 siblings, 0 replies; 11+ messages in thread
From: Nicolas Graves via Guix-patches via @ 2024-09-07 16:56 UTC (permalink / raw)
  To: 73106; +Cc: ngraves

* gnu/packages/machine-learning.scm (python-tokenizers): New variable.

Change-Id: I5db95172255dc4635c2a417f3b7252454eea27d7
---
 gnu/packages/machine-learning.scm | 111 ++++++++++++++++++++++++++++++
 1 file changed, 111 insertions(+)

diff --git a/gnu/packages/machine-learning.scm b/gnu/packages/machine-learning.scm
index 3b601f6c91..412499d424 100644
--- a/gnu/packages/machine-learning.scm
+++ b/gnu/packages/machine-learning.scm
@@ -5735,6 +5735,117 @@ (define-public rust-tokenizers
 tokenizers, with a focus on performances and versatility.")
     (license license:asl2.0)))
 
+(define-public python-tokenizers
+  (package
+    (name "python-tokenizers")
+    (version "0.19.1")
+    (source
+     (origin
+       (method url-fetch)
+       (uri (pypi-uri "tokenizers" version))
+       (sha256
+        (base32 "1qw8mjp0q9w7j1raq1rvcbfw38000kbqpwscf9mvxzfh1rlfcngf"))
+       (modules '((guix build utils)
+                  (ice-9 ftw)))
+       (snippet
+        #~(begin  ;; Only keeping bindings.
+            (for-each (lambda (file)
+                        (unless (member file '("." ".." "bindings" "PKG-INFO"))
+                          (delete-file-recursively file)))
+                      (scandir "."))
+            (for-each (lambda (file)
+                        (unless (member file '("." ".."))
+                          (rename-file (string-append "bindings/python/" file) file)))
+                      (scandir "bindings/python"))
+            (delete-file-recursively ".cargo")))))
+    (build-system cargo-build-system)
+    (arguments
+     (list
+      #:cargo-test-flags ''("--no-default-features")
+      #:imported-modules `(,@%cargo-build-system-modules
+                           ,@%pyproject-build-system-modules)
+      #:modules '((guix build cargo-build-system)
+                  ((guix build pyproject-build-system) #:prefix py:)
+                  (guix build utils)
+                  (ice-9 regex)
+                  (ice-9 textual-ports))
+      #:phases
+      #~(modify-phases %standard-phases
+          (add-after 'unpack-rust-crates 'inject-tokenizers
+            (lambda _
+              (substitute* "Cargo.toml"
+                (("\\[dependencies\\]")
+                 (format #f "
+[dev-dependencies]
+tempfile = ~s
+pyo3 = { version = ~s, features = [\"auto-initialize\"] }
+
+[dependencies]
+tokenizers = ~s"
+                         #$(package-version rust-tempfile-3)
+                         #$(package-version rust-pyo3-0.21)
+                         #$(package-version rust-tokenizers))))
+              (let ((file-path "Cargo.toml"))
+                (call-with-input-file file-path
+                  (lambda (port)
+                    (let* ((content (get-string-all port))
+                           (top-match (string-match
+                                       "\\[dependencies.tokenizers" content)))
+                      (call-with-output-file file-path
+                        (lambda (out)
+                          (format out "~a" (match:prefix top-match))))))))))
+          (add-after 'patch-cargo-checksums 'loosen-requirements
+            (lambda _
+              (substitute* "Cargo.toml"
+                (("version = \"6.4\"")
+                 (format #f "version = ~s"
+                         #$(package-version rust-onig-6))))))
+          (add-after 'check 'python-check
+            (lambda _
+              (copy-file "target/release/libtokenizers.so"
+                         "py_src/tokenizers/tokenizers.so")
+              (invoke "python3"
+                      "-c" (format #f
+                                   "import sys; sys.path.append(\"~a/py_src\")"
+                                   (getcwd))
+                      "-m" "pytest"
+                      "-s" "-v" "./tests/")))
+          (add-after 'install 'install-python
+            (lambda _
+              (let* ((pversion #$(version-major+minor (package-version python)))
+                     (lib (string-append #$output "/lib/python" pversion
+                                         "/site-packages/"))
+                     (info (string-append lib "tokenizers-"
+                                        #$(package-version this-package)
+                                        ".dist-info")))
+                (mkdir-p info)
+                (copy-file "PKG-INFO" (string-append info "/METADATA"))
+                (copy-recursively
+                 "py_src/tokenizers"
+                 (string-append lib "tokenizers"))))))
+      #:cargo-inputs
+      `(("rust-rayon" ,rust-rayon-1)
+        ("rust-serde" ,rust-serde-1)
+        ("rust-serde-json" ,rust-serde-json-1)
+        ("rust-libc" ,rust-libc-0.2)
+        ("rust-env-logger" ,rust-env-logger-0.11)
+        ("rust-pyo3" ,rust-pyo3-0.21)
+        ("rust-numpy" ,rust-numpy-0.21)
+        ("rust-ndarray" ,rust-ndarray-0.15)
+        ("rust-onig" ,rust-onig-6)
+        ("rust-itertools" ,rust-itertools-0.12)
+        ("rust-tokenizers" ,rust-tokenizers))
+      #:cargo-development-inputs
+      `(("rust-tempfile" ,rust-tempfile-3))))
+    (native-inputs
+     (list python-minimal python-pytest))
+    (home-page "https://huggingface.co/docs/tokenizers")
+    (synopsis "Implementation of various popular tokenizers")
+    (description
+     "This package provides bindings to a Rust implementation of the most used
+tokenizers, @code{rust-tokenizers}.")
+    (license license:asl2.0)))
+
 (define-public python-hmmlearn
   (package
     (name "python-hmmlearn")
-- 
2.45.2





^ permalink raw reply related	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2024-09-07 16:57 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-07 16:21 [bug#73106] [PATCH 00/10] Add python-tokenizers Nicolas Graves via Guix-patches via
2024-09-07 16:56 ` [bug#73106] [PATCH 01/10] gnu: Add rust-esaxx-rs-0.1 Nicolas Graves via Guix-patches via
2024-09-07 16:56   ` [bug#73106] [PATCH 02/10] gnu: Add rust-spm-precompiled-0.1 Nicolas Graves via Guix-patches via
2024-09-07 16:56   ` [bug#73106] [PATCH 03/10] gnu: Add rust-macro-rules-attribute-proc-macro-0.2 Nicolas Graves via Guix-patches via
2024-09-07 16:56   ` [bug#73106] [PATCH 04/10] gnu: Add rust-macro-rules-attribute-0.2 Nicolas Graves via Guix-patches via
2024-09-07 16:56   ` [bug#73106] [PATCH 05/10] gnu: Add rust-hf-hub-0.3 Nicolas Graves via Guix-patches via
2024-09-07 16:56   ` [bug#73106] [PATCH 06/10] gnu: Add rust-monostate-impl-0.1 Nicolas Graves via Guix-patches via
2024-09-07 16:56   ` [bug#73106] [PATCH 07/10] gnu: Add rust-monostate-0.1 Nicolas Graves via Guix-patches via
2024-09-07 16:56   ` [bug#73106] [PATCH 08/10] gnu: Add rust-tokenizers Nicolas Graves via Guix-patches via
2024-09-07 16:56   ` [bug#73106] [PATCH 09/10] gnu: Add rust-numpy-0.21 Nicolas Graves via Guix-patches via
2024-09-07 16:56   ` [bug#73106] [PATCH 10/10] gnu: Add python-tokenizers Nicolas Graves via Guix-patches via

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).