unofficial mirror of guix-patches@gnu.org 
 help / color / mirror / code / Atom feed
From: guix-patches--- via <guix-patches@gnu.org>
To: 69794@debbugs.gnu.org
Cc: "Nguyễn Gia Phong" <mcsinyx@disroot.org>,
	"Lars-Dominik Braun" <lars@6xq.net>,
	"Marius Bakke" <marius@gnu.org>,
	"Munyoki Kilyungi" <me@bonfacemunyoki.com>,
	"Sharlatan Hellseher" <sharlatanus@gmail.com>,
	jgart <jgart@dismail.de>
Subject: [bug#69794] [PATCH 1/2] gnu: Add python-sacremoses.
Date: Thu, 14 Mar 2024 17:32:22 +0900	[thread overview]
Message-ID: <03cb7e5cac1e4af60d9e655285b76bfd8dbf76c9.1710404630.git.mcsinyx@disroot.org> (raw)
In-Reply-To: <cover.1710404630.git.mcsinyx@disroot.org>

* gnu/packages/python-xyz.scm (python-sacremoses): New variable.

Change-Id: I2c2cd94c054d7e952ffb4b3afdedd2ee8ce905bf
---
 gnu/packages/python-xyz.scm | 34 ++++++++++++++++++++++++++++++++++
 1 file changed, 34 insertions(+)

diff --git a/gnu/packages/python-xyz.scm b/gnu/packages/python-xyz.scm
index 232b5d69993c..ad33d98db142 100644
--- a/gnu/packages/python-xyz.scm
+++ b/gnu/packages/python-xyz.scm
@@ -149,6 +149,7 @@
 ;;; Copyright © 2024 Timothee Mathieu <timothee.mathieu@inria.fr>
 ;;; Copyright © 2024 Ian Eure <ian@retrospec.tv>
 ;;; Copyright © 2024 Adriel Dumas--Jondeau <leirda@disroot.org>
+;;; Copyright © 2024 Nguyễn Gia Phong <mcsinyx@disroot.org>
 ;;;
 ;;; This file is part of GNU Guix.
 ;;;
@@ -21897,6 +21898,39 @@ (define-public python-nltk
      reasoning, wrappers for natural language processing libraries.")
     (license license:asl2.0)))
 
+(define-public python-sacremoses
+  (package
+    (name "python-sacremoses")
+    (version "0.1.0")
+    (source (origin
+              (method git-fetch)
+              (uri (git-reference
+                     (url "https://github.com/hplt-project/sacremoses")
+                     (commit version)))
+              (sha256
+                (base32
+                  "0g70vchfniknp65n4wnx7chg6g49d4xrz1wagv7f7ir2swdzyn9b"))))
+    (build-system python-build-system)
+    (arguments
+      '(#:phases
+         (modify-phases %standard-phases
+           (replace 'check
+             (lambda* (#:key tests? #:allow-other-keys)
+               (when tests?
+                 ;; Skip truecaser tests which fetch https://norvig.com/big.txt
+                 (invoke "python" "-m" "unittest"
+                         "sacremoses/test/test_corpus.py"
+                         "sacremoses/test/test_no_redos_has_numeric_only.py"
+                         "sacremoses/test/test_normalizer.py"
+                         "sacremoses/test/test_tokenizer.py")))))))
+    (propagated-inputs
+      (list python-click-7 python-joblib python-regex python-tqdm))
+    (home-page "https://github.com/hplt-project/sacremoses")
+    (synopsis "Natural language tokenizer, truecaser and normalizer")
+    (description "SacreMoses is a Python port of Moses'
+tokenizer, detokenizer, truecaser and punctuation normalizer.")
+    (license license:expat)))
+
 (define-public python-pymongo
   (package
     (name "python-pymongo")
-- 
2.41.0





  reply	other threads:[~2024-03-14  8:34 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-14  8:29 [bug#69794] [PATCH 0/2] Package some dependencies for Argos Translate guix-patches--- via
2024-03-14  8:32 ` guix-patches--- via [this message]
2024-03-14  8:32 ` [bug#69794] [PATCH 2/2] gnu: Add python-stanza guix-patches--- via

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://guix.gnu.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=03cb7e5cac1e4af60d9e655285b76bfd8dbf76c9.1710404630.git.mcsinyx@disroot.org \
    --to=guix-patches@gnu.org \
    --cc=69794@debbugs.gnu.org \
    --cc=jgart@dismail.de \
    --cc=lars@6xq.net \
    --cc=marius@gnu.org \
    --cc=mcsinyx@disroot.org \
    --cc=me@bonfacemunyoki.com \
    --cc=sharlatanus@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).