* [bug#62443] LLaMA.cpp
@ 2023-03-25 15:05 Nicolas Graves via Guix-patches via
2023-03-25 15:32 ` [bug#62443] [PATCH 1/3] gnu: Add sentencepiece Nicolas Graves via Guix-patches via
2023-04-08 12:07 ` [bug#62443] LLaMA.cpp Nicolas Goaziou
0 siblings, 2 replies; 5+ messages in thread
From: Nicolas Graves via Guix-patches via @ 2023-03-25 15:05 UTC (permalink / raw)
To: 62443
Here are 3 patches introducing the LLaMA CPP implementation. Since
weights are available as torrent download, this makes the whole model
usable with a local config.
Basic information for preparing the model are available in the
README.
--
Best regards,
Nicolas Graves
^ permalink raw reply [flat|nested] 5+ messages in thread
* [bug#62443] [PATCH 1/3] gnu: Add sentencepiece.
2023-03-25 15:05 [bug#62443] LLaMA.cpp Nicolas Graves via Guix-patches via
@ 2023-03-25 15:32 ` Nicolas Graves via Guix-patches via
2023-03-25 15:32 ` [bug#62443] [PATCH 2/3] gnu: Add python-sentencepiece Nicolas Graves via Guix-patches via
2023-03-25 15:32 ` [bug#62443] [PATCH 3/3] gnu: Add llama-cpp Nicolas Graves via Guix-patches via
2023-04-08 12:07 ` [bug#62443] LLaMA.cpp Nicolas Goaziou
1 sibling, 2 replies; 5+ messages in thread
From: Nicolas Graves via Guix-patches via @ 2023-03-25 15:32 UTC (permalink / raw)
To: 62443; +Cc: ngraves
* gnu/packages/machine-learning.scm (sentencepiece): New variable.
---
gnu/packages/machine-learning.scm | 27 +++++++++++++++++++++++++++
1 file changed, 27 insertions(+)
diff --git a/gnu/packages/machine-learning.scm b/gnu/packages/machine-learning.scm
index 37d4ef78ad..f6996af77b 100644
--- a/gnu/packages/machine-learning.scm
+++ b/gnu/packages/machine-learning.scm
@@ -583,6 +583,33 @@ (define openfst-for-vosk
'("--enable-shared" "--enable-far" "--enable-ngram-fsts"
"--enable-lookahead-fsts" "--with-pic" "--disable-bin")))))
+(define-public sentencepiece
+ (package
+ (name "sentencepiece")
+ (version "0.1.97")
+ (source
+ (origin
+ (method git-fetch)
+ (uri (git-reference
+ (url "https://github.com/google/sentencepiece")
+ (commit (string-append "v" version))))
+ (file-name (git-file-name name version))
+ (sha256
+ (base32 "1kzfkp2pk0vabyw3wmkh16h11chzq63mzc20ddhsag5fp6s91ajg"))))
+ (build-system cmake-build-system)
+ (arguments '(#:tests? #f))
+ (native-inputs (list gperftools))
+ (home-page "https://github.com/google/sentencepiece")
+ (synopsis "Unsupervised tokenizer for Neural Network-based text generation")
+ (description "SentencePiece is an unsupervised text tokenizer and
+detokenizer mainly for Neural Network-based text generation systems where the
+vocabulary size is predetermined prior to the neural model training.
+SentencePiece implements subword units (e.g., byte-pair-encoding
+(BPE) and unigram language model) with the extension of direct training from
+raw sentences. SentencePiece allows us to make a purely end-to-end system
+that does not depend on language-specific pre/postprocessing.")
+ (license license:asl2.0)))
+
(define-public shogun
(package
(name "shogun")
--
2.39.2
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [bug#62443] [PATCH 2/3] gnu: Add python-sentencepiece.
2023-03-25 15:32 ` [bug#62443] [PATCH 1/3] gnu: Add sentencepiece Nicolas Graves via Guix-patches via
@ 2023-03-25 15:32 ` Nicolas Graves via Guix-patches via
2023-03-25 15:32 ` [bug#62443] [PATCH 3/3] gnu: Add llama-cpp Nicolas Graves via Guix-patches via
1 sibling, 0 replies; 5+ messages in thread
From: Nicolas Graves via Guix-patches via @ 2023-03-25 15:32 UTC (permalink / raw)
To: 62443; +Cc: ngraves
* gnu/packages/machine-learning.scm (python-sentencepiece): New variable.
---
gnu/packages/machine-learning.scm | 19 +++++++++++++++++++
1 file changed, 19 insertions(+)
diff --git a/gnu/packages/machine-learning.scm b/gnu/packages/machine-learning.scm
index f6996af77b..df1989d316 100644
--- a/gnu/packages/machine-learning.scm
+++ b/gnu/packages/machine-learning.scm
@@ -610,6 +610,25 @@ (define-public sentencepiece
that does not depend on language-specific pre/postprocessing.")
(license license:asl2.0)))
+(define-public python-sentencepiece
+ (package
+ (name "python-sentencepiece")
+ (version "0.1.97")
+ (source
+ (origin
+ (method url-fetch)
+ (uri (pypi-uri "sentencepiece" version))
+ (sha256
+ (base32 "0v0z9ryl66432zajp099bcbnwkkldzlpjvgnjv9bq2vi19g300f9"))))
+ (build-system python-build-system)
+ (propagated-inputs (list sentencepiece))
+ (native-inputs (list pkg-config))
+ (home-page "https://github.com/google/sentencepiece")
+ (synopsis "SentencePiece python wrapper")
+ (description "This package provides a python wrapper for the SentencePiece
+unsupervised text tokenizer.")
+ (license license:asl2.0)))
+
(define-public shogun
(package
(name "shogun")
--
2.39.2
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [bug#62443] [PATCH 3/3] gnu: Add llama-cpp.
2023-03-25 15:32 ` [bug#62443] [PATCH 1/3] gnu: Add sentencepiece Nicolas Graves via Guix-patches via
2023-03-25 15:32 ` [bug#62443] [PATCH 2/3] gnu: Add python-sentencepiece Nicolas Graves via Guix-patches via
@ 2023-03-25 15:32 ` Nicolas Graves via Guix-patches via
1 sibling, 0 replies; 5+ messages in thread
From: Nicolas Graves via Guix-patches via @ 2023-03-25 15:32 UTC (permalink / raw)
To: 62443; +Cc: ngraves
* gnu/packages/machine-learning.scm (llama-cpp): New variable.
---
gnu/packages/machine-learning.scm | 64 +++++++++++++++++++++++++++++++
1 file changed, 64 insertions(+)
diff --git a/gnu/packages/machine-learning.scm b/gnu/packages/machine-learning.scm
index df1989d316..6c78b14fc6 100644
--- a/gnu/packages/machine-learning.scm
+++ b/gnu/packages/machine-learning.scm
@@ -400,6 +400,70 @@ (define-public guile-aiscm
(define-public guile-aiscm-next
(deprecated-package "guile-aiscm-next" guile-aiscm))
+(define-public llama-cpp
+ (let ((commit "3cd8dde0d1357b7f11bdd25c45d5bf5e97e284a0")
+ (revision "0"))
+ (package
+ (name "llama-cpp")
+ (version (git-version "0.0.0" revision commit))
+ (source
+ (origin
+ (method git-fetch)
+ (uri (git-reference
+ (url "https://github.com/ggerganov/llama.cpp")
+ (commit (string-append "master-" (string-take commit 7)))))
+ (file-name (git-file-name name version))
+ (sha256
+ (base32 "0i7c92cxqs31xklrn688978kk29agivgxjgvsb45wzm65gc6hm5c"))))
+ (build-system cmake-build-system)
+ (arguments
+ (list
+ #:modules '((ice-9 textual-ports)
+ (guix build utils)
+ ((guix build python-build-system) #:prefix python:)
+ (guix build cmake-build-system))
+ #:imported-modules `(,@%cmake-build-system-modules
+ (guix build python-build-system))
+ #:phases
+ #~(modify-phases %standard-phases
+ (add-before 'install 'install-python-scripts
+ (lambda _
+ (let ((bin (string-append #$output "/bin/")))
+ (define (make-script script)
+ (let ((suffix (if (string-suffix? ".py" script) "" ".py")))
+ (call-with-input-file
+ (string-append "../source/" script suffix)
+ (lambda (input)
+ (call-with-output-file (string-append bin script)
+ (lambda (output)
+ (format output "#!~a/bin/python3\n~a"
+ #$(this-package-input "python")
+ (get-string-all input))))))
+ (chmod (string-append bin script) #o555)))
+ (mkdir-p bin)
+ (make-script "convert-pth-to-ggml")
+ (make-script "convert-gptq-to-ggml")
+ (make-script "quantize.py")
+ (substitute* (string-append bin "quantize.py")
+ (("os\\.getcwd\\(\\), quantize_script_binary")
+ (string-append "\"" bin "\", quantize_script_binary"))))))
+ (add-after 'install-python-scripts 'wrap-python-scripts
+ (assoc-ref python:%standard-phases 'wrap))
+ (replace 'install
+ (lambda _
+ (let ((bin (string-append #$output "/bin/")))
+ (install-file "bin/quantize" bin)
+ (copy-file "bin/main" (string-append bin "llama"))))))))
+ (propagated-inputs
+ (list python-pytorch python-sentencepiece python-numpy))
+ (inputs (list python))
+ (home-page "https://github.com/ggerganov/llama.cpp")
+ (synopsis "Port of Facebook's LLaMA model in C/C++")
+ (description "This package provides a port to Facebook's LLaMA collection
+of foundation language models. It requires models parameters to be downloaded
+independently to be able to run a LLaMA model.")
+ (license license:expat))))
+
(define-public mcl
(package
(name "mcl")
--
2.39.2
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [bug#62443] LLaMA.cpp
2023-03-25 15:05 [bug#62443] LLaMA.cpp Nicolas Graves via Guix-patches via
2023-03-25 15:32 ` [bug#62443] [PATCH 1/3] gnu: Add sentencepiece Nicolas Graves via Guix-patches via
@ 2023-04-08 12:07 ` Nicolas Goaziou
1 sibling, 0 replies; 5+ messages in thread
From: Nicolas Goaziou @ 2023-04-08 12:07 UTC (permalink / raw)
To: 62443; +Cc: Nicolas Graves, 62443-done
Hello,
Nicolas Graves via Guix-patches via <guix-patches@gnu.org> writes:
> Here are 3 patches introducing the LLaMA CPP implementation. Since
> weights are available as torrent download, this makes the whole model
> usable with a local config.
Applied. Thank you.
Regards,
--
Nicolas Goaziou
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2023-04-08 12:08 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-03-25 15:05 [bug#62443] LLaMA.cpp Nicolas Graves via Guix-patches via
2023-03-25 15:32 ` [bug#62443] [PATCH 1/3] gnu: Add sentencepiece Nicolas Graves via Guix-patches via
2023-03-25 15:32 ` [bug#62443] [PATCH 2/3] gnu: Add python-sentencepiece Nicolas Graves via Guix-patches via
2023-03-25 15:32 ` [bug#62443] [PATCH 3/3] gnu: Add llama-cpp Nicolas Graves via Guix-patches via
2023-04-08 12:07 ` [bug#62443] LLaMA.cpp Nicolas Goaziou
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/guix.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).