unofficial mirror of guix-patches@gnu.org 
 help / color / mirror / code / Atom feed
* [bug#62443] LLaMA.cpp
@ 2023-03-25 15:05 Nicolas Graves via Guix-patches via
  2023-03-25 15:32 ` [bug#62443] [PATCH 1/3] gnu: Add sentencepiece Nicolas Graves via Guix-patches via
  2023-04-08 12:07 ` [bug#62443] LLaMA.cpp Nicolas Goaziou
  0 siblings, 2 replies; 5+ messages in thread
From: Nicolas Graves via Guix-patches via @ 2023-03-25 15:05 UTC (permalink / raw)
  To: 62443


Here are 3 patches introducing the LLaMA CPP implementation. Since
weights are available as torrent download, this makes the whole model
usable with a local config. 

Basic information for preparing the model are available in the
README. 

-- 
Best regards,
Nicolas Graves




^ permalink raw reply	[flat|nested] 5+ messages in thread

* [bug#62443] [PATCH 1/3] gnu: Add sentencepiece.
  2023-03-25 15:05 [bug#62443] LLaMA.cpp Nicolas Graves via Guix-patches via
@ 2023-03-25 15:32 ` Nicolas Graves via Guix-patches via
  2023-03-25 15:32   ` [bug#62443] [PATCH 2/3] gnu: Add python-sentencepiece Nicolas Graves via Guix-patches via
  2023-03-25 15:32   ` [bug#62443] [PATCH 3/3] gnu: Add llama-cpp Nicolas Graves via Guix-patches via
  2023-04-08 12:07 ` [bug#62443] LLaMA.cpp Nicolas Goaziou
  1 sibling, 2 replies; 5+ messages in thread
From: Nicolas Graves via Guix-patches via @ 2023-03-25 15:32 UTC (permalink / raw)
  To: 62443; +Cc: ngraves

* gnu/packages/machine-learning.scm (sentencepiece): New variable.
---
 gnu/packages/machine-learning.scm | 27 +++++++++++++++++++++++++++
 1 file changed, 27 insertions(+)

diff --git a/gnu/packages/machine-learning.scm b/gnu/packages/machine-learning.scm
index 37d4ef78ad..f6996af77b 100644
--- a/gnu/packages/machine-learning.scm
+++ b/gnu/packages/machine-learning.scm
@@ -583,6 +583,33 @@ (define openfst-for-vosk
        '("--enable-shared" "--enable-far" "--enable-ngram-fsts"
          "--enable-lookahead-fsts" "--with-pic" "--disable-bin")))))
 
+(define-public sentencepiece
+  (package
+    (name "sentencepiece")
+    (version "0.1.97")
+    (source
+     (origin
+       (method git-fetch)
+       (uri (git-reference
+             (url "https://github.com/google/sentencepiece")
+             (commit (string-append "v" version))))
+       (file-name (git-file-name name version))
+       (sha256
+        (base32 "1kzfkp2pk0vabyw3wmkh16h11chzq63mzc20ddhsag5fp6s91ajg"))))
+    (build-system cmake-build-system)
+    (arguments '(#:tests? #f))
+    (native-inputs (list gperftools))
+    (home-page "https://github.com/google/sentencepiece")
+    (synopsis "Unsupervised tokenizer for Neural Network-based text generation")
+    (description "SentencePiece is an unsupervised text tokenizer and
+detokenizer mainly for Neural Network-based text generation systems where the
+vocabulary size is predetermined prior to the neural model training.
+SentencePiece implements subword units (e.g., byte-pair-encoding
+(BPE) and unigram language model) with the extension of direct training from
+raw sentences.  SentencePiece allows us to make a purely end-to-end system
+that does not depend on language-specific pre/postprocessing.")
+    (license license:asl2.0)))
+
 (define-public shogun
   (package
     (name "shogun")
-- 
2.39.2





^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [bug#62443] [PATCH 2/3] gnu: Add python-sentencepiece.
  2023-03-25 15:32 ` [bug#62443] [PATCH 1/3] gnu: Add sentencepiece Nicolas Graves via Guix-patches via
@ 2023-03-25 15:32   ` Nicolas Graves via Guix-patches via
  2023-03-25 15:32   ` [bug#62443] [PATCH 3/3] gnu: Add llama-cpp Nicolas Graves via Guix-patches via
  1 sibling, 0 replies; 5+ messages in thread
From: Nicolas Graves via Guix-patches via @ 2023-03-25 15:32 UTC (permalink / raw)
  To: 62443; +Cc: ngraves

* gnu/packages/machine-learning.scm (python-sentencepiece): New variable.
---
 gnu/packages/machine-learning.scm | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/gnu/packages/machine-learning.scm b/gnu/packages/machine-learning.scm
index f6996af77b..df1989d316 100644
--- a/gnu/packages/machine-learning.scm
+++ b/gnu/packages/machine-learning.scm
@@ -610,6 +610,25 @@ (define-public sentencepiece
 that does not depend on language-specific pre/postprocessing.")
     (license license:asl2.0)))
 
+(define-public python-sentencepiece
+  (package
+    (name "python-sentencepiece")
+    (version "0.1.97")
+    (source
+     (origin
+       (method url-fetch)
+       (uri (pypi-uri "sentencepiece" version))
+       (sha256
+        (base32 "0v0z9ryl66432zajp099bcbnwkkldzlpjvgnjv9bq2vi19g300f9"))))
+    (build-system python-build-system)
+    (propagated-inputs (list sentencepiece))
+    (native-inputs (list pkg-config))
+    (home-page "https://github.com/google/sentencepiece")
+    (synopsis "SentencePiece python wrapper")
+    (description "This package provides a python wrapper for the SentencePiece
+unsupervised text tokenizer.")
+    (license license:asl2.0)))
+
 (define-public shogun
   (package
     (name "shogun")
-- 
2.39.2





^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [bug#62443] [PATCH 3/3] gnu: Add llama-cpp.
  2023-03-25 15:32 ` [bug#62443] [PATCH 1/3] gnu: Add sentencepiece Nicolas Graves via Guix-patches via
  2023-03-25 15:32   ` [bug#62443] [PATCH 2/3] gnu: Add python-sentencepiece Nicolas Graves via Guix-patches via
@ 2023-03-25 15:32   ` Nicolas Graves via Guix-patches via
  1 sibling, 0 replies; 5+ messages in thread
From: Nicolas Graves via Guix-patches via @ 2023-03-25 15:32 UTC (permalink / raw)
  To: 62443; +Cc: ngraves

* gnu/packages/machine-learning.scm (llama-cpp): New variable.
---
 gnu/packages/machine-learning.scm | 64 +++++++++++++++++++++++++++++++
 1 file changed, 64 insertions(+)

diff --git a/gnu/packages/machine-learning.scm b/gnu/packages/machine-learning.scm
index df1989d316..6c78b14fc6 100644
--- a/gnu/packages/machine-learning.scm
+++ b/gnu/packages/machine-learning.scm
@@ -400,6 +400,70 @@ (define-public guile-aiscm
 (define-public guile-aiscm-next
   (deprecated-package "guile-aiscm-next" guile-aiscm))
 
+(define-public llama-cpp
+  (let ((commit "3cd8dde0d1357b7f11bdd25c45d5bf5e97e284a0")
+        (revision "0"))
+    (package
+      (name "llama-cpp")
+      (version (git-version "0.0.0" revision commit))
+      (source
+       (origin
+         (method git-fetch)
+         (uri (git-reference
+               (url "https://github.com/ggerganov/llama.cpp")
+               (commit (string-append "master-" (string-take commit 7)))))
+         (file-name (git-file-name name version))
+         (sha256
+          (base32 "0i7c92cxqs31xklrn688978kk29agivgxjgvsb45wzm65gc6hm5c"))))
+      (build-system cmake-build-system)
+      (arguments
+       (list
+        #:modules '((ice-9 textual-ports)
+                    (guix build utils)
+                    ((guix build python-build-system) #:prefix python:)
+                    (guix build cmake-build-system))
+        #:imported-modules `(,@%cmake-build-system-modules
+                             (guix build python-build-system))
+        #:phases
+        #~(modify-phases %standard-phases
+            (add-before 'install 'install-python-scripts
+              (lambda _
+                (let ((bin (string-append #$output "/bin/")))
+                  (define (make-script script)
+                    (let ((suffix (if (string-suffix? ".py" script) "" ".py")))
+                      (call-with-input-file
+                          (string-append "../source/" script suffix)
+                        (lambda (input)
+                          (call-with-output-file (string-append bin script)
+                            (lambda (output)
+                              (format output "#!~a/bin/python3\n~a"
+                                      #$(this-package-input "python")
+                                      (get-string-all input))))))
+                      (chmod (string-append bin script) #o555)))
+                  (mkdir-p bin)
+                  (make-script "convert-pth-to-ggml")
+                  (make-script "convert-gptq-to-ggml")
+                  (make-script "quantize.py")
+                  (substitute* (string-append bin "quantize.py")
+                    (("os\\.getcwd\\(\\), quantize_script_binary")
+                     (string-append "\"" bin "\", quantize_script_binary"))))))
+            (add-after 'install-python-scripts 'wrap-python-scripts
+              (assoc-ref python:%standard-phases 'wrap))
+            (replace 'install
+              (lambda _
+                (let ((bin (string-append #$output "/bin/")))
+                  (install-file "bin/quantize" bin)
+                  (copy-file "bin/main" (string-append bin "llama"))))))))
+      (propagated-inputs
+       (list python-pytorch python-sentencepiece python-numpy))
+      (inputs (list python))
+      (home-page "https://github.com/ggerganov/llama.cpp")
+      (synopsis "Port of Facebook's LLaMA model in C/C++")
+      (description "This package provides a port to Facebook's LLaMA collection
+of foundation language models.  It requires models parameters to be downloaded
+independently to be able to run a LLaMA model.")
+      (license license:expat))))
+
 (define-public mcl
   (package
     (name "mcl")
-- 
2.39.2





^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [bug#62443] LLaMA.cpp
  2023-03-25 15:05 [bug#62443] LLaMA.cpp Nicolas Graves via Guix-patches via
  2023-03-25 15:32 ` [bug#62443] [PATCH 1/3] gnu: Add sentencepiece Nicolas Graves via Guix-patches via
@ 2023-04-08 12:07 ` Nicolas Goaziou
  1 sibling, 0 replies; 5+ messages in thread
From: Nicolas Goaziou @ 2023-04-08 12:07 UTC (permalink / raw)
  To: 62443; +Cc: Nicolas Graves, 62443-done

Hello,

Nicolas Graves via Guix-patches via <guix-patches@gnu.org> writes:

> Here are 3 patches introducing the LLaMA CPP implementation. Since
> weights are available as torrent download, this makes the whole model
> usable with a local config.

Applied. Thank you.

Regards,
-- 
Nicolas Goaziou




^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-04-08 12:08 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-03-25 15:05 [bug#62443] LLaMA.cpp Nicolas Graves via Guix-patches via
2023-03-25 15:32 ` [bug#62443] [PATCH 1/3] gnu: Add sentencepiece Nicolas Graves via Guix-patches via
2023-03-25 15:32   ` [bug#62443] [PATCH 2/3] gnu: Add python-sentencepiece Nicolas Graves via Guix-patches via
2023-03-25 15:32   ` [bug#62443] [PATCH 3/3] gnu: Add llama-cpp Nicolas Graves via Guix-patches via
2023-04-08 12:07 ` [bug#62443] LLaMA.cpp Nicolas Goaziou

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).