unofficial mirror of guile-user@gnu.org 
 help / color / mirror / Atom feed
* [HELP] a search engine in GNU Guile
@ 2016-08-13 15:25 Amirouche Boubekki
  2016-09-04 13:35 ` Amirouche Boubekki
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Amirouche Boubekki @ 2016-08-13 15:25 UTC (permalink / raw)
  To: Guile User

Héllo,


The goal of Culturia is to create a framework that makes it easy
to tape into Natural Language Understanding algorithms (and NLP)
and provide an interface for common tasks.

Culturia is an intelligence augmentation software.

It's primary interface is a search engine. Another important aspect
of the project is that it wants to be useable offline as such it will
come with infrastructure to dump, load and store dataset for offline 
use.

The current state of the project can be described as a big ball of mud.
There is a tiny search engine with crawling skills and that's basically
all of it.

The immediate changes that should happen are in order of preference:

- offline stackoverflow (cf. sotoki.scm) and use the generated
   website to create a zim for kiwix [0]. This is great occasion to
   show how great GNU Guile is!
- port whoosh/lucene to guile to improve text search
- offline hackernews, wikidata, wikipedia, wiktionary
- implement BM25f

Culturia is a reference to _Culture and Empire_ by Pieter Hintjens.

It has a sparse documentation is available online [1].
It's hosted on github [2] (This can change, if contributors
don't want to use github).

The TODO list is big, here is some stuff that needs to be done:

- finish GrammarLink bindings
- create sophia [3] bindings
- implement TextRank
- implement PageRank
- create a GUI using sly or html
- explore ways to easily share database among several processus

And many other things! Newbies are accepted obviously!

Send me a mail or use #guile @ irc.freenode.net, I am amz3.


Happy hacking!


[0] http://www.kiwix.org/wiki/Main_Page
[1] https://amirouche.github.io/Culturia/doc/
[2] https://github.com/amirouche/Culturia
[3] http://sophia.systems/


-- 
Amirouche ~ amz3 ~ http://www.hyperdev.fr



^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2017-02-10 10:56 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-08-13 15:25 [HELP] a search engine in GNU Guile Amirouche Boubekki
2016-09-04 13:35 ` Amirouche Boubekki
2016-09-09 14:40   ` Christopher Allan Webber
2016-09-10  6:33     ` Amirouche Boubekki
2016-09-09 14:39 ` Christopher Allan Webber
2016-09-09 14:05   ` Ralf Mattes
2016-09-09 18:10     ` Amirouche Boubekki
2017-02-10 10:56       ` amirouche
2016-09-23  5:52 ` Amirouche Boubekki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).