unofficial mirror of guix-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: Bengt Richter <bokr@bokr.com>
To: Jonathan Brielmaier <jonathan.brielmaier@web.de>
Cc: guix-devel@gnu.org
Subject: Re: Hunspell dictionaries
Date: Mon, 2 Nov 2020 14:56:48 +0100	[thread overview]
Message-ID: <20201102135648.GA3468@LionPure> (raw)
In-Reply-To: <2ce626a7-db79-bdf0-5339-9f0f12dcc60e@web.de>

Hi,

On +2020-11-01 10:41:23 +0100, Jonathan Brielmaier wrote:
> On 31.10.20 21:03, paul wrote:> Dear Guixers,
> > 
> > I was packaging an Italian dictionary for Hunspell, when I found this
> > link listing all the Hunspell dictionaries supported by Libreoffice [0].
> > 
> > This lead me to think about a function for defining dictionary packages
> > (very much based on aspell-dictionary from (gnu packages aspell)).
> 
> Hi Paul, that is nice. I think it's absolutly worth it making support of
> more languages easier :)
> 
> I worked back then on the hunspell-dict-de and we discussed that we
> should move the hunspell dicts out of libreoffice.scm as they are used
> by other programs as well.
> 
> So when you make a patch you could save that stuff in
> gnu/packages/hunspell.scm...
> 

Please forgive jumping in, but the the topic is dictionaries, soo...

For the heck of it, I blindly egrepped all the texi files of my clone
of the guix repo (for acronyms, I was guessing :)
My clone is in ~/wb/guix, currently about a week old.
--8<---------------cut here---------------start------------->8---
(cd ~/wb/guix;git log|head) =->
commit d22d129da903cf6c3382cff5226d81a881fed2aa
Author: Leo Prikler <leo.prikler@student.tugraz.at>
Date:   Tue Oct 27 13:55:53 2020 +0100

    gnu: Add guile-filesystem.
    
    * gnu/packages/guile-xyz.scm (guile-filesystem): New variable.
    (guile2.0-filesystem guile2.2-filesystem): New variable.
    
    Signed-off-by: Ludovic Courtès <ludo@gnu.org>
--8<---------------cut here---------------end--------------->8---
egrepping was for sequences of upper case Alpha chars at least 2 long.

I think it would be nice if this raw source could be turned into
a guix glossary/acronym dictionary with easy UI alongside
manual web and/or info pages ;)

--8<---------------cut here---------------start------------->8---
egrep -oh '([A-Z]{2,})' $(find ~/wb/guix -name '*texi')|sort|uniq -c|sort -h|sed -E 's/([0-9]+)[ \t]([A-Z]+)/\1:\2/g'|blockjust -w=64 > texi-nyms.txt

    1:ABCDEFGHIJKLMNOPQRSTUVWXYZ 1:ABOUT 1:ACLOCAL 1:ACPI 1:ACTION
    1:AD 1:ADDENDUM 1:ADH 1:AEBB 1:AF 1:AGGREGATION 1:AHCI 1:ALPM
    1:AM 1:ANONYMOUS 1:ANOVA 1:ANY 1:APPLICABILITY 1:ATTR 1:AUTH
    1:AUTO 1:AZERTY 1:BALLOON 1:BCM 1:BN 1:BREAK 1:BTHL 1:CABBA
    1:CAML 1:CAPABILITY 1:CE 1:CET 1:CGIT 1:CH 1:CHANGED 1:CIFS
    1:CIPHER 1:CLI 1:CLIENTCONFIG 1:CLOCAL 1:CM 1:CMD 1:CODE
    1:COLLATE 1:COLLECTIONS 1:COMBINING 1:CONDUCT 1:CONNECT 1:CPATH
    1:CRLF 1:CTEST 1:CTS 1:DATA 1:DBD 1:DBP 1:DBUS 1:DEFINITIONS
    1:DEV 1:DEVPTS 1:DH 1:DIRS 1:DNSKEY 1:DOING 1:DOMAIN 1:DOMAINS
    1:DPMS 1:DS 1:DSS 1:EA 1:EAAAAEO 1:EABI 1:ED 1:EDITION 1:EDITOR
    1:ELS 1:EMACSLOADPATH 1:ENABLE 1:EUC 1:EXISTS 1:EXPORT 1:EXWM
    1:FAILURE 1:FATAL 1:FB 1:FDC 1:FFF 1:FORMAT 1:FPM 1:FUTURE
    1:GECOS 1:GHC 1:GOBJECT 1:GPT 1:GS 1:GSSD 1:GUID 1:HACKING 1:HAS
    1:HC 1:HOST 1:HP 1:HTTPT 1:IANA 1:ICMP 1:IDENTIFICATION 1:IEEE
    1:IMPORTANT 1:IMPURITIES 1:INDEPENDENT 1:INDEX 1:INSTANCES 1:IO
    1:IPA 1:IT 1:ITIL 1:JDK 1:JFS 1:JLY 1:JPG 1:JULIA 1:KPI 1:KQ
    1:LABEL 1:LIBRESSL 1:LICENSE 1:LINEAGE 1:LIRC 1:LOGINDISABLED
    1:LOW 1:LSUB 1:LU 1:LV 1:LXDE 1:LXQT 1:MDA 1:ME 1:MIME 1:MMIO
    1:MODIFICATIONS 1:MP 1:MTU 1:MUC 1:MULTIPLE 1:MX 1:MY 1:MZ
    1:NASTY 1:NAT 1:NEON 1:NIC 1:NID 1:NIST 1:NMI 1:NOOP 1:NTFS
    1:NTLM 1:OASIS 1:OE 1:OPEN 1:OPENSSL 1:OSX 1:PETS 1:PHC 1:PI
    1:PKCS 1:PLATFORM 1:PO 1:POSSIBLE 1:PREAMBLE 1:PRIVATE 1:PS
    1:PXE 1:QAAAQEA 1:QML 1:QMP 1:QP 1:QPA 1:QT 1:QUANTITY 1:QWERTY
     1:RADIUS 1:RCPT 1:READ 1:READERS 1:RECENT 1:RELICENSING 1:REST
    1:REVISIONS 1:ROM 1:RP 1:RTS 1:RUN 1:SAM 1:SANE 1:SECRET
    1:SESSION 1:SFTP 1:SHELL 1:SIGHUP 1:SIL 1:SITE 1:SLA 1:SOMEONE
    1:SOMETHING 1:SONAME 1:SPECIAL 1:SPKI 1:SPNEGO 1:SRP 1:SSD
    1:STARTTLS 1:STRENGTH 1:SUBSYSTEM 1:TERM 1:TERMINATION
    1:TEXINPUTS 1:THAT 1:THIS 1:TRANSLATION 1:UPDATED 1:URLAUTH
    1:UTC 1:UW 1:VALIDATION 1:VDAGENT 1:VERBATIM 1:VGA 1:VISUAL
    1:VNC 1:VSZ 1:WARNINGS 1:WEB 1:WITH 1:WORKS 1:WRAPPER 1:WRITERS
    1:WRS 1:WWAN 1:XA 1:XBAR 1:XCF 1:XEP 1:XFOO 1:XLFD 1:XPV 1:YAS
    2:AAAAB 2:ACME 2:AE 2:ALLOW 2:AND 2:ARGRX 2:AT 2:BBB 2:BIND
    2:BLK 2:BSD 2:BUNDLE 2:BY 2:CAINFO 2:CBC 2:CEA 2:CI 2:CIDR
    2:COLORTERM 2:COPYING 2:CORES 2:CPE 2:CR 2:CURL 2:DATE 2:DDF
    2:DE 2:DES 2:DESCRIPTION 2:DF 2:DIGEST 2:DISPLAY 2:DN
    2:DOCUMENTS 2:DPI 2:DPM 2:DRIVER 2:DTD 2:DUSE 2:EDU 2:EF
    2:ENGINE 2:EPOCH 2:EXAMINE 2:EXCL 2:EXECUTION 2:FA 2:FAT 2:FDE
    2:FLAGS 2:FORWARD 2:FUSE 2:GITHUB 2:GPL 2:HDA 2:HMAC 2:HPC 2:IA
    2:IMAPSSLR 2:INIT 2:INSTALL 2:IS 2:ISC 2:KB 2:KEYRING 2:LADSPA
    2:LAN 2:LAYOUT 2:LC 2:LD 2:LDA 2:LEGU 2:LF 2:LGPL 2:LIBRARY
    2:LIBS 2:LOGIN 2:LXQ 2:MAC 2:MAINTENANCE 2:MELPA 2:MIPS 2:MIT
    2:MODULES 2:NAME 2:NAMESPACE 2:NDK 2:NOPASSWD 2:NORMAL 2:NOTE
    2:NSEC 2:NULL 2:OL 2:OPAM 2:PATCH 2:PCSPKR 2:PKG 2:PNG 2:POSIX
    2:PSK 2:PULSE 2:PWD 2:PYTHONPATH 2:QCOW 2:QWERTZ 2:RAPI 2:REJECT
    2:REMOTE 2:RENEWED 2:RR 2:SA 2:SATA 2:SCHEME 2:SCM 2:SDL
    2:SELECT 2:SGML 2:SLURM 2:SOCKS 2:SVN 2:SXML 2:TERMINAL 2:THANKS
    2:TO 2:TYPE 2:UA 2:UMCUG 2:VIR 2:XXXX 3:AA 3:AAAAC 3:AMD 3:AR
    3:CD 3:COMMIT 3:CRL 3:DC 3:DICT 3:DIR 3:EBB 3:EEC 3:ESP
    3:EXPUNGE 3:GMP 3:GUILE 3:HTTPD 3:IDLE 3:IDMAP 3:INFOPATH 3:ISO
    3:JP 3:LEVEL 3:LIST 3:LSH 3:LVM 3:MD 3:MIN 3:MPC 3:MPFR 3:NET
    3:NIS 3:NIX 3:NTPD 3:OF 3:ON 3:OPTIONS 3:PDF 3:PM 3:RUNPATH
    3:RYF 3:SOA 3:SSID 3:TOKEN 3:UD 3:UP 3:VT 3:XDG 3:XKB 3:YYYY
    4:ASCII 4:ASDF 4:AUTHORS 4:CABB 4:CERTBOT 4:CFB 4:CPAN 4:DRBD
    4:ECDSA 4:ELPA 4:EXAMPLE 4:FF 4:FTP 4:GB 4:GIT 4:GSSAPI 4:GUI
    4:HDPI 4:HEAD 4:INFO 4:LMTP 4:LOAD 4:LOG 4:LTS 4:MANPATH 4:MUA
    4:NNN 4:NS 4:OWNER 4:PACKAGES 4:PC 4:PHP 4:PREFIX 4:RET 4:RSA
    4:SMTPD 4:SOURCE 4:TAB 4:TAP 4:TMPDIR 4:TUN 4:USE 4:VCL 4:VPS
    4:XXX 4:ZSK 5:AAAA 5:ABCD 5:ALL 5:BUILD 5:CHECK 5:CN 5:COM
    5:CTAN 5:DAEMON 5:DEBUG 5:ELF 5:ERROR 5:FILE 5:GPM 5:INBOX 5:KDE
    5:KEY 5:MTA 5:NG 5:OOM 5:OPENPGP 5:OUTPUT 5:PR 5:RC 5:README
    5:SC 5:SIGNING 5:SIGUSR 5:SOCKET 5:SRFI 5:SYSTEM 5:TESTS 5:TTY
    5:VCS 5:XML 5:XMPP 6:ABI 6:BASE 6:CERT 6:CRAN 6:ENVIRONMENT
    6:GID 6:IN 6:IPP 6:KSK 6:LUKS 6:OK 6:PCI 6:POP 6:RFC 6:TODO
    6:USER 6:XYZ 7:FS 7:GPG 7:GSS 7:TRANSLATORS 7:VIRTIO 7:WM 7:WPA
    8:ACCEPT 8:CHECKOUT 8:FIXME 8:GTK 8:KVM 8:PEM 8:PG 8:PL 8:SD
    8:UDP 9:ALSA 9:CC 9:HTML 9:INPUT 9:MMC 9:SMTP 9:UEFI 9:UTF
    9:WARNING 10:BIOS 10:DB 10:MATE 10:MB 10:MPD 10:NGINX 10:OC
    10:SL 10:TLP 11:ARM 11:CGI 11:SDDM 11:TTL 11:UID 12:AC 12:ACL
    12:GDB 12:IRC 12:RAID 13:US 14:DAG 14:DHCP 14:DVD 14:JSON 14:NTP
    14:UNIX 15:EFI 15:GL 15:LOCPATH 15:PROFILE 15:PROFILES 15:RPC
    16:SASL 17:PACKAGE 18:GDM 18:IMAP 18:SHA 19:CA 19:CONFIG 19:CVE
    19:EXTRA 19:GC 19:SEL 20:VERSION 23:BAT 23:NSS 24:CUPS 26:PAM
    27:CPU 27:RAM 27:SQL 27:UUID 28:OS 29:PGP 29:REPL 31:PID 31:TFTP
    32:URI 33:GNOME 33:TCP 34:USB 34:VPN 35:API 35:LDAP 36:HTTPS
    36:ID 36:QEMU 37:NFS 38:HOME 38:SERVER 40:VM 41:GCC 43:SSL
    45:DNS 46:SUBSTITUTE 48:PATH 54:GRUB 61:HTTP 70:TLS 82:IP
    90:GUIX 103:SSH 119:URL 454:GNU
--8<---------------cut here---------------end--------------->8---
The numbers are the occurrence counts. GNU is the most popular :)

HTH in some way.
-- 
Regards,
Bengt Richter


  reply	other threads:[~2020-11-02 13:57 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-31 20:03 Hunspell dictionaries paul
2020-11-01  9:41 ` Jonathan Brielmaier
2020-11-02 13:56   ` Bengt Richter [this message]
2020-11-05  1:20   ` paul

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://guix.gnu.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201102135648.GA3468@LionPure \
    --to=bokr@bokr.com \
    --cc=guix-devel@gnu.org \
    --cc=jonathan.brielmaier@web.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).