unofficial mirror of guix-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: "Ludovic Courtès" <ludo@gnu.org>
To: Leo Famulari <leo@famulari.name>
Cc: guix-devel@gnu.org
Subject: Re: Frequent locales problems for new users
Date: Sat, 21 Mar 2020 16:37:05 +0100	[thread overview]
Message-ID: <87pnd51zz2.fsf@gnu.org> (raw)
In-Reply-To: <20200318183622.GA25087@jasmine.lan> (Leo Famulari's message of "Wed, 18 Mar 2020 14:36:22 -0400")

[-- Attachment #1: Type: text/plain, Size: 2242 bytes --]

Hi Leo,

Leo Famulari <leo@famulari.name> skribis:

> On Wed, Mar 18, 2020 at 04:07:22PM +0100, Ludovic Courtès wrote:
>> As for ‘glibc-utf8-locales’ vs. ‘glibc-locales’: the reason for choosing
>> the former by default over the latter is size (14 MiB vs. 917 MiB).
>
> Oof! I was going by the manual, which says 110 MiB. That does change
> things...

Yes, I was also surprised.

The patch below produces a package that includes all the UTF-8 locales
(actually I had written that patch long ago, it feels like we’re running
in circles :-)).

It takes ages to build, and when it’s finally done:

--8<---------------cut here---------------start------------->8---
$ ./pre-inst-env guix build -e '((@@ (gnu packages base) make-glibc-utf8-locales/full))' 
substitute: updating substitutes from 'https://ci.guix.gnu.org'... 100.0%
substitute: updating substitutes from 'https://ci.guix.gnu.org'... 100.0%
substituting /gnu/store/jdfs3xvlnj272475yja6bjrprfsgnkdd-glibc-2.29...
downloading from https://ci.guix.gnu.org/nar/lzip/jdfs3xvlnj272475yja6bjrprfsgnkdd-glibc-2.29...
 glibc-2.29  8.2MiB                                                       1.8MiB/s 00:05 [##################] 100.0%

building /gnu/store/w08zi9vnkd7bxpfvm5lgjyb30i7k7sw4-glibc-supported-utf8-locales.scm.drv...
successfully built /gnu/store/w08zi9vnkd7bxpfvm5lgjyb30i7k7sw4-glibc-supported-utf8-locales.scm.drv
building /gnu/store/ps6wh05pwjp5b0l9rh2yglv3sggpgcw4-glibc-utf8-locales-2.29.drv...
successfully built /gnu/store/ps6wh05pwjp5b0l9rh2yglv3sggpgcw4-glibc-utf8-locales-2.29.drv
/gnu/store/p0knl9ggxk91x87ww702g2x78jxy1vgf-glibc-utf8-locales-2.29
ludo@ribbon ~/src/guix$ guix size /gnu/store/p0knl9ggxk91x87ww702g2x78jxy1vgf-glibc-utf8-locales-2.29 | tail -1
total: 855.7 MiB
--8<---------------cut here---------------end--------------->8---

So I think that’s when we reached the conclusion that we needed
parameterized packages to allow users to choose the locale(s) they need
or special support in ‘guix package’.

:-/

Attached is the list of supported UTF-8 locales, 312 in total.

Thoughts?  How do other distros deal with this?  Are we missing some
trick to compress locale data?

Ludo’.


[-- Attachment #2: Type: text/x-patch, Size: 4435 bytes --]

diff --git a/gnu/packages/base.scm b/gnu/packages/base.scm
index e8150708c0..98b413da13 100644
--- a/gnu/packages/base.scm
+++ b/gnu/packages/base.scm
@@ -1,5 +1,5 @@
 ;;; GNU Guix --- Functional package management for GNU
-;;; Copyright © 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019 Ludovic Courtès <ludo@gnu.org>
+;;; Copyright © 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020 Ludovic Courtès <ludo@gnu.org>
 ;;; Copyright © 2014, 2019 Andreas Enge <andreas@enge.fr>
 ;;; Copyright © 2012 Nikita Karetnikov <nikita@karetnikov.org>
 ;;; Copyright © 2014, 2015, 2016, 2018 Mark H Weaver <mhw@netris.org>
@@ -52,6 +52,8 @@
   #:use-module (gnu packages python)
   #:use-module (gnu packages gettext)
   #:use-module (guix utils)
+  #:use-module (guix gexp)
+  #:use-module (guix modules)
   #:use-module (guix packages)
   #:use-module (guix download)
   #:use-module (guix git-download)
@@ -61,6 +63,8 @@
   #:use-module (srfi srfi-1)
   #:use-module (srfi srfi-26)
   #:export (glibc
+            %default-utf8-locales
+            make-glibc-utf8-locales
             libiconv-if-needed))
 
 ;;; Commentary:
@@ -1076,7 +1080,12 @@ to the @code{share/locale} sub-directory of this package.")
                                         ,(version-major+minor
                                           (package-version glibc)))))))))))
 
-(define-public (make-glibc-utf8-locales glibc)
+(define %default-utf8-locales
+  '("de_DE" "el_GR" "en_US" "fr_FR" "tr_TR"))
+
+(define* (make-glibc-utf8-locales glibc #:optional
+                                  (locales %default-utf8-locales)
+                                  (locale-file #f))
   (package
     (name "glibc-utf8-locales")
     (version (package-version glibc))
@@ -1115,10 +1124,17 @@ to the @code{share/locale} sub-directory of this package.")
 
                                ;; These are the locales commonly used for
                                ;; tests---e.g., in Guile's i18n tests.
-                               '("de_DE" "el_GR" "en_US" "fr_FR" "tr_TR"))
+                               ,(if locale-file
+                                    `(call-with-input-file
+                                         (assoc-ref %build-inputs "locale-file")
+                                       read)
+                                    `',locales))
                      #t))))
     (native-inputs `(("glibc" ,glibc)
-                     ("gzip" ,gzip)))
+                     ("gzip" ,gzip)
+                     ,@(if locale-file
+                           `(("locale-file" ,locale-file))
+                           '())))
     (synopsis "Small sample of UTF-8 locales")
     (description
      "This package provides a small sample of UTF-8 locales mostly useful in
@@ -1145,6 +1161,40 @@ test environments.")
 (define-public glibc-locales-2.27
   (deprecated-package "glibc-locales-2.27" glibc-locales-2.28))
 
+(define (glibc-supported-locales libc)
+  ((module-ref (resolve-interface '(gnu system locale)) ;FIXME: hack
+               'glibc-supported-locales)
+   libc))
+
+(define* (make-glibc-utf8-locales/full #:optional (glibc glibc))
+  (define utf8-locales
+    (computed-file "glibc-supported-utf8-locales.scm"
+                   #~(begin
+                       (use-modules (srfi srfi-1)
+                                    (ice-9 match)
+                                    (ice-9 pretty-print))
+
+                       (define locales
+                         (call-with-input-file
+                             #+(glibc-supported-locales glibc)
+                           read))
+
+                       (define utf8-locales
+                         (filter-map (match-lambda
+                                       ((name . "UTF-8")
+                                        (if (string-suffix? ".UTF-8" name)
+                                            (string-drop-right name 6)
+                                            name))
+                                       (_ #f))
+                                     locales))
+
+                       (call-with-output-file #$output
+                         (lambda (port)
+                           (pretty-print utf8-locales port))))))
+
+  (make-glibc-utf8-locales glibc #:locale-file utf8-locales))
+
+\f
 (define-public which
   (package
     (name "which")

[-- Attachment #3: Type: text/plain, Size: 2962 bytes --]

("aa_DJ"
 "aa_ER"
 "aa_ER@saaho"
 "aa_ET"
 "af_ZA"
 "agr_PE"
 "ak_GH"
 "am_ET"
 "an_ES"
 "anp_IN"
 "ar_AE"
 "ar_BH"
 "ar_DZ"
 "ar_EG"
 "ar_IN"
 "ar_IQ"
 "ar_JO"
 "ar_KW"
 "ar_LB"
 "ar_LY"
 "ar_MA"
 "ar_OM"
 "ar_QA"
 "ar_SA"
 "ar_SD"
 "ar_SS"
 "ar_SY"
 "ar_TN"
 "ar_YE"
 "ayc_PE"
 "az_AZ"
 "az_IR"
 "as_IN"
 "ast_ES"
 "be_BY"
 "be_BY@latin"
 "bem_ZM"
 "ber_DZ"
 "ber_MA"
 "bg_BG"
 "bhb_IN"
 "bho_IN"
 "bho_NP"
 "bi_VU"
 "bn_BD"
 "bn_IN"
 "bo_CN"
 "bo_IN"
 "br_FR"
 "brx_IN"
 "bs_BA"
 "byn_ER"
 "ca_AD"
 "ca_ES"
 "ca_ES@valencia"
 "ca_FR"
 "ca_IT"
 "ce_RU"
 "chr_US"
 "cmn_TW"
 "crh_UA"
 "cs_CZ"
 "csb_PL"
 "cv_RU"
 "cy_GB"
 "da_DK"
 "de_AT"
 "de_BE"
 "de_CH"
 "de_DE"
 "de_IT"
 "de_LI"
 "de_LU"
 "doi_IN"
 "dsb_DE"
 "dv_MV"
 "dz_BT"
 "el_GR"
 "el_CY"
 "en_AG"
 "en_AU"
 "en_BW"
 "en_CA"
 "en_DK"
 "en_GB"
 "en_HK"
 "en_IE"
 "en_IL"
 "en_IN"
 "en_NG"
 "en_NZ"
 "en_PH"
 "en_SC"
 "en_SG"
 "en_US"
 "en_ZA"
 "en_ZM"
 "en_ZW"
 "eo"
 "es_AR"
 "es_BO"
 "es_CL"
 "es_CO"
 "es_CR"
 "es_CU"
 "es_DO"
 "es_EC"
 "es_ES"
 "es_GT"
 "es_HN"
 "es_MX"
 "es_NI"
 "es_PA"
 "es_PE"
 "es_PR"
 "es_PY"
 "es_SV"
 "es_US"
 "es_UY"
 "es_VE"
 "et_EE"
 "eu_ES"
 "fa_IR"
 "ff_SN"
 "fi_FI"
 "fil_PH"
 "fo_FO"
 "fr_BE"
 "fr_CA"
 "fr_CH"
 "fr_FR"
 "fr_LU"
 "fur_IT"
 "fy_NL"
 "fy_DE"
 "ga_IE"
 "gd_GB"
 "gez_ER"
 "gez_ER@abegede"
 "gez_ET"
 "gez_ET@abegede"
 "gl_ES"
 "gu_IN"
 "gv_GB"
 "ha_NG"
 "hak_TW"
 "he_IL"
 "hi_IN"
 "hif_FJ"
 "hne_IN"
 "hr_HR"
 "hsb_DE"
 "ht_HT"
 "hu_HU"
 "hy_AM"
 "ia_FR"
 "id_ID"
 "ig_NG"
 "ik_CA"
 "is_IS"
 "it_CH"
 "it_IT"
 "iu_CA"
 "ja_JP"
 "ka_GE"
 "kab_DZ"
 "kk_KZ"
 "kl_GL"
 "km_KH"
 "kn_IN"
 "ko_KR"
 "kok_IN"
 "ks_IN"
 "ks_IN@devanagari"
 "ku_TR"
 "kw_GB"
 "ky_KG"
 "lb_LU"
 "lg_UG"
 "li_BE"
 "li_NL"
 "lij_IT"
 "ln_CD"
 "lo_LA"
 "lt_LT"
 "lv_LV"
 "lzh_TW"
 "mag_IN"
 "mai_IN"
 "mai_NP"
 "mfe_MU"
 "mg_MG"
 "mhr_RU"
 "mi_NZ"
 "miq_NI"
 "mjw_IN"
 "mk_MK"
 "ml_IN"
 "mn_MN"
 "mni_IN"
 "mr_IN"
 "ms_MY"
 "mt_MT"
 "my_MM"
 "nan_TW"
 "nan_TW@latin"
 "nb_NO"
 "nds_DE"
 "nds_NL"
 "ne_NP"
 "nhn_MX"
 "niu_NU"
 "niu_NZ"
 "nl_AW"
 "nl_BE"
 "nl_NL"
 "nn_NO"
 "nr_ZA"
 "nso_ZA"
 "oc_FR"
 "om_ET"
 "om_KE"
 "or_IN"
 "os_RU"
 "pa_IN"
 "pa_PK"
 "pap_AW"
 "pap_CW"
 "pl_PL"
 "ps_AF"
 "pt_BR"
 "pt_PT"
 "quz_PE"
 "raj_IN"
 "ro_RO"
 "ru_RU"
 "ru_UA"
 "rw_RW"
 "sa_IN"
 "sah_RU"
 "sat_IN"
 "sc_IT"
 "sd_IN"
 "sd_IN@devanagari"
 "se_NO"
 "sgs_LT"
 "shn_MM"
 "shs_CA"
 "si_LK"
 "sid_ET"
 "sk_SK"
 "sl_SI"
 "sm_WS"
 "so_DJ"
 "so_ET"
 "so_KE"
 "so_SO"
 "sq_AL"
 "sq_MK"
 "sr_ME"
 "sr_RS"
 "sr_RS@latin"
 "ss_ZA"
 "st_ZA"
 "sv_FI"
 "sv_SE"
 "sw_KE"
 "sw_TZ"
 "szl_PL"
 "ta_IN"
 "ta_LK"
 "tcy_IN"
 "te_IN"
 "tg_TJ"
 "th_TH"
 "the_NP"
 "ti_ER"
 "ti_ET"
 "tig_ER"
 "tk_TM"
 "tl_PH"
 "tn_ZA"
 "to_TO"
 "tpi_PG"
 "tr_CY"
 "tr_TR"
 "ts_ZA"
 "tt_RU"
 "tt_RU@iqtelif"
 "ug_CN"
 "uk_UA"
 "unm_US"
 "ur_IN"
 "ur_PK"
 "uz_UZ"
 "uz_UZ@cyrillic"
 "ve_ZA"
 "vi_VN"
 "wa_BE"
 "wae_CH"
 "wal_ET"
 "wo_SN"
 "xh_ZA"
 "yi_US"
 "yo_NG"
 "yue_HK"
 "yuw_PG"
 "zh_CN"
 "zh_HK"
 "zh_SG"
 "zh_TW"
 "zu_ZA")

  reply	other threads:[~2020-03-21 15:37 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-17 20:28 Frequent locales problems for new users Leo Famulari
2020-03-18  7:47 ` Efraim Flashner
2020-03-18  8:12 ` Thorsten Wilms
2020-03-18 16:22   ` Tobias Geerinckx-Rice
2020-03-18 15:07 ` Ludovic Courtès
2020-03-18 18:36   ` Leo Famulari
2020-03-21 15:37     ` Ludovic Courtès [this message]
2020-03-21 18:02       ` Gábor Boskovits
2020-03-21 19:43       ` Leo Famulari
2020-03-21 20:14         ` Leo Famulari
2020-03-26 12:06         ` Ludovic Courtès
2020-07-01 18:02 ` Vagrant Cascadian

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://guix.gnu.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87pnd51zz2.fsf@gnu.org \
    --to=ludo@gnu.org \
    --cc=guix-devel@gnu.org \
    --cc=leo@famulari.name \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).