* Frequent locales problems for new users @ 2020-03-17 20:28 Leo Famulari 2020-03-18 7:47 ` Efraim Flashner ` (3 more replies) 0 siblings, 4 replies; 12+ messages in thread From: Leo Famulari @ 2020-03-17 20:28 UTC (permalink / raw) To: guix-devel Warning! Locales! New users seem to have trouble with Guix locales every day. I think we can improve the situation. First, we can deprecate the glibc-utf8-locales package and not mention it in the manual section Application Setup. I've seen users think they had to install it in order to get UTF-8 support. Everyone should be using glibc-locales. Eventually we can rename it to 'glibc-locales-for-tests', and hide the package too. Second, we need to make sure that guix-install.sh is setting up GUIX_LOCPATH correctly. I see that the binary tarball's store includes glibc-utf8-locales, so it should be possible for things to "just work", ignoring that it's the wrong locales package. Does anyone know any particular issues with the installer that would cause trouble? ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Frequent locales problems for new users 2020-03-17 20:28 Frequent locales problems for new users Leo Famulari @ 2020-03-18 7:47 ` Efraim Flashner 2020-03-18 8:12 ` Thorsten Wilms ` (2 subsequent siblings) 3 siblings, 0 replies; 12+ messages in thread From: Efraim Flashner @ 2020-03-18 7:47 UTC (permalink / raw) To: Leo Famulari; +Cc: guix-devel [-- Attachment #1: Type: text/plain, Size: 1418 bytes --] On Tue, Mar 17, 2020 at 04:28:43PM -0400, Leo Famulari wrote: > Warning! Locales! New users seem to have trouble with Guix locales every > day. > > I think we can improve the situation. > > First, we can deprecate the glibc-utf8-locales package and not mention > it in the manual section Application Setup. I've seen users think they > had to install it in order to get UTF-8 support. Everyone should be > using glibc-locales. Eventually we can rename it to > 'glibc-locales-for-tests', and hide the package too. > > Second, we need to make sure that guix-install.sh is setting up > GUIX_LOCPATH correctly. I see that the binary tarball's store includes > glibc-utf8-locales, so it should be possible for things to "just work", > ignoring that it's the wrong locales package. Does anyone know any > particular issues with the installer that would cause trouble? I haven't setup a new install or helped people with one in a while so bear with me. IIRC there are two times it's needed, once for the daemon, and we already added the environment variable to the systemd unit, and once for each user. I think making it Just Work™ with the daemon would be really good at a minimum. -- Efraim Flashner <efraim@flashner.co.il> אפרים פלשנר GPG key = A28B F40C 3E55 1372 662D 14F7 41AA E7DC CA3D 8351 Confidentiality cannot be guaranteed on emails sent or received unencrypted [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Frequent locales problems for new users 2020-03-17 20:28 Frequent locales problems for new users Leo Famulari 2020-03-18 7:47 ` Efraim Flashner @ 2020-03-18 8:12 ` Thorsten Wilms 2020-03-18 16:22 ` Tobias Geerinckx-Rice 2020-03-18 15:07 ` Ludovic Courtès 2020-07-01 18:02 ` Vagrant Cascadian 3 siblings, 1 reply; 12+ messages in thread From: Thorsten Wilms @ 2020-03-18 8:12 UTC (permalink / raw) To: guix-devel On Tue, 17 Mar 2020 16:28:43 -0400 Leo Famulari <leo@famulari.name> wrote: > First, we can deprecate the glibc-utf8-locales package and not mention > it in the manual section Application Setup. I've seen users think they > had to install it in order to get UTF-8 support. Everyone should be > using glibc-locales. I mean to recall that I read in the docs and/or in an example, that glibc-utf8-locales is smaller than glibc-locales, but still sufficient for many cases. Is that wrong? -- Thorsten Wilms <t_w_@freenet.de> ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Frequent locales problems for new users 2020-03-18 8:12 ` Thorsten Wilms @ 2020-03-18 16:22 ` Tobias Geerinckx-Rice 0 siblings, 0 replies; 12+ messages in thread From: Tobias Geerinckx-Rice @ 2020-03-18 16:22 UTC (permalink / raw) To: guix-devel; +Cc: Leo Famulari, Thorsten Wilms [-- Attachment #1: Type: text/plain, Size: 1336 bytes --] Ludo', Thorsten, Leo, Ludovic Courtès 写道: > Well, we still need to be able to install locales somehow, > right? :-) This isn't about removing all locale packages, just the poorly-named -utf8- variant. Thorsten Wilms 写道: > smaller than glibc-locales, but still sufficient for many cases. > Is that wrong? Yes and no. It's sufficient for many (but not most) humans, but then that's true for ar_AE as well ;-) Here's what it contains: de_DE.utf8 el_GR.utf8 en_US.utf8 fr_FR.utf8 tr_TR.utf8 Offering this as our only choice of ‘sufficient’ user locales has some unpleasant cultural overtones to say the least. Where it is useful, and apparently does cover the majority of use cases, is in test suites &c. It's a good package for machines. Hiding it would make that clear, as we already do with tzdata-for-tests. Ludovic Courtès 写道: > As for ‘glibc-utf8-locales’ vs. ‘glibc-locales’: the reason for > choosing > the former by default over the latter is size (14 MiB vs. 917 > MiB). > Perhaps an improvement would be for ‘glibc-utf8-locales’ to be > more true > to its name: to include all the UTF-8 locales glibc supports > rather than > an arbitrary sample thereof. That would make it well-named, so good by me! Kind regards, T G-R [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 832 bytes --] ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Frequent locales problems for new users 2020-03-17 20:28 Frequent locales problems for new users Leo Famulari 2020-03-18 7:47 ` Efraim Flashner 2020-03-18 8:12 ` Thorsten Wilms @ 2020-03-18 15:07 ` Ludovic Courtès 2020-03-18 18:36 ` Leo Famulari 2020-07-01 18:02 ` Vagrant Cascadian 3 siblings, 1 reply; 12+ messages in thread From: Ludovic Courtès @ 2020-03-18 15:07 UTC (permalink / raw) To: Leo Famulari; +Cc: guix-devel Hello! Leo Famulari <leo@famulari.name> skribis: > Warning! Locales! New users seem to have trouble with Guix locales every > day. > > I think we can improve the situation. > > First, we can deprecate the glibc-utf8-locales package and not mention > it in the manual section Application Setup. I've seen users think they > had to install it in order to get UTF-8 support. Everyone should be > using glibc-locales. Eventually we can rename it to > 'glibc-locales-for-tests', and hide the package too. Well, we still need to be able to install locales somehow, right? :-) > Second, we need to make sure that guix-install.sh is setting up > GUIX_LOCPATH correctly. I see that the binary tarball's store includes > glibc-utf8-locales, so it should be possible for things to "just work", > ignoring that it's the wrong locales package. Does anyone know any > particular issues with the installer that would cause trouble? ‘guix-command’ in (guix self) creates a ‘guix’ binary where GUIX_LOCPATH points to ‘glibc-utf8-locales’, always. That means that ‘guix pull’ returns a ‘guix’ program that works fine, provided you use one of the locales in ‘glibc-utf8-locales’ *or* you have installed ‘glibc-locales’ and set ‘GUIX_LOCPATH’. The ‘guix’ binary of the ‘guix’ package does something similar. These two should already eliminate most problems. Now, we should investigate actual problems to see why they show up precisely (for that we need to see the output of commands, the contents of the .service file, and so on). That will allow us to determine the best course of action. As for ‘glibc-utf8-locales’ vs. ‘glibc-locales’: the reason for choosing the former by default over the latter is size (14 MiB vs. 917 MiB). Perhaps an improvement would be for ‘glibc-utf8-locales’ to be more true to its name: to include all the UTF-8 locales glibc supports rather than an arbitrary sample thereof. Thoughts? Ludo’. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Frequent locales problems for new users 2020-03-18 15:07 ` Ludovic Courtès @ 2020-03-18 18:36 ` Leo Famulari 2020-03-21 15:37 ` Ludovic Courtès 0 siblings, 1 reply; 12+ messages in thread From: Leo Famulari @ 2020-03-18 18:36 UTC (permalink / raw) To: Ludovic Courtès; +Cc: guix-devel On Wed, Mar 18, 2020 at 04:07:22PM +0100, Ludovic Courtès wrote: > As for ‘glibc-utf8-locales’ vs. ‘glibc-locales’: the reason for choosing > the former by default over the latter is size (14 MiB vs. 917 MiB). Oof! I was going by the manual, which says 110 MiB. That does change things... ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Frequent locales problems for new users 2020-03-18 18:36 ` Leo Famulari @ 2020-03-21 15:37 ` Ludovic Courtès 2020-03-21 18:02 ` Gábor Boskovits 2020-03-21 19:43 ` Leo Famulari 0 siblings, 2 replies; 12+ messages in thread From: Ludovic Courtès @ 2020-03-21 15:37 UTC (permalink / raw) To: Leo Famulari; +Cc: guix-devel [-- Attachment #1: Type: text/plain, Size: 2242 bytes --] Hi Leo, Leo Famulari <leo@famulari.name> skribis: > On Wed, Mar 18, 2020 at 04:07:22PM +0100, Ludovic Courtès wrote: >> As for ‘glibc-utf8-locales’ vs. ‘glibc-locales’: the reason for choosing >> the former by default over the latter is size (14 MiB vs. 917 MiB). > > Oof! I was going by the manual, which says 110 MiB. That does change > things... Yes, I was also surprised. The patch below produces a package that includes all the UTF-8 locales (actually I had written that patch long ago, it feels like we’re running in circles :-)). It takes ages to build, and when it’s finally done: --8<---------------cut here---------------start------------->8--- $ ./pre-inst-env guix build -e '((@@ (gnu packages base) make-glibc-utf8-locales/full))' substitute: updating substitutes from 'https://ci.guix.gnu.org'... 100.0% substitute: updating substitutes from 'https://ci.guix.gnu.org'... 100.0% substituting /gnu/store/jdfs3xvlnj272475yja6bjrprfsgnkdd-glibc-2.29... downloading from https://ci.guix.gnu.org/nar/lzip/jdfs3xvlnj272475yja6bjrprfsgnkdd-glibc-2.29... glibc-2.29 8.2MiB 1.8MiB/s 00:05 [##################] 100.0% building /gnu/store/w08zi9vnkd7bxpfvm5lgjyb30i7k7sw4-glibc-supported-utf8-locales.scm.drv... successfully built /gnu/store/w08zi9vnkd7bxpfvm5lgjyb30i7k7sw4-glibc-supported-utf8-locales.scm.drv building /gnu/store/ps6wh05pwjp5b0l9rh2yglv3sggpgcw4-glibc-utf8-locales-2.29.drv... successfully built /gnu/store/ps6wh05pwjp5b0l9rh2yglv3sggpgcw4-glibc-utf8-locales-2.29.drv /gnu/store/p0knl9ggxk91x87ww702g2x78jxy1vgf-glibc-utf8-locales-2.29 ludo@ribbon ~/src/guix$ guix size /gnu/store/p0knl9ggxk91x87ww702g2x78jxy1vgf-glibc-utf8-locales-2.29 | tail -1 total: 855.7 MiB --8<---------------cut here---------------end--------------->8--- So I think that’s when we reached the conclusion that we needed parameterized packages to allow users to choose the locale(s) they need or special support in ‘guix package’. :-/ Attached is the list of supported UTF-8 locales, 312 in total. Thoughts? How do other distros deal with this? Are we missing some trick to compress locale data? Ludo’. [-- Attachment #2: Type: text/x-patch, Size: 4435 bytes --] diff --git a/gnu/packages/base.scm b/gnu/packages/base.scm index e8150708c0..98b413da13 100644 --- a/gnu/packages/base.scm +++ b/gnu/packages/base.scm @@ -1,5 +1,5 @@ ;;; GNU Guix --- Functional package management for GNU -;;; Copyright © 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019 Ludovic Courtès <ludo@gnu.org> +;;; Copyright © 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020 Ludovic Courtès <ludo@gnu.org> ;;; Copyright © 2014, 2019 Andreas Enge <andreas@enge.fr> ;;; Copyright © 2012 Nikita Karetnikov <nikita@karetnikov.org> ;;; Copyright © 2014, 2015, 2016, 2018 Mark H Weaver <mhw@netris.org> @@ -52,6 +52,8 @@ #:use-module (gnu packages python) #:use-module (gnu packages gettext) #:use-module (guix utils) + #:use-module (guix gexp) + #:use-module (guix modules) #:use-module (guix packages) #:use-module (guix download) #:use-module (guix git-download) @@ -61,6 +63,8 @@ #:use-module (srfi srfi-1) #:use-module (srfi srfi-26) #:export (glibc + %default-utf8-locales + make-glibc-utf8-locales libiconv-if-needed)) ;;; Commentary: @@ -1076,7 +1080,12 @@ to the @code{share/locale} sub-directory of this package.") ,(version-major+minor (package-version glibc))))))))))) -(define-public (make-glibc-utf8-locales glibc) +(define %default-utf8-locales + '("de_DE" "el_GR" "en_US" "fr_FR" "tr_TR")) + +(define* (make-glibc-utf8-locales glibc #:optional + (locales %default-utf8-locales) + (locale-file #f)) (package (name "glibc-utf8-locales") (version (package-version glibc)) @@ -1115,10 +1124,17 @@ to the @code{share/locale} sub-directory of this package.") ;; These are the locales commonly used for ;; tests---e.g., in Guile's i18n tests. - '("de_DE" "el_GR" "en_US" "fr_FR" "tr_TR")) + ,(if locale-file + `(call-with-input-file + (assoc-ref %build-inputs "locale-file") + read) + `',locales)) #t)))) (native-inputs `(("glibc" ,glibc) - ("gzip" ,gzip))) + ("gzip" ,gzip) + ,@(if locale-file + `(("locale-file" ,locale-file)) + '()))) (synopsis "Small sample of UTF-8 locales") (description "This package provides a small sample of UTF-8 locales mostly useful in @@ -1145,6 +1161,40 @@ test environments.") (define-public glibc-locales-2.27 (deprecated-package "glibc-locales-2.27" glibc-locales-2.28)) +(define (glibc-supported-locales libc) + ((module-ref (resolve-interface '(gnu system locale)) ;FIXME: hack + 'glibc-supported-locales) + libc)) + +(define* (make-glibc-utf8-locales/full #:optional (glibc glibc)) + (define utf8-locales + (computed-file "glibc-supported-utf8-locales.scm" + #~(begin + (use-modules (srfi srfi-1) + (ice-9 match) + (ice-9 pretty-print)) + + (define locales + (call-with-input-file + #+(glibc-supported-locales glibc) + read)) + + (define utf8-locales + (filter-map (match-lambda + ((name . "UTF-8") + (if (string-suffix? ".UTF-8" name) + (string-drop-right name 6) + name)) + (_ #f)) + locales)) + + (call-with-output-file #$output + (lambda (port) + (pretty-print utf8-locales port)))))) + + (make-glibc-utf8-locales glibc #:locale-file utf8-locales)) + +\f (define-public which (package (name "which") [-- Attachment #3: Type: text/plain, Size: 2962 bytes --] ("aa_DJ" "aa_ER" "aa_ER@saaho" "aa_ET" "af_ZA" "agr_PE" "ak_GH" "am_ET" "an_ES" "anp_IN" "ar_AE" "ar_BH" "ar_DZ" "ar_EG" "ar_IN" "ar_IQ" "ar_JO" "ar_KW" "ar_LB" "ar_LY" "ar_MA" "ar_OM" "ar_QA" "ar_SA" "ar_SD" "ar_SS" "ar_SY" "ar_TN" "ar_YE" "ayc_PE" "az_AZ" "az_IR" "as_IN" "ast_ES" "be_BY" "be_BY@latin" "bem_ZM" "ber_DZ" "ber_MA" "bg_BG" "bhb_IN" "bho_IN" "bho_NP" "bi_VU" "bn_BD" "bn_IN" "bo_CN" "bo_IN" "br_FR" "brx_IN" "bs_BA" "byn_ER" "ca_AD" "ca_ES" "ca_ES@valencia" "ca_FR" "ca_IT" "ce_RU" "chr_US" "cmn_TW" "crh_UA" "cs_CZ" "csb_PL" "cv_RU" "cy_GB" "da_DK" "de_AT" "de_BE" "de_CH" "de_DE" "de_IT" "de_LI" "de_LU" "doi_IN" "dsb_DE" "dv_MV" "dz_BT" "el_GR" "el_CY" "en_AG" "en_AU" "en_BW" "en_CA" "en_DK" "en_GB" "en_HK" "en_IE" "en_IL" "en_IN" "en_NG" "en_NZ" "en_PH" "en_SC" "en_SG" "en_US" "en_ZA" "en_ZM" "en_ZW" "eo" "es_AR" "es_BO" "es_CL" "es_CO" "es_CR" "es_CU" "es_DO" "es_EC" "es_ES" "es_GT" "es_HN" "es_MX" "es_NI" "es_PA" "es_PE" "es_PR" "es_PY" "es_SV" "es_US" "es_UY" "es_VE" "et_EE" "eu_ES" "fa_IR" "ff_SN" "fi_FI" "fil_PH" "fo_FO" "fr_BE" "fr_CA" "fr_CH" "fr_FR" "fr_LU" "fur_IT" "fy_NL" "fy_DE" "ga_IE" "gd_GB" "gez_ER" "gez_ER@abegede" "gez_ET" "gez_ET@abegede" "gl_ES" "gu_IN" "gv_GB" "ha_NG" "hak_TW" "he_IL" "hi_IN" "hif_FJ" "hne_IN" "hr_HR" "hsb_DE" "ht_HT" "hu_HU" "hy_AM" "ia_FR" "id_ID" "ig_NG" "ik_CA" "is_IS" "it_CH" "it_IT" "iu_CA" "ja_JP" "ka_GE" "kab_DZ" "kk_KZ" "kl_GL" "km_KH" "kn_IN" "ko_KR" "kok_IN" "ks_IN" "ks_IN@devanagari" "ku_TR" "kw_GB" "ky_KG" "lb_LU" "lg_UG" "li_BE" "li_NL" "lij_IT" "ln_CD" "lo_LA" "lt_LT" "lv_LV" "lzh_TW" "mag_IN" "mai_IN" "mai_NP" "mfe_MU" "mg_MG" "mhr_RU" "mi_NZ" "miq_NI" "mjw_IN" "mk_MK" "ml_IN" "mn_MN" "mni_IN" "mr_IN" "ms_MY" "mt_MT" "my_MM" "nan_TW" "nan_TW@latin" "nb_NO" "nds_DE" "nds_NL" "ne_NP" "nhn_MX" "niu_NU" "niu_NZ" "nl_AW" "nl_BE" "nl_NL" "nn_NO" "nr_ZA" "nso_ZA" "oc_FR" "om_ET" "om_KE" "or_IN" "os_RU" "pa_IN" "pa_PK" "pap_AW" "pap_CW" "pl_PL" "ps_AF" "pt_BR" "pt_PT" "quz_PE" "raj_IN" "ro_RO" "ru_RU" "ru_UA" "rw_RW" "sa_IN" "sah_RU" "sat_IN" "sc_IT" "sd_IN" "sd_IN@devanagari" "se_NO" "sgs_LT" "shn_MM" "shs_CA" "si_LK" "sid_ET" "sk_SK" "sl_SI" "sm_WS" "so_DJ" "so_ET" "so_KE" "so_SO" "sq_AL" "sq_MK" "sr_ME" "sr_RS" "sr_RS@latin" "ss_ZA" "st_ZA" "sv_FI" "sv_SE" "sw_KE" "sw_TZ" "szl_PL" "ta_IN" "ta_LK" "tcy_IN" "te_IN" "tg_TJ" "th_TH" "the_NP" "ti_ER" "ti_ET" "tig_ER" "tk_TM" "tl_PH" "tn_ZA" "to_TO" "tpi_PG" "tr_CY" "tr_TR" "ts_ZA" "tt_RU" "tt_RU@iqtelif" "ug_CN" "uk_UA" "unm_US" "ur_IN" "ur_PK" "uz_UZ" "uz_UZ@cyrillic" "ve_ZA" "vi_VN" "wa_BE" "wae_CH" "wal_ET" "wo_SN" "xh_ZA" "yi_US" "yo_NG" "yue_HK" "yuw_PG" "zh_CN" "zh_HK" "zh_SG" "zh_TW" "zu_ZA") ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: Frequent locales problems for new users 2020-03-21 15:37 ` Ludovic Courtès @ 2020-03-21 18:02 ` Gábor Boskovits 2020-03-21 19:43 ` Leo Famulari 1 sibling, 0 replies; 12+ messages in thread From: Gábor Boskovits @ 2020-03-21 18:02 UTC (permalink / raw) To: Ludovic Courtès; +Cc: Guix-devel [-- Attachment #1: Type: text/plain, Size: 6604 bytes --] Hello, Ludovic Courtès <ludo@gnu.org> ezt írta (időpont: 2020. márc. 21., Szo 16:37): > Hi Leo, > > Leo Famulari <leo@famulari.name> skribis: > > > On Wed, Mar 18, 2020 at 04:07:22PM +0100, Ludovic Courtès wrote: > >> As for ‘glibc-utf8-locales’ vs. ‘glibc-locales’: the reason for choosing > >> the former by default over the latter is size (14 MiB vs. 917 MiB). > > > > Oof! I was going by the manual, which says 110 MiB. That does change > > things... > > Yes, I was also surprised. > > The patch below produces a package that includes all the UTF-8 locales > (actually I had written that patch long ago, it feels like we’re running > in circles :-)). > > It takes ages to build, and when it’s finally done: > > --8<---------------cut here---------------start------------->8--- > $ ./pre-inst-env guix build -e '((@@ (gnu packages base) > make-glibc-utf8-locales/full))' > substitute: updating substitutes from 'https://ci.guix.gnu.org'... 100.0% > substitute: updating substitutes from 'https://ci.guix.gnu.org'... 100.0% > substituting /gnu/store/jdfs3xvlnj272475yja6bjrprfsgnkdd-glibc-2.29... > downloading from > https://ci.guix.gnu.org/nar/lzip/jdfs3xvlnj272475yja6bjrprfsgnkdd-glibc-2.29. > .. > glibc-2.29 8.2MiB > 1.8MiB/s 00:05 [##################] 100.0% > > building > /gnu/store/w08zi9vnkd7bxpfvm5lgjyb30i7k7sw4-glibc-supported-utf8-locales.scm.drv... > successfully built > /gnu/store/w08zi9vnkd7bxpfvm5lgjyb30i7k7sw4-glibc-supported-utf8-locales.scm.drv > building > /gnu/store/ps6wh05pwjp5b0l9rh2yglv3sggpgcw4-glibc-utf8-locales-2.29.drv... > successfully built > /gnu/store/ps6wh05pwjp5b0l9rh2yglv3sggpgcw4-glibc-utf8-locales-2.29.drv > /gnu/store/p0knl9ggxk91x87ww702g2x78jxy1vgf-glibc-utf8-locales-2.29 > ludo@ribbon ~/src/guix$ guix size > /gnu/store/p0knl9ggxk91x87ww702g2x78jxy1vgf-glibc-utf8-locales-2.29 | tail > -1 > total: 855.7 MiB > --8<---------------cut here---------------end--------------->8--- > > So I think that’s when we reached the conclusion that we needed > parameterized packages to allow users to choose the locale(s) they need > or special support in ‘guix package’. > I believe we could also add individual locales as outputs. Then we just have to make sure that they are included to the LOCPATH. I believe we could do this to the frequently used locales, and direct users to only install out when they don't find an output with their locale. Wdyt? > > :-/ > > Attached is the list of supported UTF-8 locales, 312 in total. > > Thoughts? How do other distros deal with this? Are we missing some > trick to compress locale data? > > Ludo’. > g_bor > > ("aa_DJ" > "aa_ER" > "aa_ER@saaho" > "aa_ET" > "af_ZA" > "agr_PE" > "ak_GH" > "am_ET" > "an_ES" > "anp_IN" > "ar_AE" > "ar_BH" > "ar_DZ" > "ar_EG" > "ar_IN" > "ar_IQ" > "ar_JO" > "ar_KW" > "ar_LB" > "ar_LY" > "ar_MA" > "ar_OM" > "ar_QA" > "ar_SA" > "ar_SD" > "ar_SS" > "ar_SY" > "ar_TN" > "ar_YE" > "ayc_PE" > "az_AZ" > "az_IR" > "as_IN" > "ast_ES" > "be_BY" > "be_BY@latin" > "bem_ZM" > "ber_DZ" > "ber_MA" > "bg_BG" > "bhb_IN" > "bho_IN" > "bho_NP" > "bi_VU" > "bn_BD" > "bn_IN" > "bo_CN" > "bo_IN" > "br_FR" > "brx_IN" > "bs_BA" > "byn_ER" > "ca_AD" > "ca_ES" > "ca_ES@valencia" > "ca_FR" > "ca_IT" > "ce_RU" > "chr_US" > "cmn_TW" > "crh_UA" > "cs_CZ" > "csb_PL" > "cv_RU" > "cy_GB" > "da_DK" > "de_AT" > "de_BE" > "de_CH" > "de_DE" > "de_IT" > "de_LI" > "de_LU" > "doi_IN" > "dsb_DE" > "dv_MV" > "dz_BT" > "el_GR" > "el_CY" > "en_AG" > "en_AU" > "en_BW" > "en_CA" > "en_DK" > "en_GB" > "en_HK" > "en_IE" > "en_IL" > "en_IN" > "en_NG" > "en_NZ" > "en_PH" > "en_SC" > "en_SG" > "en_US" > "en_ZA" > "en_ZM" > "en_ZW" > "eo" > "es_AR" > "es_BO" > "es_CL" > "es_CO" > "es_CR" > "es_CU" > "es_DO" > "es_EC" > "es_ES" > "es_GT" > "es_HN" > "es_MX" > "es_NI" > "es_PA" > "es_PE" > "es_PR" > "es_PY" > "es_SV" > "es_US" > "es_UY" > "es_VE" > "et_EE" > "eu_ES" > "fa_IR" > "ff_SN" > "fi_FI" > "fil_PH" > "fo_FO" > "fr_BE" > "fr_CA" > "fr_CH" > "fr_FR" > "fr_LU" > "fur_IT" > "fy_NL" > "fy_DE" > "ga_IE" > "gd_GB" > "gez_ER" > "gez_ER@abegede" > "gez_ET" > "gez_ET@abegede" > "gl_ES" > "gu_IN" > "gv_GB" > "ha_NG" > "hak_TW" > "he_IL" > "hi_IN" > "hif_FJ" > "hne_IN" > "hr_HR" > "hsb_DE" > "ht_HT" > "hu_HU" > "hy_AM" > "ia_FR" > "id_ID" > "ig_NG" > "ik_CA" > "is_IS" > "it_CH" > "it_IT" > "iu_CA" > "ja_JP" > "ka_GE" > "kab_DZ" > "kk_KZ" > "kl_GL" > "km_KH" > "kn_IN" > "ko_KR" > "kok_IN" > "ks_IN" > "ks_IN@devanagari" > "ku_TR" > "kw_GB" > "ky_KG" > "lb_LU" > "lg_UG" > "li_BE" > "li_NL" > "lij_IT" > "ln_CD" > "lo_LA" > "lt_LT" > "lv_LV" > "lzh_TW" > "mag_IN" > "mai_IN" > "mai_NP" > "mfe_MU" > "mg_MG" > "mhr_RU" > "mi_NZ" > "miq_NI" > "mjw_IN" > "mk_MK" > "ml_IN" > "mn_MN" > "mni_IN" > "mr_IN" > "ms_MY" > "mt_MT" > "my_MM" > "nan_TW" > "nan_TW@latin" > "nb_NO" > "nds_DE" > "nds_NL" > "ne_NP" > "nhn_MX" > "niu_NU" > "niu_NZ" > "nl_AW" > "nl_BE" > "nl_NL" > "nn_NO" > "nr_ZA" > "nso_ZA" > "oc_FR" > "om_ET" > "om_KE" > "or_IN" > "os_RU" > "pa_IN" > "pa_PK" > "pap_AW" > "pap_CW" > "pl_PL" > "ps_AF" > "pt_BR" > "pt_PT" > "quz_PE" > "raj_IN" > "ro_RO" > "ru_RU" > "ru_UA" > "rw_RW" > "sa_IN" > "sah_RU" > "sat_IN" > "sc_IT" > "sd_IN" > "sd_IN@devanagari" > "se_NO" > "sgs_LT" > "shn_MM" > "shs_CA" > "si_LK" > "sid_ET" > "sk_SK" > "sl_SI" > "sm_WS" > "so_DJ" > "so_ET" > "so_KE" > "so_SO" > "sq_AL" > "sq_MK" > "sr_ME" > "sr_RS" > "sr_RS@latin" > "ss_ZA" > "st_ZA" > "sv_FI" > "sv_SE" > "sw_KE" > "sw_TZ" > "szl_PL" > "ta_IN" > "ta_LK" > "tcy_IN" > "te_IN" > "tg_TJ" > "th_TH" > "the_NP" > "ti_ER" > "ti_ET" > "tig_ER" > "tk_TM" > "tl_PH" > "tn_ZA" > "to_TO" > "tpi_PG" > "tr_CY" > "tr_TR" > "ts_ZA" > "tt_RU" > "tt_RU@iqtelif" > "ug_CN" > "uk_UA" > "unm_US" > "ur_IN" > "ur_PK" > "uz_UZ" > "uz_UZ@cyrillic" > "ve_ZA" > "vi_VN" > "wa_BE" > "wae_CH" > "wal_ET" > "wo_SN" > "xh_ZA" > "yi_US" > "yo_NG" > "yue_HK" > "yuw_PG" > "zh_CN" > "zh_HK" > "zh_SG" > "zh_TW" > "zu_ZA") > [-- Attachment #2: Type: text/html, Size: 11917 bytes --] ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Frequent locales problems for new users 2020-03-21 15:37 ` Ludovic Courtès 2020-03-21 18:02 ` Gábor Boskovits @ 2020-03-21 19:43 ` Leo Famulari 2020-03-21 20:14 ` Leo Famulari 2020-03-26 12:06 ` Ludovic Courtès 1 sibling, 2 replies; 12+ messages in thread From: Leo Famulari @ 2020-03-21 19:43 UTC (permalink / raw) To: Ludovic Courtès; +Cc: guix-devel On Sat, Mar 21, 2020 at 04:37:05PM +0100, Ludovic Courtès wrote: > Thoughts? How do other distros deal with this? Are we missing some > trick to compress locale data? I noticed that downloading glibc-locales, it's 10.8 MiB. On disk, the store item is ~220 MiB. I'm not sure how guix size calculates 917 MiB. Debian Buster's (stable) locales directory contains 357 entries, ours contains 645. Debian's locales directory is ~13 MiB, ours ~220 MiB. I poked around a bit. Debian achieves the smaller size by referring to data rather than copying it around, and only including delta changes. Hopefully we can copy their technique. Our locales are collections of binary files in a directory, like this... ------ en_US ├── LC_ADDRESS ├── LC_COLLATE ├── LC_CTYPE ├── LC_IDENTIFICATION ├── LC_MEASUREMENT ├── LC_MESSAGES │ └── SYS_LC_MESSAGES ├── LC_MONETARY ├── LC_NAME ├── LC_NUMERIC ├── LC_PAPER ├── LC_TELEPHONE └── LC_TIME ------ ... while Debian concatenates the files together as text. I compared the en_US directory from both places, and it seems that LC_CTYPE is re-used from en_GB with delta patching: ------ $ /gnu/store/03nvilh2x4z07dxv7h13gh986vvgpnsf-glibc-locales-2.29/lib/locale/2.29/en_US% du -sh * 4.0K LC_ADDRESS 24K LC_COLLATE 284K LC_CTYPE <--- the big one 4.0K LC_IDENTIFICATION 4.0K LC_MEASUREMENT 8.0K LC_MESSAGES 4.0K LC_MONETARY 4.0K LC_NAME 4.0K LC_NUMERIC 4.0K LC_PAPER 4.0K LC_TELEPHONE 4.0K LC_TIME $ cat /usr/share/i18n/locales/en_US [...] LC_CTYPE copy "en_GB" END LC_CTYPE [...] $ cat /usr/share/i18n/locales/en_GB [...] LC_CTYPE copy "i18n" translit_start include "translit_combining";"" translit_end END LC_CTYPE [...] $ du -sh /usr/share/i18n/locales/i18n_ctype 160K /usr/share/i18n/locales/i18n_ctype ------ Another example, more obscure: ------ $ /gnu/store/03nvilh2x4z07dxv7h13gh986vvgpnsf-glibc-locales-2.29/lib/locale/2.29/te_IN% du -sh * 4.0K LC_ADDRESS 2.5M LC_COLLATE <--- Yikes 332K LC_CTYPE <--- Still big 4.0K LC_IDENTIFICATION 4.0K LC_MEASUREMENT 8.0K LC_MESSAGES 4.0K LC_MONETARY 4.0K LC_NAME 4.0K LC_NUMERIC 4.0K LC_PAPER 4.0K LC_TELEPHONE 8.0K LC_TIME $ cat /usr/share/i18n/locales/te_IN [...] LC_CTYPE copy "i18n" % Telugu uses the alternate digits U+0C66..U+0C6F outdigit <U0C66>..<U0C6F> % This is used in the scanf family of functions to read Telugu numbers % using "%Id" and such. map to_inpunct; / (<U0030>,<U0C66>); / (<U0031>,<U0C67>); / (<U0032>,<U0C68>); / (<U0033>,<U0C69>); / (<U0034>,<U0C6A>); / (<U0035>,<U0C6B>); / (<U0036>,<U0C6C>); / (<U0037>,<U0C6D>); / (<U0038>,<U0C6E>); / (<U0039>,<U0C6F>); translit_start include "translit_combining";"" translit_end END LC_CTYPE LC_COLLATE % Copy the template from ISO/IEC 14651 copy "iso14651_t1" END LC_COLLATE ------ ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Frequent locales problems for new users 2020-03-21 19:43 ` Leo Famulari @ 2020-03-21 20:14 ` Leo Famulari 2020-03-26 12:06 ` Ludovic Courtès 1 sibling, 0 replies; 12+ messages in thread From: Leo Famulari @ 2020-03-21 20:14 UTC (permalink / raw) To: Ludovic Courtès; +Cc: guix-devel On Sat, Mar 21, 2020 at 03:43:32PM -0400, Leo Famulari wrote: > I poked around a bit. Debian achieves the smaller size by referring to > data rather than copying it around, and only including delta changes. > Hopefully we can copy their technique. We are discussing it on #guix. I think that Debian just packages the sources and then builds only what is requested by users, so my comparison may be moot. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Frequent locales problems for new users 2020-03-21 19:43 ` Leo Famulari 2020-03-21 20:14 ` Leo Famulari @ 2020-03-26 12:06 ` Ludovic Courtès 1 sibling, 0 replies; 12+ messages in thread From: Ludovic Courtès @ 2020-03-26 12:06 UTC (permalink / raw) To: Leo Famulari; +Cc: guix-devel Hi, Leo Famulari <leo@famulari.name> skribis: > On Sat, Mar 21, 2020 at 04:37:05PM +0100, Ludovic Courtès wrote: >> Thoughts? How do other distros deal with this? Are we missing some >> trick to compress locale data? > > I noticed that downloading glibc-locales, it's 10.8 MiB. On disk, the > store item is ~220 MiB. I'm not sure how guix size calculates 917 MiB. Oh, this is due to hard links: nars don’t support hard links, so the same thing is repeated several times. --8<---------------cut here---------------start------------->8--- $ guix archive --export glibc-locales |wc -c 961328272 $ du -hsl $(guix build glibc-locales) 939M /gnu/store/03nvilh2x4z07dxv7h13gh986vvgpnsf-glibc-locales-2.29 $ du -hs $(guix build glibc-locales) 220M /gnu/store/03nvilh2x4z07dxv7h13gh986vvgpnsf-glibc-locales-2.29 --8<---------------cut here---------------end--------------->8--- (It does mean that we should replace hard links with symlinks, like we do for ‘git’.) Doing that with the full set of UTF-8 locales I mentioned in my previous message, I see: --8<---------------cut here---------------start------------->8--- $ du -hsl /gnu/store/p0knl9ggxk91x87ww702g2x78jxy1vgf-glibc-utf8-locales-2.29 870M /gnu/store/p0knl9ggxk91x87ww702g2x78jxy1vgf-glibc-utf8-locales-2.29 $ du -hs /gnu/store/p0knl9ggxk91x87ww702g2x78jxy1vgf-glibc-utf8-locales-2.29 193M /gnu/store/p0knl9ggxk91x87ww702g2x78jxy1vgf-glibc-utf8-locales-2.29 --8<---------------cut here---------------end--------------->8--- To compare to: --8<---------------cut here---------------start------------->8--- $ du -hs $(guix build glibc-utf8-locales) 6.1M /gnu/store/n79cf8bvy3k96gjk1rf18d36w40lkwlr-glibc-utf8-locales-2.29 $ du -hsl $(guix build glibc-utf8-locales) 15M /gnu/store/n79cf8bvy3k96gjk1rf18d36w40lkwlr-glibc-utf8-locales-2.29 --8<---------------cut here---------------end--------------->8--- Thanks, Ludo’. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Frequent locales problems for new users 2020-03-17 20:28 Frequent locales problems for new users Leo Famulari ` (2 preceding siblings ...) 2020-03-18 15:07 ` Ludovic Courtès @ 2020-07-01 18:02 ` Vagrant Cascadian 3 siblings, 0 replies; 12+ messages in thread From: Vagrant Cascadian @ 2020-07-01 18:02 UTC (permalink / raw) To: Leo Famulari, guix-devel [-- Attachment #1: Type: text/plain, Size: 1470 bytes --] On 2020-03-17, Leo Famulari wrote: > Warning! Locales! New users seem to have trouble with Guix locales every > day. > > I think we can improve the situation. > > First, we can deprecate the glibc-utf8-locales package and not mention > it in the manual section Application Setup. I've seen users think they > had to install it in order to get UTF-8 support. Everyone should be > using glibc-locales. Eventually we can rename it to > 'glibc-locales-for-tests', and hide the package too. > > Second, we need to make sure that guix-install.sh is setting up > GUIX_LOCPATH correctly. I see that the binary tarball's store includes > glibc-utf8-locales, so it should be possible for things to "just work", > ignoring that it's the wrong locales package. Does anyone know any > particular issues with the installer that would cause trouble? I neglecteed to chime in way back when, but in irc the other day issues around locales came up and I wondered ... Any compelling reason not to put each locale into it's own package and/or output? You could have meta-packages which pull in specific sets "glibc-locales-es" which pull in all spanish locales, or "glibc-locales-all" or "glibc-locales-all-utf8" which pulls in everything. Or some other semi-logical splitting. That way users could install exactly the locales they want. It could be selected from the installer, and install only the specific locales they want, or sets of locales they want, etc. live well, vagrant [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 227 bytes --] ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2020-07-01 18:03 UTC | newest] Thread overview: 12+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2020-03-17 20:28 Frequent locales problems for new users Leo Famulari 2020-03-18 7:47 ` Efraim Flashner 2020-03-18 8:12 ` Thorsten Wilms 2020-03-18 16:22 ` Tobias Geerinckx-Rice 2020-03-18 15:07 ` Ludovic Courtès 2020-03-18 18:36 ` Leo Famulari 2020-03-21 15:37 ` Ludovic Courtès 2020-03-21 18:02 ` Gábor Boskovits 2020-03-21 19:43 ` Leo Famulari 2020-03-21 20:14 ` Leo Famulari 2020-03-26 12:06 ` Ludovic Courtès 2020-07-01 18:02 ` Vagrant Cascadian
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/guix.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).