From: "pelzflorian (Florian Pelz)" <pelzflorian@pelzflorian.de>
To: sirgazil <sirgazil@zoho.com>
Cc: Guile User <guile-user@gnu.org>, guile-devel <guile-devel@gnu.org>
Subject: Re: Website translations with Haunt
Date: Sat, 16 Dec 2017 20:30:41 +0100 [thread overview]
Message-ID: <20171216193041.s6hxyf7u2lb5ihso@floriannotebook> (raw)
In-Reply-To: <7e3a3109-25c5-ed60-9a7b-ee1a656de87e@zoho.com>
[-- Attachment #1: Type: text/plain, Size: 8887 bytes --]
On Sat, Dec 16, 2017 at 10:26:12AM -0500, sirgazil wrote:
> I'm very interested on this subject because I help with Guile and Guix
> websites, and I usually work with multilingual websites. I have no idea of
> what would be the right way to do i18n of websites written in Scheme,
> though. So I will just join this conversation as a potential user of your
> solutions :)
>
:)
>
> > I did not want to use the ordinary gettext functions in order to not
> > call setlocale very often to switch languages. It seems the Gettext
> > system is not designed for rapidly changing locales, but maybe I am
> > wrong about this and very many setlocale calls would not be that bad.
>
>
> For what is worth, I use ordinary gettext and `setlocale` in my website,
> which is not Haunt-based, but it is Guile Scheme and statically generated
> too. So far, it works ok.
>
Performance is what motivated me to avoid repeated setlocale calls.
I now measured the impact of my approach and for my website, repeatedly
calling setlocale and gettext is actually slightly faster than
transforming a po file into an associative list and assoc-ref’ing the
list. Only when using the same msgid very many times, transforming
the po file gets faster.
So it is probably best *not* to add ffi-helper to Haunt after all and
just use Gettext because while repeated setlocale is bad in theory, it
is faster in practice for normal websites and it does not really
matter much anyway. Then again, for long running applications, not
using setlocale is better.
If you want detailed timings, read on, otherwise feel free to skip to
the end of this e-mail.
For my German and English website with the code at
https://pelzflorian.de/git/pelzfloriande-website/ this is the result
of timing my current approach, which avoids repeated setlocale and
standard gettext calls but instead uses libgettextpo to create an
association list of msgids and msgstrs from the respective po files
(i.e. not from compiled mo files).
I put
#!/bin/sh
GUILE_LOAD_PATH=$GUILE_LOAD_PATH:$HOME/keep/projects/pelzfloriande-website:$HOME/build/nyacc/src/nyacc/examples LD_LIBRARY_PATH=/gnu/store/0jjgg2bk6qmx87sdksm7bd2b3z10yd6j-gettext-0.19.8.1/lib haunt build
inside a file called launch.sh. I then ran
$ time ./launch.sh
[…]
./launch.sh 2.43s user 0.33s system 83% cpu 3.317 total
./launch.sh 2.47s user 0.33s system 103% cpu 2.703 total
./launch.sh 2.43s user 0.36s system 103% cpu 2.700 total
./launch.sh 2.56s user 0.33s system 103% cpu 2.783 total
When instead not loading gettext-po, ffi-help-rt and Guile’s system
foreign modules, but running msgfmt to transform the po files to mo
files, moving them to ./de/LC_MESSAGES/pelzfloriande.mo and just using
standard Gettext and setlocale with
(bindtextdomain "pelzfloriande" "/home/florian/keep/projects/pelzfloriande-website")
(bind-textdomain-codeset "pelzfloriande" "UTF-8")
(textdomain "pelzfloriande")
(define (locale-for-lingua lingua)
(assoc-ref
'(("de" . "de_DE.UTF-8")
("en" . "en_US.UTF-8"))
lingua))
(define (translated-msg msgid lingua)
(begin
(setlocale LC_ALL (locale-for-lingua lingua))
(gettext msgid)))
I got the following measurements and verified that the translation is
still working:
$ time ./launch.sh
building pages in 'site'...
copying asset 'css/common.css' → '/css/common.css'
[…]
./launch.sh 2.01s user 0.29s system 102% cpu 2.241 total
For multiple runs:
./launch.sh 2.01s user 0.29s system 102% cpu 2.241 total
./launch.sh 2.06s user 0.31s system 102% cpu 2.302 total
./launch.sh 2.15s user 0.33s system 104% cpu 2.387 total
./launch.sh 1.99s user 0.32s system 102% cpu 2.246 total
When using setlocale but only when the lingua has changed from the
last call to _:
(define old-lingua "")
(define (translated-msg msgid lingua)
(begin
(if (not (equal? old-lingua lingua))
(begin
(setlocale LC_ALL (locale-for-lingua lingua))
(set! old-lingua lingua)))
(gettext msgid)))
./launch.sh 2.10s user 0.32s system 103% cpu 2.332 total
./launch.sh 2.03s user 0.31s system 102% cpu 2.283 total
./launch.sh 2.11s user 0.36s system 102% cpu 2.408 total
./launch.sh 2.05s user 0.30s system 102% cpu 2.296 total
When adding the following in a div:
,@(let loop ((i 0))
(if (< i 10000)
(cons
(_ "Home page")
(loop (1+ i)))
'()))
and verifying it is correctly translated,
this is the result for my implementation:
./launch.sh 4.48s user 0.33s system 89% cpu 5.356 total
./launch.sh 4.52s user 0.36s system 102% cpu 4.737 total
./launch.sh 4.44s user 0.38s system 104% cpu 4.619 total
./launch.sh 4.49s user 0.40s system 103% cpu 4.735 total
With a setlocale call for each _:
./launch.sh 4.46s user 0.36s system 101% cpu 4.736 total
./launch.sh 4.64s user 0.39s system 103% cpu 4.875 total
./launch.sh 4.65s user 0.33s system 103% cpu 4.838 total
./launch.sh 4.66s user 0.33s system 103% cpu 4.833 total
This is the result for a cached setlocale call:
./launch.sh 4.39s user 0.37s system 102% cpu 4.624 total
./launch.sh 4.17s user 0.32s system 88% cpu 5.086 total
./launch.sh 4.09s user 0.32s system 102% cpu 4.276 total
./launch.sh 4.16s user 0.35s system 103% cpu 4.345 total
When adding the following in the div instead
,@(let loop ((i 0))
(if (< i 10000)
(cons
(let ((current-lingua "de"))
(_ "Home page"))
(cons
(let ((current-lingua "en"))
(_ "Home page"))
(loop (1+ i))))
'()))
this is my current implementation
./launch.sh 6.36s user 0.36s system 99% cpu 6.733 total
./launch.sh 6.34s user 0.34s system 103% cpu 6.470 total
./launch.sh 6.00s user 0.39s system 103% cpu 6.195 total
this is without caching setlocale
./launch.sh 8.74s user 0.38s system 101% cpu 8.986 total
./launch.sh 8.70s user 0.36s system 102% cpu 8.872 total
./launch.sh 8.86s user 0.40s system 99% cpu 9.300 total
this is with caching setlocale
./launch.sh 8.95s user 0.37s system 93% cpu 9.979 total
./launch.sh 8.60s user 0.39s system 103% cpu 8.712 total
./launch.sh 8.81s user 0.34s system 95% cpu 9.581 total
In this contrived example, my implementation is faster. Note that my
implementation may or may not be slower when not using the same
translation very often but instead using a longer PO file.
> For internationalization, I know the convention is to use _, but I don't
> like that, so I use the alias l10n instead.
>
We should definitely let the user define the syntax like in the Guile
manual. If you want l10n, then use l10n, which is less confusing when
using _ for pattern matching. But I will stick to _ for my website.
> For internationalizing complex blocks that should not be translated in
> fragments, like:
>
>
> `(p "Hi! I play "
> (a (@ (href ,sport-url)) ,(l10n "futsal"))
> " in "
> (a (@ (href ,place-url)) ,(l10n "Tokyo")))
>
>
> I had to write a procedure I call `interleave` that I use like this:
>
>
> `(p
> ,@(interleave (l10n "Hi! I play ~SPORT~ in ~PLACE~.")
> `(a (@ (href ,sport-url)) ,(l10n "futsal"))
> `(a (@ (href ,place-url)) ,(l10n "Tokyo"))))
>
>
> So, in the translation catalogs, translators will see the strings:
>
>
> "Hi! I play ~SPORT~ in ~PLACE~."
> "futsal"
> "Tokyo"
>
>
This interleaving is like a format string and is common in
applications, but it separates the value of ~SPORT~ from the context
in which it should be translated. I prefer my approach with
multi-part translations with
,@(__ "This is a ||em_|multi-part translation||."
`(("em_" .
,(lambda (text)
`(em ,text)))))
> Currently, I use xgettext manually and Poedit for working with translation
> catalogs, but I'd like to manage translations in the future like this
> (replace `site` with `haunt`):
>
>
> # Create new translation catalogs for Finnish and Japanese.
> $ site catalog-new fi ja
>
> # Update translation catalogs with new translation strings.
> $ site catalog-update
>
> # Compile translation catalogs (generate .mo files)
> $ site catalog-compile
>
>
Yes. This is a good user interface. Maybe this should be part of the
haunt command and not require a build system after all…
> To be fully localized, I also have to pass IETF Language Tags around in the
> website code, so that I get the right content when rendering the templates
> in a given language.
>
> My 2¢
>
Yes, me too. I wonder if this should be wrapped into custom syntax
maybe like the Guix store in G-expressions, but I’m not sure.
Regards,
Florian
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
prev parent reply other threads:[~2017-12-16 19:30 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-12-09 18:06 Website translations with Haunt pelzflorian (Florian Pelz)
2017-12-09 18:15 ` ng0
2017-12-09 21:08 ` pelzflorian (Florian Pelz)
2017-12-09 22:29 ` ng0
2017-12-13 14:53 ` pelzflorian (Florian Pelz)
2017-12-10 15:22 ` Matt Wette
2017-12-10 19:21 ` pelzflorian (Florian Pelz)
2017-12-10 22:35 ` Matt Wette
2017-12-12 7:51 ` pelzflorian (Florian Pelz)
2017-12-12 8:03 ` ng0
2017-12-12 9:30 ` pelzflorian (Florian Pelz)
2017-12-12 13:45 ` Matt Wette
2017-12-12 18:47 ` pelzflorian (Florian Pelz)
2017-12-10 23:00 ` Matt Wette
2017-12-12 8:17 ` pelzflorian (Florian Pelz)
2017-12-14 9:16 ` Ludovic Courtès
2017-12-14 13:23 ` Thompson, David
2017-12-15 11:39 ` pelzflorian (Florian Pelz)
2017-12-15 14:01 ` pelzflorian (Florian Pelz)
2017-12-15 3:48 ` Christopher Lemmer Webber
2017-12-15 8:34 ` pelzflorian (Florian Pelz)
2017-12-15 12:06 ` ng0
2017-12-15 14:25 ` pelzflorian (Florian Pelz)
2017-12-16 9:54 ` Ricardo Wurmus
2017-12-16 12:37 ` pelzflorian (Florian Pelz)
2017-12-16 15:26 ` sirgazil
2017-12-16 19:30 ` pelzflorian (Florian Pelz) [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/guile/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20171216193041.s6hxyf7u2lb5ihso@floriannotebook \
--to=pelzflorian@pelzflorian.de \
--cc=guile-devel@gnu.org \
--cc=guile-user@gnu.org \
--cc=sirgazil@zoho.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).