unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: Maxim Nikulin <manikulin@gmail.com>
To: emacs-devel@gnu.org
Subject: Re: CSV parsing and other issues (Re: LC_NUMERIC)
Date: Fri, 11 Jun 2021 23:51:40 +0700	[thread overview]
Message-ID: <4ad5bbe4-a5f0-37f8-2642-0cbdfcdfacc2@gmail.com> (raw)
In-Reply-To: <83v96kapog.fsf@gnu.org>

Eli, Boruch, you are overreacting (both).

On 11/06/2021 13:19, Eli Zaretskii wrote:
> There's no need to
> introduce into Emacs features that are useful for a few people.

I think that expectation of users and developers in respect to support 
of locales evolves in time. Proper formatting of numbers is useful more 
widely then for a few people.

Boruch, till your last messages, I believed that you were convinced that 
adding support of "'" and "I" is not so easy.

Support of locale-dependent format specifiers through printf looks 
attractive but it can not be directly used by `format' or other elisp 
functions in a safe way.

Some code calling `format' implicitly expects that it generates 
locale-independent numbers, so changing its behavior is not backward 
compatible.

libc can only work with single global locale at any moment. I expect 
that attempt to "temporary" call setlocale(LC_NUMERIC, "") will be 
permanent source of bugs: forgotten reverting call, call of a function 
that needs universal format in locale-specific context, threads started 
at inappropriate moment, etc.

Another implementation of locale functions is necessary with ability to 
perform parsing and formatting without touching of global variables.

Personally I expect basic level functions with explicit locale context 
(random names):

     (locale-format-number-with-ctx
      (locale-get-current-context :group-separator 'suppress)
      1234567890)

or with explicit locale instead of `locale-get-current-context'. It is 
better to add some convenience helpers that inspect text properties, 
buffer-local and global settings to determine context:

     (locale-format-number 1234567890)

and maybe even `locale-format[-with-ctx]' that accepts printf-like 
format string.

On 11/06/2021 03:20, Boruch Baum wrote:
 > Then don't make them locale specific. Implement the
 > single-quote specifier the same way you currently handle the
 > floating-point specifier '%f', a locale-specific format that
 > has existed in emacs without complaint since ...

You are confusing something. "%f" is not locale-specific inside Emacs,
it uses "universal" format with dot "." as decimal separator even in
locales with "," in this role. At the same time "'" is highly
locale-dependent in libc. Group sizes and group separator widely
vary. I posted this example earlier:

LC_NUMERIC=C.UTF-8 /usr/bin/printf "%'d\n" 1234567890
1234567890
LC_NUMERIC=en_US.UTF-8 /usr/bin/printf "%'d\n" 1234567890
1,234,567,890
LC_NUMERIC=es_ES.UTF-8 /usr/bin/printf "%'d\n" 1234567890
1.234.567.890
LC_NUMERIC=ru_RU.UTF-8 /usr/bin/printf "%'d\n" 1234567890
1 234 567 890
LC_NUMERIC=en_IN.UTF-8 /usr/bin/printf "%'d\n" 1234567890
1,23,45,67,890

 > It's not your responsibilty.
 >
 > I can say that in the use-case that prompted my request, I'm
 > confident it will *never* be an issue. I ask format to give
 > me a string and I display it. End of story. Whether just 99%
 > or 99.99%, the overwhelming majority of cases will be the
 > same. Your concerns are total non-issues.

I would prefer to avoid idiosyncrasy when "%'d" is locale-dependent but 
"%f" is not.

P.S.

With some limitation (printf binary is available and you do not need to 
work with floating point numbers), you can leverage libc formatting 
facilities with the following crutch:

(shell-command-to-string (format "/usr/bin/printf \"%%'d\" %d"
1234567890))




  parent reply	other threads:[~2021-06-11 16:51 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-06 23:36 CSV parsing and other issues (Re: LC_NUMERIC) Boruch Baum
2021-06-07 12:28 ` Eli Zaretskii
2021-06-08  0:45   ` Boruch Baum
2021-06-08  2:35     ` Eli Zaretskii
2021-06-08 15:35       ` Stefan Monnier
2021-06-08 16:35       ` Maxim Nikulin
2021-06-08 18:52         ` Eli Zaretskii
2021-06-10 16:28           ` Maxim Nikulin
2021-06-10 16:57             ` Eli Zaretskii
2021-06-10 18:01               ` Boruch Baum
2021-06-10 18:50                 ` Eli Zaretskii
2021-06-10 19:04                   ` Boruch Baum
2021-06-10 19:23                     ` Eli Zaretskii
2021-06-10 20:20                       ` Boruch Baum
2021-06-11  6:19                         ` Eli Zaretskii
2021-06-11  8:18                           ` Boruch Baum
2021-06-11 16:51                           ` Maxim Nikulin [this message]
2021-06-11 13:56                       ` Filipp Gunbin
2021-06-11 14:10                         ` Eli Zaretskii
2021-06-11 18:52                           ` Filipp Gunbin
2021-06-11 19:34                             ` Eli Zaretskii
2021-06-11 16:58               ` Maxim Nikulin
2021-06-11 18:04                 ` Eli Zaretskii
2021-06-14 16:38                   ` Maxim Nikulin
2021-06-14 17:19                     ` Eli Zaretskii
2021-06-16 17:27                       ` Maxim Nikulin
2021-06-16 17:36                         ` Eli Zaretskii
2021-06-10 21:10             ` Stefan Monnier
2021-06-12 14:41               ` Maxim Nikulin
  -- strict thread matches above, loose matches on Subject: below --
2021-06-02 18:54 LC_NUMERIC formatting [FEATURE REQUEST] Boruch Baum
2021-06-03 14:44 ` CSV parsing and other issues (Re: LC_NUMERIC) Maxim Nikulin
2021-06-03 15:01   ` Eli Zaretskii
2021-06-04 16:31     ` Maxim Nikulin
2021-06-04 19:17       ` Eli Zaretskii

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4ad5bbe4-a5f0-37f8-2642-0cbdfcdfacc2@gmail.com \
    --to=manikulin@gmail.com \
    --cc=emacs-devel@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).