From: Maxim Nikulin <manikulin@gmail.com>
To: emacs-devel@gnu.org
Subject: Re: CSV parsing and other issues (Re: LC_NUMERIC)
Date: Mon, 14 Jun 2021 23:38:19 +0700 [thread overview]
Message-ID: <sa80ls$iev$1@ciao.gmane.io> (raw)
In-Reply-To: <83h7i49t18.fsf@gnu.org>
On 12/06/2021 01:04, Eli Zaretskii wrote:
>> From: Maxim Nikulin Date: Fri, 11 Jun 2021 23:58:24 +0700
>> On 10/06/2021 23:57, Eli Zaretskii wrote:
>> >> From: Maxim Nikulin Date: Thu, 10 Jun 2021 23:28:59 +0700
>> >
>> > For processing CSV, if there's a need to know whether the
>> > locale uses the comma as a decimal separator, we could
>> > indeed extend locale-info. But such an extension is almost
>> > trivial and doesn't even touch on the significant problems
>> > in the rest of the discussion.
>>
>> You forgot `setlocale(LC_NUMERIC, "C")', didn't you?
>
> No, I didn't. Adding a call to setlocale to locale-info, even if we
> want to add an argument for the caller to control the locale, is
> trivial.
I would avoid such manipulations and the reason is not efficiency of
particular implementation. Locale is not thread local, so changing it in
*getter* is a source rare but really obscure hardly reproducible
problems. I do not like such output
1234.567890
1234,567890
1234.567890
of the following program changing locale in a parallel thread
#include <locale.h>
#include <pthread.h>
#include <stdio.h>
#include <time.h>
#define DELAY_NS 40000000
void* other_thread(void *arg) {
struct timespec delay = { 0, DELAY_NS/2 };
nanosleep(&delay, NULL);
printf("%f\n", 1234.56789);
delay.tv_nsec = DELAY_NS;
nanosleep(&delay, NULL);
printf("%f\n", 1234.56789);
nanosleep(&delay, NULL);
printf("%f\n", 1234.56789);
return NULL;
}
int main() {
setlocale(LC_NUMERIC, "C");
pthread_t thread_id;
pthread_create(&thread_id, NULL, &other_thread, NULL);
struct timespec delay = { 0, DELAY_NS };
nanosleep(&delay, NULL);
setlocale(LC_NUMERIC, "");
nanosleep(&delay, NULL);
setlocale(LC_NUMERIC, "C");
void *res;
pthread_join(thread_id, &res);
return 0;
}
Explicit locale objects decoupled from application-wide global
preferences are safer and more flexible.
>> > Here's a trivial example:
>> >
>> > (insert (downcase (buffer-substring POS1 POS2)))
>> >
>> > Contrast with
>> >
>> > (insert (downcase "FOO"))
>>
>> Either `set-text-properties' should be called on "FOO" before passing it
>> to `downcase'
>
> Which property will help here? we don't have such properties. they
> need to be designed and implemented.
Let's name it "locale". Its value is some object that represents either
a "solid" locale such as de_DE or combined LC_NUMERIC=en_GB +
LC_TIME=de_DE + default fr_FR. Data required for particular operations
may be loaded on demand.
>> or `locale-downcase' with LOCALE first argument should be
>> added.
>
> How would you implement locale-downcase? Are you familiar with how
> Emacs case tables work?
No, I am not familiar with Emacs internals dealing with case conversion.
I already wrote I am even unaware how to properly handle Turkish. For
the scripts I am familiar with, it is enough to have default table for
normalizing and conversion. I can admit that sometimes conversion may
depend on language and the language can not be determined from code
point. In such cases I expect additional override table that has higher
priority than the default one.
> And even if we had locale-downcase, which locale would you
> pass to it in any given use case?
I already mentioned responsibility chain: explicit value or set of
overrides passed by user, text property for particular span of
characters, buffer-local variables, global environment variables. Locale
may be instantiated from its name "it_IT". Convenience functions to
obtain locale at point likely will be useful as well. (Actually I am
assuming number parsing-formatting rather than case conversion.)
next prev parent reply other threads:[~2021-06-14 16:38 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-06-06 23:36 CSV parsing and other issues (Re: LC_NUMERIC) Boruch Baum
2021-06-07 12:28 ` Eli Zaretskii
2021-06-08 0:45 ` Boruch Baum
2021-06-08 2:35 ` Eli Zaretskii
2021-06-08 15:35 ` Stefan Monnier
2021-06-08 16:35 ` Maxim Nikulin
2021-06-08 18:52 ` Eli Zaretskii
2021-06-10 16:28 ` Maxim Nikulin
2021-06-10 16:57 ` Eli Zaretskii
2021-06-10 18:01 ` Boruch Baum
2021-06-10 18:50 ` Eli Zaretskii
2021-06-10 19:04 ` Boruch Baum
2021-06-10 19:23 ` Eli Zaretskii
2021-06-10 20:20 ` Boruch Baum
2021-06-11 6:19 ` Eli Zaretskii
2021-06-11 8:18 ` Boruch Baum
2021-06-11 16:51 ` Maxim Nikulin
2021-06-11 13:56 ` Filipp Gunbin
2021-06-11 14:10 ` Eli Zaretskii
2021-06-11 18:52 ` Filipp Gunbin
2021-06-11 19:34 ` Eli Zaretskii
2021-06-11 16:58 ` Maxim Nikulin
2021-06-11 18:04 ` Eli Zaretskii
2021-06-14 16:38 ` Maxim Nikulin [this message]
2021-06-14 17:19 ` Eli Zaretskii
2021-06-16 17:27 ` Maxim Nikulin
2021-06-16 17:36 ` Eli Zaretskii
2021-06-10 21:10 ` Stefan Monnier
2021-06-12 14:41 ` Maxim Nikulin
-- strict thread matches above, loose matches on Subject: below --
2021-06-02 18:54 LC_NUMERIC formatting [FEATURE REQUEST] Boruch Baum
2021-06-03 14:44 ` CSV parsing and other issues (Re: LC_NUMERIC) Maxim Nikulin
2021-06-03 15:01 ` Eli Zaretskii
2021-06-04 16:31 ` Maxim Nikulin
2021-06-04 19:17 ` Eli Zaretskii
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='sa80ls$iev$1@ciao.gmane.io' \
--to=manikulin@gmail.com \
--cc=emacs-devel@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.