From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.devel Subject: Re: CSV parsing and other issues (Re: LC_NUMERIC) Date: Mon, 14 Jun 2021 20:19:31 +0300 Message-ID: <83h7i05pp8.fsf@gnu.org> References: <20210606233638.v7b7rwbufay5ltn7@E15-2016.optimum.net> <83a6o1hn9l.fsf@gnu.org> <20210608004510.usj7rw2i6tmx6qnw@E15-2016.optimum.net> <83h7i9f5ij.fsf@gnu.org> <73df2202-081b-5e50-677d-e4498b6782d4@gmail.com> <83eedcdw8k.fsf@gnu.org> <83lf7hbqte.fsf@gnu.org> <83h7i49t18.fsf@gnu.org> Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="6002"; mail-complaints-to="usenet@ciao.gmane.io" Cc: emacs-devel@gnu.org To: Maxim Nikulin Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Mon Jun 14 19:22:19 2021 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1lsqHy-0001Sc-M8 for ged-emacs-devel@m.gmane-mx.org; Mon, 14 Jun 2021 19:22:18 +0200 Original-Received: from localhost ([::1]:36964 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lsqHx-0005dU-LQ for ged-emacs-devel@m.gmane-mx.org; Mon, 14 Jun 2021 13:22:17 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:46394) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lsqFM-0003Sy-2z for emacs-devel@gnu.org; Mon, 14 Jun 2021 13:19:39 -0400 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]:52314) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lsqFL-00041m-Rc; Mon, 14 Jun 2021 13:19:35 -0400 Original-Received: from 84.94.185.95.cable.012.net.il ([84.94.185.95]:1729 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lsqFL-0004po-Fz; Mon, 14 Jun 2021 13:19:35 -0400 In-Reply-To: (message from Maxim Nikulin on Mon, 14 Jun 2021 23:38:19 +0700) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:270815 Archived-At: > From: Maxim Nikulin > Date: Mon, 14 Jun 2021 23:38:19 +0700 > > >> You forgot `setlocale(LC_NUMERIC, "C")', didn't you? > > > > No, I didn't. Adding a call to setlocale to locale-info, even if we > > want to add an argument for the caller to control the locale, is > > trivial. > > I would avoid such manipulations and the reason is not efficiency of > particular implementation. But we already do that in locale-info, for locale categories other than LC_NUMERIC. > >> > Here's a trivial example: > >> > > >> > (insert (downcase (buffer-substring POS1 POS2))) > >> > > >> > Contrast with > >> > > >> > (insert (downcase "FOO")) > >> > >> Either `set-text-properties' should be called on "FOO" before passing it > >> to `downcase' > > > > Which property will help here? we don't have such properties. they > > need to be designed and implemented. > Let's name it "locale". Its value is some object that represents either > a "solid" locale such as de_DE or combined LC_NUMERIC=en_GB + > LC_TIME=de_DE + default fr_FR. Data required for particular operations > may be loaded on demand. How do you associate such an object with text of a buffer or a string such that different parts of the text could have different "locales" (as required for a multi-lingual editor such as Emacs)? > > How would you implement locale-downcase? Are you familiar with how > > Emacs case tables work? > > No, I am not familiar with Emacs internals dealing with case conversion. > I already wrote I am even unaware how to properly handle Turkish. For > the scripts I am familiar with, it is enough to have default table for > normalizing and conversion. I can admit that sometimes conversion may > depend on language and the language can not be determined from code > point. In such cases I expect additional override table that has higher > priority than the default one. > > > And even if we had locale-downcase, which locale would you > > pass to it in any given use case? > > I already mentioned responsibility chain: explicit value or set of > overrides passed by user, text property for particular span of > characters, buffer-local variables, global environment variables. Locale > may be instantiated from its name "it_IT". Convenience functions to > obtain locale at point likely will be useful as well. (Actually I am > assuming number parsing-formatting rather than case conversion.) What you describe doesn't exist, not even in its design stage. We are back where we started: I said at the very beginning that this infrastructure is missing. It is futile to discuss solutions which rely on infrastructure that doesn't exist.