unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: Maxim Nikulin <manikulin@gmail.com>
To: emacs-devel@gnu.org
Subject: Re: CSV parsing and other issues (Re: LC_NUMERIC)
Date: Sat, 12 Jun 2021 21:41:48 +0700	[thread overview]
Message-ID: <sa2h3e$gqj$1@ciao.gmane.io> (raw)
In-Reply-To: <jwveed95syq.fsf-monnier+emacs@gnu.org>

On 11/06/2021 04:10, Stefan Monnier wrote:
 >> There are plenty of CSV dialects. If decimal separator is
 >> "," then office software uses ";" instead of comma as cell
 >> (field) separator.
 >
 > But there's no reason to presume that a given CSV file was
 > generated in the same locale as the one we're currently
 > using.
 >
 > So the locale could be one ingredient in the machinery used
 > to guess which separator was used, but I'm not sure it would
 > be of much help.

You are right. My expectation is still that ";" is mostly used for 
locales with comma as decimal separator, and in such cases it must be 
tried with higher priority due to records that have enough amount of 
both characters.

     1,2;3,45;56,789

Originally the question raised exactly in the context of attempt to 
improve guessing of separator:
https://debbugs.gnu.org/cgi/bugreport.cgi?bug=47885 The patches have
however other problems. Advanced options for table import are likely 
more suitable e.g. for csv-mode and may become unnecessary burden in 
org-mode (especially if kill-yank would work well in both directions).

Certainly users should have opportunity to explicitly specify the 
dialect of the files they are going to import.

 > [ BTW, I'll take the opportunity to advocate for the use of
 >   TSV instead, which is slightly less ill-defined.  ]

In real world one often does have full control of file formats he has to 
deal with. In simple cases I can use space separated columns of numbers 
having fixed width. On the other hand downloaded bank statements are 
namely CSV with ";" as delimiter and in legacy windows 8-bit encoding 
(and such files have a kind of header with varying column number 
distinct from the following table).

So ability to get decimal separator for current locale may slightly 
improve user experience with import of CSV files at least in Org mode. 
However it is just an aspect of support of locale-aware number formats 
in Emacs.




  reply	other threads:[~2021-06-12 14:41 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-06 23:36 CSV parsing and other issues (Re: LC_NUMERIC) Boruch Baum
2021-06-07 12:28 ` Eli Zaretskii
2021-06-08  0:45   ` Boruch Baum
2021-06-08  2:35     ` Eli Zaretskii
2021-06-08 15:35       ` Stefan Monnier
2021-06-08 16:35       ` Maxim Nikulin
2021-06-08 18:52         ` Eli Zaretskii
2021-06-10 16:28           ` Maxim Nikulin
2021-06-10 16:57             ` Eli Zaretskii
2021-06-10 18:01               ` Boruch Baum
2021-06-10 18:50                 ` Eli Zaretskii
2021-06-10 19:04                   ` Boruch Baum
2021-06-10 19:23                     ` Eli Zaretskii
2021-06-10 20:20                       ` Boruch Baum
2021-06-11  6:19                         ` Eli Zaretskii
2021-06-11  8:18                           ` Boruch Baum
2021-06-11 16:51                           ` Maxim Nikulin
2021-06-11 13:56                       ` Filipp Gunbin
2021-06-11 14:10                         ` Eli Zaretskii
2021-06-11 18:52                           ` Filipp Gunbin
2021-06-11 19:34                             ` Eli Zaretskii
2021-06-11 16:58               ` Maxim Nikulin
2021-06-11 18:04                 ` Eli Zaretskii
2021-06-14 16:38                   ` Maxim Nikulin
2021-06-14 17:19                     ` Eli Zaretskii
2021-06-16 17:27                       ` Maxim Nikulin
2021-06-16 17:36                         ` Eli Zaretskii
2021-06-10 21:10             ` Stefan Monnier
2021-06-12 14:41               ` Maxim Nikulin [this message]
  -- strict thread matches above, loose matches on Subject: below --
2021-06-02 18:54 LC_NUMERIC formatting [FEATURE REQUEST] Boruch Baum
2021-06-03 14:44 ` CSV parsing and other issues (Re: LC_NUMERIC) Maxim Nikulin
2021-06-03 15:01   ` Eli Zaretskii
2021-06-04 16:31     ` Maxim Nikulin
2021-06-04 19:17       ` Eli Zaretskii

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='sa2h3e$gqj$1@ciao.gmane.io' \
    --to=manikulin@gmail.com \
    --cc=emacs-devel@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).