unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* [PATCH] org-table-import: Make it more smarter for interactive use
@ 2021-04-19  4:43 Utkarsh Singh
  2021-04-19  8:19 ` Nicolas Goaziou
  0 siblings, 1 reply; 2+ messages in thread
From: Utkarsh Singh @ 2021-04-19  4:43 UTC (permalink / raw)
  To: emacs-orgmode, bug-gnu-emacs

Hi,

My previous patch proposed to add support for importing file with
arbitrary name and building upon that this patch tries to make use of it
by making org-table-import smarter by simply adding more separators
(delimiters).

Currently org-table-import 'smartly' guesses only COMMA, TAB and SPACE
as separator whereas this patch tries to add support for ';'(SEMICOLON)
and ':' (COLON).

Here is an example org-table generated using =M-x org-table-import=
/etc/passwd (uses COLON as separator) with private information removed.

| bin                    | x |     1 |     1 |                                | /                     | /usr/bin/nologin   |
| daemon                 | x |     2 |     2 |                                | /                     | /usr/bin/nologin   |
| mail                   | x |     8 |    12 |                                | /var/spool/mail       | /usr/bin/nologin   |
| ftp                    | x |    14 |    11 |                                | /srv/ftp              | /usr/bin/nologin   |
| http                   | x |    33 |    33 |                                | /srv/http             | /usr/bin/nologin   |
| nobody                 | x | 65534 | 65534 | Nobody                         | /                     | /usr/bin/nologin   |
| dbus                   | x |    81 |    81 | System Message Bus             | /                     | /usr/bin/nologin   |
| systemd-journal-remote | x |   981 |   981 | systemd Journal Remote         | /                     | /usr/bin/nologin   |
| systemd-network        | x |   980 |   980 | systemd Network Management     | /                     | /usr/bin/nologin   |
| systemd-oom            | x |   979 |   979 | systemd Userspace OOM Killer   | /                     | /usr/bin/nologin   |
| systemd-resolve        | x |   978 |   978 | systemd Resolver               | /                     | /usr/bin/nologin   |
| systemd-timesync       | x |   977 |   977 | systemd Time Synchronization   | /                     | /usr/bin/nologin   |
| systemd-coredump       | x |   976 |   976 | systemd Core Dumper            | /                     | /usr/bin/nologin   |
| avahi                  | x |   974 |   974 | Avahi mDNS/DNS-SD daemon       | /                     | /usr/bin/nologin   |
| colord                 | x |   973 |   973 | Color management daemon        | /var/lib/colord       | /usr/bin/nologin   |
| rtkit                  | x |   133 |   133 | RealtimeKit                    | /proc                 | /usr/bin/nologin   |
| transmission           | x |   169 |   169 | Transmission BitTorrent Daemon | /var/lib/transmission | /usr/bin/nologin   |
| geoclue                | x |   972 |   972 | Geoinformation service         | /var/lib/geoclue      | /usr/bin/nologin   |
| usbmux                 | x |   140 |   140 | usbmux user                    | /                     | /usr/bin/nologin   |


diff --git a/lisp/org/org-table.el b/lisp/org/org-table.el
index ab66859d6a..5ee4af612b 100644
--- a/lisp/org/org-table.el
+++ b/lisp/org/org-table.el
@@ -846,6 +846,35 @@ org-table-create
       (goto-char pos))
     (org-table-align)))
 
+
+(defun org-table-guess-separator (beg0 end0)
+  "Guess separator for `org-table-convert-region' for region BEG0 to END0.
+
+List of preferred separator:
+comma, TAB, ';', ':' or SPACE
+
+If region contains a line which doesn't contain the required
+separator then discard the separator and search again using next
+separator."
+  (let ((beg (save-excursion
+	       (goto-char (min beg0 end0))
+	       (beginning-of-line 1)
+	       (point)))
+	(end (save-excursion
+	       (goto-char (max beg0 end0))
+	       (end-of-line 1)
+	       (if (bolp) (backward-char 1) (end-of-line 1))
+	       (point))))
+    (save-excursion
+      (goto-char beg)
+      (cond
+       ((not (re-search-forward "^[^\n,]+$" end t)) '(4))
+       ((not (re-search-forward "^[^\n\t]+$" end t)) '(16))
+       ((not (re-search-forward "^[^\n;]+$" end t)) ";")
+       ((not (re-search-forward "^[^\n:]+$" end t)) ":")
+       ((not (re-search-forward "^\\([^'\"][^\n\s][^'\"]\\)+$" end t)) " ")
+       (t nil)))))
+
 ;;;###autoload
 (defun org-table-convert-region (beg0 end0 &optional separator)
   "Convert region to a table.
@@ -862,10 +891,7 @@ org-table-convert-region
 integer  When a number, use that many spaces, or a TAB, as field separator
 regexp   When a regular expression, use it to match the separator
 nil      When nil, the command tries to be smart and figure out the
-         separator in the following way:
-         - when each line contains a TAB, assume TAB-separated material
-         - when each line contains a comma, assume CSV material
-         - else, assume one or more SPACE characters as separator."
+         separator using `org-table-guess-seperator'."
   (interactive "r\nP")
   (let* ((beg (min beg0 end0))
 	 (end (max beg0 end0))
@@ -881,14 +907,9 @@ org-table-convert-region
       (goto-char end)
       (if (bolp) (backward-char 1) (end-of-line 1))
       (setq end (point-marker))
-      ;; Get the right field separator
-      (unless separator
-	(goto-char beg)
-	(setq separator
-	      (cond
-	       ((not (re-search-forward "^[^\n\t]+$" end t)) '(16))
-	       ((not (re-search-forward "^[^\n,]+$" end t)) '(4))
-	       (t 1))))
+      (if (and (not separator)
+               (not (setq separator (org-table-guess-separator beg end))))
+          (error "Unable to guess suitable separator."))
       (goto-char beg)
       (if (equal separator '(4))
 	  (while (< (point) end)
@@ -921,12 +942,8 @@ org-table-convert-region
 (defun org-table-import (file separator)
   "Import FILE as a table.
 
-The command tries to be smart and figure out the separator in the
-following way:
-
-- when each line contains a TAB, assume TAB-separated material;
-- when each line contains a comma, assume CSV material;
-- else, assume one or more SPACE characters as separator.
+The command tries to be smart and figure out the separator using
+`org-table-guess-seperator'.
 
 When non-nil, SEPARATOR specifies the field separator in the
 lines.  It can have the following values:

-- 
Utkarsh Singh
http://utkarshsingh.xyz



^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH] org-table-import: Make it more smarter for interactive use
  2021-04-19  4:43 [PATCH] org-table-import: Make it more smarter for interactive use Utkarsh Singh
@ 2021-04-19  8:19 ` Nicolas Goaziou
  0 siblings, 0 replies; 2+ messages in thread
From: Nicolas Goaziou @ 2021-04-19  8:19 UTC (permalink / raw)
  To: Utkarsh Singh; +Cc: bug-gnu-emacs, emacs-orgmode

Hello,

Utkarsh Singh <utkarsh190601@gmail.com> writes:

> My previous patch proposed to add support for importing file with
> arbitrary name and building upon that this patch tries to make use of it
> by making org-table-import smarter by simply adding more separators
> (delimiters).

Good idea, thank you. Some comments follow.

> +(defun org-table-guess-separator (beg0 end0)
> +  "Guess separator for `org-table-convert-region' for region BEG0 to END0.
> +
> +List of preferred separator:
> +comma, TAB, ';', ':' or SPACE

I suggest to use full names everywhere: comma, TAB, semicolon, colon, or
SPACE.

> +If region contains a line which doesn't contain the required
> +separator then discard the separator and search again using next
> +separator."
> +  (let ((beg (save-excursion
> +	       (goto-char (min beg0 end0))
> +	       (beginning-of-line 1)
> +	       (point)))

  (beginning-of-line 1) + (point) -> (line-beginning-position)

since you don't intent to move point.

> +	(end (save-excursion
> +	       (goto-char (max beg0 end0))
> +	       (end-of-line 1)
> +	       (if (bolp) (backward-char 1) (end-of-line 1))

I'm not sure about what you mean above. First, the second call to
end-of-line is useless, since you're already at the end of the line.
Second, what is wrong if point is at an empty line? Why do you want to
move it back?

> +	       (point))))

You may want to use `line-end-position'.

> +    (save-excursion
> +      (goto-char beg)
> +      (cond
> +       ((not (re-search-forward "^[^\n,]+$" end t)) '(4))
> +       ((not (re-search-forward "^[^\n\t]+$" end t)) '(16))
> +       ((not (re-search-forward "^[^\n;]+$" end t)) ";")
> +       ((not (re-search-forward "^[^\n:]+$" end t)) ":")
> +       ((not (re-search-forward "^\\([^'\"][^\n\s][^'\"]\\)+$" end t)) " ")
> +       (t nil)))))

I think you need to wrap `save-excursion' around each
`re-search-forward' call. Otherwise each test starts at the first line
containing the separator previously tested.
> +
>  ;;;###autoload
>  (defun org-table-convert-region (beg0 end0 &optional separator)
>    "Convert region to a table.
> @@ -862,10 +891,7 @@ org-table-convert-region
>  integer  When a number, use that many spaces, or a TAB, as field separator
>  regexp   When a regular expression, use it to match the separator
>  nil      When nil, the command tries to be smart and figure out the
> -         separator in the following way:
> -         - when each line contains a TAB, assume TAB-separated material
> -         - when each line contains a comma, assume CSV material
> -         - else, assume one or more SPACE characters as separator."
> +         separator using `org-table-guess-seperator'."

I wonder if creating a new function is warranted here. You could add the
news checks after those already present in the code, couldn't you?


Regards,
-- 
Nicolas Goaziou



^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2021-04-19  8:19 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-19  4:43 [PATCH] org-table-import: Make it more smarter for interactive use Utkarsh Singh
2021-04-19  8:19 ` Nicolas Goaziou

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).