From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: <emacs-orgmode-bounces+larch=yhetil.org@gnu.org> Received: from mp0 ([2001:41d0:2:4a6f::]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) by ms0.migadu.com with LMTPS id 4Sq6HpoKfWDZnAAAgWs5BA (envelope-from <emacs-orgmode-bounces+larch=yhetil.org@gnu.org>) for <larch@yhetil.org>; Mon, 19 Apr 2021 06:44:10 +0200 Received: from aspmx1.migadu.com ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp0 with LMTPS id sDSaF5oKfWDcGwAA1q6Kng (envelope-from <emacs-orgmode-bounces+larch=yhetil.org@gnu.org>) for <larch@yhetil.org>; Mon, 19 Apr 2021 04:44:10 +0000 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 9E0C01F3C4 for <larch@yhetil.org>; Mon, 19 Apr 2021 06:44:09 +0200 (CEST) Received: from localhost ([::1]:56512 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from <emacs-orgmode-bounces+larch=yhetil.org@gnu.org>) id 1lYLlX-0000gw-Hi for larch@yhetil.org; Mon, 19 Apr 2021 00:44:07 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:57392) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from <utkarsh190601@gmail.com>) id 1lYLkz-0000gn-Fq; Mon, 19 Apr 2021 00:43:34 -0400 Received: from mail-pg1-x533.google.com ([2607:f8b0:4864:20::533]:46661) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from <utkarsh190601@gmail.com>) id 1lYLkx-00056P-Jb; Mon, 19 Apr 2021 00:43:33 -0400 Received: by mail-pg1-x533.google.com with SMTP id 31so7916331pgn.13; Sun, 18 Apr 2021 21:43:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:subject:date:message-id:mime-version; bh=HwWhWBjz/uah/jBqo0i2tmAhiMv2OVugd6VkwcEUpVE=; b=s6q6v5+KzuLUKWxTbAj8LT9UM7Yn9iCOr74UKgB1f107Own9mxDLtD96EpMOkcskDl rspgh1mO3w9ebfE07UOO6DLJaBKZ6rnQ5T798gPDGgOIR4HpGSgwueZv9zVsSONwW+4/ RSwYsRe7ZijVAtUq/BMeT1Cbghih51ZTJlw+LKwfvZ8ck2eP2G6AP0kXjs4CqyiCht1f JM/VtWA6bM0NbOuUMK9Vv7GJH+mQ66ESkXhrxLlEwaqT/5GJR91OYIpc59SWJ9LwTARS MI8XYgquVUWbAMu/fUkx0jUM99h/nP8Lb0gC+DzyiazK2qGdgzgkoeF/HeP7C/O2Fhy/ qlJQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:mime-version; bh=HwWhWBjz/uah/jBqo0i2tmAhiMv2OVugd6VkwcEUpVE=; b=fjT/8wzry0m7Et5XXJto4bcduquMMvwa2HUhGD5u6jpv7LkLPh9eYSyJyV3BS2JHxc 6lswTFLQKfCzoHNWA1TxEMIrUR5P8Hh4pErbMd2CZaM6LAoeATBxV1Ku7VUCrlDWgF/w DGfxU94EICDO1uap9QvZ/1f1UjuXYp97Xm84/ZRyVPuj1zJgx4IJhvnKpwAcbMmmbzRX AQMuoBvGPsOK9nYDk+7+W3PvsBKZ23JTPU6DT+kE8pKqv1qPtM7276kahpALI+QoqEdz 59t1UGzmd0bmpiVZw2HDgjjQFktsq1PZwtG+TYyAc/xQSVk5ib3kP+TeIHKGCM7WcvFW wp6Q== X-Gm-Message-State: AOAM530/J9VJ5PsnEPcUprViYcelPisEFoLHrIgtW3EDqdQ0YxsJBPjz Wtaeg0SOM9dxxOFBHli0d8B0C5rZG9k= X-Google-Smtp-Source: ABdhPJwFDP7In9IVrNIHl9U14KeH+UTLS3nA+0g4RDFzRi9Vl7y7HlLVMWqk6RJ1nIkCGL/7LdkMeQ== X-Received: by 2002:aa7:946b:0:b029:24c:57ea:99bf with SMTP id t11-20020aa7946b0000b029024c57ea99bfmr18322610pfq.63.1618807408837; Sun, 18 Apr 2021 21:43:28 -0700 (PDT) Received: from localhost ([45.251.50.123]) by smtp.gmail.com with ESMTPSA id v8sm10820886pfm.128.2021.04.18.21.43.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 18 Apr 2021 21:43:28 -0700 (PDT) From: Utkarsh Singh <utkarsh190601@gmail.com> To: emacs-orgmode@gnu.org, bug-gnu-emacs@gnu.org Subject: [PATCH] org-table-import: Make it more smarter for interactive use Date: Mon, 19 Apr 2021 10:13:31 +0530 Message-ID: <87czuq9958.fsf@gmail.com> MIME-Version: 1.0 Content-Type: text/plain Received-SPF: pass client-ip=2607:f8b0:4864:20::533; envelope-from=utkarsh190601@gmail.com; helo=mail-pg1-x533.google.com X-Spam_score_int: 1 X-Spam_score: 0.1 X-Spam_bar: / X-Spam_report: (0.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, PDS_OTHER_BAD_TLD=1.999, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-orgmode@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "General discussions about Org-mode." <emacs-orgmode.gnu.org> List-Unsubscribe: <https://lists.gnu.org/mailman/options/emacs-orgmode>, <mailto:emacs-orgmode-request@gnu.org?subject=unsubscribe> List-Archive: <https://lists.gnu.org/archive/html/emacs-orgmode> List-Post: <mailto:emacs-orgmode@gnu.org> List-Help: <mailto:emacs-orgmode-request@gnu.org?subject=help> List-Subscribe: <https://lists.gnu.org/mailman/listinfo/emacs-orgmode>, <mailto:emacs-orgmode-request@gnu.org?subject=subscribe> Errors-To: emacs-orgmode-bounces+larch=yhetil.org@gnu.org Sender: "Emacs-orgmode" <emacs-orgmode-bounces+larch=yhetil.org@gnu.org> X-Migadu-Flow: FLOW_IN ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1618807449; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version: content-type:content-type:list-id:list-help:list-unsubscribe: list-subscribe:list-post:dkim-signature; bh=HwWhWBjz/uah/jBqo0i2tmAhiMv2OVugd6VkwcEUpVE=; b=h0FOpc5U/olf+LTLKbsxkVHFDTqjhE3+G9GTQvvuBoxK1W8nf/geFxZL4m1+H94zuX3ZIQ AncdWx7CcwKeaOwkMpGlN8FVE/uVWFfEXd0/kPB9+UdGNNyTa5SLdY5HplFPe419KGzOv8 bLbXA5nqa34DGbepomhX2LEivMB9HseDQWG6Uey6zo3jNXfD3eTDLR7Bo89Qiwq7kk43/c /QmT8DwHppMZ71ANMcNXHU+YKfByuT5foIPpF06Iq5t1a7FKZR56t5fnI9MXsdV/kVAyZe SVKNiD2gcprBI5zNlxWiSTNItwY0bpvCJY/vFt+kkdXwuai75VjTP4y6pjod6w== ARC-Seal: i=1; s=key1; d=yhetil.org; t=1618807449; a=rsa-sha256; cv=none; b=r5xwTg38eblxWAkUCERltZVdlCk/2W+O/zSxlL1cJJWFkyGmRnR6Z4TVoY74qb9q7Lac/M j72l5ZpN4zOFiqvwdUkGjyeRYKpvoVYW/UadQsOPuGHGZFXD5wMMKgyHbzDYBOQxt9V432 Br/3sxNTjc32r3wRcTZuiu2ff09Vv+Ml4RLIiOaRYBJ/E8m2mmcsj9W2rIDSQtpKiYzKyC Dj+wV4wyNlr0trnWs4dMISBFmrKoM9K3L7rherYEwpdR5RJPoGOiidZhuVcN0Al+mZQwhm swxjUhLfwH2emMYMKc9FSq8ErBsaga0w18JNUbiMSPkR/Hy18gDdBckVOAcn0A== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=pass header.d=gmail.com header.s=20161025 header.b=s6q6v5+K; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (aspmx1.migadu.com: domain of emacs-orgmode-bounces@gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=emacs-orgmode-bounces@gnu.org X-Migadu-Spam-Score: -3.14 Authentication-Results: aspmx1.migadu.com; dkim=pass header.d=gmail.com header.s=20161025 header.b=s6q6v5+K; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (aspmx1.migadu.com: domain of emacs-orgmode-bounces@gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=emacs-orgmode-bounces@gnu.org X-Migadu-Queue-Id: 9E0C01F3C4 X-Spam-Score: -3.14 X-Migadu-Scanner: scn0.migadu.com X-TUID: u0pvglOrCbPs Hi, My previous patch proposed to add support for importing file with arbitrary name and building upon that this patch tries to make use of it by making org-table-import smarter by simply adding more separators (delimiters). Currently org-table-import 'smartly' guesses only COMMA, TAB and SPACE as separator whereas this patch tries to add support for ';'(SEMICOLON) and ':' (COLON). Here is an example org-table generated using =M-x org-table-import= /etc/passwd (uses COLON as separator) with private information removed. | bin | x | 1 | 1 | | / | /usr/bin/nologin | | daemon | x | 2 | 2 | | / | /usr/bin/nologin | | mail | x | 8 | 12 | | /var/spool/mail | /usr/bin/nologin | | ftp | x | 14 | 11 | | /srv/ftp | /usr/bin/nologin | | http | x | 33 | 33 | | /srv/http | /usr/bin/nologin | | nobody | x | 65534 | 65534 | Nobody | / | /usr/bin/nologin | | dbus | x | 81 | 81 | System Message Bus | / | /usr/bin/nologin | | systemd-journal-remote | x | 981 | 981 | systemd Journal Remote | / | /usr/bin/nologin | | systemd-network | x | 980 | 980 | systemd Network Management | / | /usr/bin/nologin | | systemd-oom | x | 979 | 979 | systemd Userspace OOM Killer | / | /usr/bin/nologin | | systemd-resolve | x | 978 | 978 | systemd Resolver | / | /usr/bin/nologin | | systemd-timesync | x | 977 | 977 | systemd Time Synchronization | / | /usr/bin/nologin | | systemd-coredump | x | 976 | 976 | systemd Core Dumper | / | /usr/bin/nologin | | avahi | x | 974 | 974 | Avahi mDNS/DNS-SD daemon | / | /usr/bin/nologin | | colord | x | 973 | 973 | Color management daemon | /var/lib/colord | /usr/bin/nologin | | rtkit | x | 133 | 133 | RealtimeKit | /proc | /usr/bin/nologin | | transmission | x | 169 | 169 | Transmission BitTorrent Daemon | /var/lib/transmission | /usr/bin/nologin | | geoclue | x | 972 | 972 | Geoinformation service | /var/lib/geoclue | /usr/bin/nologin | | usbmux | x | 140 | 140 | usbmux user | / | /usr/bin/nologin | diff --git a/lisp/org/org-table.el b/lisp/org/org-table.el index ab66859d6a..5ee4af612b 100644 --- a/lisp/org/org-table.el +++ b/lisp/org/org-table.el @@ -846,6 +846,35 @@ org-table-create (goto-char pos)) (org-table-align))) + +(defun org-table-guess-separator (beg0 end0) + "Guess separator for `org-table-convert-region' for region BEG0 to END0. + +List of preferred separator: +comma, TAB, ';', ':' or SPACE + +If region contains a line which doesn't contain the required +separator then discard the separator and search again using next +separator." + (let ((beg (save-excursion + (goto-char (min beg0 end0)) + (beginning-of-line 1) + (point))) + (end (save-excursion + (goto-char (max beg0 end0)) + (end-of-line 1) + (if (bolp) (backward-char 1) (end-of-line 1)) + (point)))) + (save-excursion + (goto-char beg) + (cond + ((not (re-search-forward "^[^\n,]+$" end t)) '(4)) + ((not (re-search-forward "^[^\n\t]+$" end t)) '(16)) + ((not (re-search-forward "^[^\n;]+$" end t)) ";") + ((not (re-search-forward "^[^\n:]+$" end t)) ":") + ((not (re-search-forward "^\\([^'\"][^\n\s][^'\"]\\)+$" end t)) " ") + (t nil))))) + ;;;###autoload (defun org-table-convert-region (beg0 end0 &optional separator) "Convert region to a table. @@ -862,10 +891,7 @@ org-table-convert-region integer When a number, use that many spaces, or a TAB, as field separator regexp When a regular expression, use it to match the separator nil When nil, the command tries to be smart and figure out the - separator in the following way: - - when each line contains a TAB, assume TAB-separated material - - when each line contains a comma, assume CSV material - - else, assume one or more SPACE characters as separator." + separator using `org-table-guess-seperator'." (interactive "r\nP") (let* ((beg (min beg0 end0)) (end (max beg0 end0)) @@ -881,14 +907,9 @@ org-table-convert-region (goto-char end) (if (bolp) (backward-char 1) (end-of-line 1)) (setq end (point-marker)) - ;; Get the right field separator - (unless separator - (goto-char beg) - (setq separator - (cond - ((not (re-search-forward "^[^\n\t]+$" end t)) '(16)) - ((not (re-search-forward "^[^\n,]+$" end t)) '(4)) - (t 1)))) + (if (and (not separator) + (not (setq separator (org-table-guess-separator beg end)))) + (error "Unable to guess suitable separator.")) (goto-char beg) (if (equal separator '(4)) (while (< (point) end) @@ -921,12 +942,8 @@ org-table-convert-region (defun org-table-import (file separator) "Import FILE as a table. -The command tries to be smart and figure out the separator in the -following way: - -- when each line contains a TAB, assume TAB-separated material; -- when each line contains a comma, assume CSV material; -- else, assume one or more SPACE characters as separator. +The command tries to be smart and figure out the separator using +`org-table-guess-seperator'. When non-nil, SEPARATOR specifies the field separator in the lines. It can have the following values: -- Utkarsh Singh http://utkarshsingh.xyz