From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Stefan Monnier Newsgroups: gmane.emacs.devel,gmane.comp.statistics.pspp.devel Subject: Re: Fwd: [ELPA] New package: pspp-mode.el for PSPP/SPSS syntax highlighting Date: Sat, 04 Jul 2020 18:45:56 -0400 Message-ID: References: <4836167B-33A9-46A3-B586-20768E333E1D@gmx.de> <20200704151535.GA31917@jocasta.intra> Mime-Version: 1.0 Content-Type: text/plain Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="16204"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux) Cc: Friedrich Beckmann , emacs-devel@gnu.org, mail@vasilij.de, PSPP Development Mailing List To: John Darrington Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Sun Jul 05 00:46:50 2020 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1jrqvp-00046k-Vs for ged-emacs-devel@m.gmane-mx.org; Sun, 05 Jul 2020 00:46:50 +0200 Original-Received: from localhost ([::1]:33350 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jrqvp-0004Cd-1O for ged-emacs-devel@m.gmane-mx.org; Sat, 04 Jul 2020 18:46:49 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:51958) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jrqv8-0003k5-3r; Sat, 04 Jul 2020 18:46:06 -0400 Original-Received: from mailscanner.iro.umontreal.ca ([132.204.25.50]:52792) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jrqv5-0001be-18; Sat, 04 Jul 2020 18:46:05 -0400 Original-Received: from pmg2.iro.umontreal.ca (localhost.localdomain [127.0.0.1]) by pmg2.iro.umontreal.ca (Proxmox) with ESMTP id F36EC80072; Sat, 4 Jul 2020 18:46:00 -0400 (EDT) Original-Received: from mail01.iro.umontreal.ca (unknown [172.31.2.1]) by pmg2.iro.umontreal.ca (Proxmox) with ESMTP id 5A8BB80599; Sat, 4 Jul 2020 18:45:58 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=iro.umontreal.ca; s=mail; t=1593902758; bh=pcurW4HQDO27hivYwqI+Rk4oJpZ5itVLRoldYXrm9n4=; h=From:To:Cc:Subject:References:Date:In-Reply-To:From; b=iTni0rht+/Nx0NBjAkMpAD21zhOSlLEm3JfxR9gFraS+GfMTwRtBRyhyC0FZaeNUL 6urSlqVH4iu1Qi+TaETuLjldYId/+Fpus3VAsYpv5tkZXSf4y178SETmSOH7ZWByne 49eeAPkzjEsZ6qWxWOyoI44UdsJfd1JAsf87IazYg5knYzOAEK5HxMUPS7fy0sLism LC+oGxddXPJ9pjTUMHAM94ei5hONVut6w8zGbS9MOdN58DXtHHaiM++/ifgpiDjtP9 PQtFSUrYDYKY1AcjIsGNGxid5tuQ/CaaLMRdaYaTNh7h5raNZ+e/jonhAYuN8OHRIC 2QA6+K/85/EFA== Original-Received: from alfajor (unknown [157.52.0.200]) by mail01.iro.umontreal.ca (Postfix) with ESMTPSA id B1DC5120329; Sat, 4 Jul 2020 18:45:57 -0400 (EDT) In-Reply-To: <20200704151535.GA31917@jocasta.intra> (John Darrington's message of "Sat, 4 Jul 2020 17:15:35 +0200") Received-SPF: pass client-ip=132.204.25.50; envelope-from=monnier@iro.umontreal.ca; helo=mailscanner.iro.umontreal.ca X-detected-operating-system: by eggs.gnu.org: First seen = 2020/07/04 17:10:15 X-ACL-Warn: Detected OS = Linux 2.2.x-3.x [generic] X-Spam_score_int: -42 X-Spam_score: -4.3 X-Spam_bar: ---- X-Spam_report: (-4.3 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:252682 gmane.comp.statistics.pspp.devel:7572 Archived-At: Hi John, So pspp-mode.el is now in GNU ELPA: https://elpa.gnu.org/packages/pspp-mode.html > Probably it would be - but I don't think a decision has been made yet. Fair enough. But to avoid divergence, it would be good to do it. > Yes. That list doesn't change very often. Perhaps we should check > that it is up to date now. I was looking at the code a bit and am about to install some changes to it, but I bumped into some questions about the PSPP syntax of comments. The patch below does: * pspp-mode.el: Prefer `setq` to `set '`. (pspp-mode-hook): Let define-derived-mode declare it for us. (pspp-mode-map): Don't bind `C-j` since this is not specific to PSPP but is a user preference. Nowadays `electric-indent-mode` is used instead anyway. Also, use a local var for the temp. (pspp--downcase-list, pspp--upcase-list, pspp--updown-list): Use `mapcar` instead of an inefficient recursion. (pspp-indent-line): Comment out unused var `verbatim`. (pspp-comment-start-line-p): Fix incomplete escaping. (pspp-mode-syntax-table): Don't use `w` for non-word symbol constituents since it breaks the expected behavior of forward-word. Use a short non-prefixed name for the local var. Tweak the syntax-table for comments. (pspp--syntax-propertize): New var. (pspp-font-lock-keywords): Use [:alnum:] and [:alpha:]. (pspp-mode): Use define-derived-mode. But the main issue is the comment syntax. I'm trying to handle them right using `syntax-propertize`, but the patch can't handle the t-test-sps I found in PSPP's Git. I don't understand what the real syntax should be. The doc seems to suggest that comments start with `*` or `comment` (in the position of the beginning of a command?) and end with a `.` at an end of line or with an empty line, but in t-test.sps I see: * Females have gender 0 * Create 8 female cases loop #i = 1 to 8. where the comment does not seem to be terminated (neither by a `.` nor by an empty line). What am I missing? Stefan diff --git a/pspp-mode.el b/pspp-mode.el index ca9bbc931..157475234 100644 --- a/pspp-mode.el +++ b/pspp-mode.el @@ -1,4 +1,4 @@ -;;; pspp-mode.el --- Major mode for editing PSPP files +;;; pspp-mode.el --- Major mode for editing PSPP files -*- lexical-binding: t; -*- ;; Copyright (C) 2005,2018,2020 Free Software Foundation, Inc. ;; Author: Scott Andrew Borton @@ -25,13 +25,10 @@ ;; along with this program. If not, see . ;;; Code: -(defvar pspp-mode-hook nil) - (defvar pspp-mode-map - (let ((pspp-mode-map (make-keymap))) - (define-key pspp-mode-map "\C-j" 'newline-and-indent) - pspp-mode-map) + (let ((map (make-sparse-keymap))) + map) "Keymap for PSPP major mode") @@ -48,37 +45,33 @@ (while (not (or (or (bobp) pspp-end-of-block-found) pspp-start-of-block-found)) - (set 'pspp-end-of-block-found - (looking-at "^[ \t]*\\(END\\|end\\)[\t ]+\\(DATA\\|data\\)\.")) - (set 'pspp-start-of-block-found - (looking-at "^[ \t]*\\(BEGIN\\|begin\\)[\t ]+\\(DATA\\|data\\)")) + (setq pspp-end-of-block-found + (looking-at "^[ \t]*\\(END\\|end\\)[\t ]+\\(DATA\\|data\\)\.")) + (setq pspp-start-of-block-found + (looking-at "^[ \t]*\\(BEGIN\\|begin\\)[\t ]+\\(DATA\\|data\\)")) (forward-line -1)) (and pspp-start-of-block-found (not pspp-end-of-block-found))))) -(defconst pspp-indent-width +(defconst pspp-indent-width ;FIXME: Should be a defcustom. 2 "size of indent") (defun pspp--downcase-list (l) - "Takes a list of strings and returns that list with all elements downcased" - (if l - (cons (downcase (car l)) (pspp--downcase-list (cdr l))) - nil)) + "Take a list of strings and return that list with all elements downcased." + (mapcar #'downcase l)) (defun pspp--upcase-list (l) - "Takes a list of strings and returns that list with all elements upcased" - (if l - (cons (upcase (car l)) (pspp--upcase-list (cdr l))) - nil)) + "Take a list of strings and return that list with all elements upcased." + (mapcar #'upcase l)) (defun pspp--updown-list (l) - "Takes a list of strings and returns that list with all elements upcased -and downcased" + "Take a list of strings and return that list with all elements upcased +and downcased." (append (pspp--upcase-list l) (pspp--downcase-list l))) @@ -87,9 +80,10 @@ and downcased" (regexp-opt (pspp--updown-list '("DO" "BEGIN" "LOOP" - "INPUT")) t) + "INPUT")) + t) "[\t ]+") - "constructs which cause indentation") + "Constructs which cause indentation.") (defconst pspp-unindenters @@ -98,16 +92,17 @@ and downcased" "DATA" "LOOP" "REPEAT" - "INPUT")) t) + "INPUT")) + t) "[\t ]*") ;; Note that "END CASE" and "END FILE" do not unindent. - "constructs which cause end of indentation") + "Constructs which cause end of indentation.") (defun pspp-indent-line () "Indent current line as PSPP code." (beginning-of-line) - (let ((verbatim nil) + (let (;; (verbatim nil) (the-indent 0) ; Default indent to column 0 (case-fold-search t)) (if (bobp) @@ -127,7 +122,7 @@ and downcased" (setq within-command t)))))) ;; If we're not at the start of a new command, then add an indent. (if within-command - (set 'the-indent (+ 1 the-indent)))) + (setq the-indent (+ 1 the-indent)))) ;; Set the indentation according to the DO - END blocks (save-excursion (beginning-of-line) @@ -137,37 +132,38 @@ and downcased" (cond ((save-excursion (forward-line -1) (looking-at pspp-indenters)) - (set 'the-indent (+ the-indent 1))) + (setq the-indent (+ the-indent 1))) ((looking-at pspp-unindenters) - (set 'the-indent (- the-indent 1))))) + (setq the-indent (- the-indent 1))))) (forward-line -1))) (save-excursion (beginning-of-line) (if (looking-at "^[\t ]*ELSE") - (set 'the-indent (- the-indent 1)))) + (setq the-indent (- the-indent 1)))) ;; Stuff in the data-blocks should be untouched (if (not (pspp-data-block-p)) (indent-line-to (* pspp-indent-width the-indent))))) (defun pspp-comment-start-line-p () - "Returns t if the current line is the first line of a comment, nil otherwise" + "Return t if the current line is the first line of a comment, nil otherwise" (beginning-of-line) - (or (looking-at "^\*") + (or (looking-at "^\\*") (looking-at "^[\t ]*comment[\t ]") (looking-at "^[\t ]*COMMENT[\t ]"))) (defun pspp-comment-end-line-p () - "Returns t if the current line is the candidate for the last line of a comment, nil otherwise" + "Return t if the current line is the candidate for the last line of a comment, nil otherwise" (beginning-of-line) (looking-at ".*\\.[\t ]*$")) (defun pspp-comment-p () - "Returns t if point is in a comment. Nil otherwise." + "Return t if point is in a comment. Nil otherwise." + ;; FIXME: Use `syntax-ppss'? (if (pspp-data-block-p) nil (let ((pspp-comment-start-found nil) @@ -180,16 +176,16 @@ and downcased" (not pspp-comment-start-found) (not pspp-comment-end-found)) (beginning-of-line) - (if (pspp-comment-start-line-p) (set 'pspp-comment-start-found t)) + (if (pspp-comment-start-line-p) (setq pspp-comment-start-found t)) (if (bobp) - (set 'pspp-comment-end-found nil) + (setq pspp-comment-end-found nil) (save-excursion (forward-line -1) - (if (pspp-comment-end-line-p) (set 'pspp-comment-end-found t)))) - (set 'lines (forward-line -1)))) + (if (pspp-comment-end-line-p) (setq pspp-comment-end-found t)))) + (setq lines (forward-line -1)))) (save-excursion - (set 'pspp-single-line-comment (and + (setq pspp-single-line-comment (and (pspp-comment-start-line-p) (pspp-comment-end-line-p)))) @@ -198,30 +194,49 @@ and downcased" (defvar pspp-mode-syntax-table - (let ((x-pspp-mode-syntax-table (make-syntax-table))) + (let ((st (make-syntax-table))) ;; Special chars allowed in variables - (modify-syntax-entry ?# "w" x-pspp-mode-syntax-table) - (modify-syntax-entry ?@ "w" x-pspp-mode-syntax-table) - (modify-syntax-entry ?$ "w" x-pspp-mode-syntax-table) + (modify-syntax-entry ?# "_" st) + (modify-syntax-entry ?@ "_" st) + (modify-syntax-entry ?$ "_" st) ;; Comment syntax - ;; This is incomplete, because: - ;; a) Comments can also be given by COMMENT - ;; b) The sequence .\n* is interpreted incorrectly. - - (modify-syntax-entry ?* ". 2" x-pspp-mode-syntax-table) - (modify-syntax-entry ?. ". 3" x-pspp-mode-syntax-table) - (modify-syntax-entry ?\n "- 41" x-pspp-mode-syntax-table) + ;; See `pspp--syntax-propertize' for the details. + (modify-syntax-entry ?* "<" st) + (modify-syntax-entry ?. ". 3" st) + (modify-syntax-entry ?\n " 34" st) ;; String delimiters - (modify-syntax-entry ?' "\"" x-pspp-mode-syntax-table) - (modify-syntax-entry ?\" "\"" x-pspp-mode-syntax-table) - - x-pspp-mode-syntax-table) - - "Syntax table for pspp-mode") - + (modify-syntax-entry ?' "\"" st) + (modify-syntax-entry ?\" "\"" st) + + st) + + "Syntax table for pspp-mode.") + +(defconst pspp--syntax-propertize + (syntax-propertize-rules + ("\\*" + (0 (unless (save-excursion + (goto-char (match-beginning 0)) + (skip-chars-backward " \t\n") + (memq (char-before) '(nil ?\.))) + (string-to-syntax ".")))) + ("\\_<\\([Cc]\\)[Oo][Mm][Mm][Ee][Nn][Tt]\\_>" + (1 (when (save-excursion + (goto-char (match-beginning 0)) + (skip-chars-backward " \t\n") + (memq (char-before) '(nil ?\.))) + (string-to-syntax "<")))) + ;; PSPP, like Pascal, uses '' and "" rather than \' and \" to escape quotes. + ("''\\|\"\"" (0 (if (save-excursion + (nth 3 (syntax-ppss (match-beginning 0)))) + (string-to-syntax ".") + ;; In case of 3 or more quotes in a row, only advance + ;; one quote at a time. + (forward-char -1) + nil))))) (defconst pspp-font-lock-keywords (list (cons @@ -627,29 +642,28 @@ and downcased" "YRMODA")) t) "\\>") 'font-lock-function-name-face) - '( "\\<[#$@a-zA-Z][a-zA-Z0-9_]*\\>" . font-lock-variable-name-face)) + ;; FIXME: The doc at + ;; https://www.gnu.org/software/pspp/manual/html_node/Tokens.html + ;; does not include `$' in the allowed first chars and it includes + ;; `. $ # @' additionally to `_' in the subsequent chars. + '( "\\<[#$@[:alpha:]][[:alnum:]_]*\\>" . font-lock-variable-name-face)) "Highlighting expressions for PSPP mode.") ;;;###autoload -(defun pspp-mode () - (interactive) - (kill-all-local-variables) - (use-local-map pspp-mode-map) - (set-syntax-table pspp-mode-syntax-table) - +(define-derived-mode pspp-mode prog-mode "PSPP" + "Major mode to edit PSPP files." (set (make-local-variable 'font-lock-keywords-case-fold-search) t) (set (make-local-variable 'font-lock-defaults) '(pspp-font-lock-keywords)) - (set (make-local-variable 'indent-line-function) 'pspp-indent-line) + (set (make-local-variable 'indent-line-function) #'pspp-indent-line) (set (make-local-variable 'comment-start) "* ") (set (make-local-variable 'compile-command) (concat "pspp " buffer-file-name)) - (setq major-mode 'pspp-mode) - (setq mode-name "PSPP") - (run-hooks 'pspp-mode-hook)) + (set (make-local-variable 'syntax-propertize-function) + pspp--syntax-propertize)) (provide 'pspp-mode)