From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Masatake YAMATO Newsgroups: gmane.emacs.devel Subject: request for reviewing the updated version of cc-guess.el Date: Fri, 11 Feb 2011 05:23:37 +0900 (JST) Organization: Red Hat Japan, Inc. Message-ID: <20110211.052337.141994890934352619.yamato@redhat.com> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Trace: dough.gmane.org 1297371705 17246 80.91.229.12 (10 Feb 2011 21:01:45 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Thu, 10 Feb 2011 21:01:45 +0000 (UTC) Cc: emacs-devel@gnu.org To: Alan Mackenzie Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Thu Feb 10 22:01:40 2011 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1Pnddy-0004IX-DY for ged-emacs-devel@m.gmane.org; Thu, 10 Feb 2011 22:01:39 +0100 Original-Received: from localhost ([127.0.0.1]:54098 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1Pnddx-0007Vh-KH for ged-emacs-devel@m.gmane.org; Thu, 10 Feb 2011 16:01:37 -0500 Original-Received: from [140.186.70.92] (port=48988 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1Pnd3I-0003Mm-OM for emacs-devel@gnu.org; Thu, 10 Feb 2011 15:23:46 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Pnd3G-0003fx-6o for emacs-devel@gnu.org; Thu, 10 Feb 2011 15:23:44 -0500 Original-Received: from mx1.redhat.com ([209.132.183.28]:1748) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Pnd3F-0003fj-Rk for emacs-devel@gnu.org; Thu, 10 Feb 2011 15:23:42 -0500 Original-Received: from int-mx09.intmail.prod.int.phx2.redhat.com (int-mx09.intmail.prod.int.phx2.redhat.com [10.5.11.22]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id p1AKNetp013131 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Thu, 10 Feb 2011 15:23:40 -0500 Original-Received: from localhost (beach.nrt.redhat.com [10.64.200.71]) by int-mx09.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id p1AKNcWw005786; Thu, 10 Feb 2011 15:23:39 -0500 X-Scanned-By: MIMEDefang 2.68 on 10.5.11.22 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 209.132.183.28 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:135863 Archived-At: Hi, Taking 7 years I've updated cc-guess.el as suggested by Martin Stjernholm. Currently cc-guess.el is not included in cc-mode official release and as the result it is not included in GNU Emacs. Could you review the updated version with considering inclduing it to the release? My original post: http://sourceforge.net/mailarchive/message.php?msg_id=8994118 Martin's suggestion and my answer: http://sourceforge.net/mailarchive/message.php?msg_id=8994526 Martin's suggesttions: S1. thresholds for examining(sampling the indentation) the buffer. S2. Setting for c-basic-offset only as the result of guessing. S3. Key binding for the commands defined in cc-guess.el. My answer in updated code: A1. I inrtoduced `cc-guess-offset-threshold'. A2. In addition to offset-alist, the update version of cc-guess.el guesses the basic-offset value. A3. I removed key bindings. Masatake YAMATO 2011-02-11 Masatake YAMATO * cc-guess.el (cc-guess): Don't limit the region if `cc-guess-region-max' is nil. (cc-guess-no-install): Ditto. 2011-01-18 Masatake YAMATO * cc-guess.el (cc-guess-make-basic-offset): Don't use sort to find the majority. 2011-01-18 Masatake YAMATO * cc-guess.el (cc-guess-style-name-p): New function. 2011-01-16 Masatake YAMATO * cc-guess.el: Use `+', `-', `++', `--', `*', or `/'. (cc-guess-symbolize-offsets-alist): New function. (cc-guess-symbolize-integer): New function. (cc-guess-make-style): New function. (cc-guess-view-guessed-style): New function. 2011-01-16 Masatake YAMATO * cc-guess.el (cc-guess-no-install): New function. (cc-guess-buffer-no-install): New function. (cc-guess-region-no-install): New function. (cc-guess-make-basic-offset): Ignore `c' syntax-symbol. (cc-guess-style-name): New function. (cc-guess-install): Don't set `c-offsets-alist'. Instead define a style and use it. (cc-guess-view-guessed-values): Renamed from `cc-guess-view-offsets-alist'. Print also `basic-offset'. 2011-01-16 Masatake YAMATO * cc-guess.el: Use term `offset' instead of `delta'. Temorary don't use the term `style'. Instead use term `basic-offset' or `offset-alist'. (cc-guess-offset-threshold): Renamed from `cc-guess-delta-threshold'. (cc-guess-accumulate-offset): Renamed from `cc-guess-accumulate-delta'. (cc-guess-guessed-style): Removed. (cc-guess-make-offsets-alist): Renamed from `cc-guess-make-style'. (cc-guess-merge-offsets-alists): Renamed from `cc-guess-merge-styles'. (cc-guess-make-basic-offset): New function. (cc-guess-guessed-offsets-alist): New variable. (cc-guess-guessed-basic-offset): New variable. (cc-guess-view-accumulator): New function for debugging. (cc-guess-reset-accumulator): New function. (cc-guess-view-offsets-alist): Renamed from `cc-guess-view-style'. 2011-01-13 Masatake YAMATO * cc-guess.el (cc-guess-region-max): New option. (cc-guess-buffer): New function. (cc-guess): Limit the region for examining indentation by `cc-guess-region-max'. 2011-01-13 Masatake YAMATO * cc-guess.el (cc-guess-accumulate): New function. (cc-guess-region): Use `cc-guess-accumulate'. 2011-01-13 Masatake YAMATO * cc-guess.el: s/delta-accumulator/accumulator/g. 2011-01-13 Masatake YAMATO * cc-guess.el (cc-guess-empty-line-p): New subroutine. (cc-guess-region): Handle multiple symbols returned from `c-guess-basic-syntax'. Use `cc-guess-empty-line-p'. 2011-01-13 Masatake YAMATO * cc-guess.el (cc-guess-guessed-style): Renamed from `cc-guessed-style'. 2011-01-13 Masatake YAMATO * cc-guess.el (cc-guess-delta-threshold): New option. (cc-guess-current-delta): New function derived from `cc-guess-region'. (cc-guess-region): Don't accumulate a sampled indentation if it is greater than `cc-guess-delta-threshold'. Local Variables: mode: change-log End: ;;; cc-guess.el --- guess indentation values by scanning existing code ;; Copyright (C) 1985,1987,1992-2003, 2004, 2005, 2006 Free Software ;; Foundation, Inc. ;; Author: 1994-1995 Barry A. Warsaw ;; Maintainer: Unmaintained ;; Created: August 1994, split from cc-mode.el ;; Version: See cc-mode.el ;; Keywords: c languages oop ;; This file is not part of GNU Emacs. ;; This program is free software; you can redistribute it and/or modify ;; it under the terms of the GNU General Public License as published by ;; the Free Software Foundation; either version 2 of the License, or ;; (at your option) any later version. ;; ;; This program is distributed in the hope that it will be useful, ;; but WITHOUT ANY WARRANTY; without even the implied warranty of ;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ;; GNU General Public License for more details. ;; ;; You should have received a copy of the GNU General Public License ;; along with this program; see the file COPYING. If not, write to ;; the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, ;; Boston, MA 02110-1301, USA. ;;; Commentary: ;; ;; This file contains routines that help guess the cc-mode style in a ;; particular region/buffer. Here style means `offsets-alist' and ;; `basic-offset'. ;; ;; The main entry point of this program is `cc-guess' command but there ;; are some variants. ;; ;; Consider the major mode for the current buffer is one of the modes ;; provided by cc-mode. `cc-guess' guesses the indentation style by ;; examining the indentation in a region which started from buffer ;; beginning to the point limited by `cc-guess-offset-threshold' and ;; installs the guessed style. The name for installed style is given ;; by `cc-guess-style-name'. ;; `cc-guess-buffer' does the same but in a whole the buffer. ;; `cc-guess-region' does the same but in a region between the point ;; and the mark. `cc-guess-no-install', `cc-guess-buffer-no-install' ;; and `cc-guess-region-no-install' guess the indentation style but ;; don't install. ;;; Code: (eval-when-compile (let ((load-path (if (and (boundp 'byte-compile-dest-file) (stringp byte-compile-dest-file)) (cons (file-name-directory byte-compile-dest-file) load-path) load-path))) (load "cc-bytecomp" nil t))) (cc-require 'cc-defs) (cc-require 'cc-engine) (defcustom cc-guess-offset-threshold 10 "Threshold of acceptable offset when examining indent information. Discard a examined offset if its absolute value is greater than this. The offset of the a line included in the indent information returned by `c-guess-basic-syntax'." :type 'integer :group 'c) (defcustom cc-guess-region-max 50000 "The maximum point of region for examining indent information with `cc-guess'. It takes long time for examining indent information from large region. This option helps you limit the examining time. `nil' means no limit." :type 'integer :group 'c) (defvar cc-guess-guessed-offsets-alist nil "Currently guessed offsets-alist. Buffer local.") (defvar cc-guess-guessed-basic-offset nil "Currently guessed basic-offset. Buffer local.") (defvar cc-guess-accumulator nil) ;; Accumulated examined indent information. Information is represented ;; in a list. Each element in it has following structure: ;; ;; (syntactic-symbol ((indentation-offset1 . number-of-times1) ;; (indentation-offset2 . number-of-times2) ;; ...)) ;; ;; This structure is built by `cc-guess-accumulate-offset'. ;; ;; Here we call the pair (indentation-offset1 . number-of-times1) a ;; counter. `cc-guess-sort-accumulator' sorts the order of ;; counters by number-of-times. ;; Use `cc-guess-view-accumulator' to see the value. (defconst cc-guess-conversions '((c . c-lineup-C-comments) (inher-cont . c-lineup-multi-inher) (string . -1000) (comment-intro . c-lineup-comment) (arglist-cont-nonempty . c-lineup-arglist) (arglist-close . c-lineup-close-paren) (cpp-macro . -1000))) (defun cc-guess (&optional accumulate) "Apply `cc-guess-region' on the region limited by `cc-guess-region-max'. If given a prefix argument (or if the optional argument ACCUMULATE is non-nil) then the previous guess is extended, otherwise a new guess is made from scratch." (interactive "P") (cc-guess-region (point-min) (min (point-max) (or cc-guess-region-max (point-max))) accumulate)) (defun cc-guess-no-install (&optional accumulate) "Apply `cc-guess-region-no-install' on the region limited by `cc-guess-region-max'. If given a prefix argument (or if the optional argument ACCUMULATE is non-nil) then the previous guess is extended, otherwise a new guess is made from scratch." (interactive "P") (cc-guess-region-no-install (point-min) (min (point-max) (or cc-guess-region-max (point-max))) accumulate)) (defun cc-guess-buffer (&optional accumulate) "Apply `cc-guess-region' on the whole current buffer. If given a prefix argument (or if the optional argument ACCUMULATE is non-nil) then the previous guess is extended, otherwise a new guess is made from scratch." (interactive "P") (cc-guess-region (point-min) (point-max) accumulate)) (defun cc-guess-buffer-no-install (&optional accumulate) "Apply `cc-guess-region-no-install' on the whole current buffer. If given a prefix argument (or if the optional argument ACCUMULATE is non-nil) then the previous guess is extended, otherwise a new guess is made from scratch." (interactive "P") (cc-guess-region-no-install (point-min) (point-max) accumulate)) (defun cc-guess-region (start end &optional accumulate) "Call `cc-guess-region-no-install' and install the guessed style." (interactive "r\nP") (cc-guess-region-no-install start end accumulate) (cc-guess-install)) (defun cc-guess-region-no-install (start end &optional accumulate) "Guess the indentation style by examining the indentation in a region of code. Every line of code in the region is examined and the values for following two variabels are guessed: * `c-basic-offset', and * the indentation values of the various syntactic symbols in `c-offsets-alist'. The guessed values are put into `cc-guess-guessed-basic-offset' and `cc-guess-guessed-offsets-alist'. Frequencies of use are taken into account when guessing, so minor inconsistencies in the indentation style shouldn't produce wrong guesses. If given a prefix argument (or if the optional argument ACCUMULATE is non-nil) then the previous examination is extended, otherwise a new guess is made from scratch. Note that the larger the region to guess in, the slower the guessing. So you can limit the region with `cc-guess-region-max'." (interactive "r\nP") ;; ;; Examining stage ;; (let ((accumulator (when accumulate cc-guess-accumulator)) (reporter (when (fboundp 'make-progress-reporter) (make-progress-reporter "Examining Indentation " start end)))) (save-excursion (goto-char start) (while (< (point) end) (unless (cc-guess-empty-line-p) (mapc (lambda (s) (setq accumulator (or (cc-guess-accumulate accumulator s) accumulator))) (c-save-buffer-state () (c-guess-basic-syntax)))) (when reporter (progress-reporter-update reporter (point))) (forward-line 1))) (when reporter (progress-reporter-done reporter)) (setq cc-guess-accumulator (cc-guess-sort-accumulator accumulator))) ;; ;; Guessing stage ;; (let* ((basic-offset (cc-guess-make-basic-offset cc-guess-accumulator)) (typical-offsets-alist (cc-guess-make-offsets-alist cc-guess-accumulator)) (symbolic-offsets-alist (cc-guess-symbolize-offsets-alist typical-offsets-alist basic-offset)) (merged-offsets-alist (cc-guess-merge-offsets-alists (copy-list cc-guess-conversions) symbolic-offsets-alist))) (set (make-local-variable 'cc-guess-guessed-basic-offset) basic-offset) (set (make-local-variable 'cc-guess-guessed-offsets-alist) merged-offsets-alist)) ) (defsubst cc-guess-empty-line-p () (eq (line-beginning-position) (line-end-position))) (defun cc-guess-current-offset (relpos) ;; Calculate relative indentation (point) to RELPOS. (- (progn (back-to-indentation) (current-column)) (save-excursion (goto-char relpos) (current-column)))) (defun cc-guess-accumulate (accumulator syntax-element) ;; Added SYNTAX-ELEMENT to ACCUMULATOR. (let ((symbol (car syntax-element)) (relpos (cadr syntax-element))) (when (numberp relpos) (let ((offset (cc-guess-current-offset relpos))) (when (< (abs offset) cc-guess-offset-threshold) (cc-guess-accumulate-offset accumulator symbol offset)))))) (defun cc-guess-accumulate-offset (accumulator symbol offset) ;; Added SYMBOL and OFFSET to ACCUMULATOR. See ;; `cc-guess-accumulator' about the structure of ACCUMULATOR. (let* ((entry (assoc symbol accumulator)) (counters (cdr entry)) counter) (if entry (progn (setq counter (assoc offset counters)) (if counter (setcdr counter (1+ (cdr counter))) (setq counters (cons (cons offset 1) counters)) (setcdr entry counters)) accumulator) (cons (cons symbol (cons (cons offset 1) nil)) accumulator)))) (defun cc-guess-sort-accumulator (accumulator) ;; Sort the each element of ACCUMULATOR by the number-of-times. See ;; `cc-guess-accumulator' for more details. (mapcar (lambda (entry) (let ((symbol (car entry)) (counters (cdr entry))) (cons symbol (sort counters (lambda (a b) (if (> (cdr a) (cdr b)) t (and (eq (cdr a) (cdr b)) (< (car a) (car b))))))))) accumulator)) (defun cc-guess-make-offsets-alist (accumulator) ;; Throw away the rare cases in accumulator and make a offsets-alist structure. (mapcar (lambda (entry) (cons (car entry) (car (car (cdr entry))))) accumulator)) (defun cc-guess-merge-offsets-alists (strong weak) ;; Merge two offsets-alists into one. When two offsets-alists have the same symbol ;; entry, give STRONG priority over WEAK. (mapc (lambda (weak-elt) (unless (assoc (car weak-elt) strong) (setq strong (cons weak-elt strong)))) weak) strong) (defun cc-guess-make-basic-offset (accumulator) ;; As `basic-offset' find the most frequently appeared indentation-offset ;; from ACCUMULATOR. (let* (;; Drop the value related to `c' syntactic-symbol. ;; (`c': Inside a multiline C style block comment.) ;; The impact for values of `c' is too large for guessing ;; `basic-offset' if the target source file is small and its license notice ;; is at top of the file. (accumulator (assq-delete-all 'c (copy-list accumulator))) ;; Drop syntactic-symbols from ACCUMULATOR. (alist (apply #'append (mapcar (lambda (elts) (mapcar (lambda (elt) (cons (abs (car elt)) (cdr elt))) (cdr elts))) accumulator))) ;; Gather all indentation-offsets other than 0. ;; 0 is meaningless as `basic-offset'. (offset-list (delete 0 (delete-dups (mapcar (lambda (elt) (car elt)) alist)))) ;; Sum of number-of-times for offset: ;; (offset . sum) (summed (mapcar (lambda (offset) (cons offset (apply #'+ (mapcar (lambda (a) (if (eq (car a) offset) (cdr a) 0)) alist)))) offset-list))) ;; ;; Find the majority. ;; (let ((majority '(nil . 0))) (while summed (when (< (cdr majority) (cdr (car summed))) (setq majority (car summed))) (setq summed (cdr summed))) (car majority)))) (defun cc-guess-symbolize-offsets-alist (offsets-alist basic-offset) ;; Convert the representation of OFFSETS-ALIST to an alist using ;; `+', `-', `++', `--', `*', or `/'. These symbols represents ;; a value relative to BASIC-OFFSET. See info of CC mode about ;; the detail of the symbols. (mapcar (lambda (elt) (let ((s (car elt)) (v (cdr elt))) (cond ((integerp v) (cons s (cc-guess-symbolize-integer v basic-offset))) (t elt)))) offsets-alist)) (defun cc-guess-symbolize-integer (int basic-offset) (let ((aint (abs int))) (cond ((eq int basic-offset) '+) ((eq aint basic-offset) '-) ((eq int (* 2 basic-offset)) '++) ((eq aint (* 2 basic-offset)) '--) ((eq (* 2 int) basic-offset) '*) ((eq (* 2 aint) basic-offset) '-) (t int)))) (defun cc-guess-style-name-p (name) "Return t if NAME is name of a style created by cc-guess." (string-prefix-p "*cc-guess*:" name)) (defun cc-guess-style-name () ;; Make a style name for the guessed style. (format "*cc-guess*:%s" (buffer-file-name))) (defun cc-guess-make-style () ;; Make a style from guessed values. (when cc-guess-guessed-offsets-alist (let* ((basic-offset cc-guess-guessed-basic-offset) (offsets-alist (cc-guess-merge-offsets-alists cc-guess-guessed-offsets-alist c-offsets-alist))) `((c-basic-offset . ,basic-offset) (c-offsets-alist . ,offsets-alist))))) (defun cc-guess-install () "Define the indentation style from the last guessed values and use it. Here guessed values mean `cc-guess-guessed-basic-offset' and `cc-guess-guessed-offsets-alist'. When defining the style from `cc-guess-guessed-offsets-alist', `c-offsets-alist' is also merged into the style. However, `cc-guess-guessed-offsets-alist' takes precedence over `c-offsets-alist'. The style name is given by `cc-guess-style-name'." (interactive) (let ((style (cc-guess-make-style))) (if style (c-add-style (cc-guess-style-name) style t) (error "Not yet guessed")))) (defun cc-guess-view-accumulator () "Show `cc-guess-accumulator'." (interactive) (with-output-to-temp-buffer "*Accumulated Examined Indent Information*" (pp cc-guess-accumulator))) (defun cc-guess-reset-accumulator () "Reset `cc-guess-accumulator'." (interactive) (setq cc-guess-reset-accumulator nil)) (defun cc-guess-view-guessed-values () "Show `cc-guess-guessed-basic-offset' and `cc-guess-guessed-offsets-alist'." (interactive) (with-output-to-temp-buffer "*Guessed Values*" (princ "basic-offset: \n\t") (pp cc-guess-guessed-basic-offset) (princ "\n\n") (princ "offsets-alist: \n") (pp cc-guess-guessed-offsets-alist) )) (defun cc-guess-view-guessed-style () "Show the guessed style." (interactive) (let ((style (cc-guess-make-style))) (if style (with-output-to-temp-buffer "*Guessed Style*" (pp style)) (error "Not yet guessed")))) (cc-provide 'cc-guess) ;;; cc-guess.el ends here