From: Masatake YAMATO <yamato@redhat.com>
To: Alan Mackenzie <acm@muc.de>
Cc: emacs-devel@gnu.org
Subject: request for reviewing the updated version of cc-guess.el
Date: Fri, 11 Feb 2011 05:23:37 +0900 (JST) [thread overview]
Message-ID: <20110211.052337.141994890934352619.yamato@redhat.com> (raw)
Hi,
Taking 7 years I've updated cc-guess.el as suggested by
Martin Stjernholm.
Currently cc-guess.el is not included in cc-mode official release and
as the result it is not included in GNU Emacs.
Could you review the updated version with considering inclduing it to
the release?
My original post:
http://sourceforge.net/mailarchive/message.php?msg_id=8994118
Martin's suggestion and my answer:
http://sourceforge.net/mailarchive/message.php?msg_id=8994526
Martin's suggesttions:
S1. thresholds for examining(sampling the indentation) the buffer.
S2. Setting for c-basic-offset only as the result of guessing.
S3. Key binding for the commands defined in cc-guess.el.
My answer in updated code:
A1. I inrtoduced `cc-guess-offset-threshold'.
A2. In addition to offset-alist, the update version of cc-guess.el
guesses the basic-offset value.
A3. I removed key bindings.
Masatake YAMATO
2011-02-11 Masatake YAMATO <yamato@redhat.com>
* cc-guess.el (cc-guess): Don't limit the region
if `cc-guess-region-max' is nil.
(cc-guess-no-install): Ditto.
2011-01-18 Masatake YAMATO <yamato@redhat.com>
* cc-guess.el (cc-guess-make-basic-offset): Don't use sort
to find the majority.
2011-01-18 Masatake YAMATO <yamato@redhat.com>
* cc-guess.el (cc-guess-style-name-p): New function.
2011-01-16 Masatake YAMATO <yamato@redhat.com>
* cc-guess.el: Use `+', `-', `++', `--', `*', or `/'.
(cc-guess-symbolize-offsets-alist): New function.
(cc-guess-symbolize-integer): New function.
(cc-guess-make-style): New function.
(cc-guess-view-guessed-style): New function.
2011-01-16 Masatake YAMATO <yamato@redhat.com>
* cc-guess.el (cc-guess-no-install): New function.
(cc-guess-buffer-no-install): New function.
(cc-guess-region-no-install): New function.
(cc-guess-make-basic-offset): Ignore `c' syntax-symbol.
(cc-guess-style-name): New function.
(cc-guess-install): Don't set `c-offsets-alist'. Instead
define a style and use it.
(cc-guess-view-guessed-values): Renamed from
`cc-guess-view-offsets-alist'.
Print also `basic-offset'.
2011-01-16 Masatake YAMATO <yamato@redhat.com>
* cc-guess.el: Use term `offset' instead of `delta'.
Temorary don't use the term `style'. Instead use term
`basic-offset' or `offset-alist'.
(cc-guess-offset-threshold): Renamed from `cc-guess-delta-threshold'.
(cc-guess-accumulate-offset): Renamed from `cc-guess-accumulate-delta'.
(cc-guess-guessed-style): Removed.
(cc-guess-make-offsets-alist): Renamed from `cc-guess-make-style'.
(cc-guess-merge-offsets-alists): Renamed from `cc-guess-merge-styles'.
(cc-guess-make-basic-offset): New function.
(cc-guess-guessed-offsets-alist): New variable.
(cc-guess-guessed-basic-offset): New variable.
(cc-guess-view-accumulator): New function for debugging.
(cc-guess-reset-accumulator): New function.
(cc-guess-view-offsets-alist): Renamed from `cc-guess-view-style'.
2011-01-13 Masatake YAMATO <yamato@redhat.com>
* cc-guess.el (cc-guess-region-max): New option.
(cc-guess-buffer): New function.
(cc-guess): Limit the region for examining indentation
by `cc-guess-region-max'.
2011-01-13 Masatake YAMATO <yamato@redhat.com>
* cc-guess.el (cc-guess-accumulate): New function.
(cc-guess-region): Use `cc-guess-accumulate'.
2011-01-13 Masatake YAMATO <yamato@redhat.com>
* cc-guess.el: s/delta-accumulator/accumulator/g.
2011-01-13 Masatake YAMATO <yamato@redhat.com>
* cc-guess.el (cc-guess-empty-line-p): New subroutine.
(cc-guess-region): Handle multiple symbols returned from
`c-guess-basic-syntax'. Use `cc-guess-empty-line-p'.
2011-01-13 Masatake YAMATO <yamato@redhat.com>
* cc-guess.el (cc-guess-guessed-style): Renamed from
`cc-guessed-style'.
2011-01-13 Masatake YAMATO <yamato@redhat.com>
* cc-guess.el (cc-guess-delta-threshold): New option.
(cc-guess-current-delta): New function derived from
`cc-guess-region'.
(cc-guess-region): Don't accumulate a sampled indentation if
it is greater than `cc-guess-delta-threshold'.
\f
Local Variables:
mode: change-log
End:
;;; cc-guess.el --- guess indentation values by scanning existing code
;; Copyright (C) 1985,1987,1992-2003, 2004, 2005, 2006 Free Software
;; Foundation, Inc.
;; Author: 1994-1995 Barry A. Warsaw
;; Maintainer: Unmaintained
;; Created: August 1994, split from cc-mode.el
;; Version: See cc-mode.el
;; Keywords: c languages oop
;; This file is not part of GNU Emacs.
;; This program is free software; you can redistribute it and/or modify
;; it under the terms of the GNU General Public License as published by
;; the Free Software Foundation; either version 2 of the License, or
;; (at your option) any later version.
;;
;; This program is distributed in the hope that it will be useful,
;; but WITHOUT ANY WARRANTY; without even the implied warranty of
;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
;; GNU General Public License for more details.
;;
;; You should have received a copy of the GNU General Public License
;; along with this program; see the file COPYING. If not, write to
;; the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor,
;; Boston, MA 02110-1301, USA.
;;; Commentary:
;;
;; This file contains routines that help guess the cc-mode style in a
;; particular region/buffer. Here style means `offsets-alist' and
;; `basic-offset'.
;;
;; The main entry point of this program is `cc-guess' command but there
;; are some variants.
;;
;; Consider the major mode for the current buffer is one of the modes
;; provided by cc-mode. `cc-guess' guesses the indentation style by
;; examining the indentation in a region which started from buffer
;; beginning to the point limited by `cc-guess-offset-threshold' and
;; installs the guessed style. The name for installed style is given
;; by `cc-guess-style-name'.
;; `cc-guess-buffer' does the same but in a whole the buffer.
;; `cc-guess-region' does the same but in a region between the point
;; and the mark. `cc-guess-no-install', `cc-guess-buffer-no-install'
;; and `cc-guess-region-no-install' guess the indentation style but
;; don't install.
;;; Code:
(eval-when-compile
(let ((load-path
(if (and (boundp 'byte-compile-dest-file)
(stringp byte-compile-dest-file))
(cons (file-name-directory byte-compile-dest-file) load-path)
load-path)))
(load "cc-bytecomp" nil t)))
(cc-require 'cc-defs)
(cc-require 'cc-engine)
\f
(defcustom cc-guess-offset-threshold 10
"Threshold of acceptable offset when examining indent information.
Discard a examined offset if its absolute value is greater than this.
The offset of the a line included in the indent information returned
by `c-guess-basic-syntax'."
:type 'integer
:group 'c)
(defcustom cc-guess-region-max 50000
"The maximum point of region for examining indent information with `cc-guess'.
It takes long time for examining indent information from large region.
This option helps you limit the examining time. `nil' means no limit."
:type 'integer
:group 'c)
\f
(defvar cc-guess-guessed-offsets-alist nil
"Currently guessed offsets-alist. Buffer local.")
(defvar cc-guess-guessed-basic-offset nil
"Currently guessed basic-offset. Buffer local.")
(defvar cc-guess-accumulator nil)
;; Accumulated examined indent information. Information is represented
;; in a list. Each element in it has following structure:
;;
;; (syntactic-symbol ((indentation-offset1 . number-of-times1)
;; (indentation-offset2 . number-of-times2)
;; ...))
;;
;; This structure is built by `cc-guess-accumulate-offset'.
;;
;; Here we call the pair (indentation-offset1 . number-of-times1) a
;; counter. `cc-guess-sort-accumulator' sorts the order of
;; counters by number-of-times.
;; Use `cc-guess-view-accumulator' to see the value.
(defconst cc-guess-conversions
'((c . c-lineup-C-comments)
(inher-cont . c-lineup-multi-inher)
(string . -1000)
(comment-intro . c-lineup-comment)
(arglist-cont-nonempty . c-lineup-arglist)
(arglist-close . c-lineup-close-paren)
(cpp-macro . -1000)))
(defun cc-guess (&optional accumulate)
"Apply `cc-guess-region' on the region limited by `cc-guess-region-max'.
If given a prefix argument (or if the optional argument ACCUMULATE is
non-nil) then the previous guess is extended, otherwise a new guess is
made from scratch."
(interactive "P")
(cc-guess-region (point-min)
(min (point-max) (or cc-guess-region-max
(point-max)))
accumulate))
(defun cc-guess-no-install (&optional accumulate)
"Apply `cc-guess-region-no-install' on the region limited by `cc-guess-region-max'.
If given a prefix argument (or if the optional argument ACCUMULATE is
non-nil) then the previous guess is extended, otherwise a new guess is
made from scratch."
(interactive "P")
(cc-guess-region-no-install (point-min)
(min (point-max) (or cc-guess-region-max
(point-max)))
accumulate))
(defun cc-guess-buffer (&optional accumulate)
"Apply `cc-guess-region' on the whole current buffer.
If given a prefix argument (or if the optional argument ACCUMULATE is
non-nil) then the previous guess is extended, otherwise a new guess is
made from scratch."
(interactive "P")
(cc-guess-region (point-min)
(point-max)
accumulate))
(defun cc-guess-buffer-no-install (&optional accumulate)
"Apply `cc-guess-region-no-install' on the whole current buffer.
If given a prefix argument (or if the optional argument ACCUMULATE is
non-nil) then the previous guess is extended, otherwise a new guess is
made from scratch."
(interactive "P")
(cc-guess-region-no-install (point-min)
(point-max)
accumulate))
(defun cc-guess-region (start end &optional accumulate)
"Call `cc-guess-region-no-install' and install the guessed style."
(interactive "r\nP")
(cc-guess-region-no-install start end accumulate)
(cc-guess-install))
(defun cc-guess-region-no-install (start end &optional accumulate)
"Guess the indentation style by examining the indentation in a region of code.
Every line of code in the region is examined and the values for following
two variabels are guessed:
* `c-basic-offset', and
* the indentation values of the various syntactic symbols in
`c-offsets-alist'.
The guessed values are put into `cc-guess-guessed-basic-offset' and
`cc-guess-guessed-offsets-alist'.
Frequencies of use are taken into account when guessing, so minor inconsistencies
in the indentation style shouldn't produce wrong guesses.
If given a prefix argument (or if the optional argument ACCUMULATE is
non-nil) then the previous examination is extended, otherwise a new
guess is made from scratch.
Note that the larger the region to guess in, the slower the guessing.
So you can limit the region with `cc-guess-region-max'."
(interactive "r\nP")
;;
;; Examining stage
;;
(let ((accumulator (when accumulate cc-guess-accumulator))
(reporter (when (fboundp 'make-progress-reporter)
(make-progress-reporter "Examining Indentation " start end))))
(save-excursion
(goto-char start)
(while (< (point) end)
(unless (cc-guess-empty-line-p)
(mapc (lambda (s)
(setq accumulator (or (cc-guess-accumulate accumulator s)
accumulator)))
(c-save-buffer-state () (c-guess-basic-syntax))))
(when reporter (progress-reporter-update reporter (point)))
(forward-line 1)))
(when reporter (progress-reporter-done reporter))
(setq cc-guess-accumulator (cc-guess-sort-accumulator accumulator)))
;;
;; Guessing stage
;;
(let* ((basic-offset (cc-guess-make-basic-offset cc-guess-accumulator))
(typical-offsets-alist (cc-guess-make-offsets-alist cc-guess-accumulator))
(symbolic-offsets-alist (cc-guess-symbolize-offsets-alist
typical-offsets-alist
basic-offset))
(merged-offsets-alist (cc-guess-merge-offsets-alists
(copy-list cc-guess-conversions)
symbolic-offsets-alist)))
(set (make-local-variable 'cc-guess-guessed-basic-offset) basic-offset)
(set (make-local-variable 'cc-guess-guessed-offsets-alist) merged-offsets-alist))
)
(defsubst cc-guess-empty-line-p ()
(eq (line-beginning-position)
(line-end-position)))
(defun cc-guess-current-offset (relpos)
;; Calculate relative indentation (point) to RELPOS.
(- (progn (back-to-indentation)
(current-column))
(save-excursion
(goto-char relpos)
(current-column))))
(defun cc-guess-accumulate (accumulator syntax-element)
;; Added SYNTAX-ELEMENT to ACCUMULATOR.
(let ((symbol (car syntax-element))
(relpos (cadr syntax-element)))
(when (numberp relpos)
(let ((offset (cc-guess-current-offset relpos)))
(when (< (abs offset) cc-guess-offset-threshold)
(cc-guess-accumulate-offset accumulator
symbol
offset))))))
(defun cc-guess-accumulate-offset (accumulator symbol offset)
;; Added SYMBOL and OFFSET to ACCUMULATOR. See
;; `cc-guess-accumulator' about the structure of ACCUMULATOR.
(let* ((entry (assoc symbol accumulator))
(counters (cdr entry))
counter)
(if entry
(progn
(setq counter (assoc offset counters))
(if counter
(setcdr counter (1+ (cdr counter)))
(setq counters (cons (cons offset 1) counters))
(setcdr entry counters))
accumulator)
(cons (cons symbol (cons (cons offset 1) nil)) accumulator))))
(defun cc-guess-sort-accumulator (accumulator)
;; Sort the each element of ACCUMULATOR by the number-of-times. See
;; `cc-guess-accumulator' for more details.
(mapcar
(lambda (entry)
(let ((symbol (car entry))
(counters (cdr entry)))
(cons symbol (sort counters
(lambda (a b)
(if (> (cdr a) (cdr b))
t
(and
(eq (cdr a) (cdr b))
(< (car a) (car b)))))))))
accumulator))
(defun cc-guess-make-offsets-alist (accumulator)
;; Throw away the rare cases in accumulator and make a offsets-alist structure.
(mapcar
(lambda (entry)
(cons (car entry)
(car (car (cdr entry)))))
accumulator))
(defun cc-guess-merge-offsets-alists (strong weak)
;; Merge two offsets-alists into one. When two offsets-alists have the same symbol
;; entry, give STRONG priority over WEAK.
(mapc
(lambda (weak-elt)
(unless (assoc (car weak-elt) strong)
(setq strong (cons weak-elt strong))))
weak)
strong)
(defun cc-guess-make-basic-offset (accumulator)
;; As `basic-offset' find the most frequently appeared indentation-offset
;; from ACCUMULATOR.
(let* (;; Drop the value related to `c' syntactic-symbol.
;; (`c': Inside a multiline C style block comment.)
;; The impact for values of `c' is too large for guessing
;; `basic-offset' if the target source file is small and its license notice
;; is at top of the file.
(accumulator (assq-delete-all 'c (copy-list accumulator)))
;; Drop syntactic-symbols from ACCUMULATOR.
(alist (apply #'append (mapcar (lambda (elts)
(mapcar (lambda (elt)
(cons (abs (car elt))
(cdr elt)))
(cdr elts)))
accumulator)))
;; Gather all indentation-offsets other than 0.
;; 0 is meaningless as `basic-offset'.
(offset-list (delete 0
(delete-dups (mapcar (lambda (elt) (car elt)) alist))))
;; Sum of number-of-times for offset:
;; (offset . sum)
(summed (mapcar (lambda (offset)
(cons offset (apply #'+ (mapcar (lambda (a)
(if (eq (car a) offset)
(cdr a)
0))
alist))))
offset-list)))
;;
;; Find the majority.
;;
(let ((majority '(nil . 0)))
(while summed
(when (< (cdr majority) (cdr (car summed)))
(setq majority (car summed)))
(setq summed (cdr summed)))
(car majority))))
(defun cc-guess-symbolize-offsets-alist (offsets-alist basic-offset)
;; Convert the representation of OFFSETS-ALIST to an alist using
;; `+', `-', `++', `--', `*', or `/'. These symbols represents
;; a value relative to BASIC-OFFSET. See info of CC mode about
;; the detail of the symbols.
(mapcar
(lambda (elt)
(let ((s (car elt))
(v (cdr elt)))
(cond
((integerp v)
(cons s (cc-guess-symbolize-integer v
basic-offset)))
(t elt))))
offsets-alist))
(defun cc-guess-symbolize-integer (int basic-offset)
(let ((aint (abs int)))
(cond
((eq int basic-offset) '+)
((eq aint basic-offset) '-)
((eq int (* 2 basic-offset)) '++)
((eq aint (* 2 basic-offset)) '--)
((eq (* 2 int) basic-offset) '*)
((eq (* 2 aint) basic-offset) '-)
(t int))))
(defun cc-guess-style-name-p (name)
"Return t if NAME is name of a style created by cc-guess."
(string-prefix-p "*cc-guess*:" name))
(defun cc-guess-style-name ()
;; Make a style name for the guessed style.
(format "*cc-guess*:%s" (buffer-file-name)))
(defun cc-guess-make-style ()
;; Make a style from guessed values.
(when cc-guess-guessed-offsets-alist
(let* ((basic-offset cc-guess-guessed-basic-offset)
(offsets-alist (cc-guess-merge-offsets-alists
cc-guess-guessed-offsets-alist
c-offsets-alist)))
`((c-basic-offset . ,basic-offset)
(c-offsets-alist . ,offsets-alist)))))
(defun cc-guess-install ()
"Define the indentation style from the last guessed values and use it.
Here guessed values mean `cc-guess-guessed-basic-offset' and
`cc-guess-guessed-offsets-alist'.
When defining the style from `cc-guess-guessed-offsets-alist',
`c-offsets-alist' is also merged into the style. However,
`cc-guess-guessed-offsets-alist' takes precedence over
`c-offsets-alist'.
The style name is given by `cc-guess-style-name'."
(interactive)
(let ((style (cc-guess-make-style)))
(if style
(c-add-style (cc-guess-style-name)
style
t)
(error "Not yet guessed"))))
(defun cc-guess-view-accumulator ()
"Show `cc-guess-accumulator'."
(interactive)
(with-output-to-temp-buffer "*Accumulated Examined Indent Information*"
(pp cc-guess-accumulator)))
(defun cc-guess-reset-accumulator ()
"Reset `cc-guess-accumulator'."
(interactive)
(setq cc-guess-reset-accumulator nil))
(defun cc-guess-view-guessed-values ()
"Show `cc-guess-guessed-basic-offset' and `cc-guess-guessed-offsets-alist'."
(interactive)
(with-output-to-temp-buffer "*Guessed Values*"
(princ "basic-offset: \n\t")
(pp cc-guess-guessed-basic-offset)
(princ "\n\n")
(princ "offsets-alist: \n")
(pp cc-guess-guessed-offsets-alist)
))
(defun cc-guess-view-guessed-style ()
"Show the guessed style."
(interactive)
(let ((style (cc-guess-make-style)))
(if style
(with-output-to-temp-buffer "*Guessed Style*"
(pp style))
(error "Not yet guessed"))))
\f
(cc-provide 'cc-guess)
;;; cc-guess.el ends here
next reply other threads:[~2011-02-10 20:23 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-02-10 20:23 Masatake YAMATO [this message]
2011-03-23 10:13 ` request for reviewing the updated version of cc-guess.el Alan Mackenzie
2011-03-23 10:24 ` Masatake YAMATO
2011-03-24 11:35 ` Alan Mackenzie
2011-03-24 12:07 ` Masatake YAMATO
2011-03-25 13:26 ` Alan Mackenzie
2011-03-27 14:43 ` Masatake YAMATO
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/emacs/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110211.052337.141994890934352619.yamato@redhat.com \
--to=yamato@redhat.com \
--cc=acm@muc.de \
--cc=emacs-devel@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).