unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* request for reviewing the updated version of cc-guess.el
@ 2011-02-10 20:23 Masatake YAMATO
  2011-03-23 10:13 ` Alan Mackenzie
  0 siblings, 1 reply; 7+ messages in thread
From: Masatake YAMATO @ 2011-02-10 20:23 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: emacs-devel

Hi,

Taking 7 years I've updated cc-guess.el as suggested by
Martin Stjernholm. 

Currently cc-guess.el is not included in cc-mode official release and
as the result it is not included in GNU Emacs.

Could you review the updated version with considering inclduing it to
the release?


My original post:
   http://sourceforge.net/mailarchive/message.php?msg_id=8994118


Martin's suggestion and my answer:
   http://sourceforge.net/mailarchive/message.php?msg_id=8994526

Martin's suggesttions:
	 S1. thresholds for examining(sampling the indentation) the buffer.
         S2. Setting for c-basic-offset only as the result of guessing.
         S3. Key binding for the commands defined in cc-guess.el.

My answer in updated code:
         A1. I inrtoduced `cc-guess-offset-threshold'.
         A2. In addition to offset-alist, the update version of cc-guess.el
	     guesses the basic-offset value.
         A3. I removed key bindings.
	     

Masatake YAMATO




2011-02-11  Masatake YAMATO  <yamato@redhat.com>

	* cc-guess.el (cc-guess): Don't limit the region
	if `cc-guess-region-max' is nil.
	(cc-guess-no-install): Ditto.

2011-01-18  Masatake YAMATO  <yamato@redhat.com>

	* cc-guess.el (cc-guess-make-basic-offset): Don't use sort
	to find the majority.

2011-01-18  Masatake YAMATO  <yamato@redhat.com>

	* cc-guess.el (cc-guess-style-name-p): New function.

2011-01-16  Masatake YAMATO  <yamato@redhat.com>

	* cc-guess.el: Use `+', `-', `++', `--', `*', or `/'.
	(cc-guess-symbolize-offsets-alist): New function.
	(cc-guess-symbolize-integer): New function.
	(cc-guess-make-style): New function.
	(cc-guess-view-guessed-style): New function.

2011-01-16  Masatake YAMATO  <yamato@redhat.com>

	* cc-guess.el (cc-guess-no-install): New function.
	(cc-guess-buffer-no-install): New function.
	(cc-guess-region-no-install): New function.
	(cc-guess-make-basic-offset): Ignore `c' syntax-symbol.
	(cc-guess-style-name): New function.
	(cc-guess-install): Don't set `c-offsets-alist'. Instead
	define a style and use it.
	(cc-guess-view-guessed-values): Renamed from
	`cc-guess-view-offsets-alist'.
	Print also `basic-offset'.

2011-01-16  Masatake YAMATO  <yamato@redhat.com>

	* cc-guess.el: Use term `offset' instead of `delta'.
	Temorary don't use the term `style'. Instead use term
	`basic-offset' or `offset-alist'.
	(cc-guess-offset-threshold): Renamed from `cc-guess-delta-threshold'.
	(cc-guess-accumulate-offset): Renamed from `cc-guess-accumulate-delta'.
	(cc-guess-guessed-style): Removed.
	(cc-guess-make-offsets-alist): Renamed from `cc-guess-make-style'.
	(cc-guess-merge-offsets-alists): Renamed from `cc-guess-merge-styles'.
	(cc-guess-make-basic-offset): New function.
	(cc-guess-guessed-offsets-alist): New variable.
	(cc-guess-guessed-basic-offset): New variable.
	(cc-guess-view-accumulator): New function for debugging.
	(cc-guess-reset-accumulator): New function.
	(cc-guess-view-offsets-alist): Renamed from `cc-guess-view-style'.

2011-01-13  Masatake YAMATO  <yamato@redhat.com>

	* cc-guess.el (cc-guess-region-max): New option.
	(cc-guess-buffer): New function.
	(cc-guess): Limit the region for examining indentation
	by `cc-guess-region-max'.

2011-01-13  Masatake YAMATO  <yamato@redhat.com>

	* cc-guess.el (cc-guess-accumulate): New function.
	(cc-guess-region): Use `cc-guess-accumulate'.

2011-01-13  Masatake YAMATO  <yamato@redhat.com>

	* cc-guess.el: s/delta-accumulator/accumulator/g.

2011-01-13  Masatake YAMATO  <yamato@redhat.com>

	* cc-guess.el (cc-guess-empty-line-p): New subroutine.
	(cc-guess-region): Handle multiple symbols returned from
	`c-guess-basic-syntax'. Use `cc-guess-empty-line-p'.

2011-01-13  Masatake YAMATO  <yamato@redhat.com>

	* cc-guess.el (cc-guess-guessed-style): Renamed from
	`cc-guessed-style'.

2011-01-13  Masatake YAMATO  <yamato@redhat.com>

	* cc-guess.el (cc-guess-delta-threshold): New option.
	(cc-guess-current-delta): New function derived from
	`cc-guess-region'.
	(cc-guess-region): Don't accumulate a sampled indentation if
	it is greater than `cc-guess-delta-threshold'.

\f
Local Variables:
mode: change-log
End:

;;; cc-guess.el --- guess indentation values by scanning existing code

;; Copyright (C) 1985,1987,1992-2003, 2004, 2005, 2006 Free Software
;; Foundation, Inc.

;; Author:     1994-1995 Barry A. Warsaw
;; Maintainer: Unmaintained
;; Created:    August 1994, split from cc-mode.el
;; Version:    See cc-mode.el
;; Keywords:   c languages oop

;; This file is not part of GNU Emacs.

;; This program is free software; you can redistribute it and/or modify
;; it under the terms of the GNU General Public License as published by
;; the Free Software Foundation; either version 2 of the License, or
;; (at your option) any later version.
;; 
;; This program is distributed in the hope that it will be useful,
;; but WITHOUT ANY WARRANTY; without even the implied warranty of
;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
;; GNU General Public License for more details.
;; 
;; You should have received a copy of the GNU General Public License
;; along with this program; see the file COPYING.  If not, write to
;; the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor,
;; Boston, MA 02110-1301, USA.

;;; Commentary:
;;
;; This file contains routines that help guess the cc-mode style in a
;; particular region/buffer.  Here style means `offsets-alist' and 
;; `basic-offset'.
;;
;; The main entry point of this program is `cc-guess' command but there
;; are some variants.
;;
;; Consider the major mode for the current buffer is one of the modes
;; provided by cc-mode. `cc-guess' guesses the indentation style by
;; examining the indentation in a region which started from buffer
;; beginning to the point limited by `cc-guess-offset-threshold' and
;; installs the guessed style. The name for installed style is given
;; by `cc-guess-style-name'.
;; `cc-guess-buffer' does the same but in a whole the buffer.
;; `cc-guess-region' does the same but in a region between the point
;; and the mark.  `cc-guess-no-install', `cc-guess-buffer-no-install'
;; and `cc-guess-region-no-install' guess the indentation style but
;; don't install. 

;;; Code:

(eval-when-compile
  (let ((load-path
	 (if (and (boundp 'byte-compile-dest-file)
		  (stringp byte-compile-dest-file))
	     (cons (file-name-directory byte-compile-dest-file) load-path)
	   load-path)))
    (load "cc-bytecomp" nil t)))

(cc-require 'cc-defs)
(cc-require 'cc-engine)

\f

(defcustom cc-guess-offset-threshold 10
  "Threshold of acceptable offset when examining indent information.
Discard a examined offset if its absolute value is greater than this.

The offset of the a line included in the indent information returned 
by `c-guess-basic-syntax'."
  :type 'integer
  :group 'c)

(defcustom cc-guess-region-max 50000
  "The maximum point of region for examining indent information with `cc-guess'.
It takes long time for examining indent information from large region.
This option helps you limit the examining time. `nil' means no limit."
  :type 'integer
  :group 'c)

\f
(defvar cc-guess-guessed-offsets-alist nil
  "Currently guessed offsets-alist. Buffer local.")
(defvar cc-guess-guessed-basic-offset nil
  "Currently guessed basic-offset. Buffer local.")

(defvar cc-guess-accumulator nil)
;; Accumulated examined indent information.  Information is represented
;; in a list.  Each element in it has following structure:
;; 
;;  (syntactic-symbol ((indentation-offset1 . number-of-times1)
;; 		       (indentation-offset2 . number-of-times2)
;; 		       ...))
;; 
;; This structure is built by `cc-guess-accumulate-offset'.
;; 
;; Here we call the pair (indentation-offset1 . number-of-times1) a
;; counter.  `cc-guess-sort-accumulator' sorts the order of
;; counters by number-of-times.
;; Use `cc-guess-view-accumulator' to see the value.

(defconst cc-guess-conversions
  '((c . c-lineup-C-comments)
    (inher-cont . c-lineup-multi-inher)
    (string . -1000)
    (comment-intro . c-lineup-comment)
    (arglist-cont-nonempty . c-lineup-arglist)
    (arglist-close . c-lineup-close-paren)
    (cpp-macro . -1000)))


(defun cc-guess (&optional accumulate)
  "Apply `cc-guess-region' on the region limited by `cc-guess-region-max'.

If given a prefix argument (or if the optional argument ACCUMULATE is
non-nil) then the previous guess is extended, otherwise a new guess is
made from scratch."
  (interactive "P")
  (cc-guess-region (point-min)
		   (min (point-max) (or cc-guess-region-max 
						   (point-max)))
		   accumulate))

(defun cc-guess-no-install (&optional accumulate)
  "Apply `cc-guess-region-no-install' on the region limited by `cc-guess-region-max'.

If given a prefix argument (or if the optional argument ACCUMULATE is
non-nil) then the previous guess is extended, otherwise a new guess is
made from scratch."
  (interactive "P")
  (cc-guess-region-no-install (point-min)
			      (min (point-max) (or cc-guess-region-max 
						   (point-max)))
			      accumulate))

(defun cc-guess-buffer (&optional accumulate)
  "Apply `cc-guess-region' on the whole current buffer.
 
 If given a prefix argument (or if the optional argument ACCUMULATE is
 non-nil) then the previous guess is extended, otherwise a new guess is
 made from scratch."
  (interactive "P")
  (cc-guess-region (point-min)
		   (point-max)
		   accumulate))

(defun cc-guess-buffer-no-install (&optional accumulate)
  "Apply `cc-guess-region-no-install' on the whole current buffer.
 
 If given a prefix argument (or if the optional argument ACCUMULATE is
 non-nil) then the previous guess is extended, otherwise a new guess is
 made from scratch."
  (interactive "P")
  (cc-guess-region-no-install (point-min)
			      (point-max)
			      accumulate))

(defun cc-guess-region (start end &optional accumulate)
  "Call `cc-guess-region-no-install' and install the guessed style."
  (interactive "r\nP")
  (cc-guess-region-no-install start end accumulate)
  (cc-guess-install))

(defun cc-guess-region-no-install (start end &optional accumulate)
  "Guess the indentation style by examining the indentation in a region of code.
Every line of code in the region is examined and the values for following
two variabels are guessed: 

* `c-basic-offset', and
* the indentation values of the various syntactic symbols in 
  `c-offsets-alist'.

The guessed values are put into `cc-guess-guessed-basic-offset' and 
`cc-guess-guessed-offsets-alist'.

Frequencies of use are taken into account when guessing, so minor inconsistencies
in the indentation style shouldn't produce wrong guesses. 

If given a prefix argument (or if the optional argument ACCUMULATE is
non-nil) then the previous examination is extended, otherwise a new
guess is made from scratch.

Note that the larger the region to guess in, the slower the guessing.
So you can limit the region with `cc-guess-region-max'."
  (interactive "r\nP")
  ;;
  ;; Examining stage
  ;;
  (let ((accumulator (when accumulate cc-guess-accumulator))
	(reporter (when (fboundp 'make-progress-reporter)
		    (make-progress-reporter "Examining Indentation " start end))))
    (save-excursion
      (goto-char start)
      (while (< (point) end)
	(unless (cc-guess-empty-line-p)
	  (mapc (lambda (s)
		  (setq accumulator (or (cc-guess-accumulate accumulator s)
					accumulator)))
		(c-save-buffer-state () (c-guess-basic-syntax))))
	(when reporter (progress-reporter-update reporter (point)))
	(forward-line 1)))
    (when reporter (progress-reporter-done reporter))
    (setq cc-guess-accumulator (cc-guess-sort-accumulator accumulator)))
  ;;
  ;; Guessing stage
  ;;
  (let* ((basic-offset (cc-guess-make-basic-offset cc-guess-accumulator))
	 (typical-offsets-alist (cc-guess-make-offsets-alist cc-guess-accumulator))
	 (symbolic-offsets-alist (cc-guess-symbolize-offsets-alist 
				  typical-offsets-alist
				  basic-offset))
	 (merged-offsets-alist (cc-guess-merge-offsets-alists 
				(copy-list cc-guess-conversions)
				symbolic-offsets-alist)))
    (set (make-local-variable 'cc-guess-guessed-basic-offset) basic-offset)
    (set (make-local-variable 'cc-guess-guessed-offsets-alist) merged-offsets-alist))
  )


(defsubst cc-guess-empty-line-p ()
  (eq (line-beginning-position)
      (line-end-position)))

(defun cc-guess-current-offset (relpos)
  ;; Calculate relative indentation (point) to RELPOS.
  (- (progn (back-to-indentation)
	    (current-column))
     (save-excursion
       (goto-char relpos)
       (current-column))))

(defun cc-guess-accumulate (accumulator syntax-element)
  ;; Added SYNTAX-ELEMENT to ACCUMULATOR.
  (let ((symbol (car syntax-element))
	(relpos (cadr syntax-element)))
    (when (numberp relpos)
      (let ((offset (cc-guess-current-offset relpos)))
	(when (< (abs offset) cc-guess-offset-threshold)
	  (cc-guess-accumulate-offset accumulator
				      symbol
				      offset))))))

(defun cc-guess-accumulate-offset (accumulator symbol offset)
  ;; Added SYMBOL and OFFSET to ACCUMULATOR.  See
  ;; `cc-guess-accumulator' about the structure of ACCUMULATOR.
  (let* ((entry    (assoc symbol accumulator))
	 (counters (cdr entry))
	 counter)
    (if entry
	(progn
	  (setq counter (assoc offset counters))
	  (if counter
	      (setcdr counter (1+ (cdr counter)))
	    (setq counters (cons (cons offset 1) counters))
	    (setcdr entry counters))
	  accumulator)
      (cons (cons symbol (cons (cons offset 1) nil)) accumulator))))

(defun cc-guess-sort-accumulator (accumulator)
  ;; Sort the each element of ACCUMULATOR by the number-of-times.  See
  ;; `cc-guess-accumulator' for more details.
  (mapcar
   (lambda (entry)
     (let ((symbol (car entry))
	   (counters (cdr entry)))
       (cons symbol (sort counters 
			  (lambda (a b)
			    (if (> (cdr a) (cdr b))
				t
			      (and 
			       (eq (cdr a) (cdr b))
			       (< (car a) (car b)))))))))
   accumulator))

(defun cc-guess-make-offsets-alist (accumulator)
  ;; Throw away the rare cases in accumulator and make a offsets-alist structure.
  (mapcar 
   (lambda (entry)
     (cons (car entry) 
	   (car (car (cdr entry)))))
   accumulator))

(defun cc-guess-merge-offsets-alists (strong weak)
  ;; Merge two offsets-alists into one.  When two offsets-alists have the same symbol
  ;; entry, give STRONG priority over WEAK.
  (mapc
   (lambda (weak-elt)
     (unless (assoc (car weak-elt) strong)
       (setq strong (cons weak-elt strong))))
   weak)
  strong)

(defun cc-guess-make-basic-offset (accumulator)
  ;; As `basic-offset' find the most frequently appeared indentation-offset
  ;; from ACCUMULATOR.
  (let* (;; Drop the value related to `c' syntactic-symbol.
	 ;; (`c': Inside a multiline C style block comment.)
	 ;; The impact for values of `c' is too large for guessing 
	 ;; `basic-offset' if the target source file is small and its license notice 
	 ;; is at top of the file.
	 (accumulator (assq-delete-all 'c (copy-list accumulator)))
	 ;; Drop syntactic-symbols from ACCUMULATOR.
	 (alist (apply #'append (mapcar (lambda (elts)
					  (mapcar (lambda (elt)
						    (cons (abs (car elt))
							  (cdr elt)))
						  (cdr elts)))
					accumulator)))
	 ;; Gather all indentation-offsets other than 0. 
	 ;; 0 is meaningless as `basic-offset'.
	 (offset-list (delete 0
			      (delete-dups (mapcar (lambda (elt) (car elt)) alist))))
	 ;; Sum of number-of-times for offset:
	 ;;  (offset . sum)
	 (summed (mapcar (lambda (offset)
			   (cons offset (apply #'+ (mapcar (lambda (a) 
							     (if (eq (car a) offset) 
								 (cdr a)
							       0))
							   alist))))
			 offset-list)))
    ;;
    ;; Find the majority.
    ;;
    (let ((majority '(nil . 0)))
      (while summed
	(when (< (cdr majority) (cdr (car summed)))
	  (setq majority (car summed)))
	(setq summed (cdr summed)))
      (car majority))))

(defun cc-guess-symbolize-offsets-alist (offsets-alist basic-offset)
  ;; Convert the representation of OFFSETS-ALIST to an alist using 
  ;; `+', `-', `++', `--', `*', or `/'. These symbols represents
  ;; a value relative to BASIC-OFFSET. See info of CC mode about
  ;; the detail of the symbols.
  (mapcar 
   (lambda (elt)
     (let ((s (car elt))
	   (v (cdr elt)))
       (cond
	((integerp v)
	 (cons s (cc-guess-symbolize-integer v 
					     basic-offset)))
	(t elt))))
   offsets-alist))

(defun cc-guess-symbolize-integer (int basic-offset)
  (let ((aint (abs int)))
    (cond
     ((eq int basic-offset) '+)
     ((eq aint basic-offset) '-)
     ((eq int (* 2 basic-offset)) '++)
     ((eq aint (* 2 basic-offset)) '--)
     ((eq (* 2 int) basic-offset) '*)
     ((eq (* 2 aint) basic-offset) '-)
     (t int))))

(defun cc-guess-style-name-p (name)
  "Return t if NAME is name of a style created by cc-guess."
  (string-prefix-p "*cc-guess*:" name))
(defun cc-guess-style-name ()
  ;; Make a style name for the guessed style.
  (format "*cc-guess*:%s" (buffer-file-name)))

(defun cc-guess-make-style ()
  ;; Make a style from guessed values.
  (when cc-guess-guessed-offsets-alist
    (let*	((basic-offset cc-guess-guessed-basic-offset)
		 (offsets-alist (cc-guess-merge-offsets-alists
				 cc-guess-guessed-offsets-alist
				 c-offsets-alist)))
      `((c-basic-offset . ,basic-offset)
	(c-offsets-alist . ,offsets-alist)))))

(defun cc-guess-install ()
  "Define the indentation style from the last guessed values and use it.
Here guessed values mean `cc-guess-guessed-basic-offset' and 
`cc-guess-guessed-offsets-alist'.

When defining the style from `cc-guess-guessed-offsets-alist',
`c-offsets-alist' is also merged into the style.  However,
`cc-guess-guessed-offsets-alist' takes precedence over
`c-offsets-alist'.

The style name is given by `cc-guess-style-name'."
  (interactive)
  (let ((style (cc-guess-make-style)))
    (if style
	(c-add-style (cc-guess-style-name)
		     style
		     t)
      (error "Not yet guessed"))))

(defun cc-guess-view-accumulator ()
  "Show `cc-guess-accumulator'."
  (interactive)
  (with-output-to-temp-buffer "*Accumulated Examined Indent Information*"
    (pp cc-guess-accumulator)))

(defun cc-guess-reset-accumulator ()
  "Reset `cc-guess-accumulator'."
  (interactive)
  (setq cc-guess-reset-accumulator nil))

(defun cc-guess-view-guessed-values ()
  "Show `cc-guess-guessed-basic-offset' and `cc-guess-guessed-offsets-alist'."
  (interactive)
  (with-output-to-temp-buffer "*Guessed Values*"
    (princ "basic-offset: \n\t")
    (pp cc-guess-guessed-basic-offset)
    (princ "\n\n")
    (princ "offsets-alist: \n")
    (pp cc-guess-guessed-offsets-alist)
    ))

(defun cc-guess-view-guessed-style ()
  "Show the guessed style."
  (interactive)
  (let ((style (cc-guess-make-style)))
    (if style
	(with-output-to-temp-buffer "*Guessed Style*"
	  (pp style))
      (error "Not yet guessed"))))

\f
(cc-provide 'cc-guess)
;;; cc-guess.el ends here

         



^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2011-03-27 14:43 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-02-10 20:23 request for reviewing the updated version of cc-guess.el Masatake YAMATO
2011-03-23 10:13 ` Alan Mackenzie
2011-03-23 10:24   ` Masatake YAMATO
2011-03-24 11:35     ` Alan Mackenzie
2011-03-24 12:07       ` Masatake YAMATO
2011-03-25 13:26         ` Alan Mackenzie
2011-03-27 14:43           ` Masatake YAMATO

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).