unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* compare-windows - synchronize points
@ 2003-08-11 19:59 Juri Linkov
  2003-08-12 23:22 ` Richard Stallman
  0 siblings, 1 reply; 16+ messages in thread
From: Juri Linkov @ 2003-08-11 19:59 UTC (permalink / raw


Mostly, for comparing buffers Ediff is the best choice.  But in some cases
compare-windows is more useful.  For example, when buffers contain
very long lines with comma-separated fields, it's more convenient to
enable lines truncation and use compare-windows to interactively browse
the differences.  However, when compare-windows finds the difference,
it takes too much work to manually synchronize points between two windows.
This patch allows to automatically synchronize points according to
user-defined variable `compare-windows-sync'.  If the value of this
variable is a regexp, then the points in both windows are advanced
to the next occurrence of this regexp.  If the value of this variable
is a function, then this function is called to advance points.
This function is called only when points are already located on
the mismatch position.  The current behavior in such case is simply
to call `ding' to beep or flash the screen.

So successive calls of `compare-windows' will work in interlaced mode:
on first call it advances points to the next difference,
on second call it synchronizes points by skipping the difference,
on third call it again advances points to the next difference and so on.

The default value of the customizable variable `compare-windows-sync'
is nil.  So this patch will not change the current behavior
(i.e. beeping if points are unmatched).

The useful values of the variable `compare-windows-sync' could be functions
`forward-word', `forward-sentence', `forward-paragraph', `end-of-defun'
or regexp containing some field separator or a newline, depending on
the nature of the difference units separator.
The variable can be made buffer-local.

Note also that if the command `compare-windows' is bound to
a single key (for example, C-=), then this patch allows to browse
all differences repeatedly with a single keystroke.

This patch also fixes some minor documentation issues, such as moving
a key sequence substitute \\[compare-windows] to the part of docstring
where it actually talks about calling the `compare-windows', etc.

2003-08-11  Juri Linkov  <juri@jurta.org>

	* compare-w.el (compare-windows-sync): New variable.
	(compare-windows): Use compare-windows-sync.
	(compare-windows-sync-regexp): New function.
	(compare-windows-whitespace): Doc fix.

Index: emacs/lisp/compare-w.el
===================================================================
RCS file: /cvsroot/emacs/emacs/lisp/compare-w.el,v
retrieving revision 1.22
diff -c -r1.22 compare-w.el
*** emacs/lisp/compare-w.el	16 Jul 2002 13:33:13 -0000	1.22
--- emacs/lisp/compare-w.el	11 Aug 2003 19:55:15 -0000
***************
*** 1,6 ****
  ;;; compare-w.el --- compare text between windows for Emacs
  
! ;; Copyright (C) 1986, 1989, 1993, 1997 Free Software Foundation, Inc.
  
  ;; Maintainer: FSF
  ;; Keywords: convenience files
--- 1,6 ----
  ;;; compare-w.el --- compare text between windows for Emacs
  
! ;; Copyright (C) 1986, 1989, 1993, 1997, 2003 Free Software Foundation, Inc.
  
  ;; Maintainer: FSF
  ;; Keywords: convenience files
***************
*** 37,58 ****
    :group 'tools)
  
  (defcustom compare-windows-whitespace "\\(\\s-\\|\n\\)+"
!   "*Regexp that defines whitespace sequences for \\[compare-windows].
  That command optionally ignores changes in whitespace.
  
  The value of `compare-windows-whitespace' is normally a regexp, but it
  can also be a function.  The function's job is to categorize any
  whitespace around (including before) point; it should also advance
! past any whitespace.  The function is called in each buffer, with
  point at the current scanning point.  It gets one argument, the point
! where `compare-windows' was originally called; it should not look at
  any text before that point.
  
! If the function returns the same value for both buffers, then the
  whitespace is considered to match, and is skipped."
    :type '(choice regexp function)
    :group 'compare-w)
  
  (defcustom compare-ignore-case nil
    "*Non-nil means \\[compare-windows] ignores case differences."
    :type 'boolean
--- 37,78 ----
    :group 'tools)
  
  (defcustom compare-windows-whitespace "\\(\\s-\\|\n\\)+"
!   "*Regexp or function that defines whitespace sequences for `compare-windows'.
  That command optionally ignores changes in whitespace.
  
  The value of `compare-windows-whitespace' is normally a regexp, but it
  can also be a function.  The function's job is to categorize any
  whitespace around (including before) point; it should also advance
! past any whitespace.  The function is called in each window, with
  point at the current scanning point.  It gets one argument, the point
! where \\[compare-windows] was originally called; it should not look at
  any text before that point.
  
! If the function returns the same value for both windows, then the
  whitespace is considered to match, and is skipped."
    :type '(choice regexp function)
    :group 'compare-w)
  
+ (defcustom compare-windows-sync nil
+   "*Regexp or function that defines synchronization sequences for the
+ case when points are already located on the mismatch position after
+ calling `compare-windows'.
+ 
+ If the value of `compare-windows-sync' is a regexp, then the points
+ in both windows are advanced to the next occurrence of this regexp.
+ 
+ The value of `compare-windows-sync' can also be a function.  The
+ function's job is to advance points in both windows to the next
+ synchronization point.
+ 
+ The useful values of this variable could be such functions as
+ `forward-word', `forward-sentence', `forward-paragraph', `end-of-defun'
+ or regexp containing some field separator or a newline, depending on
+ the nature of the difference units separator.  The variable can be
+ made buffer-local."
+   :type '(choice regexp function)
+   :group 'compare-w)
+ 
  (defcustom compare-ignore-case nil
    "*Non-nil means \\[compare-windows] ignores case differences."
    :type 'boolean
***************
*** 72,78 ****
  
  A prefix arg means ignore changes in whitespace.
  The variable `compare-windows-whitespace' controls how whitespace is skipped.
! If `compare-ignore-case' is non-nil, changes in case are also ignored."
    (interactive "P")
    (let* (p1 p2 maxp1 maxp2 b1 b2 w2
  	    (progress 1)
--- 92,104 ----
  
  A prefix arg means ignore changes in whitespace.
  The variable `compare-windows-whitespace' controls how whitespace is skipped.
! If `compare-ignore-case' is non-nil, changes in case are also ignored.
! 
! If `compare-windows-sync' is non-nil, then successive calls of
! this command work in interlaced mode:
! on first call it advances points to the next difference,
! on second call it synchronizes points by skipping the difference,
! on third call it again advances points to the next difference and so on."
    (interactive "P")
    (let* (p1 p2 maxp1 maxp2 b1 b2 w2
  	    (progress 1)
***************
*** 81,87 ****
  	    (skip-func (if ignore-whitespace
  			 (if (stringp compare-windows-whitespace)
  			     'compare-windows-skip-whitespace
! 			   compare-windows-whitespace))))
      (setq p1 (point) b1 (current-buffer))
      (setq w2 (next-window (selected-window)))
      (if (eq w2 (selected-window))
--- 107,116 ----
  	    (skip-func (if ignore-whitespace
  			 (if (stringp compare-windows-whitespace)
  			     'compare-windows-skip-whitespace
! 			   compare-windows-whitespace)))
! 	    (sync-func (if (stringp compare-windows-sync)
!                            'compare-windows-sync-regexp
!                          compare-windows-sync)))
      (setq p1 (point) b1 (current-buffer))
      (setq w2 (next-window (selected-window)))
      (if (eq w2 (selected-window))
***************
*** 99,107 ****
      (push-mark)
  
      (while (> progress 0)
!       ;; If both buffers have whitespace next to point,
        ;; optionally skip over it.
- 
        (and skip-func
  	   (save-excursion
  	     (let (p1a p2a w1 w2 result1 result2)
--- 128,135 ----
      (push-mark)
  
      (while (> progress 0)
!       ;; If both windows have whitespace next to point,
        ;; optionally skip over it.
        (and skip-func
  	   (save-excursion
  	     (let (p1a p2a w1 w2 result1 result2)
***************
*** 127,133 ****
        (set-window-point w2 p2))
  
      (if (= (point) opoint1)
! 	(ding))))
  
  ;; Move forward over whatever might be called whitespace.
  ;; compare-windows-whitespace is a regexp that matches whitespace.
--- 155,176 ----
        (set-window-point w2 p2))
  
      (if (= (point) opoint1)
! 	(if (not sync-func)
!             (ding)
!           ;; If points are not advanced (i.e. already on mismatch position),
!           ;; then synchronize points between both windows
!           (save-excursion
!             (funcall sync-func)
!             (setq p1 (point))
!             (set-buffer b2)
!             (goto-char p2)
!             (funcall sync-func)
!             (setq p2 (point)))
!           (goto-char p1)
!           (set-window-point w2 p2)
!           ;; If points are still not synchronized, then ding
!           (if (= (point) opoint1)
!               (ding))))))
  
  ;; Move forward over whatever might be called whitespace.
  ;; compare-windows-whitespace is a regexp that matches whitespace.
***************
*** 135,141 ****
  ;; and find the latest point at which a match ends.
  ;; Don't try starting points before START, though.
  ;; Value is non-nil if whitespace is found.
- 
  ;; If there is whitespace before point, but none after,
  ;; then return t, but don't advance point.
  (defun compare-windows-skip-whitespace (start)
--- 178,183 ----
***************
*** 158,163 ****
--- 200,210 ----
      (goto-char end)
      (or (/= beg opoint)
  	(/= end opoint))))
+ 
+ ;; Move forward to the next synchronization regexp.
+ (defun compare-windows-sync-regexp ()
+   (if (stringp compare-windows-sync)
+       (re-search-forward compare-windows-sync nil t)))
  
  (provide 'compare-w)
===================================================================

PS: I have not signed legal papers yet, but I want to contribute more
to the development of Emacs.
-- 
http://www.jurta.org/emacs/

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: compare-windows - synchronize points
  2003-08-11 19:59 compare-windows - synchronize points Juri Linkov
@ 2003-08-12 23:22 ` Richard Stallman
  2003-08-13  2:57   ` Juri Linkov
  0 siblings, 1 reply; 16+ messages in thread
From: Richard Stallman @ 2003-08-12 23:22 UTC (permalink / raw
  Cc: emacs-devel

    This patch allows to automatically synchronize points according to
    user-defined variable `compare-windows-sync'.  If the value of this
    variable is a regexp, then the points in both windows are advanced
    to the next occurrence of this regexp.

Doing this automatically would be a nice feature, but I wonder
if that method really does the right job frequently.  Could you tell
us what kinds of text you have used the feature on, and what values
you have used, and what sort of results you got?

Reports from other people who try it would be useful too.

Could we possibly find good defaults to put in, so that this feature
can be available by default and won't need customization in order
to work?

Another idea I have for syncing up is that a program could use a
simple quadratic algorithm to find the first matching 4-character
string in the two buffers, and move there.  This might get painfully
slow when there are substantial insertions, though.

The variable should be called compare-windows-sync-function, I think.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: compare-windows - synchronize points
  2003-08-12 23:22 ` Richard Stallman
@ 2003-08-13  2:57   ` Juri Linkov
  2003-08-14  3:07     ` Richard Stallman
  2003-08-14 10:43     ` Alex Schroeder
  0 siblings, 2 replies; 16+ messages in thread
From: Juri Linkov @ 2003-08-13  2:57 UTC (permalink / raw
  Cc: emacs-devel

Richard Stallman <rms@gnu.org> writes:
> Doing this automatically would be a nice feature, but I wonder
> if that method really does the right job frequently.  Could you tell
> us what kinds of text you have used the feature on, and what values
> you have used, and what sort of results you got?

The most useful result it gives when I compare the different program
outputs which look like this:

word1:word2:word3:word4:word5:word6:word7:word8:word9
word1:word2:word3:word4:word5:word6:word7:word8:word9

and another output which has differences in some fields, e.g.:

word1:word2:word3:word14:word15:word6:word7:word8:word9
word1:word2:word3:word24:word25:word6:word7:word8:word9

The suitable value for compare-windows-sync in this case is the string
containing the field separator ":".  Another good value with the
equivalent result is the function `forward-word'.

I've used it also on human-language texts with differences in some
words and with the value of compare-windows-sync set to `forward-sentence',
and on Lisp programs with the value set to 'end-of-defun'.

The main thing is that the difference region (i.e. region between
current unmatched point and next matching point) should be small
enough for user to fully review it before syncing to the next
matching point.  So generally speaking, this method does the right
job on texts with small differences.

> Could we possibly find good defaults to put in, so that this feature
> can be available by default and won't need customization in order
> to work?

It's difficult to find a common default for all cases.  Seems, the value
of compare-windows-sync should be set by user individually in each case
depending on the nature of texts and differences.  The only general
default value could be a function discussed below.

> Another idea I have for syncing up is that a program could use a
> simple quadratic algorithm to find the first matching 4-character
> string in the two buffers, and move there.  This might get painfully
> slow when there are substantial insertions, though.

I also have thought about implementing a difference algorithm, for
example, such as used in the diff program.  But even such a big Emacs
package as Ediff don't even try to implement a difference algorithms
and fully relies on calling the external diff program.  However, doing
the same thing in compare-windows would be definitely overkill.

Anyhow, this patch could be applied now, since it is already useful enough.
And later, if somebody will implement a good and fast function with a
simple algorithm for finding next matching text, then this function
could be added to compare-w.el and used as the default value for
compare-windows-sync.

> The variable should be called compare-windows-sync-function, I think.

Currently the variable compare-windows-sync has no -function suffix,
because it can hold whether a function or a regexp.

-- 
http://www.jurta.org/emacs/

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: compare-windows - synchronize points
  2003-08-13  2:57   ` Juri Linkov
@ 2003-08-14  3:07     ` Richard Stallman
  2003-08-14  5:33       ` Juri Linkov
  2003-08-14 10:43     ` Alex Schroeder
  1 sibling, 1 reply; 16+ messages in thread
From: Richard Stallman @ 2003-08-14  3:07 UTC (permalink / raw
  Cc: emacs-devel

    I've used it also on human-language texts with differences in some
    words and with the value of compare-windows-sync set to `forward-sentence',
    and on Lisp programs with the value set to 'end-of-defun'.

I think that end-of-defun would move too far--the end of the defun can
easily be 100 lines down.  This would be quite likely to overlook
further interesting differences.

    > Another idea I have for syncing up is that a program could use a
    > simple quadratic algorithm to find the first matching 4-character
    > string in the two buffers, and move there.  This might get painfully
    > slow when there are substantial insertions, though.

If it tried syncing up by comparing entire lines, it could handle
even quite large changes fairly fast.  Would you like to give that a try?

    Currently the variable compare-windows-sync has no -function suffix,
    because it can hold whether a function or a regexp.

Have you found the regexp case to be useful in any real situation?

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: compare-windows - synchronize points
  2003-08-14  3:07     ` Richard Stallman
@ 2003-08-14  5:33       ` Juri Linkov
  2003-08-16 16:19         ` Richard Stallman
  0 siblings, 1 reply; 16+ messages in thread
From: Juri Linkov @ 2003-08-14  5:33 UTC (permalink / raw
  Cc: emacs-devel

Richard Stallman <rms@gnu.org> writes:
> I think that end-of-defun would move too far--the end of the defun can
> easily be 100 lines down.  This would be quite likely to overlook
> further interesting differences.

I agree.  The end-of-defun is useless on large functions.

>     Currently the variable compare-windows-sync has no -function suffix,
>     because it can hold whether a function or a regexp.
> 
> Have you found the regexp case to be useful in any real situation?

Yes, the regexp is useful.  For example, if different files contain
changes separated by some fixed string (as e.g. field separators
in comma-separated value lists, etc.)

>     > Another idea I have for syncing up is that a program could use a
>     > simple quadratic algorithm to find the first matching 4-character
>     > string in the two buffers, and move there.  This might get painfully
>     > slow when there are substantial insertions, though.
> 
> If it tried syncing up by comparing entire lines, it could handle
> even quite large changes fairly fast.  Would you like to give that a try?

OK, here is quick implementation of simple quadratic algorithm.
Seems, it works well on small differences, but fails on big ones.
Increasing the default numbers improves the situation,
but function starts to get more slow.

(defvar compare-windows-sync-string-size 4)
(defvar compare-windows-sync-region-size 200)
(defvar compare-windows-sync-point nil)
(defvar compare-windows-sync 'compare-windows-sync-default-function)

;; Function works in two passes: one call on each window.
;; On first call both matching points are computed,
;; and one of them is stored in compare-windows-sync-point
;; to be used when this function is called on second window.
(defun compare-windows-sync-default-function ()
  (if compare-windows-sync-point
      (progn
        (if (numberp compare-windows-sync-point)
            (goto-char compare-windows-sync-point))
        (setq compare-windows-sync-point nil))
    (let* ((case-fold-search compare-ignore-case)
           (w2 (next-window (selected-window)))
           (b2 (window-buffer w2))
           (op2 (1+ (window-point w2)))
           (op1 (point))
           (p1 (1+ op1))
           (bound1 (- (min (+ p1 compare-windows-sync-region-size)
                           (point-max))
                      compare-windows-sync-string-size))
           (bound2 (- (min (+ op2 compare-windows-sync-region-size)
                           (with-current-buffer b2 (point-max)))
                      compare-windows-sync-string-size))
           s1 p2 p2s pp)
      (while (< p1 bound1)
        (setq s1 (buffer-substring-no-properties
                  p1
                  (+ p1 compare-windows-sync-string-size)))
        (setq p2 (with-current-buffer b2
                   (goto-char op2)
                   (search-forward s1 bound2 t)))
        (when p2
          (setq p2 (- p2 compare-windows-sync-string-size))
          ;; calculate the distance between two matching points
          (setq p2s (cons (list (sqrt (+ (expt (- p1 op1) 2)
                                         (expt (- p2 op2) 2))) p1 p2)
                          p2s)))
        (setq p1 (1+ p1)))
      ;; use matching points with minimal distance
      (when p2s
        (setq pp (cdr (assoc (apply 'min (mapcar 'car p2s)) p2s)))
        (goto-char (car pp)))
      (setq compare-windows-sync-point (or (cadr pp) t)))))

-- 
http://www.jurta.org/emacs/

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: compare-windows - synchronize points
  2003-08-13  2:57   ` Juri Linkov
  2003-08-14  3:07     ` Richard Stallman
@ 2003-08-14 10:43     ` Alex Schroeder
  2003-08-14 14:43       ` Juri Linkov
  1 sibling, 1 reply; 16+ messages in thread
From: Alex Schroeder @ 2003-08-14 10:43 UTC (permalink / raw
  Cc: emacs-devel

Juri Linkov <juri@jurta.org> writes:

> It's difficult to find a common default for all cases.  Seems, the value
> of compare-windows-sync should be set by user individually in each case
> depending on the nature of texts and differences.  The only general
> default value could be a function discussed below.

One good solution to this problem is to write interactive functions
for the most useful cases so that you can call them directly.  There
need not be many of them:

compare-windows-sync-word
compare-windows-sync-sentence
compare-windows-sync-defun
compare-windows-sync-regexp -- this one would query for a regexp

Alex
-- 
http://www.emacswiki.org/alex/
There is no substitute for experience.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: compare-windows - synchronize points
  2003-08-14 10:43     ` Alex Schroeder
@ 2003-08-14 14:43       ` Juri Linkov
  2003-08-14 23:59         ` Alex Schroeder
  0 siblings, 1 reply; 16+ messages in thread
From: Juri Linkov @ 2003-08-14 14:43 UTC (permalink / raw
  Cc: emacs-devel

Alex Schroeder <alex@emacswiki.org> writes:
> One good solution to this problem is to write interactive functions
> for the most useful cases so that you can call them directly.  There
> need not be many of them:
> 
> compare-windows-sync-word
> compare-windows-sync-sentence
> compare-windows-sync-defun
> compare-windows-sync-regexp -- this one would query for a regexp

Adding the interactive function for syncing seems to be a good thing.
But what troubles me the most is the namespace wasting.
Instead of creating n number of compare-windows-sync-* functions
it's better to create one interactive function
(e.g. compare-windows-sync-windows) which when called with argument C-u
will ask the user for a function name (e.g. forward-word,
forward-sentence) or a regexp (anyhow, one of your proposed functions
asks for a regexp) and will set the buffer-local compare-windows-sync
variable to this function or regexp.  If the compare-windows-sync-windows
is called without C-u, it could use the previously saved value of
compare-windows-sync.

-- 
http://www.jurta.org/emacs/

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: compare-windows - synchronize points
  2003-08-14 14:43       ` Juri Linkov
@ 2003-08-14 23:59         ` Alex Schroeder
  2003-08-15  4:21           ` Juri Linkov
  0 siblings, 1 reply; 16+ messages in thread
From: Alex Schroeder @ 2003-08-14 23:59 UTC (permalink / raw
  Cc: emacs-devel

Juri Linkov <juri@jurta.org> writes:

> Adding the interactive function for syncing seems to be a good thing.
> But what troubles me the most is the namespace wasting.

You think so?  What will you save?  It is just a few bytes in memory.
It is not unheard of, however.  Note, for example, two different
functions for search-forward and re-search-forward.  Now you can bind
both functions to a key, or use them in elisp easily.  In either case,
you might have said, "Well, why not let the user answer a prompt and
decide whether he wants string or regexp search?"  M-x is a nice user
interface because it gives you completion, and with one call, you have
answered all the questions.  The two-step interface was reduced to a
one-step interface.

Alex.
-- 
http://www.emacswiki.org/alex/
There is no substitute for experience.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: compare-windows - synchronize points
  2003-08-14 23:59         ` Alex Schroeder
@ 2003-08-15  4:21           ` Juri Linkov
  0 siblings, 0 replies; 16+ messages in thread
From: Juri Linkov @ 2003-08-15  4:21 UTC (permalink / raw
  Cc: emacs-devel

Alex Schroeder <alex@emacswiki.org> writes:
> Juri Linkov <juri@jurta.org> writes:
> > Adding the interactive function for syncing seems to be a good thing.
> > But what troubles me the most is the namespace wasting.
> 
> You think so?  What will you save?  It is just a few bytes in memory.
> It is not unheard of, however.  Note, for example, two different
> functions for search-forward and re-search-forward.  Now you can bind
> both functions to a key, or use them in elisp easily.  In either case,
> you might have said, "Well, why not let the user answer a prompt and
> decide whether he wants string or regexp search?"  M-x is a nice user
> interface because it gives you completion, and with one call, you have
> answered all the questions.  The two-step interface was reduced to a
> one-step interface.

You are right: adding more interactively callable commands simplifies
the user's work.  But is is useful only for the most frequently
used functions.  The functions search-forward and re-search-forward
are good examples of very frequently used functions.

On the other hand, the interactive commands compare-windows-sync-*
would be used very rarely, if at all.  Their only purpose would be
to call corresponding forward function (forward-word, forward-line,
forward-sentence, etc.) on two windows, which in itself has very
limited applicability.

Ideally, the command `compare-windows' should do all its work alone.
If `compare-windows' is called on the matching points, then it should
advance points until it finds next difference.  If `compare-windows'
is called on unmatched points then it should skip the difference and
advance points until next matching positions and so on.

I hope that patches I submitted allow to do it.
You can try them in practice and tell if you need something more.

-- 
http://www.jurta.org/emacs/

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: compare-windows - synchronize points
  2003-08-14  5:33       ` Juri Linkov
@ 2003-08-16 16:19         ` Richard Stallman
  2003-08-18 21:35           ` Juri Linkov
  0 siblings, 1 reply; 16+ messages in thread
From: Richard Stallman @ 2003-08-16 16:19 UTC (permalink / raw
  Cc: emacs-devel

    OK, here is quick implementation of simple quadratic algorithm.
    Seems, it works well on small differences, but fails on big ones.

How does it fail?  By taking forever, or something else?  It might be
faster if you tried using a small bound, then again with the bound
doubled, etc.

I think that computing the distance with a sum would probably be about
as good in practice, and somewhat faster.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: compare-windows - synchronize points
  2003-08-16 16:19         ` Richard Stallman
@ 2003-08-18 21:35           ` Juri Linkov
  2003-08-20  2:43             ` Richard Stallman
  0 siblings, 1 reply; 16+ messages in thread
From: Juri Linkov @ 2003-08-18 21:35 UTC (permalink / raw
  Cc: emacs-devel

Richard Stallman <rms@gnu.org> writes:
> How does it fail?  By taking forever, or something else?  It might be
> faster if you tried using a small bound, then again with the bound
> doubled, etc.

It failed by not finding the next matching points in a fixed region.
Now, I added the code to start with a small bound and to double it.
This eliminates the need to choose some fixed bound and also makes
the search fast, so speed is not a problem anymore.

The only parameter that should be customizable, I think, is the
size of a string from one window that is searched in a second window.

The small number makes the difference regions more fine-grained,
but often it fails by finding the wrong match (the same string which
is closer to the starting point).  The bigger number finds the correct
matches, but it makes regions more coarse-grained.

I found that default value 32 is good for most cases,
but sometimes on many small differences the value 4 is better.
Just one illustrative example: when the quotes in the string
`compare-windows' are changed to \\[compare-windows], then
value 4 finds two differences: one is ` to \\[, and second
is ' to ], while value 32 finds one difference as a whole region.

Here is the new implementation (I also added functions for highlighting
difference regions):

(defcustom compare-windows-sync-string-size 32
  "*Size of string from one window that is searched in second window.
The small number makes the difference regions more fine-grained,
but it may fail by finding the wrong match.  The bigger number
makes regions more coarse-grained."
  :type 'integer
  :group 'compare-w)

;; Function works in two passes: one call on each window.
;; On the first call both matching points are computed,
;; and one of them is stored in compare-windows-sync-point
;; to be used when this function is called on second window.
(defun compare-windows-sync-default-function ()
  (if (not compare-windows-sync-point)
      (let* ((case-fold-search compare-ignore-case)
             (w2 (next-window (selected-window)))
             (b2 (window-buffer w2))
             (point-max2 (with-current-buffer b2 (point-max)))
             (op2 (window-point w2))
             (op1 (point))
             (region-size compare-windows-sync-string-size)
             (string-size compare-windows-sync-string-size)
             in-bounds-p s1 p2 p12s p12)
        (while (and
                ;; until matching points are found
                (not p12s)
                ;; until size exceeds the maximum points of both buffers
                ;; (bounds below take care to not overdo in each of them)
                (or (setq in-bounds-p (< region-size (max (- (point-max) op1)
                                                          (- point-max2 op2))))
                    ;; until string size becomes smaller than 4
                    (> string-size 4)))
          (if in-bounds-p
              ;; make the next search in the double-sized region;
              ;; on first iteration it is 2*compare-windows-sync-string-size,
              ;; on last iterations it exceeds both buffers maximum points
              (setq region-size (* region-size 2))
            ;; if region size exceeds the maximum points of both buffers,
            ;; then start to halve the string size until 4;
            ;; this helps to find differences near the end of buffers
            (setq string-size (/ string-size 2)))
          (let ((p1 op1)
                (bound1 (- (min (+ op1 region-size) (point-max)) string-size))
                (bound2 (min (+ op2 region-size) point-max2)))
            (while (< p1 bound1)
              (setq s1 (buffer-substring-no-properties p1 (+ p1 string-size)))
              (setq p2 (with-current-buffer b2
                         (goto-char op2)
                         (search-forward s1 bound2 t)))
              (when p2
                (setq p2 (- p2 string-size))
                (setq p12s (cons (list (+ p1 p2) p1 p2) p12s)))
              (setq p1 (1+ p1)))))
        (when p12s
          ;; use closest matching points (i.e. points with minimal sum)
          (setq p12 (cdr (assq (apply 'min (mapcar 'car p12s)) p12s)))
          (goto-char (car p12))
          (compare-windows-highlight op1 (car p12) op2 (cadr p12) b2))
        (setq compare-windows-sync-point (or (cadr p12) t)))
    ;; else set point in the second window to the precalculated value
    (if (numberp compare-windows-sync-point)
        (goto-char compare-windows-sync-point))
    (setq compare-windows-sync-point nil)))

(defcustom compare-windows-highlight t
  "*Non-nil means compare-windows highlights the differences."
  :type 'boolean
  :group 'compare-w)

(defface compare-windows-face
  '((((type tty pc) (class color))
     (:background "turquoise3"))
    (((class color) (background light))
     (:background "paleturquoise"))
    (((class color) (background dark))
     (:background "paleturquoise4"))
    (t (:underline t)))
  "Face for highlighting of compare-windows difference regions."
  :group 'compare-w)

(defvar compare-windows-overlay1 nil)
(defvar compare-windows-overlay2 nil)
(defvar compare-windows-sync-point nil)

(defun compare-windows-highlight (beg1 end1 beg2 end2 buf2)
  (when compare-windows-highlight
    (if compare-windows-overlay1
        (move-overlay compare-windows-overlay1 beg1 end1 (current-buffer))
      (setq compare-windows-overlay1 (make-overlay beg1 end1 (current-buffer)))
      (overlay-put compare-windows-overlay1 'face 'compare-windows-face)
      (overlay-put compare-windows-overlay1 'priority 1))
    (if compare-windows-overlay2
        (move-overlay compare-windows-overlay2 beg2 end2 buf2)
      (setq compare-windows-overlay2 (make-overlay beg2 end2 buf2))
      (overlay-put compare-windows-overlay2 'face 'compare-windows-face)
      (overlay-put compare-windows-overlay2 'priority 1))))

(defun compare-windows-dehighlight ()
  "Remove highlighting created by `compare-windows-highlight'."
  (interactive)
  (and compare-windows-overlay1 (delete-overlay compare-windows-overlay1))
  (and compare-windows-overlay2 (delete-overlay compare-windows-overlay2)))

-- 
http://www.jurta.org/emacs/

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: compare-windows - synchronize points
  2003-08-18 21:35           ` Juri Linkov
@ 2003-08-20  2:43             ` Richard Stallman
  2003-08-20  5:56               ` Juri Linkov
  0 siblings, 1 reply; 16+ messages in thread
From: Richard Stallman @ 2003-08-20  2:43 UTC (permalink / raw
  Cc: emacs-devel

    The small number makes the difference regions more fine-grained,
    but often it fails by finding the wrong match (the same string which
    is closer to the starting point).

I don't follow that.  Could you explain it with an example?
When I understand the problem, maybe I can propose a solution.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: compare-windows - synchronize points
  2003-08-20  2:43             ` Richard Stallman
@ 2003-08-20  5:56               ` Juri Linkov
  2003-08-26  1:38                 ` Richard Stallman
  0 siblings, 1 reply; 16+ messages in thread
From: Juri Linkov @ 2003-08-20  5:56 UTC (permalink / raw
  Cc: emacs-devel

>     The small number makes the difference regions more fine-grained,
>     but often it fails by finding the wrong match (the same string which
>     is closer to the starting point).
> 
> I don't follow that.  Could you explain it with an example?
> When I understand the problem, maybe I can propose a solution.

I forgot to mention that with small number it fails when an insertion
contains the same substring which follows after an insertion.  In this
case it finds the closest substring (in inserted text) instead of
original text unchanged in both files.

Here is an example:

suppose that original file contains following lines:

(defvar a nil)
(defvar b nil)
(defvar c nil)
(defvar d nil)

and a new line "(defvar x nil)" was added between defvars "a" and "b":

(defvar a nil)
(defvar x nil)
(defvar b nil)
(defvar c nil)
(defvar d nil)

Now, if compare-windows-sync-string-size is 4 (or even 8), then after
matching the first 23 characters (until character "b" in first file and
"x" in second) it will find that difference regions are "b" and "x".
It's because it gets the substring with size 8: " nil)\n(d" and finds
its closest occurrence just after both "b" and "x".  And after that it
completely goes out of sync, claiming that "c" was replaced by "b",
"d" by "c", and so on.

If compare-windows-sync-string-size is larger (e.g. 16), then
it looks for a larger substring " nil)\n(defvar c ", which includes the
next variable name "c", and it finds the correct match.

-- 
http://www.jurta.org/emacs/

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: compare-windows - synchronize points
  2003-08-20  5:56               ` Juri Linkov
@ 2003-08-26  1:38                 ` Richard Stallman
  2003-08-26  5:28                   ` Juri Linkov
  0 siblings, 1 reply; 16+ messages in thread
From: Richard Stallman @ 2003-08-26  1:38 UTC (permalink / raw
  Cc: emacs-devel

    I forgot to mention that with small number it fails when an insertion
    contains the same substring which follows after an insertion.  In this
    case it finds the closest substring (in inserted text) instead of
    original text unchanged in both files.

I can see how that would happen.  It suggests an idea to me.  The idea
is to search for a larger matching substring first, and then try
smaller matching substrings.  It's the same kind of idea as first
trying smaller search ranges and then larger ones.

Anoter idea: it could search first for an entire line (with a certain
minimum length) of match between the two windows.  After finding that,
it could look within the mismatched areas for smaller partial matches.

Want to try those approaches?

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: compare-windows - synchronize points
  2003-08-26  1:38                 ` Richard Stallman
@ 2003-08-26  5:28                   ` Juri Linkov
  2003-08-27 16:42                     ` Richard Stallman
  0 siblings, 1 reply; 16+ messages in thread
From: Juri Linkov @ 2003-08-26  5:28 UTC (permalink / raw
  Cc: emacs-devel

Richard Stallman <rms@gnu.org> writes:
> I can see how that would happen.  It suggests an idea to me.  The idea
> is to search for a larger matching substring first, and then try
> smaller matching substrings.  It's the same kind of idea as first
> trying smaller search ranges and then larger ones.

The latest code I posted 2003-08-19 has an implementation of exactly
same idea.  It first doubles search ranges, and if differences are
still not found, then it starts to halve a matching substring size.
So there are no problems with finding differences anymore.  It works
correctly in all cases I tested.

However, I think that the size of matching substring should be
customizable.  I found that current default value 32 (with which a
search is started) is optimal for most cases.  But sometimes user
may want to start with a lesser value to get more fine-grained results.

-- 
http://www.jurta.org/emacs/

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: compare-windows - synchronize points
  2003-08-26  5:28                   ` Juri Linkov
@ 2003-08-27 16:42                     ` Richard Stallman
  0 siblings, 0 replies; 16+ messages in thread
From: Richard Stallman @ 2003-08-27 16:42 UTC (permalink / raw
  Cc: emacs-devel

    The latest code I posted 2003-08-19 has an implementation of exactly
    same idea.  It first doubles search ranges, and if differences are
    still not found, then it starts to halve a matching substring size.
    So there are no problems with finding differences anymore.  It works
    correctly in all cases I tested.

Good work!

    However, I think that the size of matching substring should be
    customizable.  I found that current default value 32 (with which a
    search is started) is optimal for most cases.  But sometimes user
    may want to start with a lesser value to get more fine-grained results.

I have nothing against having an option to customize.  My goal is,
rather, to have a default so good that few users who need to know
about it.  It looks like you have achieve that.  Congratulations.

When we get the legal papers, I will be glad to install this new
feature.

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2003-08-27 16:42 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-08-11 19:59 compare-windows - synchronize points Juri Linkov
2003-08-12 23:22 ` Richard Stallman
2003-08-13  2:57   ` Juri Linkov
2003-08-14  3:07     ` Richard Stallman
2003-08-14  5:33       ` Juri Linkov
2003-08-16 16:19         ` Richard Stallman
2003-08-18 21:35           ` Juri Linkov
2003-08-20  2:43             ` Richard Stallman
2003-08-20  5:56               ` Juri Linkov
2003-08-26  1:38                 ` Richard Stallman
2003-08-26  5:28                   ` Juri Linkov
2003-08-27 16:42                     ` Richard Stallman
2003-08-14 10:43     ` Alex Schroeder
2003-08-14 14:43       ` Juri Linkov
2003-08-14 23:59         ` Alex Schroeder
2003-08-15  4:21           ` Juri Linkov

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).