Re: [mg: diff-next-line] - Stefan Monnier

all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed

From: Stefan Monnier <monnier@iro.umontreal.ca>
To: Han Boetes <han@mijncomputer.nl>
Cc: emacs-devel@gnu.org
Subject: Re: [mg: diff-next-line]
Date: Mon, 13 Aug 2007 02:04:39 -0400	[thread overview]
Message-ID: <jwvfy2oxa78.fsf-monnier+emacs@gnu.org> (raw)
In-Reply-To: <20070810070855.GK21516@boetes.org> (Han Boetes's message of "Fri\, 10 Aug 2007 09\:08\:32 +0200")

> This diff adds a "diff-next-line" command to mg.
> Basically, it compares the current line (starting at dot) to the 
> line following, advancing the cursor to the first character that
> is different (or eol, if none).

> Note: character under the cursor is ignored, so the command can be
> used multiple times in a row to find successive differences

> Its purpose should be obvious.  If are editing a diff, and you want
> to know exactly where two lines differ, this will do it.

[...]

> Doesn't this look like an interesting feature for diff-mode?

Yes and no.  It seems too simplistic: e.g. looking at the next line is often
not good enough (the corresponding other version of a line is often
further).  I have a work-in-progress code which instead takes one pair of
"before/after" text from a unified context diff, diffs it char-by-char, and
then highlights the resulting differences.  This does wonders on patch hunks
caused by reformatting (like M-q) where the only highlighted chars are end
of lines plus the few places where text was really changed.

For what it's worth, you can find my current code attached.  The comment
about "too fine diffs" is a bit too optimistic: to summarize, the problem is
that it'll find "unchanged" text between "hello" and "colour" (the first "l"
and the "o") which is OK when there are only two words, but on longer chunks
of "unrelated" text you end up with a highlighting that looks just "random"
(i.e. bugy).  I wish I could tweak diff's optimization function so that it
doesn't just look for the fewest number of lines (or chars) changed, but
also tries to minimize the number of hunks in the output.  Of course,
another option is to do what ediff does and do the diff at word-granularity
(or somesuch) rather char-granularity.  Or maybe I could do
a post-processing step that joins nearby changes (e.g. if there's a single
unchanged char between two changed chars, then consider the unchanged char
as changed as well).


        Stefan


;;; Fine change highlighting.

(defface diff-fine-change
  '((t :background "yellow"))
  "Face used for char-based changes shown by `diff-fine-highlight'.")

(defun diff-fine-chopup-region (beg end file)
  "Chopup the region into small elements, one per line."
  ;; FIXME: see smerge-refine-chopup-region which duplicates most of this.
  ;;
  ;; ediff chops up into words, where the definition of a word is
  ;; customizable.  Instead we here keep only one char per line.
  ;; The advantages are that there's nothing to configure, that we get very
  ;; fine results, and that it's trivial to map the line numbers in the
  ;; output of diff back into buffer positions.  The disadvantage is that it
  ;; can take more time to compute the diff and that the result is sometimes
  ;; too fine.  I'm not too concerned about the slowdown because conflicts
  ;; are usually significantly smaller than the whole file.  As for the
  ;; problem of too-fine-refinement, I have found it to be unimportant
  ;; especially when you consider the cases where the fine-grain is just
  ;; what you want.
  (let ((buf (current-buffer)))
    (with-temp-buffer
      (insert-buffer-substring buf beg end)
      (goto-char (point-min))
      (while (re-search-forward "^." nil t)
        ;; Replace the hunk's leading prefix on each line with something
        ;; constant, otherwise it'll be flagged as changes (since it's
        ;; typically "-" on one side and "+" on the other).  Note that we
        ;; keep the same number of chars: we treat the prefix as part of the
        ;; texts-to-diff, so that finding the right char afterwards will be
        ;; easier.  This only makes sense because we make diffs at
        ;; char-granularity.
        (replace-match " "))
      (goto-char (point-min))
      (while (not (eobp))
        (forward-char 1)
        ;; We add \n after each char except after \n, so we get one line per
        ;; text char, where each line contains just one char, except for \n
        ;; chars which are represented by the empty line.
        (unless (eq (char-before) ?\n) (insert ?\n)))
      (let ((coding-system-for-write 'emacs-mule))
        (write-region (point-min) (point-max) file nil 'nomessage)))))

(defun diff-fine-highlight-change (buf beg match-num1 match-num2)
  (let* ((startline (string-to-number (match-string match-num1)))
         (ol (make-overlay
              (+ beg startline -1)
              (+ beg (if (match-end match-num2)
                         (string-to-number (match-string match-num2))
                       startline))
              buf
              'front-advance nil)))
    (overlay-put ol 'diff-mode 'fine)
    (overlay-put ol 'evaporate t)
    (overlay-put ol 'face 'diff-fine-change)))


(defun diff-fine-highlight ()
  "Blabla."
  ;; TODO:
  ;; - Share code with smerge-refine
  ;; - extend to context diffs (only the ! lines)
  ;; - clean up
  ;; - make more robust
  ;; - maybe two different faces should be used here
  ;; - provide a reasonable UI
  ;; - do it hunk-wide rather than on a single substitution change
  (interactive)
  (if (re-search-backward "^[^+-]" nil 'move) (forward-line 1))
  (let* ((buf (current-buffer))
         (beg1 (point))
         (end1 (if (re-search-forward "^[^-]" nil 'move)
                   (match-beginning 0) (point-max)))
         (beg2 end1)
         (end2 (if (re-search-forward "^[^+]" nil 'move)
                   (match-beginning 0) (point-max)))
         (file1 (make-temp-file "diff1"))
         (file2 (make-temp-file "diff2")))

    ;; If the user makes edits, this may not be enough because some
    ;; highlights may now be located outside of the change (e.g. the first
    ;; char has been turned into a SPC).  Maybe we should remove overlays on
    ;; the whole hunk?
    (remove-overlays beg1 end1 'diff-mode 'fine)
    (remove-overlays beg2 end2 'diff-mode 'fine)

    ;; Chop up regions into smaller elements and save into files.
    (diff-fine-chopup-region beg1 end1 file1)
    (diff-fine-chopup-region beg2 end2 file2)

    ;; Call diff on those files.
    (unwind-protect
        (with-temp-buffer
          (let ((coding-system-for-read 'emacs-mule))
            (call-process diff-command nil t nil file1 file2))
          ;; Process diff's output.
          (goto-char (point-min))
          (while (not (eobp))
            (if (not (looking-at "\\([0-9]+\\)\\(?:,\\([0-9]+\\)\\)?\\([acd]\\)\\([0-9]+\\)\\(?:,\\([0-9]+\\)\\)?$"))
                (error "Unexpected patch hunk header: %s"
                       (buffer-substring (point) (line-end-position)))
              (let ((op (char-after (match-beginning 3))))
                (when (memq op '(?d ?c))
                  (diff-fine-highlight-change buf beg1 1 2))
                (when (memq op '(?a ?c))
                  (diff-fine-highlight-change buf beg2 4 5)))
              (forward-line 1)                            ;Skip hunk header.
              (and (re-search-forward "^[0-9]" nil 'move) ;Skip hunk body.
                   (goto-char (match-beginning 0))))))
      (delete-file file1)
      (delete-file file2))))

     prev parent reply	other threads:[~2007-08-13  6:04 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-08-10  7:08 [mg: diff-next-line] Han Boetes
2007-08-13  6:04 ` Stefan Monnier [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=jwvfy2oxa78.fsf-monnier+emacs@gnu.org \
    --to=monnier@iro.umontreal.ca \
    --cc=emacs-devel@gnu.org \
    --cc=han@mijncomputer.nl \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.