unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Re: ntemacs hangs when openning the attached file
       [not found] <42b562540805062101s3e79eecel5ddc5b19821deda2@mail.gmail.com>
@ 2008-05-07  8:48 ` Eli Zaretskii
  2008-05-20 16:13   ` Alan Mackenzie
  2008-05-23 21:01   ` Alan Mackenzie
  2008-05-07 17:34 ` Richard M Stallman
  1 sibling, 2 replies; 9+ messages in thread
From: Eli Zaretskii @ 2008-05-07  8:48 UTC (permalink / raw)
  To: yu jie; +Cc: emacs-devel

> Date: Wed, 7 May 2008 12:01:38 +0800
> From: "yu jie" <yujie052@gmail.com>
> 
>     The current CVS header version hangs when openning the attached file.

No, it doesn't hang, it just takes a lot of time to visit this file.
I measured 61 seconds on a 3GHz machine.  This file has 86406 lines,
and uses some pretty non-standard formatting, such as this one:

  static int simpleNext(
    sqlite3_tokenizer_cursor *pCursor,  /* Cursor returned by simpleOpen */
    const char **ppToken,               /* OUT: *ppToken is the token text */
    int *pnBytes,                       /* OUT: Number of bytes in token */
    int *piStartOffset,                 /* OUT: Starting offset of token */
    int *piEndOffset,                   /* OUT: Ending offset of token */
    int *piPosition                     /* OUT: Position integer of token */
  ){

It also uses long comments formatted like this:

  /*
  ** Additional bit values that can be ORed with an affinity without
  ** changing the affinity.
  */

I'm guessing that this formatting coupled with the sheer size of the
file somehow triggers an inefficiency.

I didn't dig into the problem deep enough to find the actual culprit,
but if I run Emacs under GDB and interrupt it during the long wait, I
always see the following Lisp backtrace:

    "parse-partial-sexp" (0x82e9c4)
    "c-literal-limits" (0x82eb14)
    "c-neutralize-syntax-in-CPP" (0x82ec64)
    "c-common-init" (0x82edb4)
    "c-mode" (0x82eef4)
    "set-auto-mode-0" (0x82f034)
    "set-auto-mode" (0x82f110)
    "normal-mode" (0x82f454)
    "after-find-file" (0x82f5a4)
    "find-file-noselect-1" (0x82f6e4)
    "find-file-noselect" (0x82f834)
    "find-file" (0x82f984)
    "call-interactively" (0x82fc04)

This seems not to be related to fontification, but rather to something
that c-neutralize-syntax-in-CPP does during C Mode initialization.




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: ntemacs hangs when openning the attached file
       [not found] <42b562540805062101s3e79eecel5ddc5b19821deda2@mail.gmail.com>
  2008-05-07  8:48 ` ntemacs hangs when openning the attached file Eli Zaretskii
@ 2008-05-07 17:34 ` Richard M Stallman
  1 sibling, 0 replies; 9+ messages in thread
From: Richard M Stallman @ 2008-05-07 17:34 UTC (permalink / raw)
  To: yu jie; +Cc: emacs-devel

When a test case is 4 meg, would you please offer to send it to those
who could use it, rather than mailing it directly to this list?

Thank you for contributing.





^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: ntemacs hangs when openning the attached file
  2008-05-07  8:48 ` ntemacs hangs when openning the attached file Eli Zaretskii
@ 2008-05-20 16:13   ` Alan Mackenzie
  2008-05-22  4:42     ` Stefan Monnier
  2008-05-23 21:01   ` Alan Mackenzie
  1 sibling, 1 reply; 9+ messages in thread
From: Alan Mackenzie @ 2008-05-20 16:13 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Eli Zaretskii, yu jie, emacs-devel

Stefan, Eli,

On Wed, May 07, 2008 at 11:48:47AM +0300, Eli Zaretskii wrote:
> > Date: Wed, 7 May 2008 12:01:38 +0800
> > From: "yu jie" <yujie052@gmail.com>

> >     The current CVS header version hangs when openning the attached file.

> No, it doesn't hang, it just takes a lot of time to visit this file.
> I measured 61 seconds on a 3GHz machine.  This file has 86406 lines,
> ....

and at 1 line per second would take just over a day to print.  This is a
big file.

> I'm guessing that this formatting coupled with the sheer size of the
> file somehow triggers an inefficiency.

[ .... ]

> This seems not to be related to fontification, but rather to something
> that c-neutralize-syntax-in-CPP does during C Mode initialization.

It's just that c-neutralize-syntax-in-CPP takes a long time in such a
large buffer.  However, it would take just as long if it were split into
several smaller buffers.

c-neutralize-syntax-in-CPP is a newish function which counteracts
problems cause by things like:

    #define RBRACE }

.  The answer would seem to be calling the function lazily or in the
background, a bit like jit-lock does for font locking.

Stefan, I think you mentioned a long time ago that jit could be used for
things other than font locking, but I can't find the thread.  Any clues?

-- 
Alan Mackenzie (Nuremberg, Germany).




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: ntemacs hangs when openning the attached file
  2008-05-20 16:13   ` Alan Mackenzie
@ 2008-05-22  4:42     ` Stefan Monnier
  0 siblings, 0 replies; 9+ messages in thread
From: Stefan Monnier @ 2008-05-22  4:42 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: Eli Zaretskii, yu jie, emacs-devel

>> This seems not to be related to fontification, but rather to something
>> that c-neutralize-syntax-in-CPP does during C Mode initialization.

> It's just that c-neutralize-syntax-in-CPP takes a long time in such a
> large buffer.  However, it would take just as long if it were split into
> several smaller buffers.

> c-neutralize-syntax-in-CPP is a newish function which counteracts
> problems cause by things like:

>     #define RBRACE }

> .  The answer would seem to be calling the function lazily or in the
> background, a bit like jit-lock does for font locking.

> Stefan, I think you mentioned a long time ago that jit could be used for
> things other than font locking, but I can't find the thread.  Any clues?

Not sure what you want to know.  You can use jit-lock-register, tho it's
not clear whether it does what you want.


        Stefan




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: ntemacs hangs when openning the attached file
  2008-05-07  8:48 ` ntemacs hangs when openning the attached file Eli Zaretskii
  2008-05-20 16:13   ` Alan Mackenzie
@ 2008-05-23 21:01   ` Alan Mackenzie
  2008-05-23 21:36     ` Stefan Monnier
  2008-05-24  8:18     ` Eli Zaretskii
  1 sibling, 2 replies; 9+ messages in thread
From: Alan Mackenzie @ 2008-05-23 21:01 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Stefan Monnier, yu jie, emacs-devel

Hi, Eli and yu!

On Wed, May 07, 2008 at 11:48:47AM +0300, Eli Zaretskii wrote:
> > Date: Wed, 7 May 2008 12:01:38 +0800
> > From: "yu jie" <yujie052@gmail.com>
> > 
> >     The current CVS header version hangs when openning the attached file.

> No, it doesn't hang, it just takes a lot of time to visit this file.
> I measured 61 seconds on a 3GHz machine.  This file has 86406 lines,
> and uses some pretty non-standard formatting, such as this one:

The problem was that c-neutralize-syntax-in-CPP was inefficiently coded.
I've optimised it using essentially only Emacs primitives in the defun's
main loop.  It now runs almost 2 orders of magnitude faster.

Eli, I'd appreciate it very much indeed if you could review this new
code, please - earlier versions of it were peculiarly troublesome.
Thanks!

Here's the patch:


*** cc-mode.el~	2008-05-23 20:42:45.653994480 +0000
--- cc-mode.el	2008-05-23 20:50:43.941283760 +0000
***************
*** 837,863 ****
    ;;
    ;; This function is the C/C++/ObjC value of `c-before-font-lock-function'.
    ;;
    ;; This function might do invisible changes.
!   (c-save-buffer-state (limits mbeg+1 beg end)
!     ;; First calculate the region, possibly to be extended.
!     (setq beg (min begg c-old-BOM))
      (goto-char endd)
!     (when (c-beginning-of-macro)
!       (c-end-of-macro))
      (setq end (max (+ (- c-old-EOM old-len) (- endd begg))
  		   (point)))
      ;; Clear all old punctuation properties
      (c-clear-char-property-with-value beg end 'syntax-table '(1))
  
      (goto-char beg)
      (while (and (< (point) end)
  		(search-forward-regexp c-anchored-cpp-prefix end t))
        ;; If we've found a "#" inside a string/comment, ignore it.
!       (if (setq limits (c-literal-limits))
! 	  (goto-char (cdr limits))
  	(setq mbeg+1 (point))
  	(c-end-of-macro)	  ; Do we need to go forward 1 char here?  No!
! 	(c-neutralize-CPP-line mbeg+1 (point))))))
  
  (defun c-before-change (beg end)
    ;; Function to be put on `before-change-function'.  Primarily, this calls
--- 837,881 ----
    ;;
    ;; This function is the C/C++/ObjC value of `c-before-font-lock-function'.
    ;;
+   ;; Note: SPEED _MATTERS_ IN THIS FUNCTION!!!
+   ;; 
    ;; This function might do invisible changes.
!   (c-save-buffer-state (limits mbeg+1 beg end pps-position pps-state)
!     ;; First determine the region, (beg end), which may need "neutralizing".
!     ;; This may not start inside a string or comment, or a macro.
!     (goto-char begg)
!     (if (setq limits (c-literal-limits))
! 	(goto-char (cdr limits)))   ; go forward out of any string or comment.
!     (c-beginning-of-macro)
!     (setq beg (min (point) c-old-BOM))
! 
      (goto-char endd)
!     (if (setq limits (c-literal-limits))
! 	(goto-char (car limits)))  ; go backward out of any string or comment.
!     (if (c-beginning-of-macro)
! 	(c-end-of-macro))
      (setq end (max (+ (- c-old-EOM old-len) (- endd begg))
  		   (point)))
+ 
      ;; Clear all old punctuation properties
      (c-clear-char-property-with-value beg end 'syntax-table '(1))
  
      (goto-char beg)
+     (setq pps-position beg  pps-state nil)
      (while (and (< (point) end)
  		(search-forward-regexp c-anchored-cpp-prefix end t))
        ;; If we've found a "#" inside a string/comment, ignore it.
!       (setq pps-state
! 	    (parse-partial-sexp pps-position (point) nil nil pps-state)
! 	    pps-position (point))
!       (unless (or (nth 3 pps-state)	; in a string?
! 		  (nth 4 pps-state))	; in a comment?
  	(setq mbeg+1 (point))
  	(c-end-of-macro)	  ; Do we need to go forward 1 char here?  No!
! 	(c-neutralize-CPP-line mbeg+1 (point))
! 	(setq pps-state
! 	      (parse-partial-sexp pps-position (point) nil nil pps-state)
! 	      pps-position (point))))))
  
  (defun c-before-change (beg end)
    ;; Function to be put on `before-change-function'.  Primarily, this calls


-- 
Alan Mackenzie (Nuremberg, Germany).




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: ntemacs hangs when openning the attached file
  2008-05-23 21:01   ` Alan Mackenzie
@ 2008-05-23 21:36     ` Stefan Monnier
  2008-05-24 13:16       ` Alan Mackenzie
  2008-05-24  8:18     ` Eli Zaretskii
  1 sibling, 1 reply; 9+ messages in thread
From: Stefan Monnier @ 2008-05-23 21:36 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: Eli Zaretskii, yu jie, emacs-devel

>> >     The current CVS header version hangs when openning the attached file.

>> No, it doesn't hang, it just takes a lot of time to visit this file.
>> I measured 61 seconds on a 3GHz machine.  This file has 86406 lines,
>> and uses some pretty non-standard formatting, such as this one:

> The problem was that c-neutralize-syntax-in-CPP was inefficiently coded.
> I've optimised it using essentially only Emacs primitives in the defun's
> main loop.  It now runs almost 2 orders of magnitude faster.

Sounds good.

> Eli, I'd appreciate it very much indeed if you could review this new
> code, please - earlier versions of it were peculiarly troublesome.

Don't know about Eli.  But here's some comments:

> +   ;; Note: SPEED _MATTERS_ IN THIS FUNCTION!!!
> +   ;; 
>     ;; This function might do invisible changes.
                             ^^
                            make

> +     (setq pps-position beg  pps-state nil)

It would be a lot more lispy to explicitly let-bind pps-position and
pps-state here, rather than declare them earlier without initializing
them and then initializing them here.

>       (while (and (< (point) end)
>   		(search-forward-regexp c-anchored-cpp-prefix end t))
>         ;; If we've found a "#" inside a string/comment, ignore it.
> !       (setq pps-state
> ! 	    (parse-partial-sexp pps-position (point) nil nil pps-state)
> ! 	    pps-position (point))
> !       (unless (or (nth 3 pps-state)	; in a string?
> ! 		  (nth 4 pps-state))	; in a comment?
>   	(setq mbeg+1 (point))
>   	(c-end-of-macro)	  ; Do we need to go forward 1 char here?  No!
> ! 	(c-neutralize-CPP-line mbeg+1 (point))
> ! 	(setq pps-state
> ! 	      (parse-partial-sexp pps-position (point) nil nil pps-state)
> ! 	      pps-position (point))))))

I have the impression that this second call to parse-partial-sexp
is unnecessary.


        Stefan




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: ntemacs hangs when openning the attached file
  2008-05-23 21:01   ` Alan Mackenzie
  2008-05-23 21:36     ` Stefan Monnier
@ 2008-05-24  8:18     ` Eli Zaretskii
  2008-05-24 13:17       ` Alan Mackenzie
  1 sibling, 1 reply; 9+ messages in thread
From: Eli Zaretskii @ 2008-05-24  8:18 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: monnier, yujie052, emacs-devel

> Date: Fri, 23 May 2008 21:01:23 +0000
> Cc: yu jie <yujie052@gmail.com>, Stefan Monnier <monnier@iro.umontreal.ca>,
>   emacs-devel@gnu.org
> From: Alan Mackenzie <acm@muc.de>
> 
> Eli, I'd appreciate it very much indeed if you could review this new
> code, please - earlier versions of it were peculiarly troublesome.

I have nothing to add to Stefan's comments, except thank you for
working on this.




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: ntemacs hangs when openning the attached file
  2008-05-23 21:36     ` Stefan Monnier
@ 2008-05-24 13:16       ` Alan Mackenzie
  0 siblings, 0 replies; 9+ messages in thread
From: Alan Mackenzie @ 2008-05-24 13:16 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Eli Zaretskii, yu jie, emacs-devel

Hi, Stefan!

On Fri, May 23, 2008 at 05:36:34PM -0400, Stefan Monnier wrote:

> > The problem was that c-neutralize-syntax-in-CPP was inefficiently
> > coded.  I've optimised it using essentially only Emacs primitives in
> > the defun's main loop.  It now runs almost 2 orders of magnitude
> > faster.

> Sounds good.

> > Eli, I'd appreciate it very much indeed if you could review this new
> > code, please - earlier versions of it were peculiarly troublesome.

> Don't know about Eli.

That's OK.  You'll do instead.  ;-)  (Thanks!)

> But here's some comments:

> > +   ;; Note: SPEED _MATTERS_ IN THIS FUNCTION!!!
> > +   ;; 
> >     ;; This function might do invisible changes.
>                              ^^
>                             make

I hereby resign from my role as project linguistic pedant.  ;-)  That one
(and ~100 others) have been in the source for ~5 years, put there by
Martin.  How could I miss this??

> > +     (setq pps-position beg  pps-state nil)

> It would be a lot more lispy to explicitly let-bind pps-position and
> pps-state here, rather than declare them earlier without initializing
> them and then initializing them here.

I'd thought that that would just be an unnecessary extra `let'.  However,
having tried it, it does make the code a bit clearer.  So yes, thank you
- I'll be doing this more in the future.

> >       (while (and (< (point) end)
> >   		(search-forward-regexp c-anchored-cpp-prefix end t))
> >         ;; If we've found a "#" inside a string/comment, ignore it.
> > !       (setq pps-state
> > ! 	    (parse-partial-sexp pps-position (point) nil nil pps-state)
> > ! 	    pps-position (point))
> > !       (unless (or (nth 3 pps-state)	; in a string?
> > ! 		  (nth 4 pps-state))	; in a comment?
> >   	(setq mbeg+1 (point))
> >   	(c-end-of-macro)	  ; Do we need to go forward 1 char here?  No!
> > ! 	(c-neutralize-CPP-line mbeg+1 (point))
> > ! 	(setq pps-state
> > ! 	      (parse-partial-sexp pps-position (point) nil nil pps-state)
> > ! 	      pps-position (point))))))

> I have the impression that this second call to parse-partial-sexp is
> unnecessary.

DUH!!!  Of course!  Wake up, Alan!  That's the whole point of the call to
`c-neutralize-CPP-line', just above.  Thanks!

>         Stefan

-- 
Alan Mackenzie (Nuremberg, Germany).




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: ntemacs hangs when openning the attached file
  2008-05-24  8:18     ` Eli Zaretskii
@ 2008-05-24 13:17       ` Alan Mackenzie
  0 siblings, 0 replies; 9+ messages in thread
From: Alan Mackenzie @ 2008-05-24 13:17 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: monnier, yujie052, emacs-devel

Hi, Eli!

On Sat, May 24, 2008 at 11:18:20AM +0300, Eli Zaretskii wrote:
> > Date: Fri, 23 May 2008 21:01:23 +0000
> > Cc: yu jie <yujie052@gmail.com>, Stefan Monnier <monnier@iro.umontreal.ca>,
> >   emacs-devel@gnu.org
> > From: Alan Mackenzie <acm@muc.de>
> > 
> > Eli, I'd appreciate it very much indeed if you could review this new
> > code, please - earlier versions of it were peculiarly troublesome.

> I have nothing to add to Stefan's comments, except thank you for
> working on this.

OK, thanks!  I think it should be OK now, I've just committed the change
to the Emacs-22 branch and the trunk.

-- 
Alan Mackenzie (Nuremberg, Germany).




^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2008-05-24 13:17 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <42b562540805062101s3e79eecel5ddc5b19821deda2@mail.gmail.com>
2008-05-07  8:48 ` ntemacs hangs when openning the attached file Eli Zaretskii
2008-05-20 16:13   ` Alan Mackenzie
2008-05-22  4:42     ` Stefan Monnier
2008-05-23 21:01   ` Alan Mackenzie
2008-05-23 21:36     ` Stefan Monnier
2008-05-24 13:16       ` Alan Mackenzie
2008-05-24  8:18     ` Eli Zaretskii
2008-05-24 13:17       ` Alan Mackenzie
2008-05-07 17:34 ` Richard M Stallman

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).