* C++ mode and c-beginning-of-current-token
@ 2007-05-12 10:39 Herbert Euler
2007-05-12 13:19 ` Alan Mackenzie
` (3 more replies)
0 siblings, 4 replies; 41+ messages in thread
From: Herbert Euler @ 2007-05-12 10:39 UTC (permalink / raw)
To: acm; +Cc: emacs-devel
In the newest unicode 2 branch, `parse-sexp-lookup-properties' is
default to `t' in C++ mode but not in C mode. I don't have the
Emacs 22 trunk, and so don't know the case in that trunk. But
if this variable is set to t by default, there will be a bug in c++-mode:
1. Visit an empty, new C++ file.
2. Try to insert the following line, at the beginning of the buffer:
#include <iostream>
Well, an error will be signaled when typing the second character, "i",
says "Point before start of properties". This error happens in the
function `c-beginning-of-current-token', when invoking
`skip-syntax-backward':
(defun c-beginning-of-current-token (&optional back-limit)
;; Move to the beginning of the current token. Do not move if not
;; in the middle of one. BACK-LIMIT may be used to bound the
;; backward search; if given it's assumed to be at the boundary
;; between two tokens. Return non-nil if the point is move, nil
;; otherwise.
;;
;; This function might do hidden buffer changes.
(let ((start (point)))
(if (looking-at "\\w\\|\\s_")
(skip-syntax-backward "w_" back-limit)
(when (< (skip-syntax-backward ".()" back-limit) 0)
;; ... ...
If `parse-sexp-lookup-properties' is t, `skip-syntax-backward' would
not return -1, but signaling an error in some cases.
Regards,
Guanpeng Xu
_________________________________________________________________
Express yourself instantly with MSN Messenger! Download today it's FREE!
http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: C++ mode and c-beginning-of-current-token
2007-05-12 10:39 C++ mode and c-beginning-of-current-token Herbert Euler
@ 2007-05-12 13:19 ` Alan Mackenzie
2007-05-12 14:30 ` Herbert Euler
` (2 more replies)
2007-05-12 15:30 ` Herbert Euler
` (2 subsequent siblings)
3 siblings, 3 replies; 41+ messages in thread
From: Alan Mackenzie @ 2007-05-12 13:19 UTC (permalink / raw)
To: Herbert Euler; +Cc: emacs-devel
Hi, Guanpeng!
On Sat, May 12, 2007 at 06:39:12PM +0800, Herbert Euler wrote:
> In the newest unicode 2 branch, `parse-sexp-lookup-properties' is
> default to `t' in C++ mode but not in C mode. I don't have the
> Emacs 22 trunk, and so don't know the case in that trunk. But
> if this variable is set to t by default, there will be a bug in c++-mode:
parse-sexp-lookup-properties is t in C++ Mode so that text properties
can be set on pertinent <s and >s (in templates) to mark them as
parentheses.
> 1. Visit an empty, new C++ file.
> 2. Try to insert the following line, at the beginning of the buffer:
> #include <iostream>
> Well, an error will be signaled when typing the second character, "i",
> says "Point before start of properties". This error happens in the
> function `c-beginning-of-current-token', when invoking
> `skip-syntax-backward':
OK. This doesn't happen to me in the Emacs 22 release branch. But
c-beginning-of-current-token was changed recently. I suspect you might
have some option set which exposes a bug in that function.
Did you start your emacs with -Q? If not does the error still happen
when you do? If you didn't use -Q, could you please dump your CC Mode
configuration with C-c C-b and post it here.
[ .... ]
> Regards,
> Guanpeng Xu
--
Alan Mackenzie (Ittersbach, Germany).
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: C++ mode and c-beginning-of-current-token
2007-05-12 13:19 ` Alan Mackenzie
@ 2007-05-12 14:30 ` Herbert Euler
2007-05-12 14:33 ` Herbert Euler
2007-05-12 16:02 ` Herbert Euler
2 siblings, 0 replies; 41+ messages in thread
From: Herbert Euler @ 2007-05-12 14:30 UTC (permalink / raw)
To: acm; +Cc: emacs-devel
>Did you start your emacs with -Q? If not does the error still happen
>when you do? If you didn't use -Q, could you please dump your CC Mode
>configuration with C-c C-b and post it here.
Below is the result in Emacs started with -Q option:
Emacs : GNU Emacs 23.0.0.1 (i386-mingw-nt5.1.2600)
of 2007-05-11 on XUGUANPENG4876
Package: CC Mode 5.31.4 (C++/l)
Buffer Style: gnu
c-emacs-features: (pps-extended-state col-0-paren posix-char-classes
gen-string-delim gen-comment-delim syntax-properties 1-bit)
current state:
==============
(setq
c-basic-offset 2
c-comment-only-line-offset '(0 . 0)
c-indent-comment-alist '((anchored-comment column . 0) (end-block space . 1)
(cpp-end-block space . 2))
c-indent-comments-syntactically-p nil
c-block-comment-prefix ""
c-comment-prefix-regexp '((pike-mode . "//+!?\\|\\**") (awk-mode . "#+")
(other . "//+\\|\\**"))
c-doc-comment-style '((java-mode . javadoc) (pike-mode . autodoc)
(c-mode . gtkdoc))
c-cleanup-list '(scope-operator)
c-hanging-braces-alist '((substatement-open before after))
c-hanging-colons-alist nil
c-hanging-semi&comma-criteria '(c-semi&comma-inside-parenlist)
c-backslash-column 48
c-backslash-max-column 72
c-special-indent-hook '(c-gnu-impose-minimum)
c-label-minimum-indentation 1
c-offsets-alist '((inexpr-class . +)
(inexpr-statement . +)
(lambda-intro-cont . +)
(inlambda . c-lineup-inexpr-block)
(template-args-cont c-lineup-template-args +)
(incomposition . +)
(inmodule . +)
(innamespace . +)
(inextern-lang . +)
(composition-close . 0)
(module-close . 0)
(namespace-close . 0)
(extern-lang-close . 0)
(composition-open . 0)
(module-open . 0)
(namespace-open . 0)
(extern-lang-open . 0)
(objc-method-call-cont . c-lineup-ObjC-method-call)
(objc-method-args-cont . c-lineup-ObjC-method-args)
(objc-method-intro . [0])
(friend . 0)
(cpp-define-intro c-lineup-cpp-define +)
(cpp-macro-cont . +)
(cpp-macro . [0])
(inclass . +)
(stream-op . c-lineup-streamop)
(arglist-cont-nonempty
c-lineup-gcc-asm-reg
c-lineup-arglist
)
(arglist-cont c-lineup-gcc-asm-reg 0)
(comment-intro
c-lineup-knr-region-comment
c-lineup-comment
)
(catch-clause . 0)
(else-clause . 0)
(do-while-closure . 0)
(access-label . -)
(case-label . 0)
(substatement . +)
(statement-case-intro . +)
(statement . 0)
(brace-entry-open . 0)
(brace-list-entry . 0)
(brace-list-intro . +)
(brace-list-close . 0)
(block-close . 0)
(block-open . 0)
(inher-cont . c-lineup-multi-inher)
(inher-intro . +)
(member-init-cont . c-lineup-multi-inher)
(member-init-intro . +)
(topmost-intro . 0)
(knr-argdecl . 0)
(func-decl-cont . +)
(inline-close . 0)
(class-close . 0)
(class-open . 0)
(defun-block-intro . +)
(defun-close . 0)
(defun-open . 0)
(c . c-lineup-C-comments)
(string . c-lineup-dont-change)
(topmost-intro-cont
first
c-lineup-topmost-intro-cont
c-lineup-gnu-DEFUN-intro-cont
)
(brace-list-open . +)
(inline-open . 0)
(arglist-close . c-lineup-arglist)
(arglist-intro . c-lineup-arglist-intro-after-paren)
(statement-cont . +)
(statement-case-open . +)
(label . 0)
(substatement-label . 0)
(substatement-open . +)
(knr-argdecl-intro . 5)
(statement-block-intro . +)
)
c-buffer-is-cc-mode 'c++-mode
c-tab-always-indent t
c-syntactic-indentation t
c-syntactic-indentation-in-macros t
c-ignore-auto-fill '(string cpp code)
c-auto-align-backslashes t
c-backspace-function 'backward-delete-char-untabify
c-delete-function 'delete-char
c-electric-pound-behavior nil
c-default-style '((java-mode . "java") (awk-mode . "awk") (other . "gnu"))
c-enable-xemacs-performance-kludge-p nil
c-old-style-variable-behavior nil
defun-prompt-regexp nil
tab-width 8
comment-column 32
parse-sexp-ignore-comments t
parse-sexp-lookup-properties t
auto-fill-function nil
comment-multi-line t
comment-start-skip "\\(//+\\|/\\*+\\)\\s *"
fill-prefix nil
fill-column 70
paragraph-start "[ ]*\\(//+\\|\\**\\)[ ]*$\\|^\f"
adaptive-fill-mode t
adaptive-fill-regexp "[ ]*\\(//+\\|\\**\\)[ ]*\\([
]*\\([-!|#%;>*·‣⁃◦]+[ ]*\\)*\\)"
)
Thank you very much for the quickly reply! :-D
Regards,
Guanpeng Xu
_________________________________________________________________
Express yourself instantly with MSN Messenger! Download today it's FREE!
http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: C++ mode and c-beginning-of-current-token
2007-05-12 13:19 ` Alan Mackenzie
2007-05-12 14:30 ` Herbert Euler
@ 2007-05-12 14:33 ` Herbert Euler
2007-05-12 16:02 ` Herbert Euler
2 siblings, 0 replies; 41+ messages in thread
From: Herbert Euler @ 2007-05-12 14:33 UTC (permalink / raw)
To: acm; +Cc: emacs-devel
>Did you start your emacs with -Q? If not does the error still happen
>when you do? If you didn't use -Q, could you please dump your CC Mode
>configuration with C-c C-b and post it here.
And yes, the problem happens even with -Q option.
Regards,
Guanpeng Xu
_________________________________________________________________
Express yourself instantly with MSN Messenger! Download today it's FREE!
http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/
^ permalink raw reply [flat|nested] 41+ messages in thread
* RE: C++ mode and c-beginning-of-current-token
2007-05-12 10:39 C++ mode and c-beginning-of-current-token Herbert Euler
2007-05-12 13:19 ` Alan Mackenzie
@ 2007-05-12 15:30 ` Herbert Euler
2007-05-12 18:49 ` Alan Mackenzie
2007-05-13 10:01 ` Alan Mackenzie
2007-05-14 16:58 ` Stefan Monnier
3 siblings, 1 reply; 41+ messages in thread
From: Herbert Euler @ 2007-05-12 15:30 UTC (permalink / raw)
To: herberteuler, acm; +Cc: emacs-devel
I found some additional strange behaviors. First, the error in a
buffer would only be signaled once. If an error is already signaled,
typing again does not cause another same error get signaled, but
font-lock is broken after that. Normally, characters are assigned
with face property as they are typed. For example, when I type
#include in a c++-mode buffer, the whole word is assigned with
font-lock-preprocessor-face after the first letter, l, is typed (when
# is typed, face are not set). However, in the C++ buffer in which an
error has been signaled, font-lock does not work on the erroneous
region. The erroneous starts with the beginning of the buffer (where
the original error is signaled), and ends with a statement (defined
with `c-beginning-of-statement' and `c-end-of-statement'). For
example, suppose the first line of a buffer is as below (-!- is the
point position):
-!-#include <iostream>
#include <vector>
and no errors have been signaled. Now I type C-o, the buffer content
is now
-!-
#include <iostream>
#include <vector>
Now I type #, an error is signaled (this is slightly different from
typing in an empty buffer, in such a case the error is signaled when
typing i, rather than #). Now if I finish the first line as below:
#include <iostream>-!-
#include <iostream>
#include <vector>
The second #include <iostream> and content follows it is still
colored, while the first one is not. Text properties on the first
line are (c-in-sws t auto-composed t fontified t). If other content
are typed now, for example the following:
#include <iostream>
int
main ()
{
}
-!-
#include <iostream>
#include <vector>
all of them are not colored; text properties of the non-colored
content but the first line are (c-is-sws t auto-composed t fontified
t), and of the first line is changed to (c-is-sws t c-in-sws t
auto-composed t fontified t).
If I invoke `font-lock-fontify-region' on non-colored region manually,
they will be fontified correctly. Font-lock in other parts of the
buffer works, but is abnormal as usual. And there are other strange
behaviors, but they are too complicated for me to describe, so I
intend to describe only if you need.
Hope the above information helps. Thanks.
Regards,
Guanpeng Xu
_________________________________________________________________
FREE pop-up blocking with the new MSN Toolbar - get it now!
http://toolbar.msn.click-url.com/go/onm00200415ave/direct/01/
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: C++ mode and c-beginning-of-current-token
2007-05-12 13:19 ` Alan Mackenzie
2007-05-12 14:30 ` Herbert Euler
2007-05-12 14:33 ` Herbert Euler
@ 2007-05-12 16:02 ` Herbert Euler
2 siblings, 0 replies; 41+ messages in thread
From: Herbert Euler @ 2007-05-12 16:02 UTC (permalink / raw)
To: acm; +Cc: emacs-devel
>parse-sexp-lookup-properties is t in C++ Mode so that text properties
>can be set on pertinent <s and >s (in templates) to mark them as
>parentheses.
Setting `parse-sexp-lookup-properties' to t in c-mode buffer reproduces
the error for c-mode, and setting `parse-sexp-lookup-properties' to nil in
c++-mode eliminate the error for c++-mode, although templates would
work.
Regards,
Guanpeng Xu
_________________________________________________________________
FREE pop-up blocking with the new MSN Toolbar - get it now!
http://toolbar.msn.click-url.com/go/onm00200415ave/direct/01/
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: C++ mode and c-beginning-of-current-token
2007-05-12 15:30 ` Herbert Euler
@ 2007-05-12 18:49 ` Alan Mackenzie
2007-05-13 0:51 ` Herbert Euler
0 siblings, 1 reply; 41+ messages in thread
From: Alan Mackenzie @ 2007-05-12 18:49 UTC (permalink / raw)
To: Herbert Euler; +Cc: emacs-devel
Hi, Guanpeng!
On Sat, May 12, 2007 at 11:30:07PM +0800, Herbert Euler wrote:
> I found some additional strange behaviors. First, the error in a
> buffer would only be signaled once. If an error is already signaled,
> typing again does not cause another same error get signaled, but
> font-lock is broken after that.
Font locking is done in an after-change hook. If a function in the
after-change-hook throws an error, Emacs deletes that function from the
hook, allowing Emacs to continue broken rather than hang up on continual
errors. I think that is what is happening to you here. You can check
this by examining after-change-functions before and after the error.
I think this explains all the anomalies you were seeing.
> Hope the above information helps. Thanks.
I'm going to have to think a bit about the main bug you reported.
Hopefully, I'll get back to you tomorrow about it. Sleep well!
> Regards,
> Guanpeng Xu
--
Alan Mackenzie (Ittersbach, Germany).
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: C++ mode and c-beginning-of-current-token
2007-05-12 18:49 ` Alan Mackenzie
@ 2007-05-13 0:51 ` Herbert Euler
0 siblings, 0 replies; 41+ messages in thread
From: Herbert Euler @ 2007-05-13 0:51 UTC (permalink / raw)
To: acm; +Cc: emacs-devel
> > I found some additional strange behaviors. First, the error in a
> > buffer would only be signaled once. If an error is already signaled,
> > typing again does not cause another same error get signaled, but
> > font-lock is broken after that.
>
>Font locking is done in an after-change hook. If a function in the
>after-change-hook throws an error, Emacs deletes that function from the
>hook, allowing Emacs to continue broken rather than hang up on continual
>errors. I think that is what is happening to you here. You can check
>this by examining after-change-functions before and after the error.
>
>I think this explains all the anomalies you were seeing.
Yes, the reason the error is not signaled again is that the after-change
hook is changed. I do not know whether it is correct here: all of the
functions in the hook are removed. The default value of the hook is
(c-after-change jit-lock-after-change auto-composition-after-change t),
and I tried another value (jit-lock-after-change
auto-composition-after-change
t c-after-change). In both cases, `after-change-functions' is nil after
the error has been signaled.
Regards,
Guanpeng Xu
_________________________________________________________________
FREE pop-up blocking with the new MSN Toolbar - get it now!
http://toolbar.msn.click-url.com/go/onm00200415ave/direct/01/
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: C++ mode and c-beginning-of-current-token
2007-05-12 10:39 C++ mode and c-beginning-of-current-token Herbert Euler
2007-05-12 13:19 ` Alan Mackenzie
2007-05-12 15:30 ` Herbert Euler
@ 2007-05-13 10:01 ` Alan Mackenzie
2007-05-14 2:00 ` Herbert Euler
2007-05-14 16:58 ` Stefan Monnier
3 siblings, 1 reply; 41+ messages in thread
From: Alan Mackenzie @ 2007-05-13 10:01 UTC (permalink / raw)
To: Stefan Monnier, Herbert Euler; +Cc: emacs-devel
Hi, Stefan and Guanpeng!
On Sat, May 12, 2007 at 06:39:12PM +0800, Herbert Euler wrote:
> In the newest unicode 2 branch, `parse-sexp-lookup-properties' is
> default to `t' in C++ mode but not in C mode. I don't have the
> Emacs 22 trunk, and so don't know the case in that trunk. But
> if this variable is set to t by default, there will be a bug in c++-mode:
0. Start emacs with -Q
> 1. Visit an empty, new C++ file.
> 2. Try to insert the following line, at the beginning of the buffer:
> #include <iostream>
> Well, an error will be signaled when typing the second character, "i",
> says "Point before start of properties". This error happens in the
> function `c-beginning-of-current-token', when invoking
> `skip-syntax-backward':
"Point before start of properties" is thrown by the C function
update_interval in ..../emacs/src/interval.c. update_interval is called
only from within syntax.c (I think).
Stefan: this is in your bailiwick. Could you have a look at it, please.
> (defun c-beginning-of-current-token (&optional back-limit)
> ;; Move to the beginning of the current token. Do not move if not
> ;; in the middle of one. BACK-LIMIT may be used to bound the
> ;; backward search; if given it's assumed to be at the boundary
> ;; between two tokens. Return non-nil if the point is move, nil
> ;; otherwise.
> ;;
> ;; This function might do hidden buffer changes.
> (let ((start (point)))
> (if (looking-at "\\w\\|\\s_")
> (skip-syntax-backward "w_" back-limit)
> (when (< (skip-syntax-backward ".()" back-limit) 0)
> ;; ... ...
> If `parse-sexp-lookup-properties' is t, `skip-syntax-backward' would
> not return -1, but signaling an error in some cases.
> Regards,
> Guanpeng Xu
--
Alan.
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: C++ mode and c-beginning-of-current-token
2007-05-13 10:01 ` Alan Mackenzie
@ 2007-05-14 2:00 ` Herbert Euler
2007-05-14 8:50 ` Alan Mackenzie
0 siblings, 1 reply; 41+ messages in thread
From: Herbert Euler @ 2007-05-14 2:00 UTC (permalink / raw)
To: acm, monnier; +Cc: emacs-devel
I looked at an older version of CC mode, the `c-after-change' function
does not call `c-trim-found-types', and the `c-before-change' function
does not exist. The error does not happen in that version. Both these
changes lead to the invocation of `c-beginning-of-current-token' as a
result. So I suspect these changes are the direct source of the error.
Regards,
Guanpeng Xu
_________________________________________________________________
Express yourself instantly with MSN Messenger! Download today it's FREE!
http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: C++ mode and c-beginning-of-current-token
2007-05-14 2:00 ` Herbert Euler
@ 2007-05-14 8:50 ` Alan Mackenzie
2007-05-14 9:24 ` Herbert Euler
0 siblings, 1 reply; 41+ messages in thread
From: Alan Mackenzie @ 2007-05-14 8:50 UTC (permalink / raw)
To: Herbert Euler; +Cc: monnier, emacs-devel
Hi, Guanpeng!
On Mon, May 14, 2007 at 10:00:59AM +0800, Herbert Euler wrote:
> I looked at an older version of CC mode, the `c-after-change' function
> does not call `c-trim-found-types', and the `c-before-change' function
> does not exist. The error does not happen in that version. Both these
> changes lead to the invocation of `c-beginning-of-current-token' as a
> result. So I suspect these changes are the direct source of the error.
As I said earlier, I think it more likely these changes in CC Mode have
triggered a bug rather than being buggy themselves. (OK, anybody, feel
free to demolish my hubris ;-) I haven't looked into it much, but it
seems likely that the syntax.c stuff for a Unicode branch will have been
enhanced to cope with the wierdnesses of that character set.
Could you please run edebug[*] on c-beginning-of-current-token, and check
that the arguments which are passed to skip-syntax-backwards are valid.
If they're not, it's CC Mode's problem. If they are, let's wait and see
what Stefan says.
[*] If you're not familiar with Edebug, it's well described in the Elisp
manual. Send me a private email if you want help to use it.
> Regards,
> Guanpeng Xu
--
Alan Mackenzie (Ittersbach, Germany).
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: C++ mode and c-beginning-of-current-token
2007-05-14 8:50 ` Alan Mackenzie
@ 2007-05-14 9:24 ` Herbert Euler
0 siblings, 0 replies; 41+ messages in thread
From: Herbert Euler @ 2007-05-14 9:24 UTC (permalink / raw)
To: acm; +Cc: monnier, emacs-devel
> > I looked at an older version of CC mode, the `c-after-change' function
> > does not call `c-trim-found-types', and the `c-before-change' function
> > does not exist. The error does not happen in that version. Both these
> > changes lead to the invocation of `c-beginning-of-current-token' as a
> > result. So I suspect these changes are the direct source of the error.
>
>As I said earlier, I think it more likely these changes in CC Mode have
>triggered a bug rather than being buggy themselves. (OK, anybody, feel
>free to demolish my hubris ;-) I haven't looked into it much, but it
>seems likely that the syntax.c stuff for a Unicode branch will have been
>enhanced to cope with the wierdnesses of that character set.
>
>Could you please run edebug[*] on c-beginning-of-current-token, and check
>that the arguments which are passed to skip-syntax-backwards are valid.
>If they're not, it's CC Mode's problem. If they are, let's wait and see
>what Stefan says.
The argument is surely valid, but there are too many differences of syntax.c
between the main branch and the unicode 2 branch. As you said, perhaps
the source of the unicode 2 branch is too old. Let's wait for Stefan's
opinion.
Thank you very much.
Regards,
Guanpeng Xu
Btw, I thought except for unicode feature, the unicode 2 branch is the same
as the main branch. So I am wrong.
_________________________________________________________________
Express yourself instantly with MSN Messenger! Download today it's FREE!
http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: C++ mode and c-beginning-of-current-token
2007-05-12 10:39 C++ mode and c-beginning-of-current-token Herbert Euler
` (2 preceding siblings ...)
2007-05-13 10:01 ` Alan Mackenzie
@ 2007-05-14 16:58 ` Stefan Monnier
2007-05-15 3:45 ` Herbert Euler
2007-05-15 13:30 ` Herbert Euler
3 siblings, 2 replies; 41+ messages in thread
From: Stefan Monnier @ 2007-05-14 16:58 UTC (permalink / raw)
To: Herbert Euler; +Cc: acm, emacs-devel
> If `parse-sexp-lookup-properties' is t, `skip-syntax-backward' would
> not return -1, but signaling an error in some cases.
Sounds like a bug in skip-syntax-backward.
Stefan
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: C++ mode and c-beginning-of-current-token
2007-05-14 16:58 ` Stefan Monnier
@ 2007-05-15 3:45 ` Herbert Euler
2007-05-15 6:39 ` martin rudalics
2007-05-16 16:15 ` Stefan Monnier
2007-05-15 13:30 ` Herbert Euler
1 sibling, 2 replies; 41+ messages in thread
From: Herbert Euler @ 2007-05-15 3:45 UTC (permalink / raw)
To: monnier; +Cc: acm, emacs-devel
> > If `parse-sexp-lookup-properties' is t, `skip-syntax-backward' would
> > not return -1, but signaling an error in some cases.
>
>Sounds like a bug in skip-syntax-backward.
Is it Ok of signaling error if point is at beginning of buffer when calling
`skip-syntax-backward'?
Regards,
Guanpeng Xu
_________________________________________________________________
Don't just search. Find. Check out the new MSN Search!
http://search.msn.com/
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: C++ mode and c-beginning-of-current-token
2007-05-15 3:45 ` Herbert Euler
@ 2007-05-15 6:39 ` martin rudalics
2007-05-16 16:15 ` Stefan Monnier
1 sibling, 0 replies; 41+ messages in thread
From: martin rudalics @ 2007-05-15 6:39 UTC (permalink / raw)
To: Herbert Euler; +Cc: acm, monnier, emacs-devel
> Is it Ok of signaling error if point is at beginning of buffer when calling
> `skip-syntax-backward'?
No. (Unless you pass a wrong argument to it.)
Could you run skip_chars under gdb to see what goes on?
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: C++ mode and c-beginning-of-current-token
2007-05-14 16:58 ` Stefan Monnier
2007-05-15 3:45 ` Herbert Euler
@ 2007-05-15 13:30 ` Herbert Euler
2007-05-16 8:01 ` Herbert Euler
1 sibling, 1 reply; 41+ messages in thread
From: Herbert Euler @ 2007-05-15 13:30 UTC (permalink / raw)
To: monnier; +Cc: acm, emacs-devel
> > If `parse-sexp-lookup-properties' is t, `skip-syntax-backward' would
> > not return -1, but signaling an error in some cases.
>
>Sounds like a bug in skip-syntax-backward.
In the previous message, I described a way to reproduce the error:
0. Start Emacs with emacs -Q.
1. Visit a new, empty C++ file.
2. Insert the following characters:
#include <iostream>
The error happens when inserting the second character, i. Now let's
remember the following two facts:
a) `parse-sexp-lookup-properties' is t in a c++-mode buffer,
b) `after-change-functions' contains `c-after-change', which leads to
invocation of `skip-syntax-backward' in the way
(skip-syntax-backward ".()" nil),
c) the c++-mode buffer is not multibyte.
Now, with fact a), code in syntax.c will update interval information,
including the function `skip_syntaxes', invoked by
`Fskip_syntax_backward'. Take a look at how `skip_syntaxes' works, we
might know how the error happens.
The code that skips characters and updates interval information uses a
variable `pos', and below is how it updates interval information:
while (1)
{
if (p <= stop)
{
if (p <= endp)
break;
p = GPT_ADDR;
stop = endp;
}
if (! fastmap[(int) SYNTAX (p[-1])])
break;
p--, pos--, pos_byte--;
UPDATE_SYNTAX_TABLE_BACKWARD (pos - 1);
}
As you see, it updates interval information at position (pos - 1). At
an earlier stage in this function, `pos' is initialized to the buffer
position:
int start_point = PT;
int pos = PT;
int pos_byte = PT_BYTE;
unsigned char *p = PT_ADDR, *endp, *stop;
Since we're inserting the second character, `PT' is now 2, and so the
initial value of `pos' is 2. Later, before updating interval
information, `pos' is self-decreased, so (pos - 1) = (1 - 1) = 0.
Here is how `UPDATE_SYNTAX_TABLE_BACKWARD' is defined:
#define UPDATE_SYNTAX_TABLE_BACKWARD(charpos) \
(parse_sexp_lookup_properties \
&& (charpos) < gl_state.b_property \
? (update_syntax_table ((charpos) + gl_state.offset, -1, 0, \
gl_state.object), \
1) \
: 0)
It invokes `update_syntax_table' with `charpos' set to (charpos +
gl_state.offset). `charpos' is 0 there, and gl_state.offset is 0. So
`update_syntax_table' is invoked with `charpos' set to 0. Finally,
`update_interval' is invoked in `update_syntax_table':
void
update_syntax_table (charpos, count, init, object)
int charpos, count, init;
Lisp_Object object;
{
/* ... ... */
i = update_interval (i, charpos);
So `update_interval' is invoked with `pos' set to 0:
INTERVAL
update_interval (i, pos)
register INTERVAL i;
int pos;
{
if (NULL_INTERVAL_P (i))
return NULL_INTERVAL;
while (1)
{
if (pos < i->position)
{
/* Move left. */
if (pos >= i->position - TOTAL_LENGTH (i->left))
{
i->left->position = i->position - TOTAL_LENGTH (i->left)
+ LEFT_TOTAL_LENGTH (i->left);
i = i->left; /* Move to the left child */
}
else if (NULL_PARENT (i))
error ("Point before start of properties");
else
i = INTERVAL_PARENT (i);
continue;
}
`pos' is 0, and `*i' is
(gdb) p *i
$14 = {
total_length = 1,
position = 1,
left = 0x0,
right = 0x0,
up = {
interval = 0x86a1204,
obj = 141169156
},
up_obj = 1,
gcmarkbit = 0,
write_protect = 0,
visible = 0,
front_sticky = 0,
rear_sticky = 0,
plist = 156480797
}
(gdb)
That's why the error is signaled, I think.
Regards,
Guanpeng Xu
_________________________________________________________________
Don't just search. Find. Check out the new MSN Search!
http://search.msn.click-url.com/go/onm00200636ave/direct/01/
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: C++ mode and c-beginning-of-current-token
2007-05-15 13:30 ` Herbert Euler
@ 2007-05-16 8:01 ` Herbert Euler
2007-05-16 8:05 ` Herbert Euler
2007-05-16 9:00 ` martin rudalics
0 siblings, 2 replies; 41+ messages in thread
From: Herbert Euler @ 2007-05-16 8:01 UTC (permalink / raw)
To: herberteuler, monnier; +Cc: acm, emacs-devel
>> > If `parse-sexp-lookup-properties' is t, `skip-syntax-backward' would
>> > not return -1, but signaling an error in some cases.
>>
>>Sounds like a bug in skip-syntax-backward.
>
>In the previous message, I described a way to reproduce the error:
>
>0. Start Emacs with emacs -Q.
>
>1. Visit a new, empty C++ file.
>
>2. Insert the following characters:
>
>#include <iostream>
>
>The error happens when inserting the second character, i. Now let's
>remember the following two facts:
>
>a) `parse-sexp-lookup-properties' is t in a c++-mode buffer,
>
>b) `after-change-functions' contains `c-after-change', which leads to
> invocation of `skip-syntax-backward' in the way
> (skip-syntax-backward ".()" nil),
>
>c) the c++-mode buffer is not multibyte.
>
>Now, with fact a), code in syntax.c will update interval information,
>including the function `skip_syntaxes', invoked by
>`Fskip_syntax_backward'. Take a look at how `skip_syntaxes' works, we
>might know how the error happens.
>
>The code that skips characters and updates interval information uses a
>variable `pos', and below is how it updates interval information:
>
> while (1)
> {
> if (p <= stop)
> {
> if (p <= endp)
> break;
> p = GPT_ADDR;
> stop = endp;
> }
> if (! fastmap[(int) SYNTAX (p[-1])])
> break;
> p--, pos--, pos_byte--;
> UPDATE_SYNTAX_TABLE_BACKWARD (pos - 1);
> }
>
>As you see, it updates interval information at position (pos - 1). At
>an earlier stage in this function, `pos' is initialized to the buffer
>position:
>
> int start_point = PT;
> int pos = PT;
> int pos_byte = PT_BYTE;
> unsigned char *p = PT_ADDR, *endp, *stop;
>
>Since we're inserting the second character, `PT' is now 2, and so the
>initial value of `pos' is 2. Later, before updating interval
>information, `pos' is self-decreased, so (pos - 1) = (1 - 1) = 0.
>
>Here is how `UPDATE_SYNTAX_TABLE_BACKWARD' is defined:
>
> #define UPDATE_SYNTAX_TABLE_BACKWARD(charpos) \
> (parse_sexp_lookup_properties \
> && (charpos) < gl_state.b_property \
> ? (update_syntax_table ((charpos) + gl_state.offset, -1, 0, \
> gl_state.object), \
> 1) \
> : 0)
>
>It invokes `update_syntax_table' with `charpos' set to (charpos +
>gl_state.offset). `charpos' is 0 there, and gl_state.offset is 0. So
>`update_syntax_table' is invoked with `charpos' set to 0. Finally,
>`update_interval' is invoked in `update_syntax_table':
>
> void
> update_syntax_table (charpos, count, init, object)
> int charpos, count, init;
> Lisp_Object object;
> {
> /* ... ... */
> i = update_interval (i, charpos);
>
>So `update_interval' is invoked with `pos' set to 0:
>
> INTERVAL
> update_interval (i, pos)
> register INTERVAL i;
> int pos;
> {
> if (NULL_INTERVAL_P (i))
> return NULL_INTERVAL;
>
> while (1)
> {
> if (pos < i->position)
> {
> /* Move left. */
> if (pos >= i->position - TOTAL_LENGTH (i->left))
> {
> i->left->position = i->position - TOTAL_LENGTH (i->left)
> + LEFT_TOTAL_LENGTH (i->left);
> i = i->left; /* Move to the left child */
> }
> else if (NULL_PARENT (i))
> error ("Point before start of properties");
> else
> i = INTERVAL_PARENT (i);
> continue;
> }
>
>`pos' is 0, and `*i' is
>
>(gdb) p *i
>$14 = {
> total_length = 1,
> position = 1,
> left = 0x0,
> right = 0x0,
> up = {
> interval = 0x86a1204,
> obj = 141169156
> },
> up_obj = 1,
> gcmarkbit = 0,
> write_protect = 0,
> visible = 0,
> front_sticky = 0,
> rear_sticky = 0,
> plist = 156480797
>}
>(gdb)
>
>That's why the error is signaled, I think.
Ok, the function `skip_syntaxes' does not exist in the main branch,
syntax.c in the unicode 2 branch is too old.
Regards,
Guanpeng Xu
_________________________________________________________________
Express yourself instantly with MSN Messenger! Download today it's FREE!
http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: C++ mode and c-beginning-of-current-token
2007-05-16 8:01 ` Herbert Euler
@ 2007-05-16 8:05 ` Herbert Euler
2007-05-17 2:12 ` Kenichi Handa
2007-05-16 9:00 ` martin rudalics
1 sibling, 1 reply; 41+ messages in thread
From: Herbert Euler @ 2007-05-16 8:05 UTC (permalink / raw)
To: herberteuler, monnier; +Cc: acm, emacs-devel
>Ok, the function `skip_syntaxes' does not exist in the main branch,
>syntax.c in the unicode 2 branch is too old.
Sorry, `skip_syntaxes' is added by Kenichi Handa, and never exists
in the main branch.
Regards,
Guanpeng Xu
_________________________________________________________________
FREE pop-up blocking with the new MSN Toolbar - get it now!
http://toolbar.msn.click-url.com/go/onm00200415ave/direct/01/
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: C++ mode and c-beginning-of-current-token
2007-05-16 8:01 ` Herbert Euler
2007-05-16 8:05 ` Herbert Euler
@ 2007-05-16 9:00 ` martin rudalics
2007-05-16 11:12 ` Herbert Euler
1 sibling, 1 reply; 41+ messages in thread
From: martin rudalics @ 2007-05-16 9:00 UTC (permalink / raw)
To: Herbert Euler; +Cc: acm, monnier, emacs-devel
>> p--, pos--, pos_byte--;
>> UPDATE_SYNTAX_TABLE_BACKWARD (pos - 1);
Possibly a silly suggestion: What happens if you use
UPDATE_SYNTAX_TABLE_BACKWARD (pos);
here?
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: C++ mode and c-beginning-of-current-token
2007-05-16 9:00 ` martin rudalics
@ 2007-05-16 11:12 ` Herbert Euler
2007-05-16 12:21 ` martin rudalics
0 siblings, 1 reply; 41+ messages in thread
From: Herbert Euler @ 2007-05-16 11:12 UTC (permalink / raw)
To: rudalics; +Cc: acm, monnier, emacs-devel
>>> p--, pos--, pos_byte--;
>>> UPDATE_SYNTAX_TABLE_BACKWARD (pos - 1);
>
>Possibly a silly suggestion: What happens if you use
>
> UPDATE_SYNTAX_TABLE_BACKWARD (pos);
>
>here?
I'm not very sure about the meaning of (pos - 1), so instead of trying
your proposal, I tried the following one:
while (1)
{
if (p <= stop)
{
if (p <= endp)
break;
p = GPT_ADDR;
stop = endp;
}
if (! fastmap[(int) SYNTAX (p[-1])])
break;
p--, pos--, pos_byte--;
if (pos <= 1)
break;
UPDATE_SYNTAX_TABLE_BACKWARD (pos - 1);
}
And this change seems to fix the problem.
Regards,
Guanpeng Xu
_________________________________________________________________
Express yourself instantly with MSN Messenger! Download today it's FREE!
http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: C++ mode and c-beginning-of-current-token
2007-05-16 11:12 ` Herbert Euler
@ 2007-05-16 12:21 ` martin rudalics
0 siblings, 0 replies; 41+ messages in thread
From: martin rudalics @ 2007-05-16 12:21 UTC (permalink / raw)
To: Herbert Euler; +Cc: acm, monnier, emacs-devel
> I'm not very sure about the meaning of (pos - 1), so instead of trying
> your proposal, I tried the following one:
To test you could try the following: Assign a syntax-table text-property
which differs from the standard syntax-table for the buffer to a few
characters and check whether `skip-syntax-backward' updates the
properties correctly when you are after the last character. This should
guarantee that
> p--, pos--, pos_byte--;
> if (pos <= 1)
> break;
> UPDATE_SYNTAX_TABLE_BACKWARD (pos - 1);
indeed updates the property for each and every character - in particular
the very first ("rightmost") one - skipped. Better do this in an elisp
buffer to avoid that the major mode interferes with your settings.
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: C++ mode and c-beginning-of-current-token
2007-05-15 3:45 ` Herbert Euler
2007-05-15 6:39 ` martin rudalics
@ 2007-05-16 16:15 ` Stefan Monnier
1 sibling, 0 replies; 41+ messages in thread
From: Stefan Monnier @ 2007-05-16 16:15 UTC (permalink / raw)
To: Herbert Euler; +Cc: acm, emacs-devel
>> > If `parse-sexp-lookup-properties' is t, `skip-syntax-backward' would
>> > not return -1, but signaling an error in some cases.
>>
>> Sounds like a bug in skip-syntax-backward.
> Is it Ok of signaling error if point is at beginning of buffer when calling
> `skip-syntax-backward'?
No, it's not.
It might conceptually be OK to change it so as to signal
a `beginning-of-buffer' error (although it'd most likely introduce
compatiblity bugs), but it has no reason to ever signal not "Point before
start of properties".
Stefan
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: C++ mode and c-beginning-of-current-token
2007-05-16 8:05 ` Herbert Euler
@ 2007-05-17 2:12 ` Kenichi Handa
2007-05-17 10:18 ` martin rudalics
0 siblings, 1 reply; 41+ messages in thread
From: Kenichi Handa @ 2007-05-17 2:12 UTC (permalink / raw)
To: Herbert Euler; +Cc: acm, herberteuler, monnier, emacs-devel
In article <BAY143-F2699DD79B76EFF38E365F0DA3C0@phx.gbl>, "Herbert Euler" <herberteuler@hotmail.com> writes:
> >Ok, the function `skip_syntaxes' does not exist in the main branch,
> >syntax.c in the unicode 2 branch is too old.
> Sorry, `skip_syntaxes' is added by Kenichi Handa, and never exists
> in the main branch.
I devided Emacs 22's skip_chars into skip_chars (only for
skipping chars) and skip_syntaxes (only for skipping
syntaxes) without changing (and understanding) the logic.
And I did an optimization in them to avoid multibyte
character checking if the region contains only ASCII chars.
The current problem in emacs-unicode-2 happens in a code
that treats the buffer as unibyte because of the above
optimization, and the same problem happens also in Emacs 22
with a unibyte buffer.
Please try the same thing with Emacs 22 started with
"--unibyte" argument. The same error is signalled.
So, the bug has been in the original code of skip_chars
(where it handles a unibyte case) in Emacs 22. Could
someone please fix it in Emacs 22? Then, I'll do the same
fix on emacs-unicode-2.
---
Kenichi Handa
handa@m17n.org
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: C++ mode and c-beginning-of-current-token
2007-05-17 2:12 ` Kenichi Handa
@ 2007-05-17 10:18 ` martin rudalics
2007-05-17 12:52 ` Herbert Euler
0 siblings, 1 reply; 41+ messages in thread
From: martin rudalics @ 2007-05-17 10:18 UTC (permalink / raw)
To: Kenichi Handa; +Cc: acm, Herbert Euler, monnier, emacs-devel
[-- Attachment #1: Type: text/plain, Size: 370 bytes --]
> Please try the same thing with Emacs 22 started with
> "--unibyte" argument. The same error is signalled.
>
> So, the bug has been in the original code of skip_chars
> (where it handles a unibyte case) in Emacs 22. Could
> someone please fix it in Emacs 22? Then, I'll do the same
> fix on emacs-unicode-2.
FWIW, I couldn't find problems with the attached patch.
[-- Attachment #2: syntax.patch --]
[-- Type: text/plain, Size: 416 bytes --]
*** syntax.c Wed Jan 17 09:31:10 2007
--- syntax.c Thu May 17 11:15:42 2007
***************
*** 1672,1678 ****
if (! fastmap[(int) SYNTAX (p[-1])])
break;
p--, pos--;
! UPDATE_SYNTAX_TABLE_BACKWARD (pos - 1);
}
}
}
--- 1672,1678 ----
if (! fastmap[(int) SYNTAX (p[-1])])
break;
p--, pos--;
! UPDATE_SYNTAX_TABLE_BACKWARD (pos);
}
}
}
[-- Attachment #3: Type: text/plain, Size: 142 bytes --]
_______________________________________________
Emacs-devel mailing list
Emacs-devel@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-devel
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: C++ mode and c-beginning-of-current-token
2007-05-17 10:18 ` martin rudalics
@ 2007-05-17 12:52 ` Herbert Euler
2007-05-17 13:51 ` martin rudalics
2007-05-17 14:32 ` Stefan Monnier
0 siblings, 2 replies; 41+ messages in thread
From: Herbert Euler @ 2007-05-17 12:52 UTC (permalink / raw)
To: rudalics, handa; +Cc: acm, monnier, emacs-devel
>>Please try the same thing with Emacs 22 started with
>>"--unibyte" argument. The same error is signalled.
>>
>>So, the bug has been in the original code of skip_chars
>>(where it handles a unibyte case) in Emacs 22. Could
>>someone please fix it in Emacs 22? Then, I'll do the same
>>fix on emacs-unicode-2.
>
>FWIW, I couldn't find problems with the attached patch.
>*** syntax.c Wed Jan 17 09:31:10 2007
>--- syntax.c Thu May 17 11:15:42 2007
>***************
>*** 1672,1678 ****
> if (! fastmap[(int) SYNTAX (p[-1])])
> break;
> p--, pos--;
>! UPDATE_SYNTAX_TABLE_BACKWARD (pos - 1);
> }
> }
> }
>--- 1672,1678 ----
> if (! fastmap[(int) SYNTAX (p[-1])])
> break;
> p--, pos--;
>! UPDATE_SYNTAX_TABLE_BACKWARD (pos);
> }
> }
> }
Can I ask a silly question, too? There are two "calling"s of
UPDATE_SYNTAX_TABLE_BACKWARD in `skip_chars', one with the "argument"
(pos), the other with (pos - 1). To me, the second one does not look
like a typo, but a correct piece of code on purpose. Yes, changing it
from (pos - 1) to (pos) works, but it seems to be dangerous to me.
UPDATE_SYNTAX_TABLE_BACKWARD also updates `gl_state' by setting its
members `forward_i' and `backward_i'. Perhaps (pos - 1) is for
keeping these two members at right positions. Different "argument"s
passed to UPDATE_SYNTAX_TABLE_BACKWARD sets the two intervals
differently. There are still some details I did not make clear, so I
am still confused now.
Regards,
Guanpeng Xu
_________________________________________________________________
Don't just search. Find. Check out the new MSN Search!
http://search.msn.com/
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: C++ mode and c-beginning-of-current-token
2007-05-17 12:52 ` Herbert Euler
@ 2007-05-17 13:51 ` martin rudalics
2007-05-17 21:40 ` martin rudalics
2007-05-17 14:32 ` Stefan Monnier
1 sibling, 1 reply; 41+ messages in thread
From: martin rudalics @ 2007-05-17 13:51 UTC (permalink / raw)
To: Herbert Euler; +Cc: acm, emacs-devel, monnier, handa
> Can I ask a silly question, too? There are two "calling"s of
> UPDATE_SYNTAX_TABLE_BACKWARD in `skip_chars', one with the "argument"
> (pos), the other with (pos - 1). To me, the second one does not look
> like a typo, but a correct piece of code on purpose. Yes, changing it
>
>> from (pos - 1) to (pos) works, but it seems to be dangerous to me.
I do have my own problems with UPDATE_SYNTAX_TABLE_BACKWARD. To
reproduce with Emacs -Q define foo as
(defun foo ()
(interactive)
(put-text-property (1- (point)) (point) 'syntax-table '(2))
(setq parse-sexp-lookup-properties t))
open a text-mode buffer, insert a couple of non-word chars in the
buffer, leave point after them, and type M-x foo followed by M-b. On my
system it goes back by _two_ characters instead of one. I'm yet too
silly to understand what's going on.
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: C++ mode and c-beginning-of-current-token
2007-05-17 12:52 ` Herbert Euler
2007-05-17 13:51 ` martin rudalics
@ 2007-05-17 14:32 ` Stefan Monnier
2007-05-17 14:45 ` martin rudalics
2007-05-18 13:00 ` Richard Stallman
1 sibling, 2 replies; 41+ messages in thread
From: Stefan Monnier @ 2007-05-17 14:32 UTC (permalink / raw)
To: Herbert Euler; +Cc: rudalics, acm, emacs-devel, handa
> Can I ask a silly question, too? There are two "calling"s of
> UPDATE_SYNTAX_TABLE_BACKWARD in `skip_chars', one with the "argument"
> (pos), the other with (pos - 1). To me, the second one does not look
> like a typo, but a correct piece of code on purpose. Yes, changing it
> from (pos - 1) to (pos) works, but it seems to be dangerous to me.
The important part is to make sure that the syntax-table is up-to-date for
the char being read, when to char is passed to `SYNTAX'.
Also, it is important to update the syntax-table only after making sure that
the new position is valid. So I believe the patch below is what we want.
Stefan
--- orig/src/syntax.c
+++ mod/src/syntax.c
@@ -1691,10 +1691,10 @@
p = GPT_ADDR;
stop = endp;
}
+ UPDATE_SYNTAX_TABLE_BACKWARD (pos - 1);
if (! fastmap[(int) SYNTAX (p[-1])])
break;
p--, pos--;
- UPDATE_SYNTAX_TABLE_BACKWARD (pos - 1);
}
}
}
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: C++ mode and c-beginning-of-current-token
2007-05-17 14:32 ` Stefan Monnier
@ 2007-05-17 14:45 ` martin rudalics
2007-05-18 13:00 ` Richard Stallman
1 sibling, 0 replies; 41+ messages in thread
From: martin rudalics @ 2007-05-17 14:45 UTC (permalink / raw)
To: Stefan Monnier; +Cc: acm, Herbert Euler, handa, emacs-devel
[-- Attachment #1: Type: text/plain, Size: 642 bytes --]
> Also, it is important to update the syntax-table only after making sure that
> the new position is valid. So I believe the patch below is what we want.
>
>
> Stefan
>
>
> --- orig/src/syntax.c
> +++ mod/src/syntax.c
> @@ -1691,10 +1691,10 @@
> p = GPT_ADDR;
> stop = endp;
> }
> + UPDATE_SYNTAX_TABLE_BACKWARD (pos - 1);
> if (! fastmap[(int) SYNTAX (p[-1])])
> break;
> p--, pos--;
> - UPDATE_SYNTAX_TABLE_BACKWARD (pos - 1);
> }
> }
> }
Indeed, though I'd prefer the attached one.
BTW, did you look at the backward-word problem I mentioned in my previous mail?
[-- Attachment #2: syntax.patch --]
[-- Type: text/plain, Size: 571 bytes --]
*** syntax.c Wed Jan 17 09:31:10 2007
--- syntax.c Thu May 17 15:23:24 2007
***************
*** 1669,1678 ****
p = GPT_ADDR;
stop = endp;
}
- if (! fastmap[(int) SYNTAX (p[-1])])
- break;
p--, pos--;
! UPDATE_SYNTAX_TABLE_BACKWARD (pos - 1);
}
}
}
--- 1669,1681 ----
p = GPT_ADDR;
stop = endp;
}
p--, pos--;
! UPDATE_SYNTAX_TABLE_BACKWARD (pos);
! if (! fastmap[(int) SYNTAX (*p)])
! {
! p++, pos++;
! break;
! }
}
}
}
[-- Attachment #3: Type: text/plain, Size: 142 bytes --]
_______________________________________________
Emacs-devel mailing list
Emacs-devel@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-devel
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: C++ mode and c-beginning-of-current-token
2007-05-17 13:51 ` martin rudalics
@ 2007-05-17 21:40 ` martin rudalics
0 siblings, 0 replies; 41+ messages in thread
From: martin rudalics @ 2007-05-17 21:40 UTC (permalink / raw)
To: martin rudalics; +Cc: acm, Herbert Euler, handa, monnier, emacs-devel
[-- Attachment #1: Type: text/plain, Size: 620 bytes --]
> I do have my own problems with UPDATE_SYNTAX_TABLE_BACKWARD. To
> reproduce with Emacs -Q define foo as
>
> (defun foo ()
> (interactive)
> (put-text-property (1- (point)) (point) 'syntax-table '(2))
> (setq parse-sexp-lookup-properties t))
>
> open a text-mode buffer, insert a couple of non-word chars in the
> buffer, leave point after them, and type M-x foo followed by M-b. On my
> system it goes back by _two_ characters instead of one. I'm yet too
> silly to understand what's going on.
The attached patch seems to fix both the syntax and word backward scanning
problems. Could someone please try?
[-- Attachment #2: syntax.patch --]
[-- Type: text/plain, Size: 1575 bytes --]
*** syntax.c Wed Jan 17 09:31:10 2007
--- syntax.c Thu May 17 23:30:50 2007
***************
*** 1276,1294 ****
position of it. */
while (1)
{
- int temp_byte;
-
if (from == beg)
break;
! temp_byte = dec_bytepos (from_byte);
UPDATE_SYNTAX_TABLE_BACKWARD (from);
! ch0 = FETCH_CHAR (temp_byte);
code = SYNTAX (ch0);
if (!(words_include_escapes
&& (code == Sescape || code == Scharquote)))
if (code != Sword || WORD_BOUNDARY_P (ch0, ch1))
! break;
! DEC_BOTH (from, from_byte);
ch1 = ch0;
}
count++;
--- 1276,1294 ----
position of it. */
while (1)
{
if (from == beg)
break;
! DEC_BOTH (from, from_byte);
UPDATE_SYNTAX_TABLE_BACKWARD (from);
! ch0 = FETCH_CHAR (from_byte);
code = SYNTAX (ch0);
if (!(words_include_escapes
&& (code == Sescape || code == Scharquote)))
if (code != Sword || WORD_BOUNDARY_P (ch0, ch1))
! {
! INC_BOTH (from, from_byte);
! break;
! }
ch1 = ch0;
}
count++;
***************
*** 1669,1678 ****
p = GPT_ADDR;
stop = endp;
}
- if (! fastmap[(int) SYNTAX (p[-1])])
- break;
p--, pos--;
! UPDATE_SYNTAX_TABLE_BACKWARD (pos - 1);
}
}
}
--- 1669,1681 ----
p = GPT_ADDR;
stop = endp;
}
p--, pos--;
! UPDATE_SYNTAX_TABLE_BACKWARD (pos);
! if (! fastmap[(int) SYNTAX (*p)])
! {
! p++, pos++;
! break;
! }
}
}
}
[-- Attachment #3: Type: text/plain, Size: 142 bytes --]
_______________________________________________
Emacs-devel mailing list
Emacs-devel@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-devel
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: C++ mode and c-beginning-of-current-token
2007-05-17 14:32 ` Stefan Monnier
2007-05-17 14:45 ` martin rudalics
@ 2007-05-18 13:00 ` Richard Stallman
2007-05-18 23:39 ` Herbert Euler
2007-05-19 12:59 ` martin rudalics
1 sibling, 2 replies; 41+ messages in thread
From: Richard Stallman @ 2007-05-18 13:00 UTC (permalink / raw)
To: Stefan Monnier; +Cc: rudalics, herberteuler, emacs-devel, handa, acm
Also, it is important to update the syntax-table only after making sure that
the new position is valid. So I believe the patch below is what we want.
Is this bug only in unicode-2? Does it affect Emacs 22?
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: C++ mode and c-beginning-of-current-token
2007-05-18 13:00 ` Richard Stallman
@ 2007-05-18 23:39 ` Herbert Euler
2007-05-19 22:31 ` Richard Stallman
2007-05-19 12:59 ` martin rudalics
1 sibling, 1 reply; 41+ messages in thread
From: Herbert Euler @ 2007-05-18 23:39 UTC (permalink / raw)
To: rms, monnier; +Cc: rudalics, handa, acm, emacs-devel
> Also, it is important to update the syntax-table only after making
>sure that
> the new position is valid. So I believe the patch below is what we
>want.
>
>Is this bug only in unicode-2? Does it affect Emacs 22?
As Kenichi said, it will happen if Emacs is started with --unibyte.
Regards,
Guanpeng Xu
_________________________________________________________________
FREE pop-up blocking with the new MSN Toolbar - get it now!
http://toolbar.msn.click-url.com/go/onm00200415ave/direct/01/
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: C++ mode and c-beginning-of-current-token
2007-05-18 13:00 ` Richard Stallman
2007-05-18 23:39 ` Herbert Euler
@ 2007-05-19 12:59 ` martin rudalics
2007-05-19 15:18 ` Stefan Monnier
2007-05-20 6:50 ` Richard Stallman
1 sibling, 2 replies; 41+ messages in thread
From: martin rudalics @ 2007-05-19 12:59 UTC (permalink / raw)
To: rms; +Cc: acm, herberteuler, handa, Stefan Monnier, emacs-devel
> Is this bug only in unicode-2? Does it affect Emacs 22?
There are two bugs with identical structure - one in `skip_chars' and
one in `scan_words'. The bug in `skip_chars' occurs when you invoke
Emacs with the unibyte option. IMO it's virulent in the unicode-2
branch only. The bug in `scan_words' occurs with Emacs -Q. That bug
will be hardly noticed ever since `backward-word' practically never
relies on syntax-table properties. Patching any of these is hairy
because neither Guanpeng nor I seem to understand the interval updating
code sufficiently well. Hence, unless we find someone with intimate
knowledge of the interval code, I'd propose to not touch it for the
release. We could try to fix them in the trunk and the Unicode branch
and see what happens.
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: C++ mode and c-beginning-of-current-token
2007-05-19 12:59 ` martin rudalics
@ 2007-05-19 15:18 ` Stefan Monnier
2007-05-19 17:48 ` martin rudalics
2007-05-21 13:01 ` Kenichi Handa
2007-05-20 6:50 ` Richard Stallman
1 sibling, 2 replies; 41+ messages in thread
From: Stefan Monnier @ 2007-05-19 15:18 UTC (permalink / raw)
To: martin rudalics; +Cc: acm, herberteuler, emacs-devel, rms, handa
> There are two bugs with identical structure - one in `skip_chars' and
> one in `scan_words'.
I have just installed a patch in EMACS_22_BASE for skip_chars.
As for scan_words, I don't know of such a bug. Also looking at the code,
I don't see it (not that it proves anything, of course). Could you tell me
where's the problem in scan_words?
Stefan
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: C++ mode and c-beginning-of-current-token
2007-05-19 15:18 ` Stefan Monnier
@ 2007-05-19 17:48 ` martin rudalics
2007-05-21 13:01 ` Kenichi Handa
1 sibling, 0 replies; 41+ messages in thread
From: martin rudalics @ 2007-05-19 17:48 UTC (permalink / raw)
To: Stefan Monnier; +Cc: acm, herberteuler, handa, rms, emacs-devel
[-- Attachment #1: Type: text/plain, Size: 611 bytes --]
> As for scan_words, I don't know of such a bug. Also looking at the code,
> I don't see it (not that it proves anything, of course). Could you tell me
> where's the problem in scan_words?
With Emacs -Q define foo as
(defun foo ()
(interactive)
(put-text-property (1- (point)) (point) 'syntax-table '(2))
(setq parse-sexp-lookup-properties t))
open a text-mode buffer, insert a couple of non-word chars in the
buffer, leave point after them, and type M-x foo followed by M-b. It
goes back by _two_ characters instead of one. The attached patch was
supposed to fix this and the other problem.
[-- Attachment #2: syntax.patch --]
[-- Type: text/plain, Size: 1575 bytes --]
*** syntax.c Wed Jan 17 09:31:10 2007
--- syntax.c Thu May 17 23:30:50 2007
***************
*** 1276,1294 ****
position of it. */
while (1)
{
- int temp_byte;
-
if (from == beg)
break;
! temp_byte = dec_bytepos (from_byte);
UPDATE_SYNTAX_TABLE_BACKWARD (from);
! ch0 = FETCH_CHAR (temp_byte);
code = SYNTAX (ch0);
if (!(words_include_escapes
&& (code == Sescape || code == Scharquote)))
if (code != Sword || WORD_BOUNDARY_P (ch0, ch1))
! break;
! DEC_BOTH (from, from_byte);
ch1 = ch0;
}
count++;
--- 1276,1294 ----
position of it. */
while (1)
{
if (from == beg)
break;
! DEC_BOTH (from, from_byte);
UPDATE_SYNTAX_TABLE_BACKWARD (from);
! ch0 = FETCH_CHAR (from_byte);
code = SYNTAX (ch0);
if (!(words_include_escapes
&& (code == Sescape || code == Scharquote)))
if (code != Sword || WORD_BOUNDARY_P (ch0, ch1))
! {
! INC_BOTH (from, from_byte);
! break;
! }
ch1 = ch0;
}
count++;
***************
*** 1669,1678 ****
p = GPT_ADDR;
stop = endp;
}
- if (! fastmap[(int) SYNTAX (p[-1])])
- break;
p--, pos--;
! UPDATE_SYNTAX_TABLE_BACKWARD (pos - 1);
}
}
}
--- 1669,1681 ----
p = GPT_ADDR;
stop = endp;
}
p--, pos--;
! UPDATE_SYNTAX_TABLE_BACKWARD (pos);
! if (! fastmap[(int) SYNTAX (*p)])
! {
! p++, pos++;
! break;
! }
}
}
}
[-- Attachment #3: Type: text/plain, Size: 142 bytes --]
_______________________________________________
Emacs-devel mailing list
Emacs-devel@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-devel
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: C++ mode and c-beginning-of-current-token
2007-05-18 23:39 ` Herbert Euler
@ 2007-05-19 22:31 ` Richard Stallman
0 siblings, 0 replies; 41+ messages in thread
From: Richard Stallman @ 2007-05-19 22:31 UTC (permalink / raw)
To: Herbert Euler; +Cc: rudalics, handa, monnier, acm, emacs-devel
As Kenichi said, it will happen if Emacs is started with --unibyte.
I think we can do without fixing that for now.
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: C++ mode and c-beginning-of-current-token
2007-05-19 12:59 ` martin rudalics
2007-05-19 15:18 ` Stefan Monnier
@ 2007-05-20 6:50 ` Richard Stallman
1 sibling, 0 replies; 41+ messages in thread
From: Richard Stallman @ 2007-05-20 6:50 UTC (permalink / raw)
To: martin rudalics; +Cc: acm, herberteuler, handa, monnier, emacs-devel
The bug in `skip_chars' occurs when you invoke
Emacs with the unibyte option. IMO it's virulent in the unicode-2
branch only. The bug in `scan_words' occurs with Emacs -Q. That bug
will be hardly noticed ever since `backward-word' practically never
relies on syntax-table properties.
Ok, we can leave it alone for now.
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: C++ mode and c-beginning-of-current-token
2007-05-19 15:18 ` Stefan Monnier
2007-05-19 17:48 ` martin rudalics
@ 2007-05-21 13:01 ` Kenichi Handa
2007-05-21 14:00 ` Stefan Monnier
1 sibling, 1 reply; 41+ messages in thread
From: Kenichi Handa @ 2007-05-21 13:01 UTC (permalink / raw)
To: Stefan Monnier; +Cc: rudalics, herberteuler, emacs-devel, rms, acm
In article <jwvk5v4c0bh.fsf-monnier+emacs@gnu.org>, Stefan Monnier <monnier@iro.umontreal.ca> writes:
> > There are two bugs with identical structure - one in `skip_chars' and
> > one in `scan_words'.
> I have just installed a patch in EMACS_22_BASE for skip_chars.
Thank you. I've just installed the corresponding change in
emacs-unicode-2.
> As for scan_words, I don't know of such a bug. Also looking at the code,
> I don't see it (not that it proves anything, of course). Could you tell me
> where's the problem in scan_words?
martin rudalics <rudalics@gmx.at> writes:
> With Emacs -Q define foo as
> (defun foo ()
> (interactive)
> (put-text-property (1- (point)) (point) 'syntax-table '(2))
> (setq parse-sexp-lookup-properties t))
> open a text-mode buffer, insert a couple of non-word chars in the
> buffer, leave point after them, and type M-x foo followed by M-b. It
> goes back by _two_ characters instead of one. The attached patch was
> supposed to fix this and the other problem.
I confirmed this bug. It seems that the following single
line change is easier to understand what was wrong, but
Martin's change makes the resulting code easier to read.
*** syntax.c 21 May 2007 21:21:30 +0900 1.205
--- syntax.c 21 May 2007 21:53:50 +0900
***************
*** 1281,1287 ****
if (from == beg)
break;
temp_byte = dec_bytepos (from_byte);
! UPDATE_SYNTAX_TABLE_BACKWARD (from);
ch0 = FETCH_CHAR (temp_byte);
code = SYNTAX (ch0);
if (!(words_include_escapes
--- 1281,1287 ----
if (from == beg)
break;
temp_byte = dec_bytepos (from_byte);
! UPDATE_SYNTAX_TABLE_BACKWARD (from - 1);
ch0 = FETCH_CHAR (temp_byte);
code = SYNTAX (ch0);
if (!(words_include_escapes
---
Kenichi Handa
handa@m17n.org
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: C++ mode and c-beginning-of-current-token
2007-05-21 13:01 ` Kenichi Handa
@ 2007-05-21 14:00 ` Stefan Monnier
2007-05-22 1:37 ` Kenichi Handa
0 siblings, 1 reply; 41+ messages in thread
From: Stefan Monnier @ 2007-05-21 14:00 UTC (permalink / raw)
To: Kenichi Handa; +Cc: rudalics, herberteuler, emacs-devel, rms, acm
> I confirmed this bug. It seems that the following single
> line change is easier to understand what was wrong, but
> Martin's change makes the resulting code easier to read.
Yes, thanks, it's the one I was planning on installing. My connection is
pretty poor (I'm on the road), so if you want to install it, please do,
Stefan
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: C++ mode and c-beginning-of-current-token
2007-05-21 14:00 ` Stefan Monnier
@ 2007-05-22 1:37 ` Kenichi Handa
2007-05-22 10:26 ` Stefan Monnier
0 siblings, 1 reply; 41+ messages in thread
From: Kenichi Handa @ 2007-05-22 1:37 UTC (permalink / raw)
To: Stefan Monnier; +Cc: rudalics, herberteuler, acm, rms, emacs-devel
In article <jwv8xbi8egv.fsf-monnier+emacs@gnu.org>, Stefan Monnier <monnier@iro.umontreal.ca> writes:
> > I confirmed this bug. It seems that the following single
> > line change is easier to understand what was wrong, but
> > Martin's change makes the resulting code easier to read.
> Yes, thanks, it's the one I was planning on installing. My connection is
> pretty poor (I'm on the road), so if you want to install it, please do,
Which one, mine or Martin's?
---
Kenichi Handa
handa@m17n.org
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: C++ mode and c-beginning-of-current-token
2007-05-22 1:37 ` Kenichi Handa
@ 2007-05-22 10:26 ` Stefan Monnier
2007-05-22 12:08 ` Kenichi Handa
0 siblings, 1 reply; 41+ messages in thread
From: Stefan Monnier @ 2007-05-22 10:26 UTC (permalink / raw)
To: Kenichi Handa; +Cc: rudalics, herberteuler, emacs-devel, rms, acm
>> > I confirmed this bug. It seems that the following single
>> > line change is easier to understand what was wrong, but
>> > Martin's change makes the resulting code easier to read.
>> Yes, thanks, it's the one I was planning on installing. My connection is
>> pretty poor (I'm on the road), so if you want to install it, please do,
> Which one, mine or Martin's?
Up to you. I find yours more "obviously correct", but it's just a question
fo taste, so the committer gets to impose his own,
Stefan
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: C++ mode and c-beginning-of-current-token
2007-05-22 10:26 ` Stefan Monnier
@ 2007-05-22 12:08 ` Kenichi Handa
0 siblings, 0 replies; 41+ messages in thread
From: Kenichi Handa @ 2007-05-22 12:08 UTC (permalink / raw)
To: Stefan Monnier; +Cc: rudalics, herberteuler, emacs-devel, rms, acm
In article <jwvfy5p40k2.fsf-monnier+emacs@gnu.org>, Stefan Monnier <monnier@iro.umontreal.ca> writes:
>>> > I confirmed this bug. It seems that the following single
>>> > line change is easier to understand what was wrong, but
>>> > Martin's change makes the resulting code easier to read.
>>> Yes, thanks, it's the one I was planning on installing. My connection is
>>> pretty poor (I'm on the road), so if you want to install it, please do,
> > Which one, mine or Martin's?
> Up to you. I find yours more "obviously correct", but it's just a question
> fo taste, so the committer gets to impose his own,
I've just installed Martin's fix because I thought that, for
the trunk, the readability of the resulting code is more
inportant than obviousness of the fix.
---
Kenichi Handa
handa@m17n.org
^ permalink raw reply [flat|nested] 41+ messages in thread
end of thread, other threads:[~2007-05-22 12:08 UTC | newest]
Thread overview: 41+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-05-12 10:39 C++ mode and c-beginning-of-current-token Herbert Euler
2007-05-12 13:19 ` Alan Mackenzie
2007-05-12 14:30 ` Herbert Euler
2007-05-12 14:33 ` Herbert Euler
2007-05-12 16:02 ` Herbert Euler
2007-05-12 15:30 ` Herbert Euler
2007-05-12 18:49 ` Alan Mackenzie
2007-05-13 0:51 ` Herbert Euler
2007-05-13 10:01 ` Alan Mackenzie
2007-05-14 2:00 ` Herbert Euler
2007-05-14 8:50 ` Alan Mackenzie
2007-05-14 9:24 ` Herbert Euler
2007-05-14 16:58 ` Stefan Monnier
2007-05-15 3:45 ` Herbert Euler
2007-05-15 6:39 ` martin rudalics
2007-05-16 16:15 ` Stefan Monnier
2007-05-15 13:30 ` Herbert Euler
2007-05-16 8:01 ` Herbert Euler
2007-05-16 8:05 ` Herbert Euler
2007-05-17 2:12 ` Kenichi Handa
2007-05-17 10:18 ` martin rudalics
2007-05-17 12:52 ` Herbert Euler
2007-05-17 13:51 ` martin rudalics
2007-05-17 21:40 ` martin rudalics
2007-05-17 14:32 ` Stefan Monnier
2007-05-17 14:45 ` martin rudalics
2007-05-18 13:00 ` Richard Stallman
2007-05-18 23:39 ` Herbert Euler
2007-05-19 22:31 ` Richard Stallman
2007-05-19 12:59 ` martin rudalics
2007-05-19 15:18 ` Stefan Monnier
2007-05-19 17:48 ` martin rudalics
2007-05-21 13:01 ` Kenichi Handa
2007-05-21 14:00 ` Stefan Monnier
2007-05-22 1:37 ` Kenichi Handa
2007-05-22 10:26 ` Stefan Monnier
2007-05-22 12:08 ` Kenichi Handa
2007-05-20 6:50 ` Richard Stallman
2007-05-16 9:00 ` martin rudalics
2007-05-16 11:12 ` Herbert Euler
2007-05-16 12:21 ` martin rudalics
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).