* C++ mode and c-beginning-of-current-token @ 2007-05-12 10:39 Herbert Euler 2007-05-12 13:19 ` Alan Mackenzie ` (3 more replies) 0 siblings, 4 replies; 41+ messages in thread From: Herbert Euler @ 2007-05-12 10:39 UTC (permalink / raw) To: acm; +Cc: emacs-devel In the newest unicode 2 branch, `parse-sexp-lookup-properties' is default to `t' in C++ mode but not in C mode. I don't have the Emacs 22 trunk, and so don't know the case in that trunk. But if this variable is set to t by default, there will be a bug in c++-mode: 1. Visit an empty, new C++ file. 2. Try to insert the following line, at the beginning of the buffer: #include <iostream> Well, an error will be signaled when typing the second character, "i", says "Point before start of properties". This error happens in the function `c-beginning-of-current-token', when invoking `skip-syntax-backward': (defun c-beginning-of-current-token (&optional back-limit) ;; Move to the beginning of the current token. Do not move if not ;; in the middle of one. BACK-LIMIT may be used to bound the ;; backward search; if given it's assumed to be at the boundary ;; between two tokens. Return non-nil if the point is move, nil ;; otherwise. ;; ;; This function might do hidden buffer changes. (let ((start (point))) (if (looking-at "\\w\\|\\s_") (skip-syntax-backward "w_" back-limit) (when (< (skip-syntax-backward ".()" back-limit) 0) ;; ... ... If `parse-sexp-lookup-properties' is t, `skip-syntax-backward' would not return -1, but signaling an error in some cases. Regards, Guanpeng Xu _________________________________________________________________ Express yourself instantly with MSN Messenger! Download today it's FREE! http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/ ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: C++ mode and c-beginning-of-current-token 2007-05-12 10:39 C++ mode and c-beginning-of-current-token Herbert Euler @ 2007-05-12 13:19 ` Alan Mackenzie 2007-05-12 14:30 ` Herbert Euler ` (2 more replies) 2007-05-12 15:30 ` Herbert Euler ` (2 subsequent siblings) 3 siblings, 3 replies; 41+ messages in thread From: Alan Mackenzie @ 2007-05-12 13:19 UTC (permalink / raw) To: Herbert Euler; +Cc: emacs-devel Hi, Guanpeng! On Sat, May 12, 2007 at 06:39:12PM +0800, Herbert Euler wrote: > In the newest unicode 2 branch, `parse-sexp-lookup-properties' is > default to `t' in C++ mode but not in C mode. I don't have the > Emacs 22 trunk, and so don't know the case in that trunk. But > if this variable is set to t by default, there will be a bug in c++-mode: parse-sexp-lookup-properties is t in C++ Mode so that text properties can be set on pertinent <s and >s (in templates) to mark them as parentheses. > 1. Visit an empty, new C++ file. > 2. Try to insert the following line, at the beginning of the buffer: > #include <iostream> > Well, an error will be signaled when typing the second character, "i", > says "Point before start of properties". This error happens in the > function `c-beginning-of-current-token', when invoking > `skip-syntax-backward': OK. This doesn't happen to me in the Emacs 22 release branch. But c-beginning-of-current-token was changed recently. I suspect you might have some option set which exposes a bug in that function. Did you start your emacs with -Q? If not does the error still happen when you do? If you didn't use -Q, could you please dump your CC Mode configuration with C-c C-b and post it here. [ .... ] > Regards, > Guanpeng Xu -- Alan Mackenzie (Ittersbach, Germany). ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: C++ mode and c-beginning-of-current-token 2007-05-12 13:19 ` Alan Mackenzie @ 2007-05-12 14:30 ` Herbert Euler 2007-05-12 14:33 ` Herbert Euler 2007-05-12 16:02 ` Herbert Euler 2 siblings, 0 replies; 41+ messages in thread From: Herbert Euler @ 2007-05-12 14:30 UTC (permalink / raw) To: acm; +Cc: emacs-devel >Did you start your emacs with -Q? If not does the error still happen >when you do? If you didn't use -Q, could you please dump your CC Mode >configuration with C-c C-b and post it here. Below is the result in Emacs started with -Q option: Emacs : GNU Emacs 23.0.0.1 (i386-mingw-nt5.1.2600) of 2007-05-11 on XUGUANPENG4876 Package: CC Mode 5.31.4 (C++/l) Buffer Style: gnu c-emacs-features: (pps-extended-state col-0-paren posix-char-classes gen-string-delim gen-comment-delim syntax-properties 1-bit) current state: ============== (setq c-basic-offset 2 c-comment-only-line-offset '(0 . 0) c-indent-comment-alist '((anchored-comment column . 0) (end-block space . 1) (cpp-end-block space . 2)) c-indent-comments-syntactically-p nil c-block-comment-prefix "" c-comment-prefix-regexp '((pike-mode . "//+!?\\|\\**") (awk-mode . "#+") (other . "//+\\|\\**")) c-doc-comment-style '((java-mode . javadoc) (pike-mode . autodoc) (c-mode . gtkdoc)) c-cleanup-list '(scope-operator) c-hanging-braces-alist '((substatement-open before after)) c-hanging-colons-alist nil c-hanging-semi&comma-criteria '(c-semi&comma-inside-parenlist) c-backslash-column 48 c-backslash-max-column 72 c-special-indent-hook '(c-gnu-impose-minimum) c-label-minimum-indentation 1 c-offsets-alist '((inexpr-class . +) (inexpr-statement . +) (lambda-intro-cont . +) (inlambda . c-lineup-inexpr-block) (template-args-cont c-lineup-template-args +) (incomposition . +) (inmodule . +) (innamespace . +) (inextern-lang . +) (composition-close . 0) (module-close . 0) (namespace-close . 0) (extern-lang-close . 0) (composition-open . 0) (module-open . 0) (namespace-open . 0) (extern-lang-open . 0) (objc-method-call-cont . c-lineup-ObjC-method-call) (objc-method-args-cont . c-lineup-ObjC-method-args) (objc-method-intro . [0]) (friend . 0) (cpp-define-intro c-lineup-cpp-define +) (cpp-macro-cont . +) (cpp-macro . [0]) (inclass . +) (stream-op . c-lineup-streamop) (arglist-cont-nonempty c-lineup-gcc-asm-reg c-lineup-arglist ) (arglist-cont c-lineup-gcc-asm-reg 0) (comment-intro c-lineup-knr-region-comment c-lineup-comment ) (catch-clause . 0) (else-clause . 0) (do-while-closure . 0) (access-label . -) (case-label . 0) (substatement . +) (statement-case-intro . +) (statement . 0) (brace-entry-open . 0) (brace-list-entry . 0) (brace-list-intro . +) (brace-list-close . 0) (block-close . 0) (block-open . 0) (inher-cont . c-lineup-multi-inher) (inher-intro . +) (member-init-cont . c-lineup-multi-inher) (member-init-intro . +) (topmost-intro . 0) (knr-argdecl . 0) (func-decl-cont . +) (inline-close . 0) (class-close . 0) (class-open . 0) (defun-block-intro . +) (defun-close . 0) (defun-open . 0) (c . c-lineup-C-comments) (string . c-lineup-dont-change) (topmost-intro-cont first c-lineup-topmost-intro-cont c-lineup-gnu-DEFUN-intro-cont ) (brace-list-open . +) (inline-open . 0) (arglist-close . c-lineup-arglist) (arglist-intro . c-lineup-arglist-intro-after-paren) (statement-cont . +) (statement-case-open . +) (label . 0) (substatement-label . 0) (substatement-open . +) (knr-argdecl-intro . 5) (statement-block-intro . +) ) c-buffer-is-cc-mode 'c++-mode c-tab-always-indent t c-syntactic-indentation t c-syntactic-indentation-in-macros t c-ignore-auto-fill '(string cpp code) c-auto-align-backslashes t c-backspace-function 'backward-delete-char-untabify c-delete-function 'delete-char c-electric-pound-behavior nil c-default-style '((java-mode . "java") (awk-mode . "awk") (other . "gnu")) c-enable-xemacs-performance-kludge-p nil c-old-style-variable-behavior nil defun-prompt-regexp nil tab-width 8 comment-column 32 parse-sexp-ignore-comments t parse-sexp-lookup-properties t auto-fill-function nil comment-multi-line t comment-start-skip "\\(//+\\|/\\*+\\)\\s *" fill-prefix nil fill-column 70 paragraph-start "[ ]*\\(//+\\|\\**\\)[ ]*$\\|^\f" adaptive-fill-mode t adaptive-fill-regexp "[ ]*\\(//+\\|\\**\\)[ ]*\\([ ]*\\([-!|#%;>*·‣⁃◦]+[ ]*\\)*\\)" ) Thank you very much for the quickly reply! :-D Regards, Guanpeng Xu _________________________________________________________________ Express yourself instantly with MSN Messenger! Download today it's FREE! http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/ ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: C++ mode and c-beginning-of-current-token 2007-05-12 13:19 ` Alan Mackenzie 2007-05-12 14:30 ` Herbert Euler @ 2007-05-12 14:33 ` Herbert Euler 2007-05-12 16:02 ` Herbert Euler 2 siblings, 0 replies; 41+ messages in thread From: Herbert Euler @ 2007-05-12 14:33 UTC (permalink / raw) To: acm; +Cc: emacs-devel >Did you start your emacs with -Q? If not does the error still happen >when you do? If you didn't use -Q, could you please dump your CC Mode >configuration with C-c C-b and post it here. And yes, the problem happens even with -Q option. Regards, Guanpeng Xu _________________________________________________________________ Express yourself instantly with MSN Messenger! Download today it's FREE! http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/ ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: C++ mode and c-beginning-of-current-token 2007-05-12 13:19 ` Alan Mackenzie 2007-05-12 14:30 ` Herbert Euler 2007-05-12 14:33 ` Herbert Euler @ 2007-05-12 16:02 ` Herbert Euler 2 siblings, 0 replies; 41+ messages in thread From: Herbert Euler @ 2007-05-12 16:02 UTC (permalink / raw) To: acm; +Cc: emacs-devel >parse-sexp-lookup-properties is t in C++ Mode so that text properties >can be set on pertinent <s and >s (in templates) to mark them as >parentheses. Setting `parse-sexp-lookup-properties' to t in c-mode buffer reproduces the error for c-mode, and setting `parse-sexp-lookup-properties' to nil in c++-mode eliminate the error for c++-mode, although templates would work. Regards, Guanpeng Xu _________________________________________________________________ FREE pop-up blocking with the new MSN Toolbar - get it now! http://toolbar.msn.click-url.com/go/onm00200415ave/direct/01/ ^ permalink raw reply [flat|nested] 41+ messages in thread
* RE: C++ mode and c-beginning-of-current-token 2007-05-12 10:39 C++ mode and c-beginning-of-current-token Herbert Euler 2007-05-12 13:19 ` Alan Mackenzie @ 2007-05-12 15:30 ` Herbert Euler 2007-05-12 18:49 ` Alan Mackenzie 2007-05-13 10:01 ` Alan Mackenzie 2007-05-14 16:58 ` Stefan Monnier 3 siblings, 1 reply; 41+ messages in thread From: Herbert Euler @ 2007-05-12 15:30 UTC (permalink / raw) To: herberteuler, acm; +Cc: emacs-devel I found some additional strange behaviors. First, the error in a buffer would only be signaled once. If an error is already signaled, typing again does not cause another same error get signaled, but font-lock is broken after that. Normally, characters are assigned with face property as they are typed. For example, when I type #include in a c++-mode buffer, the whole word is assigned with font-lock-preprocessor-face after the first letter, l, is typed (when # is typed, face are not set). However, in the C++ buffer in which an error has been signaled, font-lock does not work on the erroneous region. The erroneous starts with the beginning of the buffer (where the original error is signaled), and ends with a statement (defined with `c-beginning-of-statement' and `c-end-of-statement'). For example, suppose the first line of a buffer is as below (-!- is the point position): -!-#include <iostream> #include <vector> and no errors have been signaled. Now I type C-o, the buffer content is now -!- #include <iostream> #include <vector> Now I type #, an error is signaled (this is slightly different from typing in an empty buffer, in such a case the error is signaled when typing i, rather than #). Now if I finish the first line as below: #include <iostream>-!- #include <iostream> #include <vector> The second #include <iostream> and content follows it is still colored, while the first one is not. Text properties on the first line are (c-in-sws t auto-composed t fontified t). If other content are typed now, for example the following: #include <iostream> int main () { } -!- #include <iostream> #include <vector> all of them are not colored; text properties of the non-colored content but the first line are (c-is-sws t auto-composed t fontified t), and of the first line is changed to (c-is-sws t c-in-sws t auto-composed t fontified t). If I invoke `font-lock-fontify-region' on non-colored region manually, they will be fontified correctly. Font-lock in other parts of the buffer works, but is abnormal as usual. And there are other strange behaviors, but they are too complicated for me to describe, so I intend to describe only if you need. Hope the above information helps. Thanks. Regards, Guanpeng Xu _________________________________________________________________ FREE pop-up blocking with the new MSN Toolbar - get it now! http://toolbar.msn.click-url.com/go/onm00200415ave/direct/01/ ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: C++ mode and c-beginning-of-current-token 2007-05-12 15:30 ` Herbert Euler @ 2007-05-12 18:49 ` Alan Mackenzie 2007-05-13 0:51 ` Herbert Euler 0 siblings, 1 reply; 41+ messages in thread From: Alan Mackenzie @ 2007-05-12 18:49 UTC (permalink / raw) To: Herbert Euler; +Cc: emacs-devel Hi, Guanpeng! On Sat, May 12, 2007 at 11:30:07PM +0800, Herbert Euler wrote: > I found some additional strange behaviors. First, the error in a > buffer would only be signaled once. If an error is already signaled, > typing again does not cause another same error get signaled, but > font-lock is broken after that. Font locking is done in an after-change hook. If a function in the after-change-hook throws an error, Emacs deletes that function from the hook, allowing Emacs to continue broken rather than hang up on continual errors. I think that is what is happening to you here. You can check this by examining after-change-functions before and after the error. I think this explains all the anomalies you were seeing. > Hope the above information helps. Thanks. I'm going to have to think a bit about the main bug you reported. Hopefully, I'll get back to you tomorrow about it. Sleep well! > Regards, > Guanpeng Xu -- Alan Mackenzie (Ittersbach, Germany). ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: C++ mode and c-beginning-of-current-token 2007-05-12 18:49 ` Alan Mackenzie @ 2007-05-13 0:51 ` Herbert Euler 0 siblings, 0 replies; 41+ messages in thread From: Herbert Euler @ 2007-05-13 0:51 UTC (permalink / raw) To: acm; +Cc: emacs-devel > > I found some additional strange behaviors. First, the error in a > > buffer would only be signaled once. If an error is already signaled, > > typing again does not cause another same error get signaled, but > > font-lock is broken after that. > >Font locking is done in an after-change hook. If a function in the >after-change-hook throws an error, Emacs deletes that function from the >hook, allowing Emacs to continue broken rather than hang up on continual >errors. I think that is what is happening to you here. You can check >this by examining after-change-functions before and after the error. > >I think this explains all the anomalies you were seeing. Yes, the reason the error is not signaled again is that the after-change hook is changed. I do not know whether it is correct here: all of the functions in the hook are removed. The default value of the hook is (c-after-change jit-lock-after-change auto-composition-after-change t), and I tried another value (jit-lock-after-change auto-composition-after-change t c-after-change). In both cases, `after-change-functions' is nil after the error has been signaled. Regards, Guanpeng Xu _________________________________________________________________ FREE pop-up blocking with the new MSN Toolbar - get it now! http://toolbar.msn.click-url.com/go/onm00200415ave/direct/01/ ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: C++ mode and c-beginning-of-current-token 2007-05-12 10:39 C++ mode and c-beginning-of-current-token Herbert Euler 2007-05-12 13:19 ` Alan Mackenzie 2007-05-12 15:30 ` Herbert Euler @ 2007-05-13 10:01 ` Alan Mackenzie 2007-05-14 2:00 ` Herbert Euler 2007-05-14 16:58 ` Stefan Monnier 3 siblings, 1 reply; 41+ messages in thread From: Alan Mackenzie @ 2007-05-13 10:01 UTC (permalink / raw) To: Stefan Monnier, Herbert Euler; +Cc: emacs-devel Hi, Stefan and Guanpeng! On Sat, May 12, 2007 at 06:39:12PM +0800, Herbert Euler wrote: > In the newest unicode 2 branch, `parse-sexp-lookup-properties' is > default to `t' in C++ mode but not in C mode. I don't have the > Emacs 22 trunk, and so don't know the case in that trunk. But > if this variable is set to t by default, there will be a bug in c++-mode: 0. Start emacs with -Q > 1. Visit an empty, new C++ file. > 2. Try to insert the following line, at the beginning of the buffer: > #include <iostream> > Well, an error will be signaled when typing the second character, "i", > says "Point before start of properties". This error happens in the > function `c-beginning-of-current-token', when invoking > `skip-syntax-backward': "Point before start of properties" is thrown by the C function update_interval in ..../emacs/src/interval.c. update_interval is called only from within syntax.c (I think). Stefan: this is in your bailiwick. Could you have a look at it, please. > (defun c-beginning-of-current-token (&optional back-limit) > ;; Move to the beginning of the current token. Do not move if not > ;; in the middle of one. BACK-LIMIT may be used to bound the > ;; backward search; if given it's assumed to be at the boundary > ;; between two tokens. Return non-nil if the point is move, nil > ;; otherwise. > ;; > ;; This function might do hidden buffer changes. > (let ((start (point))) > (if (looking-at "\\w\\|\\s_") > (skip-syntax-backward "w_" back-limit) > (when (< (skip-syntax-backward ".()" back-limit) 0) > ;; ... ... > If `parse-sexp-lookup-properties' is t, `skip-syntax-backward' would > not return -1, but signaling an error in some cases. > Regards, > Guanpeng Xu -- Alan. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: C++ mode and c-beginning-of-current-token 2007-05-13 10:01 ` Alan Mackenzie @ 2007-05-14 2:00 ` Herbert Euler 2007-05-14 8:50 ` Alan Mackenzie 0 siblings, 1 reply; 41+ messages in thread From: Herbert Euler @ 2007-05-14 2:00 UTC (permalink / raw) To: acm, monnier; +Cc: emacs-devel I looked at an older version of CC mode, the `c-after-change' function does not call `c-trim-found-types', and the `c-before-change' function does not exist. The error does not happen in that version. Both these changes lead to the invocation of `c-beginning-of-current-token' as a result. So I suspect these changes are the direct source of the error. Regards, Guanpeng Xu _________________________________________________________________ Express yourself instantly with MSN Messenger! Download today it's FREE! http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/ ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: C++ mode and c-beginning-of-current-token 2007-05-14 2:00 ` Herbert Euler @ 2007-05-14 8:50 ` Alan Mackenzie 2007-05-14 9:24 ` Herbert Euler 0 siblings, 1 reply; 41+ messages in thread From: Alan Mackenzie @ 2007-05-14 8:50 UTC (permalink / raw) To: Herbert Euler; +Cc: monnier, emacs-devel Hi, Guanpeng! On Mon, May 14, 2007 at 10:00:59AM +0800, Herbert Euler wrote: > I looked at an older version of CC mode, the `c-after-change' function > does not call `c-trim-found-types', and the `c-before-change' function > does not exist. The error does not happen in that version. Both these > changes lead to the invocation of `c-beginning-of-current-token' as a > result. So I suspect these changes are the direct source of the error. As I said earlier, I think it more likely these changes in CC Mode have triggered a bug rather than being buggy themselves. (OK, anybody, feel free to demolish my hubris ;-) I haven't looked into it much, but it seems likely that the syntax.c stuff for a Unicode branch will have been enhanced to cope with the wierdnesses of that character set. Could you please run edebug[*] on c-beginning-of-current-token, and check that the arguments which are passed to skip-syntax-backwards are valid. If they're not, it's CC Mode's problem. If they are, let's wait and see what Stefan says. [*] If you're not familiar with Edebug, it's well described in the Elisp manual. Send me a private email if you want help to use it. > Regards, > Guanpeng Xu -- Alan Mackenzie (Ittersbach, Germany). ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: C++ mode and c-beginning-of-current-token 2007-05-14 8:50 ` Alan Mackenzie @ 2007-05-14 9:24 ` Herbert Euler 0 siblings, 0 replies; 41+ messages in thread From: Herbert Euler @ 2007-05-14 9:24 UTC (permalink / raw) To: acm; +Cc: monnier, emacs-devel > > I looked at an older version of CC mode, the `c-after-change' function > > does not call `c-trim-found-types', and the `c-before-change' function > > does not exist. The error does not happen in that version. Both these > > changes lead to the invocation of `c-beginning-of-current-token' as a > > result. So I suspect these changes are the direct source of the error. > >As I said earlier, I think it more likely these changes in CC Mode have >triggered a bug rather than being buggy themselves. (OK, anybody, feel >free to demolish my hubris ;-) I haven't looked into it much, but it >seems likely that the syntax.c stuff for a Unicode branch will have been >enhanced to cope with the wierdnesses of that character set. > >Could you please run edebug[*] on c-beginning-of-current-token, and check >that the arguments which are passed to skip-syntax-backwards are valid. >If they're not, it's CC Mode's problem. If they are, let's wait and see >what Stefan says. The argument is surely valid, but there are too many differences of syntax.c between the main branch and the unicode 2 branch. As you said, perhaps the source of the unicode 2 branch is too old. Let's wait for Stefan's opinion. Thank you very much. Regards, Guanpeng Xu Btw, I thought except for unicode feature, the unicode 2 branch is the same as the main branch. So I am wrong. _________________________________________________________________ Express yourself instantly with MSN Messenger! Download today it's FREE! http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/ ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: C++ mode and c-beginning-of-current-token 2007-05-12 10:39 C++ mode and c-beginning-of-current-token Herbert Euler ` (2 preceding siblings ...) 2007-05-13 10:01 ` Alan Mackenzie @ 2007-05-14 16:58 ` Stefan Monnier 2007-05-15 3:45 ` Herbert Euler 2007-05-15 13:30 ` Herbert Euler 3 siblings, 2 replies; 41+ messages in thread From: Stefan Monnier @ 2007-05-14 16:58 UTC (permalink / raw) To: Herbert Euler; +Cc: acm, emacs-devel > If `parse-sexp-lookup-properties' is t, `skip-syntax-backward' would > not return -1, but signaling an error in some cases. Sounds like a bug in skip-syntax-backward. Stefan ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: C++ mode and c-beginning-of-current-token 2007-05-14 16:58 ` Stefan Monnier @ 2007-05-15 3:45 ` Herbert Euler 2007-05-15 6:39 ` martin rudalics 2007-05-16 16:15 ` Stefan Monnier 2007-05-15 13:30 ` Herbert Euler 1 sibling, 2 replies; 41+ messages in thread From: Herbert Euler @ 2007-05-15 3:45 UTC (permalink / raw) To: monnier; +Cc: acm, emacs-devel > > If `parse-sexp-lookup-properties' is t, `skip-syntax-backward' would > > not return -1, but signaling an error in some cases. > >Sounds like a bug in skip-syntax-backward. Is it Ok of signaling error if point is at beginning of buffer when calling `skip-syntax-backward'? Regards, Guanpeng Xu _________________________________________________________________ Don't just search. Find. Check out the new MSN Search! http://search.msn.com/ ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: C++ mode and c-beginning-of-current-token 2007-05-15 3:45 ` Herbert Euler @ 2007-05-15 6:39 ` martin rudalics 2007-05-16 16:15 ` Stefan Monnier 1 sibling, 0 replies; 41+ messages in thread From: martin rudalics @ 2007-05-15 6:39 UTC (permalink / raw) To: Herbert Euler; +Cc: acm, monnier, emacs-devel > Is it Ok of signaling error if point is at beginning of buffer when calling > `skip-syntax-backward'? No. (Unless you pass a wrong argument to it.) Could you run skip_chars under gdb to see what goes on? ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: C++ mode and c-beginning-of-current-token 2007-05-15 3:45 ` Herbert Euler 2007-05-15 6:39 ` martin rudalics @ 2007-05-16 16:15 ` Stefan Monnier 1 sibling, 0 replies; 41+ messages in thread From: Stefan Monnier @ 2007-05-16 16:15 UTC (permalink / raw) To: Herbert Euler; +Cc: acm, emacs-devel >> > If `parse-sexp-lookup-properties' is t, `skip-syntax-backward' would >> > not return -1, but signaling an error in some cases. >> >> Sounds like a bug in skip-syntax-backward. > Is it Ok of signaling error if point is at beginning of buffer when calling > `skip-syntax-backward'? No, it's not. It might conceptually be OK to change it so as to signal a `beginning-of-buffer' error (although it'd most likely introduce compatiblity bugs), but it has no reason to ever signal not "Point before start of properties". Stefan ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: C++ mode and c-beginning-of-current-token 2007-05-14 16:58 ` Stefan Monnier 2007-05-15 3:45 ` Herbert Euler @ 2007-05-15 13:30 ` Herbert Euler 2007-05-16 8:01 ` Herbert Euler 1 sibling, 1 reply; 41+ messages in thread From: Herbert Euler @ 2007-05-15 13:30 UTC (permalink / raw) To: monnier; +Cc: acm, emacs-devel > > If `parse-sexp-lookup-properties' is t, `skip-syntax-backward' would > > not return -1, but signaling an error in some cases. > >Sounds like a bug in skip-syntax-backward. In the previous message, I described a way to reproduce the error: 0. Start Emacs with emacs -Q. 1. Visit a new, empty C++ file. 2. Insert the following characters: #include <iostream> The error happens when inserting the second character, i. Now let's remember the following two facts: a) `parse-sexp-lookup-properties' is t in a c++-mode buffer, b) `after-change-functions' contains `c-after-change', which leads to invocation of `skip-syntax-backward' in the way (skip-syntax-backward ".()" nil), c) the c++-mode buffer is not multibyte. Now, with fact a), code in syntax.c will update interval information, including the function `skip_syntaxes', invoked by `Fskip_syntax_backward'. Take a look at how `skip_syntaxes' works, we might know how the error happens. The code that skips characters and updates interval information uses a variable `pos', and below is how it updates interval information: while (1) { if (p <= stop) { if (p <= endp) break; p = GPT_ADDR; stop = endp; } if (! fastmap[(int) SYNTAX (p[-1])]) break; p--, pos--, pos_byte--; UPDATE_SYNTAX_TABLE_BACKWARD (pos - 1); } As you see, it updates interval information at position (pos - 1). At an earlier stage in this function, `pos' is initialized to the buffer position: int start_point = PT; int pos = PT; int pos_byte = PT_BYTE; unsigned char *p = PT_ADDR, *endp, *stop; Since we're inserting the second character, `PT' is now 2, and so the initial value of `pos' is 2. Later, before updating interval information, `pos' is self-decreased, so (pos - 1) = (1 - 1) = 0. Here is how `UPDATE_SYNTAX_TABLE_BACKWARD' is defined: #define UPDATE_SYNTAX_TABLE_BACKWARD(charpos) \ (parse_sexp_lookup_properties \ && (charpos) < gl_state.b_property \ ? (update_syntax_table ((charpos) + gl_state.offset, -1, 0, \ gl_state.object), \ 1) \ : 0) It invokes `update_syntax_table' with `charpos' set to (charpos + gl_state.offset). `charpos' is 0 there, and gl_state.offset is 0. So `update_syntax_table' is invoked with `charpos' set to 0. Finally, `update_interval' is invoked in `update_syntax_table': void update_syntax_table (charpos, count, init, object) int charpos, count, init; Lisp_Object object; { /* ... ... */ i = update_interval (i, charpos); So `update_interval' is invoked with `pos' set to 0: INTERVAL update_interval (i, pos) register INTERVAL i; int pos; { if (NULL_INTERVAL_P (i)) return NULL_INTERVAL; while (1) { if (pos < i->position) { /* Move left. */ if (pos >= i->position - TOTAL_LENGTH (i->left)) { i->left->position = i->position - TOTAL_LENGTH (i->left) + LEFT_TOTAL_LENGTH (i->left); i = i->left; /* Move to the left child */ } else if (NULL_PARENT (i)) error ("Point before start of properties"); else i = INTERVAL_PARENT (i); continue; } `pos' is 0, and `*i' is (gdb) p *i $14 = { total_length = 1, position = 1, left = 0x0, right = 0x0, up = { interval = 0x86a1204, obj = 141169156 }, up_obj = 1, gcmarkbit = 0, write_protect = 0, visible = 0, front_sticky = 0, rear_sticky = 0, plist = 156480797 } (gdb) That's why the error is signaled, I think. Regards, Guanpeng Xu _________________________________________________________________ Don't just search. Find. Check out the new MSN Search! http://search.msn.click-url.com/go/onm00200636ave/direct/01/ ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: C++ mode and c-beginning-of-current-token 2007-05-15 13:30 ` Herbert Euler @ 2007-05-16 8:01 ` Herbert Euler 2007-05-16 8:05 ` Herbert Euler 2007-05-16 9:00 ` martin rudalics 0 siblings, 2 replies; 41+ messages in thread From: Herbert Euler @ 2007-05-16 8:01 UTC (permalink / raw) To: herberteuler, monnier; +Cc: acm, emacs-devel >> > If `parse-sexp-lookup-properties' is t, `skip-syntax-backward' would >> > not return -1, but signaling an error in some cases. >> >>Sounds like a bug in skip-syntax-backward. > >In the previous message, I described a way to reproduce the error: > >0. Start Emacs with emacs -Q. > >1. Visit a new, empty C++ file. > >2. Insert the following characters: > >#include <iostream> > >The error happens when inserting the second character, i. Now let's >remember the following two facts: > >a) `parse-sexp-lookup-properties' is t in a c++-mode buffer, > >b) `after-change-functions' contains `c-after-change', which leads to > invocation of `skip-syntax-backward' in the way > (skip-syntax-backward ".()" nil), > >c) the c++-mode buffer is not multibyte. > >Now, with fact a), code in syntax.c will update interval information, >including the function `skip_syntaxes', invoked by >`Fskip_syntax_backward'. Take a look at how `skip_syntaxes' works, we >might know how the error happens. > >The code that skips characters and updates interval information uses a >variable `pos', and below is how it updates interval information: > > while (1) > { > if (p <= stop) > { > if (p <= endp) > break; > p = GPT_ADDR; > stop = endp; > } > if (! fastmap[(int) SYNTAX (p[-1])]) > break; > p--, pos--, pos_byte--; > UPDATE_SYNTAX_TABLE_BACKWARD (pos - 1); > } > >As you see, it updates interval information at position (pos - 1). At >an earlier stage in this function, `pos' is initialized to the buffer >position: > > int start_point = PT; > int pos = PT; > int pos_byte = PT_BYTE; > unsigned char *p = PT_ADDR, *endp, *stop; > >Since we're inserting the second character, `PT' is now 2, and so the >initial value of `pos' is 2. Later, before updating interval >information, `pos' is self-decreased, so (pos - 1) = (1 - 1) = 0. > >Here is how `UPDATE_SYNTAX_TABLE_BACKWARD' is defined: > > #define UPDATE_SYNTAX_TABLE_BACKWARD(charpos) \ > (parse_sexp_lookup_properties \ > && (charpos) < gl_state.b_property \ > ? (update_syntax_table ((charpos) + gl_state.offset, -1, 0, \ > gl_state.object), \ > 1) \ > : 0) > >It invokes `update_syntax_table' with `charpos' set to (charpos + >gl_state.offset). `charpos' is 0 there, and gl_state.offset is 0. So >`update_syntax_table' is invoked with `charpos' set to 0. Finally, >`update_interval' is invoked in `update_syntax_table': > > void > update_syntax_table (charpos, count, init, object) > int charpos, count, init; > Lisp_Object object; > { > /* ... ... */ > i = update_interval (i, charpos); > >So `update_interval' is invoked with `pos' set to 0: > > INTERVAL > update_interval (i, pos) > register INTERVAL i; > int pos; > { > if (NULL_INTERVAL_P (i)) > return NULL_INTERVAL; > > while (1) > { > if (pos < i->position) > { > /* Move left. */ > if (pos >= i->position - TOTAL_LENGTH (i->left)) > { > i->left->position = i->position - TOTAL_LENGTH (i->left) > + LEFT_TOTAL_LENGTH (i->left); > i = i->left; /* Move to the left child */ > } > else if (NULL_PARENT (i)) > error ("Point before start of properties"); > else > i = INTERVAL_PARENT (i); > continue; > } > >`pos' is 0, and `*i' is > >(gdb) p *i >$14 = { > total_length = 1, > position = 1, > left = 0x0, > right = 0x0, > up = { > interval = 0x86a1204, > obj = 141169156 > }, > up_obj = 1, > gcmarkbit = 0, > write_protect = 0, > visible = 0, > front_sticky = 0, > rear_sticky = 0, > plist = 156480797 >} >(gdb) > >That's why the error is signaled, I think. Ok, the function `skip_syntaxes' does not exist in the main branch, syntax.c in the unicode 2 branch is too old. Regards, Guanpeng Xu _________________________________________________________________ Express yourself instantly with MSN Messenger! Download today it's FREE! http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/ ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: C++ mode and c-beginning-of-current-token 2007-05-16 8:01 ` Herbert Euler @ 2007-05-16 8:05 ` Herbert Euler 2007-05-17 2:12 ` Kenichi Handa 2007-05-16 9:00 ` martin rudalics 1 sibling, 1 reply; 41+ messages in thread From: Herbert Euler @ 2007-05-16 8:05 UTC (permalink / raw) To: herberteuler, monnier; +Cc: acm, emacs-devel >Ok, the function `skip_syntaxes' does not exist in the main branch, >syntax.c in the unicode 2 branch is too old. Sorry, `skip_syntaxes' is added by Kenichi Handa, and never exists in the main branch. Regards, Guanpeng Xu _________________________________________________________________ FREE pop-up blocking with the new MSN Toolbar - get it now! http://toolbar.msn.click-url.com/go/onm00200415ave/direct/01/ ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: C++ mode and c-beginning-of-current-token 2007-05-16 8:05 ` Herbert Euler @ 2007-05-17 2:12 ` Kenichi Handa 2007-05-17 10:18 ` martin rudalics 0 siblings, 1 reply; 41+ messages in thread From: Kenichi Handa @ 2007-05-17 2:12 UTC (permalink / raw) To: Herbert Euler; +Cc: acm, herberteuler, monnier, emacs-devel In article <BAY143-F2699DD79B76EFF38E365F0DA3C0@phx.gbl>, "Herbert Euler" <herberteuler@hotmail.com> writes: > >Ok, the function `skip_syntaxes' does not exist in the main branch, > >syntax.c in the unicode 2 branch is too old. > Sorry, `skip_syntaxes' is added by Kenichi Handa, and never exists > in the main branch. I devided Emacs 22's skip_chars into skip_chars (only for skipping chars) and skip_syntaxes (only for skipping syntaxes) without changing (and understanding) the logic. And I did an optimization in them to avoid multibyte character checking if the region contains only ASCII chars. The current problem in emacs-unicode-2 happens in a code that treats the buffer as unibyte because of the above optimization, and the same problem happens also in Emacs 22 with a unibyte buffer. Please try the same thing with Emacs 22 started with "--unibyte" argument. The same error is signalled. So, the bug has been in the original code of skip_chars (where it handles a unibyte case) in Emacs 22. Could someone please fix it in Emacs 22? Then, I'll do the same fix on emacs-unicode-2. --- Kenichi Handa handa@m17n.org ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: C++ mode and c-beginning-of-current-token 2007-05-17 2:12 ` Kenichi Handa @ 2007-05-17 10:18 ` martin rudalics 2007-05-17 12:52 ` Herbert Euler 0 siblings, 1 reply; 41+ messages in thread From: martin rudalics @ 2007-05-17 10:18 UTC (permalink / raw) To: Kenichi Handa; +Cc: acm, Herbert Euler, monnier, emacs-devel [-- Attachment #1: Type: text/plain, Size: 370 bytes --] > Please try the same thing with Emacs 22 started with > "--unibyte" argument. The same error is signalled. > > So, the bug has been in the original code of skip_chars > (where it handles a unibyte case) in Emacs 22. Could > someone please fix it in Emacs 22? Then, I'll do the same > fix on emacs-unicode-2. FWIW, I couldn't find problems with the attached patch. [-- Attachment #2: syntax.patch --] [-- Type: text/plain, Size: 416 bytes --] *** syntax.c Wed Jan 17 09:31:10 2007 --- syntax.c Thu May 17 11:15:42 2007 *************** *** 1672,1678 **** if (! fastmap[(int) SYNTAX (p[-1])]) break; p--, pos--; ! UPDATE_SYNTAX_TABLE_BACKWARD (pos - 1); } } } --- 1672,1678 ---- if (! fastmap[(int) SYNTAX (p[-1])]) break; p--, pos--; ! UPDATE_SYNTAX_TABLE_BACKWARD (pos); } } } [-- Attachment #3: Type: text/plain, Size: 142 bytes --] _______________________________________________ Emacs-devel mailing list Emacs-devel@gnu.org http://lists.gnu.org/mailman/listinfo/emacs-devel ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: C++ mode and c-beginning-of-current-token 2007-05-17 10:18 ` martin rudalics @ 2007-05-17 12:52 ` Herbert Euler 2007-05-17 13:51 ` martin rudalics 2007-05-17 14:32 ` Stefan Monnier 0 siblings, 2 replies; 41+ messages in thread From: Herbert Euler @ 2007-05-17 12:52 UTC (permalink / raw) To: rudalics, handa; +Cc: acm, monnier, emacs-devel >>Please try the same thing with Emacs 22 started with >>"--unibyte" argument. The same error is signalled. >> >>So, the bug has been in the original code of skip_chars >>(where it handles a unibyte case) in Emacs 22. Could >>someone please fix it in Emacs 22? Then, I'll do the same >>fix on emacs-unicode-2. > >FWIW, I couldn't find problems with the attached patch. >*** syntax.c Wed Jan 17 09:31:10 2007 >--- syntax.c Thu May 17 11:15:42 2007 >*************** >*** 1672,1678 **** > if (! fastmap[(int) SYNTAX (p[-1])]) > break; > p--, pos--; >! UPDATE_SYNTAX_TABLE_BACKWARD (pos - 1); > } > } > } >--- 1672,1678 ---- > if (! fastmap[(int) SYNTAX (p[-1])]) > break; > p--, pos--; >! UPDATE_SYNTAX_TABLE_BACKWARD (pos); > } > } > } Can I ask a silly question, too? There are two "calling"s of UPDATE_SYNTAX_TABLE_BACKWARD in `skip_chars', one with the "argument" (pos), the other with (pos - 1). To me, the second one does not look like a typo, but a correct piece of code on purpose. Yes, changing it from (pos - 1) to (pos) works, but it seems to be dangerous to me. UPDATE_SYNTAX_TABLE_BACKWARD also updates `gl_state' by setting its members `forward_i' and `backward_i'. Perhaps (pos - 1) is for keeping these two members at right positions. Different "argument"s passed to UPDATE_SYNTAX_TABLE_BACKWARD sets the two intervals differently. There are still some details I did not make clear, so I am still confused now. Regards, Guanpeng Xu _________________________________________________________________ Don't just search. Find. Check out the new MSN Search! http://search.msn.com/ ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: C++ mode and c-beginning-of-current-token 2007-05-17 12:52 ` Herbert Euler @ 2007-05-17 13:51 ` martin rudalics 2007-05-17 21:40 ` martin rudalics 2007-05-17 14:32 ` Stefan Monnier 1 sibling, 1 reply; 41+ messages in thread From: martin rudalics @ 2007-05-17 13:51 UTC (permalink / raw) To: Herbert Euler; +Cc: acm, emacs-devel, monnier, handa > Can I ask a silly question, too? There are two "calling"s of > UPDATE_SYNTAX_TABLE_BACKWARD in `skip_chars', one with the "argument" > (pos), the other with (pos - 1). To me, the second one does not look > like a typo, but a correct piece of code on purpose. Yes, changing it > >> from (pos - 1) to (pos) works, but it seems to be dangerous to me. I do have my own problems with UPDATE_SYNTAX_TABLE_BACKWARD. To reproduce with Emacs -Q define foo as (defun foo () (interactive) (put-text-property (1- (point)) (point) 'syntax-table '(2)) (setq parse-sexp-lookup-properties t)) open a text-mode buffer, insert a couple of non-word chars in the buffer, leave point after them, and type M-x foo followed by M-b. On my system it goes back by _two_ characters instead of one. I'm yet too silly to understand what's going on. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: C++ mode and c-beginning-of-current-token 2007-05-17 13:51 ` martin rudalics @ 2007-05-17 21:40 ` martin rudalics 0 siblings, 0 replies; 41+ messages in thread From: martin rudalics @ 2007-05-17 21:40 UTC (permalink / raw) To: martin rudalics; +Cc: acm, Herbert Euler, handa, monnier, emacs-devel [-- Attachment #1: Type: text/plain, Size: 620 bytes --] > I do have my own problems with UPDATE_SYNTAX_TABLE_BACKWARD. To > reproduce with Emacs -Q define foo as > > (defun foo () > (interactive) > (put-text-property (1- (point)) (point) 'syntax-table '(2)) > (setq parse-sexp-lookup-properties t)) > > open a text-mode buffer, insert a couple of non-word chars in the > buffer, leave point after them, and type M-x foo followed by M-b. On my > system it goes back by _two_ characters instead of one. I'm yet too > silly to understand what's going on. The attached patch seems to fix both the syntax and word backward scanning problems. Could someone please try? [-- Attachment #2: syntax.patch --] [-- Type: text/plain, Size: 1575 bytes --] *** syntax.c Wed Jan 17 09:31:10 2007 --- syntax.c Thu May 17 23:30:50 2007 *************** *** 1276,1294 **** position of it. */ while (1) { - int temp_byte; - if (from == beg) break; ! temp_byte = dec_bytepos (from_byte); UPDATE_SYNTAX_TABLE_BACKWARD (from); ! ch0 = FETCH_CHAR (temp_byte); code = SYNTAX (ch0); if (!(words_include_escapes && (code == Sescape || code == Scharquote))) if (code != Sword || WORD_BOUNDARY_P (ch0, ch1)) ! break; ! DEC_BOTH (from, from_byte); ch1 = ch0; } count++; --- 1276,1294 ---- position of it. */ while (1) { if (from == beg) break; ! DEC_BOTH (from, from_byte); UPDATE_SYNTAX_TABLE_BACKWARD (from); ! ch0 = FETCH_CHAR (from_byte); code = SYNTAX (ch0); if (!(words_include_escapes && (code == Sescape || code == Scharquote))) if (code != Sword || WORD_BOUNDARY_P (ch0, ch1)) ! { ! INC_BOTH (from, from_byte); ! break; ! } ch1 = ch0; } count++; *************** *** 1669,1678 **** p = GPT_ADDR; stop = endp; } - if (! fastmap[(int) SYNTAX (p[-1])]) - break; p--, pos--; ! UPDATE_SYNTAX_TABLE_BACKWARD (pos - 1); } } } --- 1669,1681 ---- p = GPT_ADDR; stop = endp; } p--, pos--; ! UPDATE_SYNTAX_TABLE_BACKWARD (pos); ! if (! fastmap[(int) SYNTAX (*p)]) ! { ! p++, pos++; ! break; ! } } } } [-- Attachment #3: Type: text/plain, Size: 142 bytes --] _______________________________________________ Emacs-devel mailing list Emacs-devel@gnu.org http://lists.gnu.org/mailman/listinfo/emacs-devel ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: C++ mode and c-beginning-of-current-token 2007-05-17 12:52 ` Herbert Euler 2007-05-17 13:51 ` martin rudalics @ 2007-05-17 14:32 ` Stefan Monnier 2007-05-17 14:45 ` martin rudalics 2007-05-18 13:00 ` Richard Stallman 1 sibling, 2 replies; 41+ messages in thread From: Stefan Monnier @ 2007-05-17 14:32 UTC (permalink / raw) To: Herbert Euler; +Cc: rudalics, acm, emacs-devel, handa > Can I ask a silly question, too? There are two "calling"s of > UPDATE_SYNTAX_TABLE_BACKWARD in `skip_chars', one with the "argument" > (pos), the other with (pos - 1). To me, the second one does not look > like a typo, but a correct piece of code on purpose. Yes, changing it > from (pos - 1) to (pos) works, but it seems to be dangerous to me. The important part is to make sure that the syntax-table is up-to-date for the char being read, when to char is passed to `SYNTAX'. Also, it is important to update the syntax-table only after making sure that the new position is valid. So I believe the patch below is what we want. Stefan --- orig/src/syntax.c +++ mod/src/syntax.c @@ -1691,10 +1691,10 @@ p = GPT_ADDR; stop = endp; } + UPDATE_SYNTAX_TABLE_BACKWARD (pos - 1); if (! fastmap[(int) SYNTAX (p[-1])]) break; p--, pos--; - UPDATE_SYNTAX_TABLE_BACKWARD (pos - 1); } } } ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: C++ mode and c-beginning-of-current-token 2007-05-17 14:32 ` Stefan Monnier @ 2007-05-17 14:45 ` martin rudalics 2007-05-18 13:00 ` Richard Stallman 1 sibling, 0 replies; 41+ messages in thread From: martin rudalics @ 2007-05-17 14:45 UTC (permalink / raw) To: Stefan Monnier; +Cc: acm, Herbert Euler, handa, emacs-devel [-- Attachment #1: Type: text/plain, Size: 642 bytes --] > Also, it is important to update the syntax-table only after making sure that > the new position is valid. So I believe the patch below is what we want. > > > Stefan > > > --- orig/src/syntax.c > +++ mod/src/syntax.c > @@ -1691,10 +1691,10 @@ > p = GPT_ADDR; > stop = endp; > } > + UPDATE_SYNTAX_TABLE_BACKWARD (pos - 1); > if (! fastmap[(int) SYNTAX (p[-1])]) > break; > p--, pos--; > - UPDATE_SYNTAX_TABLE_BACKWARD (pos - 1); > } > } > } Indeed, though I'd prefer the attached one. BTW, did you look at the backward-word problem I mentioned in my previous mail? [-- Attachment #2: syntax.patch --] [-- Type: text/plain, Size: 571 bytes --] *** syntax.c Wed Jan 17 09:31:10 2007 --- syntax.c Thu May 17 15:23:24 2007 *************** *** 1669,1678 **** p = GPT_ADDR; stop = endp; } - if (! fastmap[(int) SYNTAX (p[-1])]) - break; p--, pos--; ! UPDATE_SYNTAX_TABLE_BACKWARD (pos - 1); } } } --- 1669,1681 ---- p = GPT_ADDR; stop = endp; } p--, pos--; ! UPDATE_SYNTAX_TABLE_BACKWARD (pos); ! if (! fastmap[(int) SYNTAX (*p)]) ! { ! p++, pos++; ! break; ! } } } } [-- Attachment #3: Type: text/plain, Size: 142 bytes --] _______________________________________________ Emacs-devel mailing list Emacs-devel@gnu.org http://lists.gnu.org/mailman/listinfo/emacs-devel ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: C++ mode and c-beginning-of-current-token 2007-05-17 14:32 ` Stefan Monnier 2007-05-17 14:45 ` martin rudalics @ 2007-05-18 13:00 ` Richard Stallman 2007-05-18 23:39 ` Herbert Euler 2007-05-19 12:59 ` martin rudalics 1 sibling, 2 replies; 41+ messages in thread From: Richard Stallman @ 2007-05-18 13:00 UTC (permalink / raw) To: Stefan Monnier; +Cc: rudalics, herberteuler, emacs-devel, handa, acm Also, it is important to update the syntax-table only after making sure that the new position is valid. So I believe the patch below is what we want. Is this bug only in unicode-2? Does it affect Emacs 22? ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: C++ mode and c-beginning-of-current-token 2007-05-18 13:00 ` Richard Stallman @ 2007-05-18 23:39 ` Herbert Euler 2007-05-19 22:31 ` Richard Stallman 2007-05-19 12:59 ` martin rudalics 1 sibling, 1 reply; 41+ messages in thread From: Herbert Euler @ 2007-05-18 23:39 UTC (permalink / raw) To: rms, monnier; +Cc: rudalics, handa, acm, emacs-devel > Also, it is important to update the syntax-table only after making >sure that > the new position is valid. So I believe the patch below is what we >want. > >Is this bug only in unicode-2? Does it affect Emacs 22? As Kenichi said, it will happen if Emacs is started with --unibyte. Regards, Guanpeng Xu _________________________________________________________________ FREE pop-up blocking with the new MSN Toolbar - get it now! http://toolbar.msn.click-url.com/go/onm00200415ave/direct/01/ ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: C++ mode and c-beginning-of-current-token 2007-05-18 23:39 ` Herbert Euler @ 2007-05-19 22:31 ` Richard Stallman 0 siblings, 0 replies; 41+ messages in thread From: Richard Stallman @ 2007-05-19 22:31 UTC (permalink / raw) To: Herbert Euler; +Cc: rudalics, handa, monnier, acm, emacs-devel As Kenichi said, it will happen if Emacs is started with --unibyte. I think we can do without fixing that for now. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: C++ mode and c-beginning-of-current-token 2007-05-18 13:00 ` Richard Stallman 2007-05-18 23:39 ` Herbert Euler @ 2007-05-19 12:59 ` martin rudalics 2007-05-19 15:18 ` Stefan Monnier 2007-05-20 6:50 ` Richard Stallman 1 sibling, 2 replies; 41+ messages in thread From: martin rudalics @ 2007-05-19 12:59 UTC (permalink / raw) To: rms; +Cc: acm, herberteuler, handa, Stefan Monnier, emacs-devel > Is this bug only in unicode-2? Does it affect Emacs 22? There are two bugs with identical structure - one in `skip_chars' and one in `scan_words'. The bug in `skip_chars' occurs when you invoke Emacs with the unibyte option. IMO it's virulent in the unicode-2 branch only. The bug in `scan_words' occurs with Emacs -Q. That bug will be hardly noticed ever since `backward-word' practically never relies on syntax-table properties. Patching any of these is hairy because neither Guanpeng nor I seem to understand the interval updating code sufficiently well. Hence, unless we find someone with intimate knowledge of the interval code, I'd propose to not touch it for the release. We could try to fix them in the trunk and the Unicode branch and see what happens. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: C++ mode and c-beginning-of-current-token 2007-05-19 12:59 ` martin rudalics @ 2007-05-19 15:18 ` Stefan Monnier 2007-05-19 17:48 ` martin rudalics 2007-05-21 13:01 ` Kenichi Handa 2007-05-20 6:50 ` Richard Stallman 1 sibling, 2 replies; 41+ messages in thread From: Stefan Monnier @ 2007-05-19 15:18 UTC (permalink / raw) To: martin rudalics; +Cc: acm, herberteuler, emacs-devel, rms, handa > There are two bugs with identical structure - one in `skip_chars' and > one in `scan_words'. I have just installed a patch in EMACS_22_BASE for skip_chars. As for scan_words, I don't know of such a bug. Also looking at the code, I don't see it (not that it proves anything, of course). Could you tell me where's the problem in scan_words? Stefan ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: C++ mode and c-beginning-of-current-token 2007-05-19 15:18 ` Stefan Monnier @ 2007-05-19 17:48 ` martin rudalics 2007-05-21 13:01 ` Kenichi Handa 1 sibling, 0 replies; 41+ messages in thread From: martin rudalics @ 2007-05-19 17:48 UTC (permalink / raw) To: Stefan Monnier; +Cc: acm, herberteuler, handa, rms, emacs-devel [-- Attachment #1: Type: text/plain, Size: 611 bytes --] > As for scan_words, I don't know of such a bug. Also looking at the code, > I don't see it (not that it proves anything, of course). Could you tell me > where's the problem in scan_words? With Emacs -Q define foo as (defun foo () (interactive) (put-text-property (1- (point)) (point) 'syntax-table '(2)) (setq parse-sexp-lookup-properties t)) open a text-mode buffer, insert a couple of non-word chars in the buffer, leave point after them, and type M-x foo followed by M-b. It goes back by _two_ characters instead of one. The attached patch was supposed to fix this and the other problem. [-- Attachment #2: syntax.patch --] [-- Type: text/plain, Size: 1575 bytes --] *** syntax.c Wed Jan 17 09:31:10 2007 --- syntax.c Thu May 17 23:30:50 2007 *************** *** 1276,1294 **** position of it. */ while (1) { - int temp_byte; - if (from == beg) break; ! temp_byte = dec_bytepos (from_byte); UPDATE_SYNTAX_TABLE_BACKWARD (from); ! ch0 = FETCH_CHAR (temp_byte); code = SYNTAX (ch0); if (!(words_include_escapes && (code == Sescape || code == Scharquote))) if (code != Sword || WORD_BOUNDARY_P (ch0, ch1)) ! break; ! DEC_BOTH (from, from_byte); ch1 = ch0; } count++; --- 1276,1294 ---- position of it. */ while (1) { if (from == beg) break; ! DEC_BOTH (from, from_byte); UPDATE_SYNTAX_TABLE_BACKWARD (from); ! ch0 = FETCH_CHAR (from_byte); code = SYNTAX (ch0); if (!(words_include_escapes && (code == Sescape || code == Scharquote))) if (code != Sword || WORD_BOUNDARY_P (ch0, ch1)) ! { ! INC_BOTH (from, from_byte); ! break; ! } ch1 = ch0; } count++; *************** *** 1669,1678 **** p = GPT_ADDR; stop = endp; } - if (! fastmap[(int) SYNTAX (p[-1])]) - break; p--, pos--; ! UPDATE_SYNTAX_TABLE_BACKWARD (pos - 1); } } } --- 1669,1681 ---- p = GPT_ADDR; stop = endp; } p--, pos--; ! UPDATE_SYNTAX_TABLE_BACKWARD (pos); ! if (! fastmap[(int) SYNTAX (*p)]) ! { ! p++, pos++; ! break; ! } } } } [-- Attachment #3: Type: text/plain, Size: 142 bytes --] _______________________________________________ Emacs-devel mailing list Emacs-devel@gnu.org http://lists.gnu.org/mailman/listinfo/emacs-devel ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: C++ mode and c-beginning-of-current-token 2007-05-19 15:18 ` Stefan Monnier 2007-05-19 17:48 ` martin rudalics @ 2007-05-21 13:01 ` Kenichi Handa 2007-05-21 14:00 ` Stefan Monnier 1 sibling, 1 reply; 41+ messages in thread From: Kenichi Handa @ 2007-05-21 13:01 UTC (permalink / raw) To: Stefan Monnier; +Cc: rudalics, herberteuler, emacs-devel, rms, acm In article <jwvk5v4c0bh.fsf-monnier+emacs@gnu.org>, Stefan Monnier <monnier@iro.umontreal.ca> writes: > > There are two bugs with identical structure - one in `skip_chars' and > > one in `scan_words'. > I have just installed a patch in EMACS_22_BASE for skip_chars. Thank you. I've just installed the corresponding change in emacs-unicode-2. > As for scan_words, I don't know of such a bug. Also looking at the code, > I don't see it (not that it proves anything, of course). Could you tell me > where's the problem in scan_words? martin rudalics <rudalics@gmx.at> writes: > With Emacs -Q define foo as > (defun foo () > (interactive) > (put-text-property (1- (point)) (point) 'syntax-table '(2)) > (setq parse-sexp-lookup-properties t)) > open a text-mode buffer, insert a couple of non-word chars in the > buffer, leave point after them, and type M-x foo followed by M-b. It > goes back by _two_ characters instead of one. The attached patch was > supposed to fix this and the other problem. I confirmed this bug. It seems that the following single line change is easier to understand what was wrong, but Martin's change makes the resulting code easier to read. *** syntax.c 21 May 2007 21:21:30 +0900 1.205 --- syntax.c 21 May 2007 21:53:50 +0900 *************** *** 1281,1287 **** if (from == beg) break; temp_byte = dec_bytepos (from_byte); ! UPDATE_SYNTAX_TABLE_BACKWARD (from); ch0 = FETCH_CHAR (temp_byte); code = SYNTAX (ch0); if (!(words_include_escapes --- 1281,1287 ---- if (from == beg) break; temp_byte = dec_bytepos (from_byte); ! UPDATE_SYNTAX_TABLE_BACKWARD (from - 1); ch0 = FETCH_CHAR (temp_byte); code = SYNTAX (ch0); if (!(words_include_escapes --- Kenichi Handa handa@m17n.org ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: C++ mode and c-beginning-of-current-token 2007-05-21 13:01 ` Kenichi Handa @ 2007-05-21 14:00 ` Stefan Monnier 2007-05-22 1:37 ` Kenichi Handa 0 siblings, 1 reply; 41+ messages in thread From: Stefan Monnier @ 2007-05-21 14:00 UTC (permalink / raw) To: Kenichi Handa; +Cc: rudalics, herberteuler, emacs-devel, rms, acm > I confirmed this bug. It seems that the following single > line change is easier to understand what was wrong, but > Martin's change makes the resulting code easier to read. Yes, thanks, it's the one I was planning on installing. My connection is pretty poor (I'm on the road), so if you want to install it, please do, Stefan ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: C++ mode and c-beginning-of-current-token 2007-05-21 14:00 ` Stefan Monnier @ 2007-05-22 1:37 ` Kenichi Handa 2007-05-22 10:26 ` Stefan Monnier 0 siblings, 1 reply; 41+ messages in thread From: Kenichi Handa @ 2007-05-22 1:37 UTC (permalink / raw) To: Stefan Monnier; +Cc: rudalics, herberteuler, acm, rms, emacs-devel In article <jwv8xbi8egv.fsf-monnier+emacs@gnu.org>, Stefan Monnier <monnier@iro.umontreal.ca> writes: > > I confirmed this bug. It seems that the following single > > line change is easier to understand what was wrong, but > > Martin's change makes the resulting code easier to read. > Yes, thanks, it's the one I was planning on installing. My connection is > pretty poor (I'm on the road), so if you want to install it, please do, Which one, mine or Martin's? --- Kenichi Handa handa@m17n.org ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: C++ mode and c-beginning-of-current-token 2007-05-22 1:37 ` Kenichi Handa @ 2007-05-22 10:26 ` Stefan Monnier 2007-05-22 12:08 ` Kenichi Handa 0 siblings, 1 reply; 41+ messages in thread From: Stefan Monnier @ 2007-05-22 10:26 UTC (permalink / raw) To: Kenichi Handa; +Cc: rudalics, herberteuler, emacs-devel, rms, acm >> > I confirmed this bug. It seems that the following single >> > line change is easier to understand what was wrong, but >> > Martin's change makes the resulting code easier to read. >> Yes, thanks, it's the one I was planning on installing. My connection is >> pretty poor (I'm on the road), so if you want to install it, please do, > Which one, mine or Martin's? Up to you. I find yours more "obviously correct", but it's just a question fo taste, so the committer gets to impose his own, Stefan ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: C++ mode and c-beginning-of-current-token 2007-05-22 10:26 ` Stefan Monnier @ 2007-05-22 12:08 ` Kenichi Handa 0 siblings, 0 replies; 41+ messages in thread From: Kenichi Handa @ 2007-05-22 12:08 UTC (permalink / raw) To: Stefan Monnier; +Cc: rudalics, herberteuler, emacs-devel, rms, acm In article <jwvfy5p40k2.fsf-monnier+emacs@gnu.org>, Stefan Monnier <monnier@iro.umontreal.ca> writes: >>> > I confirmed this bug. It seems that the following single >>> > line change is easier to understand what was wrong, but >>> > Martin's change makes the resulting code easier to read. >>> Yes, thanks, it's the one I was planning on installing. My connection is >>> pretty poor (I'm on the road), so if you want to install it, please do, > > Which one, mine or Martin's? > Up to you. I find yours more "obviously correct", but it's just a question > fo taste, so the committer gets to impose his own, I've just installed Martin's fix because I thought that, for the trunk, the readability of the resulting code is more inportant than obviousness of the fix. --- Kenichi Handa handa@m17n.org ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: C++ mode and c-beginning-of-current-token 2007-05-19 12:59 ` martin rudalics 2007-05-19 15:18 ` Stefan Monnier @ 2007-05-20 6:50 ` Richard Stallman 1 sibling, 0 replies; 41+ messages in thread From: Richard Stallman @ 2007-05-20 6:50 UTC (permalink / raw) To: martin rudalics; +Cc: acm, herberteuler, handa, monnier, emacs-devel The bug in `skip_chars' occurs when you invoke Emacs with the unibyte option. IMO it's virulent in the unicode-2 branch only. The bug in `scan_words' occurs with Emacs -Q. That bug will be hardly noticed ever since `backward-word' practically never relies on syntax-table properties. Ok, we can leave it alone for now. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: C++ mode and c-beginning-of-current-token 2007-05-16 8:01 ` Herbert Euler 2007-05-16 8:05 ` Herbert Euler @ 2007-05-16 9:00 ` martin rudalics 2007-05-16 11:12 ` Herbert Euler 1 sibling, 1 reply; 41+ messages in thread From: martin rudalics @ 2007-05-16 9:00 UTC (permalink / raw) To: Herbert Euler; +Cc: acm, monnier, emacs-devel >> p--, pos--, pos_byte--; >> UPDATE_SYNTAX_TABLE_BACKWARD (pos - 1); Possibly a silly suggestion: What happens if you use UPDATE_SYNTAX_TABLE_BACKWARD (pos); here? ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: C++ mode and c-beginning-of-current-token 2007-05-16 9:00 ` martin rudalics @ 2007-05-16 11:12 ` Herbert Euler 2007-05-16 12:21 ` martin rudalics 0 siblings, 1 reply; 41+ messages in thread From: Herbert Euler @ 2007-05-16 11:12 UTC (permalink / raw) To: rudalics; +Cc: acm, monnier, emacs-devel >>> p--, pos--, pos_byte--; >>> UPDATE_SYNTAX_TABLE_BACKWARD (pos - 1); > >Possibly a silly suggestion: What happens if you use > > UPDATE_SYNTAX_TABLE_BACKWARD (pos); > >here? I'm not very sure about the meaning of (pos - 1), so instead of trying your proposal, I tried the following one: while (1) { if (p <= stop) { if (p <= endp) break; p = GPT_ADDR; stop = endp; } if (! fastmap[(int) SYNTAX (p[-1])]) break; p--, pos--, pos_byte--; if (pos <= 1) break; UPDATE_SYNTAX_TABLE_BACKWARD (pos - 1); } And this change seems to fix the problem. Regards, Guanpeng Xu _________________________________________________________________ Express yourself instantly with MSN Messenger! Download today it's FREE! http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/ ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: C++ mode and c-beginning-of-current-token 2007-05-16 11:12 ` Herbert Euler @ 2007-05-16 12:21 ` martin rudalics 0 siblings, 0 replies; 41+ messages in thread From: martin rudalics @ 2007-05-16 12:21 UTC (permalink / raw) To: Herbert Euler; +Cc: acm, monnier, emacs-devel > I'm not very sure about the meaning of (pos - 1), so instead of trying > your proposal, I tried the following one: To test you could try the following: Assign a syntax-table text-property which differs from the standard syntax-table for the buffer to a few characters and check whether `skip-syntax-backward' updates the properties correctly when you are after the last character. This should guarantee that > p--, pos--, pos_byte--; > if (pos <= 1) > break; > UPDATE_SYNTAX_TABLE_BACKWARD (pos - 1); indeed updates the property for each and every character - in particular the very first ("rightmost") one - skipped. Better do this in an elisp buffer to avoid that the major mode interferes with your settings. ^ permalink raw reply [flat|nested] 41+ messages in thread
end of thread, other threads:[~2007-05-22 12:08 UTC | newest] Thread overview: 41+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2007-05-12 10:39 C++ mode and c-beginning-of-current-token Herbert Euler 2007-05-12 13:19 ` Alan Mackenzie 2007-05-12 14:30 ` Herbert Euler 2007-05-12 14:33 ` Herbert Euler 2007-05-12 16:02 ` Herbert Euler 2007-05-12 15:30 ` Herbert Euler 2007-05-12 18:49 ` Alan Mackenzie 2007-05-13 0:51 ` Herbert Euler 2007-05-13 10:01 ` Alan Mackenzie 2007-05-14 2:00 ` Herbert Euler 2007-05-14 8:50 ` Alan Mackenzie 2007-05-14 9:24 ` Herbert Euler 2007-05-14 16:58 ` Stefan Monnier 2007-05-15 3:45 ` Herbert Euler 2007-05-15 6:39 ` martin rudalics 2007-05-16 16:15 ` Stefan Monnier 2007-05-15 13:30 ` Herbert Euler 2007-05-16 8:01 ` Herbert Euler 2007-05-16 8:05 ` Herbert Euler 2007-05-17 2:12 ` Kenichi Handa 2007-05-17 10:18 ` martin rudalics 2007-05-17 12:52 ` Herbert Euler 2007-05-17 13:51 ` martin rudalics 2007-05-17 21:40 ` martin rudalics 2007-05-17 14:32 ` Stefan Monnier 2007-05-17 14:45 ` martin rudalics 2007-05-18 13:00 ` Richard Stallman 2007-05-18 23:39 ` Herbert Euler 2007-05-19 22:31 ` Richard Stallman 2007-05-19 12:59 ` martin rudalics 2007-05-19 15:18 ` Stefan Monnier 2007-05-19 17:48 ` martin rudalics 2007-05-21 13:01 ` Kenichi Handa 2007-05-21 14:00 ` Stefan Monnier 2007-05-22 1:37 ` Kenichi Handa 2007-05-22 10:26 ` Stefan Monnier 2007-05-22 12:08 ` Kenichi Handa 2007-05-20 6:50 ` Richard Stallman 2007-05-16 9:00 ` martin rudalics 2007-05-16 11:12 ` Herbert Euler 2007-05-16 12:21 ` martin rudalics
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/emacs.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).