From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Dmitry Gutov Newsgroups: gmane.emacs.bugs Subject: bug#6286: General delimited literals in ruby-mode patch Date: Wed, 25 Apr 2012 07:03:05 +0400 Message-ID: <4F976969.6050804@yandex.ru> References: <8739ammd8l.fsf@yandex.ru> <87k43vecyt.fsf@yandex.ru> <87ehu3mga2.fsf@yandex.ru> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="------------010000040107060300060006" X-Trace: dough.gmane.org 1335323061 26738 80.91.229.3 (25 Apr 2012 03:04:21 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Wed, 25 Apr 2012 03:04:21 +0000 (UTC) Cc: 6286@debbugs.gnu.org To: Stefan Monnier Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Wed Apr 25 05:04:18 2012 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1SMsWf-0007Fg-VY for geb-bug-gnu-emacs@m.gmane.org; Wed, 25 Apr 2012 05:04:18 +0200 Original-Received: from localhost ([::1]:40239 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SMsWf-00058w-E2 for geb-bug-gnu-emacs@m.gmane.org; Tue, 24 Apr 2012 23:04:17 -0400 Original-Received: from eggs.gnu.org ([208.118.235.92]:36892) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SMsWc-00058p-Hu for bug-gnu-emacs@gnu.org; Tue, 24 Apr 2012 23:04:16 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1SMsWZ-0000UC-Ra for bug-gnu-emacs@gnu.org; Tue, 24 Apr 2012 23:04:14 -0400 Original-Received: from debbugs.gnu.org ([140.186.70.43]:50220) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SMsWZ-0000U1-Ge for bug-gnu-emacs@gnu.org; Tue, 24 Apr 2012 23:04:11 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.72) (envelope-from ) id 1SMsXN-0007o6-JE for bug-gnu-emacs@gnu.org; Tue, 24 Apr 2012 23:05:01 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Dmitry Gutov Original-Sender: debbugs-submit-bounces@debbugs.gnu.org Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Wed, 25 Apr 2012 03:05:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 6286 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch Original-Received: via spool by 6286-submit@debbugs.gnu.org id=B6286.133532304629941 (code B ref 6286); Wed, 25 Apr 2012 03:05:01 +0000 Original-Received: (at 6286) by debbugs.gnu.org; 25 Apr 2012 03:04:06 +0000 Original-Received: from localhost ([127.0.0.1]:51254 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1SMsWS-0007mp-HE for submit@debbugs.gnu.org; Tue, 24 Apr 2012 23:04:05 -0400 Original-Received: from forward2.mail.yandex.net ([77.88.46.7]:33336) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1SMsWN-0007lj-9W for 6286@debbugs.gnu.org; Tue, 24 Apr 2012 23:04:02 -0400 Original-Received: from smtp1.mail.yandex.net (smtp1.mail.yandex.net [77.88.46.101]) by forward2.mail.yandex.net (Yandex) with ESMTP id D0C5612A175D; Wed, 25 Apr 2012 07:03:01 +0400 (MSK) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex.ru; s=mail; t=1335322981; bh=NTPhww19X/PtVLGbOThJvpBfM460KwRQtXGmyMySxfo=; h=Message-ID:Date:From:MIME-Version:To:CC:Subject:References: In-Reply-To:Content-Type; b=l085fD4Qpv50bD2W4OijLWEZgcMsU8C/kGUkAASt5JLYw4o+eI5XDQ1GiYWLnpuE8 1lVirCju8g07h+m3cpXlb7I/q7L151qZDjTC8noPkIbXgaFPF9AOBNbdB+MCg04gA+ zRFg+xrPA3yc+3fIWNgUtS3u8dUs3tglTa13BzYE= Original-Received: from smtp1.mail.yandex.net (localhost [127.0.0.1]) by smtp1.mail.yandex.net (Yandex) with ESMTP id AAAECAA00B7; Wed, 25 Apr 2012 07:03:01 +0400 (MSK) Original-Received: from 98-87.nwlink.spb.ru (98-87.nwlink.spb.ru [178.252.98.87]) by smtp1.mail.yandex.net (nwsmtp/Yandex) with ESMTP id 30R4J2Td-30RGtQ2K; Wed, 25 Apr 2012 07:03:01 +0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex.ru; s=mail; t=1335322981; bh=NTPhww19X/PtVLGbOThJvpBfM460KwRQtXGmyMySxfo=; h=Message-ID:Date:From:User-Agent:MIME-Version:To:CC:Subject: References:In-Reply-To:Content-Type; b=CeYLEp+zFPtI/UAkYF/FfSM/0YYl+SjQTtFTSFf7ufR0fFP1czTboqvKfQBh2n1dr V1hTXcpnVgKANDbemq9MtX4yDBzrj3VgJ0nGH/hU/Q5Xc1HYUF7rnP0w0crwpTZ0hQ 6XDe7TxACaZI/9HlyH5q+dlIo9tEV9TjUvZwzf9Q= User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:11.0) Gecko/20120327 Thunderbird/11.0.1 In-Reply-To: X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.13 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 2) X-Received-From: 140.186.70.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.bugs:59472 Archived-At: This is a multi-part message in MIME format. --------------010000040107060300060006 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit So, the patch. On 24.04.2012 21:09, Stefan Monnier wrote: > Here is what it does: > - Split large regexp into more manageable chunks. I didn't think to use match groups this way. Very nice. > - During the split I saw that gsub/sub/split/scan were matched (for > regexp) without regards to what precedes them, so "asub / a + bsub / b" > was taken for a regexp. This fix has uncovered another problem: "gsub", "gsub!", "sub", "sub!", "scan", "split", and "split!" are not special tokens, those are all methods on class String: http://www.ruby-doc.org/core-1.9.3/String.html The original author just collected the methods most often used with regexps. And now this is broken: "abcdec".split /[be]/ One might argue that this isn't the most important use case, and that methods with arity > 1 are covered by the second rule (comma after), but 5 of these 7 methods can be called with just 1 argument. So that would mean backward incompatibility. > - I found a problem in your approach to handling Cucumber code. I'm assuming you mean this: x = toto / foo if /do bar/ =~ "dobar" # shortened version We can add a constraint that "do" is followed by (optionally) |a, d, c| (block arguments), and then EOL, since do ... end syntax isn't usually used with one-liner blocks, especially not after a regexp argument. Or we can revert the change and do it the original way. I looked into how other editors deal with regular expressions in Ruby. Vim is whitespace-sensitive. In the example above, the highlighting depends on whether you put space before "foo" (so it highlights one or the other regexp-looking expression). Textmate favors the whitelisting approach, like ruby-mode had pre-patch: http://www.ruby-forum.com/topic/170852 It has one benefit in that when you've typed the regexp, it's already highlighted, before you type the block keyword. Might feel more natural. In this approach, we'd move the "hardcoded" list of special method names to a variable, so that users might customize it, per project. What do you think? And here's a patch for another issue (attached). -- Dmitry --------------010000040107060300060006 Content-Type: text/plain; charset=windows-1251; name="0001-ruby-mode-Don-t-propertize-percent-literals-inside-s.patch" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename*0="0001-ruby-mode-Don-t-propertize-percent-literals-inside-s.pa"; filename*1="tch" >From 05d10742e01cc3dceb4f465695daee7cc42215d6 Mon Sep 17 00:00:00 2001 From: Dmitry Gutov Date: Wed, 25 Apr 2012 06:55:50 +0400 Subject: [PATCH] ruby-mode: Don't propertize percent literals inside strings --- lisp/progmodes/ruby-mode.el | 58 ++++++++++++++++++++++++------------------- test/indent/ruby.rb | 3 +++ 2 files changed, 35 insertions(+), 26 deletions(-) diff --git a/lisp/progmodes/ruby-mode.el b/lisp/progmodes/ruby-mode.el index 5d79437..9ed3879 100644 --- a/lisp/progmodes/ruby-mode.el +++ b/lisp/progmodes/ruby-mode.el @@ -1162,7 +1162,7 @@ See `add-log-current-defun-function'." (7 (prog1 "\"" (ruby-syntax-propertize-heredoc end)))) ;; Handle percent literals: %w(), %q{}, etc. ("\\(?:^\\|[[ \t\n<+(,=]\\)\\(%\\)[qQrswWx]?\\([[:punct:]]\\)" - (1 (prog1 "|" (ruby-syntax-propertize-general-delimiters end))))) + (1 (ruby-syntax-propertize-general-delimiters end)))) (point) end)) (defun ruby-syntax-propertize-heredoc (limit) @@ -1198,31 +1198,37 @@ See `add-log-current-defun-function'." (beginning-of-line)))) (defun ruby-syntax-propertize-general-delimiters (limit) - (goto-char (match-beginning 2)) - (let* ((op (char-after)) - (ops (char-to-string op)) - (cl (or (cdr (aref (syntax-table) op)) - (cdr (assoc op '((?< . ?>)))))) - parse-sexp-lookup-properties) - (ignore-errors - (if cl - (progn ; Paired delimiters. - ;; Delimiter pairs of the same kind can be nested - ;; inside the literal, as long as they are balanced. - ;; Create syntax table that ignores other characters. - (with-syntax-table (make-char-table 'syntax-table nil) - (modify-syntax-entry op (concat "(" (char-to-string cl))) - (modify-syntax-entry cl (concat ")" ops)) - (modify-syntax-entry ?\\ "\\") - (save-restriction - (narrow-to-region (point) limit) - (forward-list)))) ; skip to the paired character - ;; Single character delimiter. - (re-search-forward (concat "[^\\]\\(?:\\\\\\\\\\)*" - (regexp-quote ops)) limit nil)) - ;; If we reached here, the closing delimiter was found. - (put-text-property (1- (point)) (point) - 'syntax-table (string-to-syntax "|"))))) + (goto-char (match-beginning 1)) ; When multiline, the beginning + (let ((state (syntax-ppss)) ; may already be propertized. + (syntax-value (string-to-syntax "|"))) + ;; Move forward either way, to escape inf loop. + (goto-char (match-beginning 2)) + (unless (nth 3 state) ; not inside a string + (let* ((op (char-after)) + (ops (char-to-string op)) + (cl (or (cdr (aref (syntax-table) op)) + (cdr (assoc op '((?< . ?>)))))) + parse-sexp-lookup-properties) + (ignore-errors + (if cl + (progn ; Paired delimiters. + ;; Delimiter pairs of the same kind can be nested + ;; inside the literal, as long as they are balanced. + ;; Create syntax table that ignores other characters. + (with-syntax-table (make-char-table 'syntax-table nil) + (modify-syntax-entry op (concat "(" (char-to-string cl))) + (modify-syntax-entry cl (concat ")" ops)) + (modify-syntax-entry ?\\ "\\") + (save-restriction + (narrow-to-region (point) limit) + (forward-list)))) ; skip to the paired character + ;; Single character delimiter. + (re-search-forward (concat "[^\\]\\(?:\\\\\\\\\\)*" + (regexp-quote ops)) limit nil)) + ;; If we reached here, the closing delimiter was found. + (put-text-property (1- (point)) (point) 'syntax-table + syntax-value))) + syntax-value))) ) ;; For Emacsen where syntax-propertize-rules is not (yet) available, diff --git a/test/indent/ruby.rb b/test/indent/ruby.rb index c4a747a..fe1986a 100644 --- a/test/indent/ruby.rb +++ b/test/indent/ruby.rb @@ -7,6 +7,9 @@ c = %w(foo baz) d = %!hello! +# Don't propertize percent literals inside strings. +"(%s, %s)" % [123, 456] + # A "do" after a slash means that slash is not a division, but it doesn't imply # it's a regexp-ender, since it can be a regexp-starter instead! x = toto / foo; if /do bar/ then -- 1.7.10.msysgit.1 --------------010000040107060300060006--