From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Dmitry Gutov Newsgroups: gmane.emacs.devel Subject: Re: Escaping quotes in docstrings, Was: A simple solution to "Upcoming loss of usability ..." Date: Thu, 2 Jul 2015 03:09:30 +0300 Message-ID: <5594813A.3000705@yandex.ru> References: <87egkzg7gb.fsf@gmail.com> <558C2E25.10303@cs.ucla.edu> <558C492E.9000705@yandex.ru> <558C7DE1.4060507@cs.ucla.edu> <558C82D2.1070408@yandex.ru> <558CBA7E.7060900@cs.ucla.edu> <558D403D.303@yandex.ru> <558EDD4C.4040002@cs.ucla.edu> <558EE315.3080107@yandex.ru> <558F10FA.409@cs.ucla.edu> <558F4804.1020406@yandex.ru> <559010D6.5090905@cs.ucla.edu> <559058AD.5060504@yandex.ru> <55908355.3080407@yandex.ru> <559356D2.4000103@cs.ucla.edu> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-Trace: ger.gmane.org 1435795784 31569 80.91.229.3 (2 Jul 2015 00:09:44 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Thu, 2 Jul 2015 00:09:44 +0000 (UTC) To: Paul Eggert , emacs-devel@gnu.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Thu Jul 02 02:09:44 2015 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1ZAS4V-0005uc-PW for ged-emacs-devel@m.gmane.org; Thu, 02 Jul 2015 02:09:44 +0200 Original-Received: from localhost ([::1]:33253 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZAS4U-0000e7-Td for ged-emacs-devel@m.gmane.org; Wed, 01 Jul 2015 20:09:42 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:58246) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZAS4Q-0000do-Nx for emacs-devel@gnu.org; Wed, 01 Jul 2015 20:09:40 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ZAS4M-0000gJ-GS for emacs-devel@gnu.org; Wed, 01 Jul 2015 20:09:38 -0400 Original-Received: from mail-wi0-x234.google.com ([2a00:1450:400c:c05::234]:33541) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZAS4M-0000gE-6A for emacs-devel@gnu.org; Wed, 01 Jul 2015 20:09:34 -0400 Original-Received: by wiwl6 with SMTP id l6so180466264wiw.0 for ; Wed, 01 Jul 2015 17:09:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:subject:to:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-type:content-transfer-encoding; bh=JZgabTaAVDRNGE5OhpaInzUcBar36ZnicayomQfmFJ4=; b=GnxpCz/YZ4zg8fqAkJ3xubYlbLgCYaf/DXr4dplO3wrQdN67s/rwbx/pwBAz3z7dVL xPuRYDnkx2Ar6ZG2riO8E7G9eSfFqMFKHIMqgoqiSpmmPzKPRGXNwtXOcDMk47kI4bgt snAdn7kFXUru9XmyAZmk33l6w3Qr6Hx/8eR3qGqQ/Dv8pfA8yLFrufCU+0L8IeodrFcS tCY2A7zRftPN7hxEWK/Bty1hplCe1tH7qEW9gr6A3lrVfXvjK0Isa44gRuDgNkTUeyrh ro+NrReFcFq3WiMuMsC9hzUo9cCo8CYq2wNLDJP4AHzbYp3LsTqDMvIZqZRTHV0580F5 D+mw== X-Received: by 10.194.8.40 with SMTP id o8mr28440742wja.100.1435795773493; Wed, 01 Jul 2015 17:09:33 -0700 (PDT) Original-Received: from [192.168.1.2] ([82.102.93.54]) by mx.google.com with ESMTPSA id b5sm5681726wic.3.2015.07.01.17.09.31 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 01 Jul 2015 17:09:32 -0700 (PDT) User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.0 In-Reply-To: <559356D2.4000103@cs.ucla.edu> X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2a00:1450:400c:c05::234 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:187704 Archived-At: On 07/01/2015 05:56 AM, Paul Eggert wrote: > I tried it out. One thing I noticed right away is odd behavior that > results from storing grave accent and apostrophe in the buffer and > displaying them as curved quotes. For example, run 'emacs -Q -nw', then > read the documentation for the 'length' function and copy it to a new > file /tmp/foo by typing "C-h f length RET C-x o C-x h M-w C-x 4 f > /tmp/foo RET C-y C-s". On the screen you'll see a buffer 'foo' > containing curved quotes. Now, revisit /tmp/foo by typing "C-x C-v > RET". The buffer will contain the same contents as before, except the > curved quotes are magically transformed to grave accent and apostrophe > on the display. It's weird that copied text mysteriously changes its > display representation at a seemingly unrelated moment. It's the same if you copy some syntax-highlighted text (say, from src/doc.c) to /tmp/foo and save it. Kill the buffer, reopen - the highlighting is gone! :) That may seem counter-intuitive, but it's something Emacs users are generally familiar with. > How about the following idea instead. Instead of displaying grave > accent and apostrophe specially, have with-help-window transliterate > these characters in place before displaying itself as usual. That should work, too. In help-mode-finish, before help-make-xrefs. > with-help-window should not transliterate characters that are marked as > being escaped, or as being user data (not clear that we need two kinds > of marks here; one should do, no?). They're semantically different. I think we currently apply linkification (help-make-xrefs) to the contents of user data, and we might continue doing that. On the other hand, we should probably avoid linkifying symbols inside quotes when at least one of the quotes is escaped. > That's the big picture. Here are a few more-minor remarks. > >> + (font-lock-add-keywords >> + nil '(("\\(\\\\~\\)\\(?:\\\\~\\|.\\)" > > As already mentioned, the new \~ quoting syntax doesn't seem to be > needed; we can get by with the existing \= quoting syntax. So let's go > that direction. For us to go there, could you please make substitute-command-keys add the `escaped' property to the escaped characters in its output? And push it to scratch/quote-escaping. > Also, this regexp string matches either (1) \~\~ or (2) \~ followed by > any non-newline character. But if \~ is supposed to escape the next > character, the regexp string should simply implement that, i.e., the > regexp string should be "\\\\~\\(.\\|\n\\)". I guess so. Sorry, brain fart. > I assume the extra complication is about escaping backslash itself, > e.g., if the docstring is \~\ (which would look like "\\~\\" in the > source code) this should stand for \ in the *Help* buffer. But the > above regexp doesn't do that. Doesn't it? Works for me. But anyway, let's try to reuse the quoting from substitute-command-keys first. >> + (unless (get-text-property mbeg 'help-value) > > Supposed the matched string is partly help-value, and partly not. E.g., > mbeg has help-value but mbeg+1 does not but mbeg+2 does. Shouldn't this > test that all the matched characters are not help-value characters? Why? I'm assuming the value is separated from the other contents by whitespace or newlines. IMHO, that would too defensive. We're throwing that code out anyway. On the other hand, we could encounter a value between ` and ', in help--translate-quotes, if the value is shown at the end of the buffer. Should be taken care of in f9f3aa5 (as well as another nearby bug). >> + ;; If we use "" as the third argument, cursor >> + ;; stumbles once when moving over its position. > > I don't understand this comment. Can you explain? For example, does > the comment apply to the just the compose-region call, or to the rest of > the 'unless'? Only to the compose-region call. If we pass "" to it as COMPOSITION, the result will still be considered a valid position for the cursor, even though it has zero width. Anyway, we're going a different route: no compose-region calls. >> + (buffer-substring-no-properties >> + mend (1+ mend))) > > This may go haywire if it returns "\t", because a TAB is special to > compose-region. Also, what if the buffer has some properties other than > help-value that should be preserved? Err, I don't think the current code deletes any existing properties, it only changes how the buffer looks. On the other hand, if help-mode-finish performs the translations (destructively, I'm assuming), then we indeed might lose some properties. I wouldn't worry too much about that, though. If help-mode doesn't know about them, no other code is likely to use them either.