From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Dmitry Gutov Newsgroups: gmane.emacs.devel Subject: Re: Bug #25608 and the comment-cache branch Date: Wed, 22 Feb 2017 04:25:53 +0200 Message-ID: <195629e9-11d6-2fb6-4c9d-39c8a244e2ec@yandex.ru> References: <20170202202418.GA2505@acm> <83lgtouxpf.fsf@gnu.org> <20170202215154.GB2505@acm> <83h94bvhzw.fsf@gnu.org> <20170203172952.GC2250@acm> <0a40d539-b7bc-2655-5429-6280022106ee@yandex.ru> <20170204102410.GA2047@acm> <8f9e68fc-4314-625d-b4bf-796c71c91798@yandex.ru> <20170206192423.GB3568@acm> <4f0fabf3-be9c-7492-379b-59dc93e72b4f@yandex.ru> <20170207192119.GA2490@acm> <424e6409-029c-d15d-421c-4fb90594329c@yandex.ru> NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-Trace: blaine.gmane.org 1487730371 25747 195.159.176.226 (22 Feb 2017 02:26:11 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Wed, 22 Feb 2017 02:26:11 +0000 (UTC) User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.0 To: Stefan Monnier , emacs-devel@gnu.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Wed Feb 22 03:26:06 2017 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1cgMd2-0006D5-Dw for ged-emacs-devel@m.gmane.org; Wed, 22 Feb 2017 03:26:04 +0100 Original-Received: from localhost ([::1]:49463 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cgMd7-0006NN-EV for ged-emacs-devel@m.gmane.org; Tue, 21 Feb 2017 21:26:10 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:33470) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cgMd1-0006NG-3i for emacs-devel@gnu.org; Tue, 21 Feb 2017 21:26:04 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cgMcx-0007AT-5E for emacs-devel@gnu.org; Tue, 21 Feb 2017 21:26:03 -0500 Original-Received: from mail-wm0-x22e.google.com ([2a00:1450:400c:c09::22e]:34890) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1cgMcw-00079f-Ui for emacs-devel@gnu.org; Tue, 21 Feb 2017 21:25:59 -0500 Original-Received: by mail-wm0-x22e.google.com with SMTP id v186so129084888wmd.0 for ; Tue, 21 Feb 2017 18:25:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:subject:to:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=pHQeZjBBT9sWmL3fz3f7wB6bhwx/ZmXSyJdWjwMomEY=; b=pGYaci6bTnSFlrMC/Zl/WAxb1FXI+k+QYiInC01a3OjGEubZI4btmBWYpIum/4ncJ5 Axtu1E6DjJkHClbb2i6GFxLjZeeDz91r382utiXWdMrSWw49NO88xfVbYpraFqwIZ96r ++Hbcm5NcrsVoKBosp8neoba821b1zGtBucQsiPW3NzXOi3ysWDkWjxlEj/hlmjwa9cO koNN96PfACd8ZAjy4ZUHhxDpoF79xBaoSuJx+y+AQO9+LVJeyNMpVlliWtxRktAmUoc2 wRR01RTnguKMIB56VgyIGZNBABK5zPX/kLzhlw3Ke04L1BTGrkjKzLxHl3LiOQ374+d0 w6XA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:subject:to:references:from:message-id :date:user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=pHQeZjBBT9sWmL3fz3f7wB6bhwx/ZmXSyJdWjwMomEY=; b=LIy9aAUraGxUfrnouz2+HCLCAqiNwliKBykCfDCOE/WRpJrt5kBsxpmsTAdNpbNfBS 9EocpN9K79JF9wsD2pXFJ47ua+jI3HxB3cimkwqzNq5M2ouw/1RAT/MrAc59QQZW/lPN g/RH6o66+P3H818POV1rPCBuKPk3bKomAEdOgjgjEve9FRz7A+m2Gcvkh6jK4jjFKRmd NikRNtVjU7VL+H+FEXy5DxyxYIChnBITVAdT26040WGb7sPvCx6UAT3i6fJ9pls6/tYt TqaG/byr0uyUD9j6KFpQjEwx/acalLEj3ouJtszpDGfGarYh0OsrRfOW8EdL8Eg+WYjx dJQA== X-Gm-Message-State: AMke39ma+3Kg4ByByT6jMWUicipQSDJHRbrjegBDCy3tBybBFcTbjeKxaR2vnTnG2lRQTg== X-Received: by 10.28.24.5 with SMTP id 5mr172405wmy.1.1487730355991; Tue, 21 Feb 2017 18:25:55 -0800 (PST) Original-Received: from [192.168.1.3] ([185.105.173.41]) by smtp.googlemail.com with ESMTPSA id o50sm5791263wrc.56.2017.02.21.18.25.54 (version=TLS1_3 cipher=AEAD-AES128-GCM-SHA256 bits=128/128); Tue, 21 Feb 2017 18:25:54 -0800 (PST) In-Reply-To: Content-Language: en-US X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2a00:1450:400c:c09::22e X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.org gmane.emacs.devel:212535 Archived-At: On 14.02.2017 18:38, Stefan Monnier wrote: > Like all the sexp movement functions, `forward-comment` is allowed to > assume that the starting position is outside of comments/strings, so it > doesn't need to consult the cache to see if it's within a string. I see, thanks. And I think that means that, ideally, it would work without the caller having to adjust the syntax visibility bounds, or the like, as long as the syntax table is correct and the beginning (or the end) of the currently navigated comment is within view. > In the case we do scan forward (e.g. the case where we end up using > parse-partial-sexp (or syntax-ppss in my patch)), we actually manually > re-introduce that behavior: if the forward parse says that the > end-comment-marker in inside a string (or inside another comment), we > re-parse from the beginning of that string (or comment) to try and see > if that end-comment-marker could be considered to close a comment nested > within the string (or the other comment). That indeed sounds complex. > Calling syntax-ppss every time back_comment is invoked would probably > result in bad performance currently: when parsing backward > (e.g. backward-sexp), the syntax-ppss-last optimization is ineffective, > so we'd fallback on syntax-ppss-cache which ends up scanning on the > average syntax-ppss-max-span/2 (i.e. 10K) chars. When \n is a comment > ender (i.e. in most programming language modes), it would imply > a forward scan of 10K for every line. You're probably right, but I wonder what the benchmarks would say. (parse-partial-sexp 1 10000) takes 0.0005 seconds here, so it'd still require some intensive usage to show up on user's radar. Previously, we started from the beginning of the current defun, as delineated by an open paren in the first column, right? I've seen function definitions longer than 10000 chars. > IOW, for such an approach to work, we'd have to rework syntax-ppss to be > faster when scanning backward (e.g. reduce syntax-ppss-max-span, which > would have other repercussions). Perhaps we could use the "generic comment bounds" syntax-table property to delineate such difficult comments. If that idea sounds similar to comment-cache, that is no accident. But we should try to limit the incompatibility with mixed modes by only caching the beginnings of comments which contain strings, nested comments, etc. Better suggestion welcome (use a tree data structure instead of in-buffer text-properties?). I've only recently come to the realization that our usage of the syntax-table text property has the same general incompatibility with mixed mode buffers as comment-cache does. The only reasons why it doesn't show as much is because we use them relatively rarely. But we couldn't, for instance, apply a "generic string" syntax to some literal in a subregion that is inside a "generic string" belonging to the primary major mode. Not sure what to do about that.