From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Ergus Newsgroups: gmane.emacs.devel Subject: Re: How to add pseudo vector types Date: Tue, 27 Jul 2021 01:40:53 +0200 Message-ID: <20210726234053.za5axe3m646ps7wr@Ergus> References: <8335s64v10.fsf@gnu.org> <5380C92B-6C15-4490-A1E0-1C3132DBB16A@gmail.com> <83k0li2shw.fsf@gnu.org> <86wnpg82v3.fsf@stephe-leake.org> <83lf5wyn0z.fsf@gnu.org> <86pmv66yqg.fsf@stephe-leake.org> <83a6maw705.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="34120"; mail-complaints-to="usenet@ciao.gmane.io" Cc: Eli Zaretskii , =?utf-8?Q?Cl=C3=A9ment?= Pit-Claudel , Stephen Leake , Stefan Monnier , emacs-devel@gnu.org To: Yuan Fu Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Tue Jul 27 01:42:15 2021 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1m8AEg-0008ku-Qo for ged-emacs-devel@m.gmane-mx.org; Tue, 27 Jul 2021 01:42:15 +0200 Original-Received: from localhost ([::1]:58090 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m8AEf-0006Wq-P1 for ged-emacs-devel@m.gmane-mx.org; Mon, 26 Jul 2021 19:42:13 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:52736) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m8ADo-0005c1-Cd for emacs-devel@gnu.org; Mon, 26 Jul 2021 19:41:20 -0400 Original-Received: from sonic313-14.consmr.mail.bf2.yahoo.com ([74.6.133.124]:35924) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m8ADl-0005VY-AT for emacs-devel@gnu.org; Mon, 26 Jul 2021 19:41:20 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=aol.com; s=a2048; t=1627342872; bh=8X7IUN1IsPHqysXZqx/10L1UskZXz24s3Q0PrSBXGQk=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From:Subject:Reply-To; b=YzehClDvC8nv5aYzQPD/DgHgUVNrlLVkzN3m/JGspjxMNVy1jddOqgdNAelZJgrjrBeLKDWcRKSDeh+3ZASS8YNPdj+wS7DdyzmC6RWeVFUJ3kecs/XI3qbanIGgGVLLefOb+bZ81hKs6HQAhmx6bYIMaxrgRk2D9db6nWjHO3Qsa1R76tTIHJDPYUABp4Q1WaZGMdfnL1KwwPc330/Iimbr0gUBQ4g7ylfwM30ggevGVDmH8UmEUSM/K2qkLuklc+azfBLg7KLq4H1/ty59afvrHGoXHtxRF7lplZPmj57C0L8B7Kxs9ywBEf4R9jCILcT9ZQcff/BhvfDiVq6bUA== X-SONIC-DKIM-SIGN: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s2048; t=1627342872; bh=Kn7u2kcYzYPKhBBnli5r3L/8ClcuJpKTSGSDhFSmSrr=; h=X-Sonic-MF:Date:From:To:Subject:From:Subject; b=WuThZH2Ml0puYqAcv70b0paGFX37Wi2GlEaRY37h3V+LN7g5PQF1Slril3My88JRZ+Vm8Dee3cX2DVITedapXCbG8Fo+J1QzIHjpLvHXYA8Xbg6twM70VAJH/QJ21VrKd2Cavi7NXUZyZBbzt3jjUqpzbmSS4thS9VUQSRf+PpQTuQAOD4XI/5hfBx44LNpnip3U8H8J1CJo1THCo2dOl1LwKDcilprysrBc+wDpkx2WYfTKb2qywMDnSOFLdt7KLO1LhdOOa7zP9+hQnJFZphVcjj6ncKy/lSE73KpZwXG/Mu5/HjnIiyGCQeeLBvwT5Uuku/2N/gikPAADLbRC8A== X-YMail-OSG: bcLEdjsVM1ni4.BAg5wFXk5thuhw7Nt8Jj3Ki9KIm2j1sbVZ64q10gsALEgyxl_ mvTFlVtMIE1Axz5rl.JdM373jAWWKKkOZl5icNUWyBoTMESys4vGArrdCrq.2ueMTAYMR6vZaA0J qiRiSx3AnpTFkYGnmFCTSpAl7maYcrohR8frrw9pnT0iqUrYR_.IHchFoOuM5Jmcu00g1tIgYI5b djKRemoueWs72K7WLcXhZ05olaE3a.E4qznFT5iAM2m2MjqtPx5pWcKXI1yeziMFBSMj9_pJax2O ovjg7zoIhtHdK.n4EsrC5oDb3mvJGedKDuchYkEuMS60gaCEll4BaaqF.5gqKHlqrvFZ1Ymf7LdI 6lGJmhCIa0mrs18NE4dsIxHdIKSJS2CFTAVL_rlO3Qy3LKz0nBSKPZJePXeRJ3qCVPTe2i9b3Q7T W9Q1Nca8Hnj2uIBG6G58CYQI7b.XEkksjdi_0cj7dJpDYFKG6z6KZpTx1In5SYOZEQKY0y_PGHxj G3Gbee1_wCNCxLS8okbuzKw4cpvtm7b0GzHIXMqTRRJahlDdaFu8zRA5QHHyF79iaoHmbI0LTGQl doZgiYiBpbK1tSMUlrp4e0PJVRxzNyBcxOUgQLkSdaPuwznRwKPDEpYrTgrKXhLCX9BEYqkKVI5u rbcD_3B4wwwFKYxt4YYy8OU8c_vCelO2KqckGx8b6EhXby_kQisE_ffezcpIAvXRL4hMhOA1fydT IUjb1sbHecnMYFiGVTbytaRlRJ4847d8G7QbJslNhJjkcscqA0tQBzRVSWMtlPp61XH6ez26.fRY YgkwqI.xzkUxu.bJZATaWv7uSosNyyT09cTg_smfac X-Sonic-MF: Original-Received: from sonic.gate.mail.ne1.yahoo.com by sonic313.consmr.mail.bf2.yahoo.com with HTTP; Mon, 26 Jul 2021 23:41:12 +0000 Original-Received: by kubenode528.mail-prod1.omega.ir2.yahoo.com (VZM Hermes SMTP Server) with ESMTPA ID 89fb5d0702c4a1c899d5b660129f720c; Mon, 26 Jul 2021 23:41:10 +0000 (UTC) Content-Disposition: inline In-Reply-To: X-Mailer: WebService/1.1.18749 mail.backend.jedi.jws.acl:role.jedi.acl.token.atz.jws.hermes.aol Received-SPF: pass client-ip=74.6.133.124; envelope-from=spacibba@aol.com; helo=sonic313-14.consmr.mail.bf2.yahoo.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:271677 Archived-At: On Mon, Jul 26, 2021 at 12:40:31PM -0400, Yuan Fu wrote: >> >>>>> unless the narrowing is for multi-major-mode. >>>> >>>> And what would you do in that case, if you allow TS to look beyond the >>>> restriction? >>> >>> In the multi-major-mode case, there is a separate parser for each >>> language, and each sub-mode region in the text would get its own parser >>> tree (ie, it acts like a separate file), and that parser tree is only >>> told about changes to those regions. So the parser will never try to >>> look outside the region; it doesn't need to know about narrowing. >> >> Once again, we are talking about the function used by TS to read >> buffer text. Not about the parser or its caller. Low-level code, >> which knows nothing about the context, should never look beyond the >> restriction. > >It doesn’t harm for tree-sitter to see the rest of the buffer, it >doesn’t modify anything, all it does it reading the text. OTOH, >restricting tree-sitter to the bounds of narrows adds complexity for no >benefit (as far as I can see). Maybe narrowing is the context that low >level code should ignore, or at least tree-sitter should ignore. The >only benefit that I can think of is “we firmly adhere to the ‘contract’ >that no one can look beyond the narrowed region”, but is it a good >contract? Is there really a contract in the first place? IMO, narrowing >acts like masking tapes over the rest of the buffer, so that user edits >like re-replace wouldn’t spill out. Demanding everything in Emacs to >not have access to the rest of the buffer is dogmatic (in the sense >that it is too rigid and is simply following the doctrine blindly). > Hi Yuan: From my absolute ignorance on tree_sitter and your changes. There is a function ts_parser_set_included_ranges that is a way I used once to reduce the parsing region and improve (notably) the performance in a test api. Can't narrow regions use that? I think it is the same idea but I am probably wrong. Limiting the region to parse to the modified region (that in emacs may be known thanks to the gap and maybe the undo-tree) and using the output tree from the previous parse as the `old_tree` parameter in ts_parser_parse_string made tree_sitter incredibly fast in my case (and useful to run it on every key press). In my case using old_tree reduced the time by a factor of 10 in a big source file; and limiting the parser to the "changed" region only made it almost instantly in more than 80% of the executions with small modifications. (I repeat; it was a much simpler use case) >And about language definitions and font-locking, I just realized that >tree-sitter language definitions provides highlighting patterns, and we >only need to minimally modify them to use them for Emacs, so there >aren’t much manual effort involved. > I think tree-sitter has many more language definitions than Emacs in some languages, and probably we may want to properly support them. So maybe: instead of just modifying what is on tree-sitter to make it similar to what emacs currently has; we could just use the node's syntactic information and then let emacs use it adding more faces if needed... Does it makes sense? The idea is to have real syntactic information on the text itself because that may help in the future to implement indentation and navigation commands using three-sitter's information (commands like up-list or forward-sexp) will be the equivalent to ts_tree_cursor_goto_parent or ts_tree_cursor_goto_next_sibling. >Also, anyone have thoughts on how should tree-sitter intergrate with >font-lock beyond the current simple interface? > No idea, but in my experience the most efficient way to traverse a tree-sitter tree is with ts_tree_cursor but maybe for font-lock the best is just to use ts_tree_get_changed_ranges. >Yuan Best, Ergus