From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: HaiJun Zhang Newsgroups: gmane.emacs.devel Subject: Re: Using incremental parsing in Emacs Date: Tue, 7 Jan 2020 00:36:30 +0800 Message-ID: References: <83blrkj1o1.fsf@gnu.org> <20200105141900.GA71296@breton.holly.idiocy.org> <83eewdg3vy.fsf@gnu.org> <834kx9g08y.fsf@gnu.org> <83v9ppdzed.fsf@gnu.org> Mime-Version: 1.0 Content-Type: multipart/alternative; boundary="5e136213_579be4f1_b145" Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="252369"; mail-complaints-to="usenet@blaine.gmane.org" Cc: Eli Zaretskii , alan@idiocy.org, arthur miller , emacs-devel@gnu.org To: Stefan Monnier Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Mon Jan 06 17:37:02 2020 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([209.51.188.17]) by blaine.gmane.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.89) (envelope-from ) id 1ioVNE-0013So-KN for ged-emacs-devel@m.gmane.org; Mon, 06 Jan 2020 17:37:00 +0100 Original-Received: from localhost ([::1]:54726 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ioVNC-0006pC-I3 for ged-emacs-devel@m.gmane.org; Mon, 06 Jan 2020 11:36:58 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:45415) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ioVN3-0006kE-BN for emacs-devel@gnu.org; Mon, 06 Jan 2020 11:36:50 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ioVN1-0004OH-2Z for emacs-devel@gnu.org; Mon, 06 Jan 2020 11:36:48 -0500 Original-Received: from mail-oln040092253043.outbound.protection.outlook.com ([40.92.253.43]:27382 helo=APC01-SG2-obe.outbound.protection.outlook.com) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1ioVMy-0004J6-F1; Mon, 06 Jan 2020 11:36:45 -0500 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=QGzyuisKK0A0DFjwsoR9NEHIalXyuKtvKBbyba9g3LQ/wBWomMHeEDlvgFk5eXhx5BuKwQNhM/oRwMpkRC6SedaiBCz+mnM2eKaWr970YCpbMUbCsTITweGFFlLeudnhWLnf8zVbH/fWjSezkXTIrJC+d2Gld+lef/ZjJMa42Bm4uGSyM2zro8Is93wTjEoEJ4mDmn1k9RPD3CNpk+3etgJVb+abPKJwB0s9VwD0YdIuDlNsAwLcYPkxh5qytjEKQzJtiVXD8qkLHenBtRKYpEcmq8CZoTqrByS8qH4Rphg0GGD2Yj0oB58yL2Cv3+5bJ10CZlefEYZ8i+HKO4zd0g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=kh9xGN1zYzKnCVfdoBuXCPtZxMrWra0rqD4TMKH/0fA=; b=DyLWmrgrKOqa5i5jqjlMOFLoO7Ar4m41xxbcmk0c+8G7w7iUFMl7iDQfRpfMMcmxHepXjyplKtPgx+GWlv3c/8kRsTctx0WPK7T6A82ZG+F8qjm1gyUgc6WrxWAj5hnhyPPVrSQl2l1auGR02ZbUaz+d3apgY+4ZBz/DAt7V/+5cnPTOmNM0mDKOgOShy/6xiSfBRKDUqZRe0thWVbD6vFsh/M+hxIiLOFUQKSMYsKG8NBVZe8vFo9tCB1CLRB5fCAR5gRoJrr6kXnMGRECJUqoj8qSYKZ/S+jh3TfX5kfohN9sWjaUTdOK0Q7QREGddI7tUE44QRwZsWrpzZ260Kg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=outlook.com; dmarc=pass action=none header.from=outlook.com; dkim=pass header.d=outlook.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=outlook.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=kh9xGN1zYzKnCVfdoBuXCPtZxMrWra0rqD4TMKH/0fA=; b=cBb8qCFXD8sYl7V9UL65VgNUC32GCQXa9etfqPxntSxSJWk6nvNFve1VlOLYoT4p+cSyaV+gxPTisjsVzW/nfa43iPgI5ACam3sTpgwcooe5th5wAs+hmu3SeoG+Jo2okiRmUB8GIJyBf6QuqYeP4LAu+gb/W/QS9NkS3UowD52g8NSd4GCv7oshabUu9dTZgiwc6UAtKAzj6Ngbl0oxV5VbJ5Wx4JrSo6dZCquMRUx0ZspsTjT3kLdaVOlNbfRszmY+o+x5vn+kymxj1y61CdpjR2wrCyMLG+q9Zlkcdd7bD5Ri6QJMRQfuxJEbrGX/9YpHvXnTtlquFzhX0hUxMw== Original-Received: from HK2APC01FT115.eop-APC01.prod.protection.outlook.com (10.152.248.54) by HK2APC01HT071.eop-APC01.prod.protection.outlook.com (10.152.248.229) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2602.11; Mon, 6 Jan 2020 16:36:39 +0000 Original-Received: from PS1PR03MB3606.apcprd03.prod.outlook.com (10.152.248.59) by HK2APC01FT115.mail.protection.outlook.com (10.152.248.194) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2602.11 via Frontend Transport; Mon, 6 Jan 2020 16:36:39 +0000 X-IncomingTopHeaderMarker: OriginalChecksum:88688D9928CF1A4F7C8C37FFBE56722308DD0C10F37A5E1A967675C55D0B974F; UpperCasedChecksum:8DD32A690FCB7D692107530C608E43B12C44002BBEA4E2182BAA4EFB07012A56; SizeAsReceived:9262; Count:48 Original-Received: from PS1PR03MB3606.apcprd03.prod.outlook.com ([fe80::b470:80bc:efed:9117]) by PS1PR03MB3606.apcprd03.prod.outlook.com ([fe80::b470:80bc:efed:9117%7]) with mapi id 15.20.2623.008; Mon, 6 Jan 2020 16:36:39 +0000 In-Reply-To: X-Readdle-Message-ID: 0fe6211e-34ba-4f03-9081-92dca38aa21e@Spark X-ClientProxiedBy: HK2PR02CA0185.apcprd02.prod.outlook.com (2603:1096:201:21::21) To PS1PR03MB3606.apcprd03.prod.outlook.com (2603:1096:803:4e::17) X-Microsoft-Original-Message-ID: <0fe6211e-34ba-4f03-9081-92dca38aa21e@Spark> Original-Received: from [192.168.1.103] (1.196.184.176) by HK2PR02CA0185.apcprd02.prod.outlook.com (2603:1096:201:21::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2602.11 via Frontend Transport; Mon, 6 Jan 2020 16:36:38 +0000 X-Readdle-Message-ID: 0fe6211e-34ba-4f03-9081-92dca38aa21e@Spark X-Microsoft-Original-Message-ID: <0fe6211e-34ba-4f03-9081-92dca38aa21e@Spark> X-TMN: [pAJbD5xFCMEnIvDCHA20baXcAsvodXOF] X-MS-PublicTrafficType: Email X-IncomingHeaderCount: 48 X-EOPAttributedMessage: 0 X-MS-Office365-Filtering-Correlation-Id: eb36a2fd-01a3-46ef-cb76-08d792c69ac6 X-MS-Exchange-SLBlob-MailProps: q+fD6XS3/ULD3VaDn2jKBsRcNW91Dr7paxhORYb8tIFJ+8MOjTVQmz1Fs4wxE0A9y8xvb/0RC3MP5TH7HyRxzF75EBkFP4/SxBoQ0B8ArkKNk6tH72/za6KTP7FLY1/ANnfXAyGCU8NWvpL/ZuayRVyMleOYQNLPr+k7YVAJpAM8Ua06o0BDanc850lzRCkOtGOWxLEsWiCrIke8hGq1IRquRxkh8sRX3qjLitcqV/p4ubULWMgMSA8EXZ0PlP83Se/lRys84HBbgnXh7DI8DnQ8nOee6U6SlXG1XvAigas2X4PfwnqPKDx1IBkwzvEwphx3Msur6/MYKIjv8xzVoelV7YqQ8s58GPkG+j8h0n1kHnXNkNsAe1w1eGRV8cpoSeWMwluH8beBSxNM9RIjbgErsB1UWDXM3MngKMUohh8dk0VyYPYEiFsTbUWBxmFtPlhvko1nyupVgTwZIV681sW2/UCAoMTaGNAdcSm9QeQU2HqDo0Lg40vRhDvGa0p4g2Su08PnNoBG6bPFhnd0L6p6XFMngmK+8ya+eqcbtqOmyCegfmhgRiV5c7V6TDzWeWAasxezmvP6hC53nPrsR/uKe2oTHyQ1zUVxyghnmW9UO9KvBSNdYDLvk+SrsfhGe61P568Iym/cpJCgPa108QrevqjQiBafdUpNKmNFvbt4/1BVLMEa6w1soH3b03URqGKC6FbUw3IhCxMqFeP19A== X-MS-TrafficTypeDiagnostic: HK2APC01HT071: X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: SpwvdhsTfyjspzeLps3hvMLaioIGMrojFCyAby+hXhsJ3FqWueho/Og/wT2zIptehmqWQx5lr62Yme36LDgq3FnC4HszgLiZog16GSCwugcpqJrcfDfZBHvGHQ81WifG3uEXHjK6KcfiANnsCDgXn5+IageVV6QYOF2c9/WF/nCxpxd04wEtNNct+rs+T5gi X-OriginatorOrg: outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: eb36a2fd-01a3-46ef-cb76-08d792c69ac6 X-MS-Exchange-CrossTenant-OriginalArrivalTime: 06 Jan 2020 16:36:39.6822 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 84df9e7f-e9f6-40af-b435-aaaaaaaaaaaa X-MS-Exchange-CrossTenant-FromEntityHeader: Internet X-MS-Exchange-CrossTenant-RMS-PersistedConsumerOrg: 00000000-0000-0000-0000-000000000000 X-MS-Exchange-Transport-CrossTenantHeadersStamped: HK2APC01HT071 X-detected-operating-system: by eggs.gnu.org: Windows 7 or 8 [fuzzy] X-Received-From: 40.92.253.43 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.org gmane.emacs.devel:244033 Archived-At: --5e136213_579be4f1_b145 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline Thanks for your great explanation. =E5=9C=A8 2020=E5=B9=B41=E6=9C=886=E6=97=A5 +0800 PM9:47=EF=BC=8CStefan M= onnier =EF=BC=8C=E5=86=99=E9=81=93=EF=BC=9A > > I see the buffer is fontified correctly. Does it parse the whole buff= er=3F > > We have different levels of parsing. At the bottom we have > =60syntax-ppss=60 (whose workhorse, implemented in C, is > =60parse-partial-sexp=60) which only counts parentheses and looks for > comment and string markers. In the above case, =60syntax-ppss=60 indeed= > parses the whole buffer, but given its limited scope this parsing is > usually fast (it can be slow in some cases, because =60parse-partial-se= xp=60 > is supplemented by =60syntax-propertize-function=60 to handle the =22un= usual=22 > cases of =22strings/comments=22 (a typical example would be here-docume= nts > in shell scripts) and this is all implemented in Elisp using regexp > searches). > > After this parsing is done, font-lock looks at the few lines actually > displayed using its Elisp/regexps rules to apply the actual highlightin= g. > This may look at more parts of the buffer, tho, depending on the actual= > font-lock rules. > > > Stefan > --5e136213_579be4f1_b145 Content-Type: text/html; charset="utf-8" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline
Thanks for your great explanation.
=E5=9C=A8 2020=E5=B9=B41=E6=9C=886=E6=97= =A5 +0800 PM9:47=EF=BC=8CStefan Monnier <monnier@iro.umontreal.ca>= ;=EF=BC=8C=E5=86=99=E9=81=93=EF=BC=9A
I see the buffer is fo= ntified correctly. Does it parse the whole buffer?

We have different levels of parsing. At the bottom we have
`syntax-ppss` (whose workhorse, implemented in C, is
`parse-partial-sexp`) which only counts parentheses and looks for
comment and string markers. In the above case, `syntax-ppss` indeed
parses the whole buffer, but given its limited scope this parsing is
usually fast (it can be slow in some cases, because `parse-partial-sexp` is supplemented by `syntax-propertize-function` to handle the "unusual= "
cases of "strings/comments" (a typical example would be here-docu= ments
in shell scripts) and this is all implemented in Elisp using regexp
searches).

After this parsing is done, font-lock looks at the few lines actually
displayed using its Elisp/regexps rules to apply the actual highlighting. This may look at more parts of the buffer, tho, depending on the actual
font-lock rules.


Stefan

--5e136213_579be4f1_b145--