From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Eric Ludlam Newsgroups: gmane.emacs.devel Subject: Re: Why tree-sitter instead of Semantic? (was Re: CC Mode with font-lock-maximum-decoration 2) Date: Tue, 16 Aug 2022 21:41:09 -0400 Message-ID: References: <83o7wuva9o.fsf@gnu.org> <83mtceupbx.fsf@gnu.org> <83lerxvfnu.fsf@gnu.org> <838rnxvdcq.fsf@gnu.org> <83r11ptksn.fsf@gnu.org> <83a68dti6w.fsf@gnu.org> <87a687sjnv.fsf@yahoo.com> <83zgg4fm9p.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="16506"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.11.0 Cc: Eli Zaretskii , luangruo@yahoo.com, jostein@secure.kjonigsen.net, jostein@kjonigsen.net, acm@muc.de, emacs-devel@gnu.org, casouri@gmail.com To: Lynn Winebarger , Stefan Monnier Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Wed Aug 17 03:42:26 2022 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1oO84g-00043f-5K for ged-emacs-devel@m.gmane-mx.org; Wed, 17 Aug 2022 03:42:26 +0200 Original-Received: from localhost ([::1]:34418 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oO84e-00008D-HI for ged-emacs-devel@m.gmane-mx.org; Tue, 16 Aug 2022 21:42:24 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:44876) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oO83W-0007rV-TW for emacs-devel@gnu.org; Tue, 16 Aug 2022 21:41:14 -0400 Original-Received: from mail-qv1-xf32.google.com ([2607:f8b0:4864:20::f32]:42772) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1oO83V-00047Y-DW; Tue, 16 Aug 2022 21:41:14 -0400 Original-Received: by mail-qv1-xf32.google.com with SMTP id ct13so9155678qvb.9; Tue, 16 Aug 2022 18:41:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:in-reply-to:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :from:from:to:cc; bh=ej5ykadow8NAhWD6eqaqr3yLqBd1YhoS6fVVvDEaeIE=; b=qpJBVIwE4GfaDs8gd9hOq/rYPNjUNSceoTRcGwH6ep+xjF3ERyBhzGwsP2nmrrb/6Q 4Y3Mhi1EAyywHz8KrGs0gDVlsmvjkfotBa/zGnj7JODv4MmwpbpxwcViTOQ350Gsrgp1 l05bSy7V89KA62XSeHonWuHwyBj4pUgmlkifucSunktG9vSiLrDzy2/GFTWwTBl5cnl3 i3NI1oT4ux1cDzQFOeOJNqMf7iT0pNRj8MnFIuUpHzGXOR94oCe0U3QBj1ESjPhpFmKg myWG4mxpyU1n/JDtwOzSA1NmEZdfovOYjzzJCTaaZJAGh7RBRvBBpL4RxOjiZByRp7GJ CyFg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:in-reply-to:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :from:x-gm-message-state:from:to:cc; bh=ej5ykadow8NAhWD6eqaqr3yLqBd1YhoS6fVVvDEaeIE=; b=eCzgCjdsbc8DxVqqWat3MCO5XTtaTZcuLay4qUGhU1WKpcsejr1h2AE8gzz7sxoH8Y 7d994LW2FnJoY//6+LxfezOxE0I8kv9v3QZzqT/kEU05ahrPEiy4kJh9HlzwDxf+lmxd RBkH9L0NDi/N0ktnG9vK8ga7Eu2Nzcy/bUGTV3YF0Eetg9KEigE6++gf0TsgybylhoDe wKjUmRam5skqf0OOF9SkXjLI6Ge+swEPtAO7Nkxrzty45bzzSI4SJavqAwj4Zmr8QBgp GVYQ9I6GrWWHBM0G3JRwWFCjWiuA6rjhV0Rbmh+hzypuyUSgRn0ZnYG/29DhFxbzm/Gj ddGw== X-Gm-Message-State: ACgBeo1psISkesUPHZCGZH4nrRd6C33Z56tigYvsyb7vcRhFaeq+P1u7 Ph33zNU/M1eb2+RJuQhkvvwrCRpAkX/PjQ== X-Google-Smtp-Source: AA6agR7YTvC8vX1qEAtaCItaIT4vA24UIVH/hiLZeghvzrQFpUuoR+gCo4YuOHTNYmdXyhnXqumn9w== X-Received: by 2002:a0c:aa89:0:b0:48f:5a1b:2e4a with SMTP id f9-20020a0caa89000000b0048f5a1b2e4amr14143473qvb.102.1660700471610; Tue, 16 Aug 2022 18:41:11 -0700 (PDT) Original-Received: from [192.168.1.161] (pool-108-20-30-136.bstnma.fios.verizon.net. [108.20.30.136]) by smtp.googlemail.com with ESMTPSA id az9-20020a05620a170900b006bb87c4833asm884871qkb.109.2022.08.16.18.41.10 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 16 Aug 2022 18:41:10 -0700 (PDT) X-Google-Original-From: Eric Ludlam Content-Language: en-US In-Reply-To: Received-SPF: pass client-ip=2607:f8b0:4864:20::f32; envelope-from=ericludlam@gmail.com; helo=mail-qv1-xf32.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, NICE_REPLY_A=-0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:293525 Archived-At: On 8/16/22 1:40 PM, Lynn Winebarger wrote: > On Tue, Aug 16, 2022 at 1:19 PM Stefan Monnier wrote: >> >>> I'm only saying there's a disconnect between Jostein's report and Po's >>> response. It's probably a UI issue. There's a checkbox in a dropdown >>> menu that says "Source Code Parsers (Semantic)". >> >> FWIW, I've used (semantic-mode 1) to enable CEDET in Emacs's C source >> files and that was all that was needed to get TAB completion of struct >> field's names working. >> I haven't used it for much more than that, admittedly. > > It also works for me, but I also have been mostly looking at Emacs > source with it, and Semantic knows how to use the TAGS file for > context-sensitive completion in C. And something is working > gangbusters in Elisp, but unfortunately I can't really identify which > package is doing the work. > >>> * "${" and "{" could both open a block closed by "}" >> >> Why do you think it's a problem? > If you want the lexer to tokenize the ${ as a symbol while still > recognizing the text in between as delimited, it seems like a problem. > I mean, I already deal with that in ordinary font-lock, I was hoping > the parser/lexer generation would address the issue independently of > syntax tables. Lexers are built per-language from a set of analyzers. Thus, you call (define-lex ...) and list a bunch of analyzers, which are created with `define-lex-analyzer' or one of the variants. The analyzers mostly use regular expressions, and when possible, uses expressions that use the syntax table because they are quite fast. If you restrict yourself to the built-in named lexer analyzers, like 'semantic-lex-whitespace', then that is what they are, but you can use `define-lex-analyzer' or `define-lex-regex-analyzer' and write any code you want to do a match, push a token, and find the end point. The C lexer/parser does this a lot. For a very simple case like matching ${: (define-lex-simple-regex-analyzer my-dollar-curly "doc string" "\\$\\{" 'dollar-curly) and then put this in front of the { } block analyzer when you build up your lexer. Eric