From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.devel Subject: Re: Some issues with the tree-sitter branch Date: Mon, 17 Oct 2022 09:48:49 +0300 Message-ID: <83y1tf2be6.fsf@gnu.org> References: <83o7ub51yd.fsf@gnu.org> <6F240F48-E29C-4E75-B5D5-30C7CF7956C9@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="9128"; mail-complaints-to="usenet@ciao.gmane.io" Cc: emacs-devel@gnu.org To: Yuan Fu Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Mon Oct 17 08:59:08 2022 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1okK5c-0002Dy-15 for ged-emacs-devel@m.gmane-mx.org; Mon, 17 Oct 2022 08:59:08 +0200 Original-Received: from localhost ([::1]:35538 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1okK5a-0007zf-Lg for ged-emacs-devel@m.gmane-mx.org; Mon, 17 Oct 2022 02:59:06 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:48600) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1okJvs-0007Um-S4 for emacs-devel@gnu.org; Mon, 17 Oct 2022 02:49:07 -0400 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]:38220) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1okJvs-0003qy-IB; Mon, 17 Oct 2022 02:49:04 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=MIME-version:References:Subject:In-Reply-To:To:From: Date; bh=DOYKR6cNj1vt+Sq7z1W12yvdoPz1lWf4wKkvHM6hGYo=; b=X0eoU6bboQAwfIIwFmu4 TbjA62WLhfbpPtXndc33qiAvheY8mRDqUfYybT9Gb8HO/Zf5t0Syii72wK5vX1PF/xifKsz9QrQtQ nZi2n9pgyeGEOR25PLRh/Jbsv306YjaCdJ5lK1/mkApaqB/6E8nwXowOgr48BJJZpABi8khBpVmBX KTGnbm1yVg3j0ofjWz+MrVmJOiwr79o9O3shb++PdeDQ+ap7uFKV+T9shldzUUd799bJ7D+us7lcK oEZFnyyYrehLwSU+tzaUtErwnGwsnNvMd6F5Xweem0bi6N1uM7C2gaiyF45EdAgcXt15zuFQL968T eeMPdu3Ga7pcbw==; Original-Received: from [87.69.77.57] (port=4256 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1okJvq-0004rq-BM; Mon, 17 Oct 2022 02:49:04 -0400 In-Reply-To: <6F240F48-E29C-4E75-B5D5-30C7CF7956C9@gmail.com> (message from Yuan Fu on Sun, 16 Oct 2022 21:53:06 -0700) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:297923 Archived-At: > From: Yuan Fu > Date: Sun, 16 Oct 2022 21:53:06 -0700 > Cc: emacs-devel@gnu.org > > > . Several places assign EMACS_INT values to uint32_t variables with > > an explicit range check (and error signal in case of overflow). > > I think you meant the lack of a range check? Like mentioned here: > > + /* FIXME: We should signal an error below if START_BYTE > + etc. overflow the 32-bit unsigned data type. */ Yes, I added FIXME comments to make it easier to find those places. > I added buffer size check at parser creation time, and used casts to uint32_t liberally, assuming the values never overflows and, so we don’t need to handle the error at a million places. But I should have added checks in ts_after_chang and other places where buffer size could change. I’ll add checks in ts_after_change and other places, and if the argumetns overflows uint32, it will set a flag (say, buffer_too_large) in the parser object, and next time any lisp function tries to use that parser, an buffer-too-large error will be signaled. WDYT? That sounds fine to me, but I think we also should do something when some value that can be larger than UINT_MAX is passed to tree-sitter functions, because doing so might cause tree-sitter do something for a completely unrelated portion of the buffer. At the very least add eassert there, so that at least a build with --enable-checking will detect such cases. > I see that you fixed them, I’ll keep those in mind in the future. That’s a lot of lines you need to change, sorry about that :-( That was a low-hanging fruit that was easy to pluck ;-)