From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Christopher Baines Newsgroups: gmane.lisp.guile.bugs Subject: bug#62290: [PATCH] Fix some invalid unicode handling issues with suspendable ports. Date: Mon, 20 Mar 2023 09:15:13 +0000 Message-ID: <20230320091513.10817-1-mail@cbaines.net> References: <874jqf6b35.fsf@cbaines.net> Mime-Version: 1.0 Content-Transfer-Encoding: 8bit Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="27873"; mail-complaints-to="usenet@ciao.gmane.io" To: 62290@debbugs.gnu.org Original-X-From: bug-guile-bounces+guile-bugs=m.gmane-mx.org@gnu.org Mon Mar 20 10:16:20 2023 Return-path: Envelope-to: guile-bugs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1peBcp-00074N-TC for guile-bugs@m.gmane-mx.org; Mon, 20 Mar 2023 10:16:19 +0100 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1peBcZ-00032c-P5; Mon, 20 Mar 2023 05:16:03 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1peBcY-00032M-Kc for bug-guile@gnu.org; Mon, 20 Mar 2023 05:16:02 -0400 Original-Received: from debbugs.gnu.org ([209.51.188.43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1peBcY-0004bI-9A for bug-guile@gnu.org; Mon, 20 Mar 2023 05:16:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1peBcY-0005Zh-5R for bug-guile@gnu.org; Mon, 20 Mar 2023 05:16:02 -0400 X-Loop: help-debbugs@gnu.org In-Reply-To: <874jqf6b35.fsf@cbaines.net> Resent-From: Christopher Baines Original-Sender: "Debbugs-submit" Resent-CC: bug-guile@gnu.org Resent-Date: Mon, 20 Mar 2023 09:16:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 62290 X-GNU-PR-Package: guile Original-Received: via spool by 62290-submit@debbugs.gnu.org id=B62290.167930371721366 (code B ref 62290); Mon, 20 Mar 2023 09:16:02 +0000 Original-Received: (at 62290) by debbugs.gnu.org; 20 Mar 2023 09:15:17 +0000 Original-Received: from localhost ([127.0.0.1]:53692 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1peBbp-0005YY-1L for submit@debbugs.gnu.org; Mon, 20 Mar 2023 05:15:17 -0400 Original-Received: from mira.cbaines.net ([212.71.252.8]:42404) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1peBbm-0005YQ-Ue for 62290@debbugs.gnu.org; Mon, 20 Mar 2023 05:15:15 -0400 Original-Received: from localhost (unknown [IPv6:2a02:8010:68c1:0:54d1:d5d4:280e:f699]) by mira.cbaines.net (Postfix) with ESMTPSA id 56CDE16F21 for <62290@debbugs.gnu.org>; Mon, 20 Mar 2023 09:15:14 +0000 (GMT) Original-Received: from localhost (localhost [local]) by localhost (OpenSMTPD) with ESMTPA id 03fb9ae7 for <62290@debbugs.gnu.org>; Mon, 20 Mar 2023 09:15:14 +0000 (UTC) X-Mailer: git-send-email 2.39.1 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-guile@gnu.org List-Id: "Bug reports for GUILE, GNU's Ubiquitous Extension Language" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-guile-bounces+guile-bugs=m.gmane-mx.org@gnu.org Original-Sender: bug-guile-bounces+guile-bugs=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.lisp.guile.bugs:10573 Archived-At: Based on the implementation in ports.c. I don't understand what this code is really doing, but the suspendable ports implementation differs from the similar C code for a couple of inequalities. * module/ice-9/suspendable-ports.scm (decode-utf8, bad-utf8-len): Flip a couple of inequalities. * test-suite/tests/ports.test ("string ports"): Add additional invalid UTF-8 test case. --- module/ice-9/suspendable-ports.scm | 8 ++++---- test-suite/tests/ports.test | 7 +++++++ 2 files changed, 11 insertions(+), 4 deletions(-) diff --git a/module/ice-9/suspendable-ports.scm b/module/ice-9/suspendable-ports.scm index a823f1d37..9fac1df62 100644 --- a/module/ice-9/suspendable-ports.scm +++ b/module/ice-9/suspendable-ports.scm @@ -419,7 +419,7 @@ (= (logand u8_2 #xc0) #x80) (case u8_0 ((#xe0) (>= u8_1 #xa0)) - ((#xed) (>= u8_1 #x9f)) + ((#xed) (<= u8_1 #x9f)) (else #t))) (kt (integer->char (logior (ash (logand u8_0 #x0f) 12) @@ -436,7 +436,7 @@ (= (logand u8_3 #xc0) #x80) (case u8_0 ((#xf0) (>= u8_1 #x90)) - ((#xf4) (>= u8_1 #x8f)) + ((#xf4) (<= u8_1 #x8f)) (else #t))) (kt (integer->char (logior (ash (logand u8_0 #x07) 18) @@ -462,7 +462,7 @@ ((< buffering 2) 1) ((not (= (logand (ref 1) #xc0) #x80)) 1) ((and (eq? first-byte #xe0) (< (ref 1) #xa0)) 1) - ((and (eq? first-byte #xed) (< (ref 1) #x9f)) 1) + ((and (eq? first-byte #xed) (> (ref 1) #x9f)) 1) ((< buffering 3) 2) ((not (= (logand (ref 2) #xc0) #x80)) 2) (else 0))) @@ -471,7 +471,7 @@ ((< buffering 2) 1) ((not (= (logand (ref 1) #xc0) #x80)) 1) ((and (eq? first-byte #xf0) (< (ref 1) #x90)) 1) - ((and (eq? first-byte #xf4) (< (ref 1) #x8f)) 1) + ((and (eq? first-byte #xf4) (> (ref 1) #x8f)) 1) ((< buffering 3) 2) ((not (= (logand (ref 2) #xc0) #x80)) 2) ((< buffering 4) 3) diff --git a/test-suite/tests/ports.test b/test-suite/tests/ports.test index 66e10e3dd..1b30e1a68 100644 --- a/test-suite/tests/ports.test +++ b/test-suite/tests/ports.test @@ -1059,6 +1059,13 @@ eof)) (test-decoding-error (#xf0 #x88 #x88 #x88) "UTF-8" + (error ;; 2nd byte should be in the 90..BF range + error ;; 88: not a valid starting byte + error ;; 88: not a valid starting byte + error ;; 88: not a valid starting byte + eof)) + + (test-decoding-error (#xf4 #xa4 #xbd #xa4) "UTF-8" (error ;; 2nd byte should be in the 90..BF range error ;; 88: not a valid starting byte error ;; 88: not a valid starting byte -- 2.39.1