From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Rob Browning Newsgroups: gmane.lisp.guile.bugs Subject: bug#56413: [PATCH 1/1] scm_i_utf8_string_hash: compute u8 chars not bytes Date: Mon, 07 Nov 2022 23:05:34 -0600 Message-ID: <87mt92ujc1.fsf@trouble.defaultvalue.org> References: <20220706012323.1024763-1-rlb@defaultvalue.org> <87zgd5gi4t.fsf@gnu.org> <87zgd3vpb7.fsf@trouble.defaultvalue.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="19478"; mail-complaints-to="usenet@ciao.gmane.io" Cc: 56413@debbugs.gnu.org To: Ludovic =?UTF-8?Q?Court=C3=A8s?= Original-X-From: bug-guile-bounces+guile-bugs=m.gmane-mx.org@gnu.org Tue Nov 08 06:06:46 2022 Return-path: Envelope-to: guile-bugs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1osGow-0004sH-GZ for guile-bugs@m.gmane-mx.org; Tue, 08 Nov 2022 06:06:46 +0100 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1osGoH-0001gJ-93; Tue, 08 Nov 2022 00:06:05 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1osGoE-0001fF-U8 for bug-guile@gnu.org; Tue, 08 Nov 2022 00:06:02 -0500 Original-Received: from debbugs.gnu.org ([209.51.188.43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1osGoE-0000dZ-Gy for bug-guile@gnu.org; Tue, 08 Nov 2022 00:06:02 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1osGoE-0003hz-24 for bug-guile@gnu.org; Tue, 08 Nov 2022 00:06:02 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Rob Browning Original-Sender: "Debbugs-submit" Resent-CC: bug-guile@gnu.org Resent-Date: Tue, 08 Nov 2022 05:06:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 56413 X-GNU-PR-Package: guile X-GNU-PR-Keywords: patch Original-Received: via spool by 56413-submit@debbugs.gnu.org id=B56413.166788393914224 (code B ref 56413); Tue, 08 Nov 2022 05:06:01 +0000 Original-Received: (at 56413) by debbugs.gnu.org; 8 Nov 2022 05:05:39 +0000 Original-Received: from localhost ([127.0.0.1]:35960 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1osGnq-0003hL-Io for submit@debbugs.gnu.org; Tue, 08 Nov 2022 00:05:38 -0500 Original-Received: from defaultvalue.org ([45.33.119.55]:59694 ident=postfix) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1osGno-0003hD-Dt for 56413@debbugs.gnu.org; Tue, 08 Nov 2022 00:05:36 -0500 Original-Received: from trouble.defaultvalue.org (localhost [127.0.0.1]) (Authenticated sender: rlb@defaultvalue.org) by defaultvalue.org (Postfix) with ESMTPSA id AC96B2043C; Mon, 7 Nov 2022 23:05:35 -0600 (CST) Original-Received: by trouble.defaultvalue.org (Postfix, from userid 1000) id 08A3714E552; Mon, 7 Nov 2022 23:05:35 -0600 (CST) In-Reply-To: <87zgd3vpb7.fsf@trouble.defaultvalue.org> X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-guile@gnu.org List-Id: "Bug reports for GUILE, GNU's Ubiquitous Extension Language" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-guile-bounces+guile-bugs=m.gmane-mx.org@gnu.org Original-Sender: bug-guile-bounces+guile-bugs=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.lisp.guile.bugs:10421 Archived-At: Rob Browning writes: > OK, so unfortunately I don't actually recall how I came up with that > number, but I can start over with some canonical approach to compute the > value if we like. I hacked up hash.c to let me call wide_string_hash() directly and printed the hash for wchar_t {0x3A0, 0x3B5, 0x3C1, 0x3AF}, which should be what the optimized utf-8 code is consuming. I saw 4029223418961680680. I double-checked via (symbol-hash '=CE=A0=CE=B5=CF=81=CE=AF) from the terminal, and that returned the same va= lue. Oh, and unless I'm missing something, I remembered why we may need to keep the standalone C test program -- there's no straightforward way to call scm_from_utf8_symbol() from scheme? Thanks --=20 Rob Browning rlb @defaultvalue.org and @debian.org GPG as of 2011-07-10 E6A9 DA3C C9FD 1FF8 C676 D2C4 C0F0 39E9 ED1B 597A GPG as of 2002-11-03 14DD 432F AE39 534D B592 F9A0 25C8 D377 8C7E 73A4