From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Jean Abou Samra Newsgroups: gmane.lisp.guile.bugs Subject: bug#57507: Regular expression matching depends on locale encoding Date: Wed, 31 Aug 2022 18:54:50 +0200 Message-ID: Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="16288"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.2.0 To: 57507@debbugs.gnu.org Original-X-From: bug-guile-bounces+guile-bugs=m.gmane-mx.org@gnu.org Wed Aug 31 18:55:15 2022 Return-path: Envelope-to: guile-bugs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1oTQzg-00045x-H2 for guile-bugs@m.gmane-mx.org; Wed, 31 Aug 2022 18:55:12 +0200 Original-Received: from localhost ([::1]:43548 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oTQzf-0007bC-Ce for guile-bugs@m.gmane-mx.org; Wed, 31 Aug 2022 12:55:11 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:38928) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oTQzW-0007b2-Oh for bug-guile@gnu.org; Wed, 31 Aug 2022 12:55:02 -0400 Original-Received: from debbugs.gnu.org ([209.51.188.43]:50610) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1oTQzW-0006up-Ei for bug-guile@gnu.org; Wed, 31 Aug 2022 12:55:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1oTQzW-00027P-62 for bug-guile@gnu.org; Wed, 31 Aug 2022 12:55:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Jean Abou Samra Original-Sender: "Debbugs-submit" Resent-CC: bug-guile@gnu.org Resent-Date: Wed, 31 Aug 2022 16:55:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: report 57507 X-GNU-PR-Package: guile X-Debbugs-Original-To: bug-guile@gnu.org Original-Received: via spool by submit@debbugs.gnu.org id=B.16619649018131 (code B ref -1); Wed, 31 Aug 2022 16:55:02 +0000 Original-Received: (at submit) by debbugs.gnu.org; 31 Aug 2022 16:55:01 +0000 Original-Received: from localhost ([127.0.0.1]:40359 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1oTQzU-000275-Uc for submit@debbugs.gnu.org; Wed, 31 Aug 2022 12:55:01 -0400 Original-Received: from lists.gnu.org ([209.51.188.17]:53616) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1oTQzR-00026v-IB for submit@debbugs.gnu.org; Wed, 31 Aug 2022 12:54:59 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:48694) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oTQzP-0007ai-IG for bug-guile@gnu.org; Wed, 31 Aug 2022 12:54:57 -0400 Original-Received: from mout.kundenserver.de ([217.72.192.74]:56449) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oTQzN-0006uL-VS for bug-guile@gnu.org; Wed, 31 Aug 2022 12:54:55 -0400 Original-Received: from [192.168.1.128] ([82.65.251.18]) by mrelayeu.kundenserver.de (mreue108 [212.227.15.184]) with ESMTPSA (Nemesis) id 1MjSDU-1p8aMz3GIc-00ksss for ; Wed, 31 Aug 2022 18:54:51 +0200 Content-Language: en-US X-Provags-ID: V03:K1:QLmlXCYqcqLP3agz04lc3RXCFoPSB7Tf2Tkiw3gTJk1qzF/XRe/ v0a8Jcm2mMTbQQbBWS5Hhlbgv8jPvoHi78kNSYnsikO3tH5dfPX5RA2WYJb6wCfsS/tR77K XHL3NJwn86NZbrOTScuxTSnD/7aqNnjDXJmZKCNXMD5xoK8b6I8edmKhMUEJgYijQQq7KVA wXTDK1oRzWxAVNe+y0qTQ== X-UI-Out-Filterresults: notjunk:1;V03:K0:FYZex9tiPUk=:OIid1czp9DkRTgEzF3lsdX aOvtsPvp4yoDWxvedLKjkOrlvr95dbi6p0xDYzbJY2GXpmWqYzHJvlLmrGHGcVCMGvIetM17N Qu6Ma9FWT4q/E7mB+upMwfIPI+z5wpArdNjFCVXsC2I1vto/Vy45s/f4S+uQJ9yOA8jhmbXVJ U8mVF1bnLYiCza3kOl5EmiMoSHQep7ZaiyvezfN8UumCjeNBXtMo5fRB7umT6t9GlhnL7POQp 2Vjfe53DDZ4iDu4cUV1Gx8K0smZtVCqIJUrbJOoHiGB7k8GpKoSs236xhGS5gSFlE5M+4a0aH nHh5cQ8DI7yfvoIwDWfjU27ZWFUr1mfGB4rbEe2qojLfWtPYuarVtF0urSg6wAb86EZx6OqrN ug0X14Ux344FRJffFl8RM4AJ0vbonw2VjZUy45VS97lxb7r5HG8/JiUqNdIHaYlH/g/qSOVO9 U8iwxTebZKrVXkgUN15JsTHUcBv+r80RxQVfO1fzb2QzkhX/Cf7H4J71QS+UJpaBKaVXfvxij fPAnnFZMwW/fRAW7PdxlCJYZ8FgOxvpjGy1YWpNwB2yN8KPY3ec1anGniI1sHIye3CwH1n+aa RXdqSLRXhvJHoTfh33bwszsSkqWfEYm3ZqyEy6GkkK0kLKq/l1ilFYq/oyMMVs7gdlkcqvSvf 6HWNibBFxqUo16l6GpSb03Vh9pprO5R9M3hvOWCU2LEj521mINqSUuJdmJLD2gFnF7YxX1l7p 54NWghrQo20N2TR49kd8pM3xT7kUq73X71zsTA== Received-SPF: none client-ip=217.72.192.74; envelope-from=jean@abou-samra.fr; helo=mout.kundenserver.de X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-guile@gnu.org List-Id: "Bug reports for GUILE, GNU's Ubiquitous Extension Language" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-guile-bounces+guile-bugs=m.gmane-mx.org@gnu.org Original-Sender: "bug-guile" Xref: news.gmane.io gmane.lisp.guile.bugs:10355 Archived-At: Regular expressions do funky things with Unicode if a non-Unicode-aware locale is set. Yet, they're purely string operations, so I don't think it's expected that they depend on the locale encoding. $ LC_ALL=C guile3.0 GNU Guile 3.0.7 Copyright (C) 1995-2021 Free Software Foundation, Inc. Guile comes with ABSOLUTELY NO WARRANTY; for details type `,show w'. This program is free software, and you are welcome to redistribute it under certain conditions; type `,show c' for details. Enter `,help' for help. scheme@(guile-user)> (use-modules (ice-9 regex)) scheme@(guile-user)> (match:substring (string-match "\u203f" "\u3091")) ice-9/boot-9.scm:1685:16: In procedure raise-exception: In procedure make-regexp: Invalid preceding regular expression Entering a new prompt.  Type `,bt' for a backtrace or `,q' to continue. scheme@(guile-user) [1]> ,q scheme@(guile-user)> (match:substring (string-match "[\u203f]" "\u3091")) $1 = "\u3091" scheme@(guile-user)>