From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: uzibalqa Newsgroups: gmane.emacs.help Subject: Re: Regexp capturing unicode characters Date: Fri, 02 Aug 2024 08:03:01 +0000 Message-ID: References: <865xskygar.fsf@gnu.org> <2wHi4S9MruOl3ZOkpjKnin3CJxnVnomMkaIdhl-i3OF7AYEda3X-7-1ijhWUrLZ22JwOMXQu5ntZ3FFBuAlmhkpMxgXFbhZ-sS_XMmCrE4g=@protonmail.com> <86le1gwii6.fsf@gnu.org> <86h6c4w93c.fsf@gnu.org> <86frrow2z1.fsf@gnu.org> <86ed77wkax.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="20459"; mail-complaints-to="usenet@ciao.gmane.io" Cc: help-gnu-emacs@gnu.org To: Eli Zaretskii Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane-mx.org@gnu.org Fri Aug 02 10:04:48 2024 Return-path: Envelope-to: geh-help-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1sZnHM-000575-1B for geh-help-gnu-emacs@m.gmane-mx.org; Fri, 02 Aug 2024 10:04:48 +0200 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sZnGW-0004uB-FU; Fri, 02 Aug 2024 04:03:56 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sZnGA-00045z-59 for help-gnu-emacs@gnu.org; Fri, 02 Aug 2024 04:03:37 -0400 Original-Received: from mail-4319.protonmail.ch ([185.70.43.19]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sZnFp-0006sC-1h for help-gnu-emacs@gnu.org; Fri, 02 Aug 2024 04:03:33 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=proton.me; s=protonmail; t=1722585789; x=1722844989; bh=aMLj5q5vqQe/BxvMzaHEZmLipRHx7ZvLJK8kg1PVQk4=; h=Date:To:From:Cc:Subject:Message-ID:In-Reply-To:References: Feedback-ID:From:To:Cc:Date:Subject:Reply-To:Feedback-ID: Message-ID:BIMI-Selector; b=LV+7+BYVeXdBYcM7OW4+LIxlT5Kspg+FKOp+NTu8+RMSi2psodW4e2l6BiRrDVZyT qDTTmodnWBDAEgQz4EARj/lQ+VePo6qmJxyly88usvbiRGLzHfarhhAgrT3Y6oyfL+ 9xkj0rexOCmDVGjl5gWIZU52xdF1qO5nRwhagPbnaDh+lv6U6sXtJYGY6qkLFgI1yp 1yOLNp0lpP94k3Wv5chnXWSKHJaqiZ/g9EW8zeEOAFAqm4nOhi9KnRsn4ciK09HNy8 sjHIGmZEkOROj0qwBl64rSUKiXA9in11LiHxI7qndIKJnriAU+x1rH4TdzdYEUo3U2 aEeDHv1cEwghg== In-Reply-To: <86ed77wkax.fsf@gnu.org> Feedback-ID: 52887082:user:proton X-Pm-Message-ID: 51de59067afeef7b52d83a3d5e29b04f0cac49f1 Received-SPF: pass client-ip=185.70.43.19; envelope-from=uzibalqa@proton.me; helo=mail-4319.protonmail.ch X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: help-gnu-emacs@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Users list for the GNU Emacs text editor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.help:147498 Archived-At: On Friday, August 2nd, 2024 at 5:44 PM, Eli Zaretskii wrote: > > Date: Thu, 01 Aug 2024 19:44:18 +0000 > > From: Heime heimeborgia@protonmail.com > > Cc: help-gnu-emacs@gnu.org > >=20 > > On Friday, August 2nd, 2024 at 5:46 AM, Eli Zaretskii eliz@gnu.org wrot= e: > >=20 > > > Once again, [:alpha:] and [:alnum:] will match letters and digits in > > > any language, not just in English. > > >=20 > > > > > The useful information is already there (including a cross-refere= nce > > > > > to a detailed description of what "multibyte" means). I just > > > > > translated it into simpler terms, based on what you told about th= e job > > > > > you want to do, to save you from the need to read that if you don= 't > > > > > want to. > > > >=20 > > > > A mention that [:multibyte:] is not used much nowadays. > > >=20 > > > That's not what I said. I said it is almost never the right thing > > > nowadays, especially in your case. > > >=20 > > > I'm trying to help you by saying simplified things. The manual > > > doesn't simplify, because it's a reference. > >=20 > > Would graph [:graph:] be the most powerful ? >=20 >=20 > [:graph:] includes punctuation and other symbols, which AFAIU you > don't want to match (since you thought [:word:] is what you need). >=20 > > In "34.2 Disabling Multibyte Characters", it is stated > >=20 > > "Multibyte mode allows you to use all the supported languages > > and scripts without limitations." >=20 >=20 > That's not really relevant to the issue at hand. Yes, multibyte > characters are needed to support all the languages. No, that doesn't > mean you need to use [:multibyte:], because that will match > punctuation, symbols, non-ASCII control and whitespace characters, > etc., and you don't want that. OTOH, [:multibyte:] doesn't match > ASCII letters and digits, and you certainly do want to match them. >=20 > > Yet you say that it is never the right thing especially in my case. > > Where in my case I want to support languages without limitations. >=20 >=20 > Yes, and [:alpha:] and [:alnum:] support languages without > limitations. As I already told you several times. >=20 > > I did not find the reference is enough to decide what is appropriate > > to use for languages without limitations, or for specific languages. > > Mainly because I would not know what the classes include exactly. >=20 >=20 > I tried to help you with specific advice, but you insist on not > listening. So this will be my last message in this thread. I listen, but also wanted reason so I can reach the same conclusion. I accept the elaboration, which I could not conclude by myself based only on the manual descriptions. That was all it was about. Then there is ".*" which is very broad and flexible. It can match spaces,= =20 punctuation, and special characters. Would this constitute the broadest=20 thing ? I am using to pick on anything. =20 =20