all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Heime <heimeborgia@protonmail.com>
To: Eli Zaretskii <eliz@gnu.org>
Cc: help-gnu-emacs@gnu.org
Subject: Regexp capturing unicode characters
Date: Thu, 01 Aug 2024 19:44:18 +0000	[thread overview]
Message-ID: <aE-4pSaOfROw5n0ORaY5TGox9gnY-KIeMAVN-y8_YV-jDy10TTKnUgysTF0oiM1M09m2Kt1V4Fn5HIEo3Lb5gb2_pb53q0STD1IlPvYBlUk=@protonmail.com> (raw)
In-Reply-To: <86frrow2z1.fsf@gnu.org>

On Friday, August 2nd, 2024 at 5:46 AM, Eli Zaretskii <eliz@gnu.org> wrote:

> > Date: Thu, 01 Aug 2024 17:06:26 +0000
> > From: Heime heimeborgia@protonmail.com
> > Cc: help-gnu-emacs@gnu.org
> > 
> > On Friday, August 2nd, 2024 at 3:34 AM, Eli Zaretskii eliz@gnu.org wrote:
> > 
> > I want to include in the regexp the possibility that the user wrote some
> > comment in a foreign language other than english. Otherwise the regexp
> > would simply skip them. And your suggestion has been [alpha] and [:alnum:].
> 
> 
> Once again, [:alpha:] and [:alnum:] will match letters and digits in
> any language, not just in English.
> 
> > > The useful information is already there (including a cross-reference
> > > to a detailed description of what "multibyte" means). I just
> > > translated it into simpler terms, based on what you told about the job
> > > you want to do, to save you from the need to read that if you don't
> > > want to.
> > 
> > A mention that [:multibyte:] is not used much nowadays.
> 
> 
> That's not what I said. I said it is almost never the right thing
> nowadays, especially in your case.
> 
> I'm trying to help you by saying simplified things. The manual
> doesn't simplify, because it's a reference.

Would graph [:graph:] be the most powerful ?  

In "34.2 Disabling Multibyte Characters", it is stated 

"Multibyte mode allows you to use all the supported languages 
and scripts without limitations."

Yet you say that it is never the right thing especially in my case.
Where in my case I want to support languages without limitations.

I did not find the reference is enough to decide what is appropriate 
to use for languages without limitations, or for specific languages.
Mainly because I would not know what the classes include exactly.

Have read 

34.1 Text Representtions

34.7 Character Sets

36.2.1 Table of Syntax Classes

and 

35.3.1.1 Special Characters in Regular Expressions

35.3.1.2 Character Classes

35.3.1.3 Backslash Constructs in Regular Expressions

Would I have missed other things important to the discussion ?



  reply	other threads:[~2024-08-01 19:44 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-07-31 21:24 Regexp capturing unicode characters Heime
2024-07-31 21:50 ` Heime
2024-08-01  5:15 ` Eli Zaretskii
2024-08-01 11:26   ` Heime
2024-08-01 12:10     ` Eli Zaretskii
2024-08-01 13:43       ` Heime
2024-08-01 14:30         ` Michael Heerdegen via Users list for the GNU Emacs text editor
2024-08-01 15:34         ` Eli Zaretskii
2024-08-01 17:06           ` Heime
2024-08-01 17:46             ` Eli Zaretskii
2024-08-01 19:44               ` Heime [this message]
2024-08-02  5:44                 ` Eli Zaretskii
2024-08-02  8:03                   ` uzibalqa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='aE-4pSaOfROw5n0ORaY5TGox9gnY-KIeMAVN-y8_YV-jDy10TTKnUgysTF0oiM1M09m2Kt1V4Fn5HIEo3Lb5gb2_pb53q0STD1IlPvYBlUk=@protonmail.com' \
    --to=heimeborgia@protonmail.com \
    --cc=eliz@gnu.org \
    --cc=help-gnu-emacs@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.