From: Heime <heimeborgia@protonmail.com>
To: Eli Zaretskii <eliz@gnu.org>
Cc: help-gnu-emacs@gnu.org
Subject: Regexp capturing unicode characters
Date: Thu, 01 Aug 2024 19:44:18 +0000 [thread overview]
Message-ID: <aE-4pSaOfROw5n0ORaY5TGox9gnY-KIeMAVN-y8_YV-jDy10TTKnUgysTF0oiM1M09m2Kt1V4Fn5HIEo3Lb5gb2_pb53q0STD1IlPvYBlUk=@protonmail.com> (raw)
In-Reply-To: <86frrow2z1.fsf@gnu.org>
On Friday, August 2nd, 2024 at 5:46 AM, Eli Zaretskii <eliz@gnu.org> wrote:
> > Date: Thu, 01 Aug 2024 17:06:26 +0000
> > From: Heime heimeborgia@protonmail.com
> > Cc: help-gnu-emacs@gnu.org
> >
> > On Friday, August 2nd, 2024 at 3:34 AM, Eli Zaretskii eliz@gnu.org wrote:
> >
> > I want to include in the regexp the possibility that the user wrote some
> > comment in a foreign language other than english. Otherwise the regexp
> > would simply skip them. And your suggestion has been [alpha] and [:alnum:].
>
>
> Once again, [:alpha:] and [:alnum:] will match letters and digits in
> any language, not just in English.
>
> > > The useful information is already there (including a cross-reference
> > > to a detailed description of what "multibyte" means). I just
> > > translated it into simpler terms, based on what you told about the job
> > > you want to do, to save you from the need to read that if you don't
> > > want to.
> >
> > A mention that [:multibyte:] is not used much nowadays.
>
>
> That's not what I said. I said it is almost never the right thing
> nowadays, especially in your case.
>
> I'm trying to help you by saying simplified things. The manual
> doesn't simplify, because it's a reference.
Would graph [:graph:] be the most powerful ?
In "34.2 Disabling Multibyte Characters", it is stated
"Multibyte mode allows you to use all the supported languages
and scripts without limitations."
Yet you say that it is never the right thing especially in my case.
Where in my case I want to support languages without limitations.
I did not find the reference is enough to decide what is appropriate
to use for languages without limitations, or for specific languages.
Mainly because I would not know what the classes include exactly.
Have read
34.1 Text Representtions
34.7 Character Sets
36.2.1 Table of Syntax Classes
and
35.3.1.1 Special Characters in Regular Expressions
35.3.1.2 Character Classes
35.3.1.3 Backslash Constructs in Regular Expressions
Would I have missed other things important to the discussion ?
next prev parent reply other threads:[~2024-08-01 19:44 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-07-31 21:24 Regexp capturing unicode characters Heime
2024-07-31 21:50 ` Heime
2024-08-01 5:15 ` Eli Zaretskii
2024-08-01 11:26 ` Heime
2024-08-01 12:10 ` Eli Zaretskii
2024-08-01 13:43 ` Heime
2024-08-01 14:30 ` Michael Heerdegen via Users list for the GNU Emacs text editor
2024-08-01 15:34 ` Eli Zaretskii
2024-08-01 17:06 ` Heime
2024-08-01 17:46 ` Eli Zaretskii
2024-08-01 19:44 ` Heime [this message]
2024-08-02 5:44 ` Eli Zaretskii
2024-08-02 8:03 ` uzibalqa
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='aE-4pSaOfROw5n0ORaY5TGox9gnY-KIeMAVN-y8_YV-jDy10TTKnUgysTF0oiM1M09m2Kt1V4Fn5HIEo3Lb5gb2_pb53q0STD1IlPvYBlUk=@protonmail.com' \
--to=heimeborgia@protonmail.com \
--cc=eliz@gnu.org \
--cc=help-gnu-emacs@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.