all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Hongyi Zhao <hongyi.zhao@gmail.com>
To: tomas@tuxteam.de
Cc: help-gnu-emacs <help-gnu-emacs@gnu.org>
Subject: Re: [External] : Re: Regexp for matching control character, say, FORM FEED. (Was: Re: The `^L' appeared in built-in help.)
Date: Thu, 22 Jul 2021 17:45:36 +0800	[thread overview]
Message-ID: <CAGP6POJdNh4i1TkOGACGD7081-Q6JOuBeqbQUduVDHvYM6ESFw@mail.gmail.com> (raw)
In-Reply-To: <20210722080643.GC11096@tuxteam.de>

On Thu, Jul 22, 2021 at 4:06 PM <tomas@tuxteam.de> wrote:
>
> On Thu, Jul 22, 2021 at 09:13:31AM +0800, Hongyi Zhao wrote:
>
> [...]
>
> > I want to know whether there are some similar regexp patterns in Emacs
> > as the ones used by grep, say, $'\014' or $'\f'.
>
> To offer some other perspective on the (correct) answers by Emanuel and
> Drew, remember that a regular expression is, basically, a string
> where each character is interpreted as "itself", unless it is a "regexp
> special" character [1]. So, for example searching for the regular expression
> "a" will find all "a"s in your text, because the character a isn't a
> "regexp special".
>
> Now ASCII control characters are all *not* "regexp special" so you only
> have to find a way to express them whithin a string. How, that is stated
> in the Emacs Lisp manual when it talks about "string type" [2] (especially
> the subnode "Non-ASCII Characters in Strings", which leads you to "character
> type" [3]. The special forms "\f", "\^L" or "\C-L" (all of them equivalent),
> which all were talked about here are treated in a subnode of the above [4].
> This notation carries some historical baggage, so don't expect too much
> logic from it.
>
> For example, why ^L? Because form feed is at point 12 (in decimal) in the
> ascii table, and L at point 76, the difference being 64.

$ man ascii |egrep  ' L$'
       014   12    0C    FF  '\f' (form feed)        114   76    4C    L

> What happens is that the "^" "subtracts 64 from the character code", or more precisely
> masks out bit 6 of its binary representation.

$ man ascii |egrep  ' \^$'
       036   30    1E    RS  (record separator)      136   94    5E    ^

If so, the RS should be represented by ^^ in a self-consistent way :-)

> So ^M would be "carriage return" and so on. Just have a look at the ASCII table.

$ man ascii |egrep  ' M$'
       015   13    0D    CR  '\r' (carriage ret)     115   77    4D    M


> Then "\f" comes from the C string literal representation. It's meant to
> be mnemonic ("f" for "form feed" -- similarly "\n" for "line feed", aka
> "new line", "\b" for "bell" and so on).
>
> The references below lead you to more alternative representations, like
> short hex "\x0C", short Unicode hex "\u000C", long Unicode hex "\U0000000C";
> there are also (mostly historical) octals, etc.
>
> You can even put the unicode /names/ in there, using the "\N{...}"
> notation, so your ^L can be named "\N{FORM FEED (FF)}" (yes the (FF)
> in parentheses is part of it: the Unicode Consortium put it in there.
> Life is like that).
>
> If you want to explore those unicode names, type in C-x 8 <RET>, you
> can autocomplete your way among them.
>
> Hope this gives some rough map for that landscape :-)

Thank you for your systematic and informative comments and explanations.

> Cheers
>
> [1] Emacs Lisp reference manual "Syntax of Regular Expressions"
>     or https://www.gnu.org/software/emacs/manual/html_node/elisp/Syntax-of-Regexps.html
>
>
> [2] Emacs Lisp reference manual "String Type" and its subnodes
>     or https://www.gnu.org/software/emacs/manual/html_node/elisp/String-Type.html
>
> [3] Emacs Lisp reference manual "Character Type"
>     https://www.gnu.org/software/emacs/manual/html_node/elisp/Character-Type.html
>
> [4] Emacs Lisp reference manual "Control-Character Syntax"
>     https://www.gnu.org/software/emacs/manual/html_node/elisp/Ctl_002dChar-Syntax.html
>
>  - tomás

Best,
HY



  reply	other threads:[~2021-07-22  9:45 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-06  2:34 The `^L' appeared in built-in help Hongyi Zhao
2021-07-06  2:46 ` [External] : " Drew Adams
2021-07-06  2:53   ` Emanuel Berg via Users list for the GNU Emacs text editor
2021-07-06 14:56     ` Drew Adams
2021-07-06 15:56       ` Emanuel Berg via Users list for the GNU Emacs text editor
2021-07-06 17:04         ` Drew Adams
2021-07-06 17:12           ` Emanuel Berg via Users list for the GNU Emacs text editor
2021-07-06  2:51 ` Emanuel Berg via Users list for the GNU Emacs text editor
2021-07-06  3:44   ` Hongyi Zhao
2021-07-06  4:06     ` Emanuel Berg via Users list for the GNU Emacs text editor
2021-07-06  8:26       ` Hongyi Zhao
2021-07-06  8:31         ` Hongyi Zhao
2021-07-06  9:12           ` Emanuel Berg via Users list for the GNU Emacs text editor
2021-07-06  9:40             ` Hongyi Zhao
2021-07-06 10:06               ` Emanuel Berg via Users list for the GNU Emacs text editor
2021-07-06 11:07                 ` Hongyi Zhao
2021-07-06 11:22                   ` Hongyi Zhao
2021-07-06 11:55                     ` Hongyi Zhao
2021-07-06 12:09                       ` Hongyi Zhao
2021-07-06 16:13                       ` Emanuel Berg via Users list for the GNU Emacs text editor
2021-07-06 16:12                     ` Emanuel Berg via Users list for the GNU Emacs text editor
2021-07-07  3:03                       ` Hongyi Zhao
2021-07-13  3:06                         ` Emanuel Berg via Users list for the GNU Emacs text editor
2021-07-06 16:12                   ` Emanuel Berg via Users list for the GNU Emacs text editor
2021-07-07  1:40                     ` Hongyi Zhao
2021-07-13  3:07                       ` Emanuel Berg via Users list for the GNU Emacs text editor
2021-07-18  6:34                         ` Hongyi Zhao
2021-07-19  0:27                           ` Emanuel Berg via Users list for the GNU Emacs text editor
2021-07-20  1:27       ` Hongyi Zhao
2021-07-20  1:42         ` [External] : " Drew Adams
2021-07-20  2:02           ` Hongyi Zhao
2021-07-20  4:28             ` Drew Adams
2021-07-20  5:56               ` Hongyi Zhao
2021-07-20 10:29                 ` Hongyi Zhao
2021-07-20 14:48                   ` Drew Adams
2021-07-20 16:28                   ` Emanuel Berg via Users list for the GNU Emacs text editor
2021-07-21  2:03                     ` Hongyi Zhao
2021-07-21  2:26                       ` Emanuel Berg via Users list for the GNU Emacs text editor
2021-07-21  4:44                         ` Drew Adams
2021-07-21  7:15                           ` Regexp for matching control character, say, FORM FEED. (Was: Re: The `^L' appeared in built-in help.) Hongyi Zhao
2021-07-21 17:08                             ` [External] : " Drew Adams
2021-07-22  1:13                               ` Hongyi Zhao
2021-07-22  1:28                                 ` Emanuel Berg via Users list for the GNU Emacs text editor
2021-07-22  1:39                                   ` Drew Adams
2021-07-22  1:42                                     ` Emanuel Berg via Users list for the GNU Emacs text editor
2021-07-22  3:52                                       ` Drew Adams
2021-07-22  4:14                                         ` Emanuel Berg via Users list for the GNU Emacs text editor
2021-07-22  8:04                                           ` Hongyi Zhao
2021-07-22 13:56                                           ` Hongyi Zhao
2021-07-22 14:38                                             ` tomas
2021-07-22 14:53                                               ` Hongyi Zhao
2021-07-22 15:01                                                 ` tomas
2021-07-22 15:21                                                   ` Hongyi Zhao
2021-07-22 22:07                                                 ` [External] : Re: Regexp for matching control character, say, FORM FEED Michael Heerdegen
2021-07-23  1:09                                                   ` Hongyi Zhao
2021-07-22 17:07                                             ` [External] : Re: Regexp for matching control character, say, FORM FEED. (Was: Re: The `^L' appeared in built-in help.) Drew Adams
2021-07-22 17:11                                               ` tomas
2021-08-01  2:41                                                 ` Emanuel Berg via Users list for the GNU Emacs text editor
2021-07-22  8:06                                 ` tomas
2021-07-22  9:45                                   ` Hongyi Zhao [this message]
2021-07-22 10:06                                     ` tomas
2021-07-22 10:27                                       ` Hongyi Zhao
2021-07-22 12:14                                         ` tomas
2021-08-01  2:31                                   ` Emanuel Berg via Users list for the GNU Emacs text editor
2021-07-06  4:19 ` The `^L' appeared in built-in help 2QdxY4RzWzUUiLuE
2021-07-06  4:29   ` Emanuel Berg via Users list for the GNU Emacs text editor

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAGP6POJdNh4i1TkOGACGD7081-Q6JOuBeqbQUduVDHvYM6ESFw@mail.gmail.com \
    --to=hongyi.zhao@gmail.com \
    --cc=help-gnu-emacs@gnu.org \
    --cc=tomas@tuxteam.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.