From: Eli Zaretskii <eliz@gnu.org>
To: Paul Eggert <eggert@cs.ucla.edu>
Cc: emacs-devel@gnu.org
Subject: Re: commit-msg hook
Date: Tue, 14 Apr 2015 18:08:48 +0300 [thread overview]
Message-ID: <83pp76bvcv.fsf@gnu.org> (raw)
In-Reply-To: <552C32F7.5010206@cs.ucla.edu>
> Date: Mon, 13 Apr 2015 14:19:51 -0700
> From: Paul Eggert <eggert@cs.ucla.edu>
> CC: emacs-devel@gnu.org
>
> On 04/13/2015 01:18 PM, Eli Zaretskii wrote:
> > Gawk has the --characters-as-bytes option since v4.0.0, which should
> > countermand that, I think.
>
> Sure, although the code should work even plain POSIX awk, as there
> should be no need to assume such a GNU extension when bootstrapping.
> That is, the script could support either:
>
> 1. POSIX awk with multibyte OS support, with proper UTF-8 checking from
> OS libraries; or
>
> 2. GNU awk 4 (2012) or later, with nearly-as-good UTF-8 checking
> hand-coded into the script; or
>
> 3. Traditional awk without UTF-8 checking.
>
> Currently the script supports (1) and (3) but someone could add support
> for (2).
How about the following change? It improves on (3), and worked for me
both on MS-Windows and on GNU/Linux.
--- ./.git/hooks/commit-msg.~5~ 2015-04-12 19:11:27.481125000 +0300
+++ ./.git/hooks/commit-msg 2015-04-14 11:11:02.000000000 +0300
@@ -45,10 +45,13 @@
BEGIN {
# These regular expressions assume traditional Unix unibyte behavior.
# They are needed for old or broken versions of awk, e.g.,
- # mawk 1.3.3 (1996), or gawk on MSYS (2015).
+ # mawk 1.3.3 (1996), or gawk on MSYS (2015), and/or for systems that
+ # cannot use UTF-8 as the codeset for the locale.
space = "[ \f\n\r\t\v]"
non_space = "[^ \f\n\r\t\v]"
- non_print = "[\1-\37\177]"
+ # The non_print below rejects control characters and surrogates
+ # UTF-8 for: 0x01-0x1f 0x7f 0x80-0x9f 0xd800-0xdbff 0xdc00-0xdfff
+ non_print = "[\1-\37\177]|\302[\200-\237]|\355([\240-\257]|[\260-\277])[\200-\277]"
# Prefer POSIX regular expressions if available, as they do a
# better job of checking. Similarly, prefer POSIX negated
next prev parent reply other threads:[~2015-04-14 15:08 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-04-10 10:43 commit-msg hook Eli Zaretskii
2015-04-10 18:23 ` Johan Bockgård
2015-04-11 2:42 ` Paul Eggert
2015-04-11 7:24 ` Eli Zaretskii
2015-04-11 9:55 ` Eli Zaretskii
2015-04-11 9:59 ` Eli Zaretskii
2015-04-11 12:42 ` Dmitry Gutov
2015-04-11 14:29 ` Eli Zaretskii
2015-04-11 15:13 ` Dmitry Gutov
2015-04-11 15:17 ` Eli Zaretskii
2015-04-12 3:36 ` Stefan Monnier
2015-04-12 18:54 ` chad
2015-04-11 15:40 ` Paul Eggert
2015-04-11 16:40 ` Eli Zaretskii
2015-04-11 20:09 ` Paul Eggert
2015-04-12 16:10 ` Eli Zaretskii
2015-04-13 15:48 ` Eli Zaretskii
2015-04-13 18:37 ` Paul Eggert
2015-04-13 20:18 ` Eli Zaretskii
2015-04-13 21:19 ` Paul Eggert
2015-04-14 15:08 ` Eli Zaretskii [this message]
2015-04-14 17:01 ` Paul Eggert
2015-04-14 17:09 ` Eli Zaretskii
2015-04-14 17:42 ` Paul Eggert
2015-04-14 18:01 ` Eli Zaretskii
2015-04-14 18:32 ` Paul Eggert
2015-04-14 18:59 ` Eli Zaretskii
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=83pp76bvcv.fsf@gnu.org \
--to=eliz@gnu.org \
--cc=eggert@cs.ucla.edu \
--cc=emacs-devel@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.