unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* miscue in latest commit of Arsen Arsenović patch
@ 2024-12-23 21:47 Paul Eggert
  2024-12-24  0:16 ` Stefan Kangas
  2024-12-24 13:55 ` Alan Mackenzie
  0 siblings, 2 replies; 3+ messages in thread
From: Paul Eggert @ 2024-12-23 21:47 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: Arsen Arsenović, Emacs Development

[-- Attachment #1: Type: text/plain, Size: 1032 bytes --]

The recent Emacs commit 39380e1bd3bfc26e355445590e243fcfa940fc9f should 
look like this in 'git log' output:

   commit 39380e1bd3bfc26e355445590e243fcfa940fc9f
   Author:     Arsen Arsenović <arsen@aarsen.me>
   AuthorDate: Sun Dec 22 19:33:29 2024 +0000
   Commit:     Alan Mackenzie <acm@muc.de>
   CommitDate: Sun Dec 22 19:33:29 2024 +0000

       Java Mode: introduce the keyword `assert'.
       ...

However, there was a glitch and instead I see a blotch "�" (U+FFFD 
REPLACEMENT CHARACTER) where there should be "ć" (U+0107 LATIN SMALL 
LETTER C WITH ACUTE).

Since Arsen has another patch committed (by Dmitry) without the glitch, 
I expect the problem is on Alan's end. Alan, are you running Emacs in a 
single-byte locale to commit patches? If so, I suggest using a UTF-8 
locale instead, whenever the commit contains non-ASCII characters in the 
author's name or commit message.

To help avoid similar glitches in the future I installed the attached.

Apologies to Arsen for the misspelling.

[-- Attachment #2: 0001-Avoid-U-FFFD-in-commit-messages.patch --]
[-- Type: text/x-patch, Size: 1715 bytes --]

From 28c420afab6a0944a192c30ff2d5d9e40c88f14f Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Mon, 23 Dec 2024 13:38:51 -0800
Subject: [PATCH] Avoid U+FFFD in commit messages

* build-aux/git-hooks/commit-msg:
Also check against U+FFFD REPLACEMENT CHARACTER in commit messages.
---
 build-aux/git-hooks/commit-msg | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/build-aux/git-hooks/commit-msg b/build-aux/git-hooks/commit-msg
index 1eb2560bba2..dace4c7fb66 100755
--- a/build-aux/git-hooks/commit-msg
+++ b/build-aux/git-hooks/commit-msg
@@ -31,6 +31,8 @@
 # Use U+00A2 CENT SIGN to test whether the locale works.
 cent_sign_utf8_format='\302\242\n'
 cent_sign=`printf "$cent_sign_utf8_format"`
+replacement_character_utf8_format='\357\277\275\n'
+replacement_character=`printf "$replacement_character_utf8_format"`
 print_at_sign='BEGIN {print substr("'$cent_sign'@", 2)}'
 at_sign=`$awk "$print_at_sign" </dev/null 2>/dev/null`
 if test "$at_sign" != @; then
@@ -44,7 +46,12 @@ at_sign=
 fi
 
 # Check the log entry.
-exec $awk -v at_sign="$at_sign" -v cent_sign="$cent_sign" -v file="$1" '
+exec $awk \
+     -v at_sign="$at_sign" \
+     -v cent_sign="$cent_sign" \
+     -v file="$1" \
+     -v replacement_character="$replacement_character" \
+'
   BEGIN {
     # These regular expressions assume traditional Unix unibyte behavior.
     # They are needed for old or broken versions of awk, e.g.,
@@ -137,6 +144,10 @@ at_sign=
     print "Unprintable character in commit message"
     status = 1
   }
+  $0 ~ replacement_character {
+    print "Replacement character in commit message"
+    status = 1
+  }
 
   END {
     if (nlines == 0) {
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2024-12-24 13:55 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-12-23 21:47 miscue in latest commit of Arsen Arsenović patch Paul Eggert
2024-12-24  0:16 ` Stefan Kangas
2024-12-24 13:55 ` Alan Mackenzie

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).