all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
* bug#27978: Detection of section name in man.el
@ 2017-08-05 23:44 Grégory Mounié
  2017-08-18  8:49 ` Eli Zaretskii
  0 siblings, 1 reply; 3+ messages in thread
From: Grégory Mounié @ 2017-08-05 23:44 UTC (permalink / raw)
  To: 27978

[-- Attachment #1: Type: text/plain, Size: 604 bytes --]


  When parsing manual in languages with non-ascii letters, the section 
names using non-ascii letters are not added to the table of content.

  I noticed the bug reading the French bash manual: the quite useful 
"COMMANDES INTERNES DE l'INTERPRÉTEUR" section does not appear (SHELL 
BUILTIN COMMAND). (because of the É letter)

  I propose to use Character class instead of ascii interval in the 
appropriate regexp defvar. It should not change anything for english 
manual and it should work for many other languages.

  It works great for the bash manual in French.
  Grégory Mounié

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Unicode-support-for-man-section-name-detection.patch --]
[-- Type: text/x-patch; name="0001-Unicode-support-for-man-section-name-detection.patch", Size: 1814 bytes --]

From f9f8b027bcec6fe7aec2c0009eecdcd7e8880292 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Gr=C3=A9gory=20Mouni=C3=A9?= <Gregory.Mounie@imag.fr>
Date: Sun, 6 Aug 2017 01:22:58 +0200
Subject: [PATCH] Unicode support for man section name detection

* lisp/man.el: Replace ascii interval by character class in
order to detect correctly the section names in the table of
content (eg. in the french version of the  bash manual).

Copyright-paperwork-exempt: yes
---
 lisp/man.el | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/lisp/man.el b/lisp/man.el
index 0e1c92956b..97a4758e7e 100644
--- a/lisp/man.el
+++ b/lisp/man.el
@@ -278,21 +278,21 @@ Man-cooked-hook
   :type 'hook
   :group 'man)
 
-(defvar Man-name-regexp "[-a-zA-Z0-9_­+][-a-zA-Z0-9_.:­+]*"
+(defvar Man-name-regexp "[-[:alnum:]_­+][-[:alnum:]_.:­+]*"
   "Regular expression describing the name of a manpage (without section).")
 
-(defvar Man-section-regexp "[0-9][a-zA-Z0-9+]*\\|[LNln]"
+(defvar Man-section-regexp "[[:digit:]][[:alnum:]+]*\\|[LNln]"
   "Regular expression describing a manpage section within parentheses.")
 
 (defvar Man-page-header-regexp
   (if (string-match "-solaris2\\." system-configuration)
-      (concat "^[-A-Za-z0-9_].*[ \t]\\(" Man-name-regexp
+      (concat "^[-[:alnum:]_].*[ \t]\\(" Man-name-regexp
 	      "(\\(" Man-section-regexp "\\))\\)$")
     (concat "^[ \t]*\\(" Man-name-regexp
 	    "(\\(" Man-section-regexp "\\))\\).*\\1"))
   "Regular expression describing the heading of a page.")
 
-(defvar Man-heading-regexp "^\\([A-Z][A-Z0-9 /-]+\\)$"
+(defvar Man-heading-regexp "^\\([[:upper:]][[:upper:][:digit:] /-]+\\)$"
   "Regular expression describing a manpage heading entry.")
 
 (defvar Man-see-also-regexp "SEE ALSO"
-- 
2.13.3


^ permalink raw reply related	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2017-08-18 19:23 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-08-05 23:44 bug#27978: Detection of section name in man.el Grégory Mounié
2017-08-18  8:49 ` Eli Zaretskii
     [not found]   ` <4f29a934-24db-6d10-db27-fd3a3a0c1269@imag.fr>
2017-08-18 19:23     ` Eli Zaretskii

Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.