bug#30513: Unicode Character Name is misspelled (MATHEMATICAL ITALIC SMALL LAMDA)

unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed

* bug#30513: Unicode Character Name is misspelled (MATHEMATICAL ITALIC SMALL LAMDA)
@ 2018-02-18 14:58 Michael Grünewald
  2018-02-18 15:04 ` Noam Postavsky
  0 siblings, 1 reply; 12+ messages in thread
From: Michael Grünewald @ 2018-02-18 14:58 UTC (permalink / raw)
  To: 30513

The Unicode character 𝜆 has its name misspelled, it should be

  MATHEMATICAL ITALIC SMALL LAMBDA

instead of 

  MATHEMATICAL ITALIC SMALL LAMDA

(notice a B is missing in the second variant).


When using C-u C-x = on that character, this information is displayed:

             position: 6184 of 10702 (58%), column: 11
            character: 𝜆 (displayed as 𝜆) (codepoint 120582, #o353406, #x1d706)
    preferred charset: unicode (Unicode (ISO10646))
code point in charset: 0x1D706
               script: mathematical
               syntax: w 	which means: word
             category: .:Base, L:Left-to-right (strong)
             to input: type "C-x 8 RET 1d706" or "C-x 8 RET MATHEMATICAL ITALIC SMALL LAMDA"
          buffer code: #xF0 #x9D #x9C #x86
            file code: #xF0 #x9D #x9C #x86 (encoded by coding system utf-8-unix)
              display: by this font (glyph code)
    mac-ct:-*-STIXGeneral-normal-normal-normal-*-16-*-*-*-p-0-iso10646-1 (#xBEF)

Character code properties: customize what to show
  name: MATHEMATICAL ITALIC SMALL LAMDA
  general-category: Ll (Letter, Lowercase)
  decomposition: (font 955) (font 'λ')

There are text properties here:
  fontified            t

[back]


In GNU Emacs 25.3.1 (x86_64-apple-darwin17.2.0, NS appkit-1561.10 Version 10.13.1 (Build 17B48))
 of 2017-11-01 built on MacBook-Pro.localdomain
Windowing system distributor 'Apple', version 10.3.1561
Configured using:
 'configure --prefix=/opt/local --without-ns --without-dbus
 --without-gconf --without-libotf --without-m17n-flt --without-gpm
 --without-gnutls --with-xml2 --with-modules --infodir
 /opt/local/share/info/emacs --with-ns CC=/usr/bin/clang 'CFLAGS=-pipe
 -Os -arch x86_64' 'LDFLAGS=-L/opt/local/lib
 -Wl,-headerpad_max_install_names -arch x86_64'
 CPPFLAGS=-I/opt/local/include'

Configured features:
NOTIFY ACL LIBXML2 ZLIB TOOLKIT_SCROLL_BARS NS MODULES

Important settings:
  value of $LC_CTYPE: UTF-8
  value of $LANG: en_US.UTF-8
  locale-coding-system: utf-8-unix

Major mode: Lisp

Minor modes in effect:
  diff-auto-refine-mode: t
  slime-trace-dialog-minor-mode: t
  slime-autodoc-mode: t
  slime-mode: t
  shell-dirtrack-mode: t
  global-whitespace-mode: t
  tooltip-mode: t
  global-eldoc-mode: t
  eldoc-mode: t
  electric-indent-mode: t
  mouse-wheel-mode: t
  menu-bar-mode: t
  file-name-shadow-mode: t
  global-font-lock-mode: t
  font-lock-mode: t
  blink-cursor-mode: t
  auto-composition-mode: t
  auto-encryption-mode: t
  auto-compression-mode: t
  column-number-mode: t
  line-number-mode: t
  auto-fill-function: do-auto-fill
  transient-mark-mode: t

Recent messages:
Saving file /Users/michael/AbleBaker/waermondt/src/bgm.lisp...
Wrote /Users/michael/AbleBaker/waermondt/src/bgm.lisp
Quit
Char: 𝜏 (120591, #o353417, #x1d70f, file ...) point=5964 of 10701 (56%) column=11
Type C-x 1 to delete the help window.
Char: 𝜏 (120591, #o353417, #x1d70f, file ...) point=5964 of 10701 (56%) column=11
Saving file /Users/michael/AbleBaker/waermondt/src/bgm.lisp...
Wrote /Users/michael/AbleBaker/waermondt/src/bgm.lisp
Quit
Auto-saving...
Quit [2 times]

Load-path shadows:
None found.

Features:
(shadow sort mail-extr emacsbug sendmail wid-edit descr-text tmm
log-edit message idna format-spec rfc822 mml mml-sec epg mm-decode
mm-bodies mm-encode mailabbrev mail-utils gmm-utils mailheader add-log
log-view pcvs-util vc vc-dispatcher rect two-column sgml-mode dired-aux
iso-transl cus-start cus-load quail latexenc vc-git diff-mode misearch
multi-isearch network-stream nsm starttls dired finder-inf package
epg-config seq merlin-cap merlin crm ocp-index ocp-indent php-mode
cc-langs cc-mode cc-fonts cc-guess cc-menus cc-cmds cc-styles cc-align
cc-engine cc-vars cc-defs slime-indentation slime-cl-indent cl-indent
slime-hyperdoc url-http tls gnutls url url-proxy url-privacy url-expand
url-methods url-history mailcap url-auth mail-parse rfc2231 rfc2047
rfc2045 ietf-drums url-cookie url-domsuf url-util url-parse auth-source
gnus-util mm-util help-fns mail-prsvr password-cache url-gw url-vars
slime-sprof slime-asdf grep slime-fancy slime-trace-dialog
slime-fontifying-fu slime-package-fu slime-references
slime-compiler-notes-tree slime-scratch slime-presentations bridge
slime-macrostep macrostep slime-mdot-fu slime-enclosing-context
slime-fuzzy derived slime-fancy-trace slime-fancy-inspector slime-c-p-c
slime-editing-commands slime-autodoc slime-repl edmacro kmacro elp cl
slime-parse slime etags xref cl-seq project eieio byte-opt bytecomp
byte-compile cl-extra help-mode cconv eieio-core cl-macs gv arc-mode
archive-mode noutline outline easy-mmode pp hyperspec thingatpt
browse-url cl-loaddefs pcase cl-lib slime-autoloads tex-mode shell
pcomplete css-mode smie caml tuareg speedbar sb-image ezimage dframe
advice compile comint ansi-color ring easymenu windmove whitespace
time-date mule-util tooltip eldoc electric uniquify ediff-hook vc-hooks
lisp-float-type mwheel ns-win ucs-normalize term/common-win tool-bar dnd
fontset image regexp-opt fringe tabulated-list newcomment elisp-mode
lisp-mode prog-mode register page menu-bar rfn-eshadow timer select
scroll-bar mouse jit-lock font-lock syntax facemenu font-core frame
cl-generic cham georgian utf-8-lang misc-lang vietnamese tibetan thai
tai-viet lao korean japanese eucjp-ms cp51932 hebrew greek romanian
slovak czech european ethiopic indian cyrillic chinese charscript
case-table epa-hook jka-cmpr-hook help simple abbrev minibuffer
cl-preloaded nadvice loaddefs button faces cus-face macroexp files
text-properties overlay sha1 md5 base64 format env code-pages mule
custom widget hashtable-print-readable backquote kqueue cocoa ns
multi-tty make-network-process emacs)

Memory information:
((conses 16 527508 118142)
 (symbols 48 46383 0)
 (miscs 40 1618 2299)
 (strings 32 123349 17918)
 (string-bytes 1 3265119)
 (vectors 16 59363)
 (vector-slots 8 1787488 231565)
 (floats 8 354 782)
 (intervals 56 8771 2523)
 (buffers 976 56))






^ permalink raw reply	[flat|nested] 12+ messages in thread

* bug#30513: Unicode Character Name is misspelled (MATHEMATICAL ITALIC SMALL LAMDA)
  2018-02-18 14:58 bug#30513: Unicode Character Name is misspelled (MATHEMATICAL ITALIC SMALL LAMDA) Michael Grünewald
@ 2018-02-18 15:04 ` Noam Postavsky
  2018-02-18 15:23   ` Michael Grünewald
  2018-02-18 16:31   ` Eli Zaretskii
  0 siblings, 2 replies; 12+ messages in thread
From: Noam Postavsky @ 2018-02-18 15:04 UTC (permalink / raw)
  To: Michael Grünewald; +Cc: 30513

Michael Grünewald <michipili@gmail.com> writes:

> The Unicode character 𝜆 has its name misspelled, it should be
>
>   MATHEMATICAL ITALIC SMALL LAMBDA
>
> instead of 
>
>   MATHEMATICAL ITALIC SMALL LAMDA
>
> (notice a B is missing in the second variant).

https://en.wikipedia.org/wiki/Lambda says:

    Unicode uses the spelling "lamda" in character names, instead of
    "lambda", due to "preferences expressed by the Greek National Body"[14]

[14]: http://www.unicode.org/mail-arch/unicode-ml/y2010-m06/0063.html





^ permalink raw reply	[flat|nested] 12+ messages in thread

* bug#30513: Unicode Character Name is misspelled (MATHEMATICAL ITALIC SMALL LAMDA)
  2018-02-18 15:04 ` Noam Postavsky
@ 2018-02-18 15:23   ` Michael Grünewald
  2018-02-18 15:54     ` Noam Postavsky
  2018-02-18 16:31     ` Eli Zaretskii
  2018-02-18 16:31   ` Eli Zaretskii
  1 sibling, 2 replies; 12+ messages in thread
From: Michael Grünewald @ 2018-02-18 15:23 UTC (permalink / raw)
  To: Noam Postavsky; +Cc: 30513

> On 18. Feb 2018, at 16:04, Noam Postavsky <npostavs@gmail.com> wrote:
> 
> Michael Grünewald <michipili@gmail.com> writes:
> 
>> The Unicode character 𝜆 has its name misspelled, it should be
>> 
>>  MATHEMATICAL ITALIC SMALL LAMBDA
>> 
>> instead of 
>> 
>>  MATHEMATICAL ITALIC SMALL LAMDA
>> 
>> (notice a B is missing in the second variant).
> 
> https://en.wikipedia.org/wiki/Lambda says:
> 
>    Unicode uses the spelling "lamda" in character names, instead of
>    "lambda", due to "preferences expressed by the Greek National Body"[14]
> 
> [14]: http://www.unicode.org/mail-arch/unicode-ml/y2010-m06/0063.html

I see, thank you for the very quick reply!

I sometimes use C-x 8 RET to enter characters using their Unicode names.  Since
these names are usually very long I rely heavily on auto-completion to reduce typing.
So if I need to enter a

  MATHEMATICAL ITALIC SMALL TAU

I start with just the word TAU and hit TAB to display possible alternatives and
quickly reach the desired name.

What would be the preferred way to enter math symbols without fumbling over such small
oddities? Is there any possibility of adjusting the auto-completion method so that it
is not so picky about the difference between LAMDA and LAMBDA?  Are there other better
approaches?

(I think I am well aware of the LAMBDA vs. LAMDA detail now but what about the huge crowd
of Emacs users entering MATHEMATICAL ITALIC SMALL letters using C-x 8 RET? ;) )

I would like to avoid using the TeX input method, because it does not interact so
nicely with programming (because of the special meaning quotes and the underscore become).

Best,
Michael

^ permalink raw reply	[flat|nested] 12+ messages in thread

* bug#30513: Unicode Character Name is misspelled (MATHEMATICAL ITALIC SMALL LAMDA)
  2018-02-18 15:23   ` Michael Grünewald
@ 2018-02-18 15:54     ` Noam Postavsky
       [not found]       ` <83woza9q0s.fsf@gnu.org>
  2020-09-04  4:45       ` Lars Ingebrigtsen
  2018-02-18 16:31     ` Eli Zaretskii
  1 sibling, 2 replies; 12+ messages in thread
From: Noam Postavsky @ 2018-02-18 15:54 UTC (permalink / raw)
  To: Michael Grünewald; +Cc: 30513

[-- Attachment #1: Type: text/plain, Size: 835 bytes --]

tags 30513 + patch
quit

Michael Grünewald <michipili@gmail.com> writes:

> What would be the preferred way to enter math symbols without fumbling over such small
> oddities? Is there any possibility of adjusting the auto-completion method so that it
> is not so picky about the difference between LAMDA and LAMBDA?  Are there other better
> approaches?
>
> (I think I am well aware of the LAMBDA vs. LAMDA detail now but what about the huge crowd
> of Emacs users entering MATHEMATICAL ITALIC SMALL letters using C-x 8 RET? ;) )

Yeah, this is fresh in my memory since I was recently playing with ucs-insert
and was a bit surprised to discover "LAMBDA" under the *old* name:

  name: GREEK SMALL LETTER LAMDA
  old-name: GREEK SMALL LETTER LAMBDA

Anyway, I think the following patch should smooth things over:


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: patch --]
[-- Type: text/x-diff, Size: 1553 bytes --]

From a7b40afd7fad41d55e7a43168c7febc9722ee3ac Mon Sep 17 00:00:00 2001
From: Noam Postavsky <npostavs@gmail.com>
Date: Sun, 18 Feb 2018 10:43:42 -0500
Subject: [PATCH v1] Allow "lambda" spelling for ucs-insert (Bug#30513)

* lisp/international/mule-cmds.el (ucs-names): Add a "LAMBDA"
completion variant for every "LAMDA" name.
---
 lisp/international/mule-cmds.el | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/lisp/international/mule-cmds.el b/lisp/international/mule-cmds.el
index 3468166263..2a995121dd 100644
--- a/lisp/international/mule-cmds.el
+++ b/lisp/international/mule-cmds.el
@@ -2949,6 +2949,14 @@ ucs-names
 	        ;; higher code, so it gets pushed later!
 	        (if new-name (puthash new-name c names))
 	        (if old-name (puthash old-name c names))
+                ;; Unicode uses the spelling "lamda" in character
+                ;; names, instead of "lambda", due to "preferences
+                ;; expressed by the Greek National Body" (Bug#30513).
+                ;; Some characters have an old-name with the "lambda"
+                ;; spelling, but others don't.  Add the traditional
+                ;; spelling for more convenient completion.
+                (if (and (not old-name) new-name (string-match "LAMDA" new-name))
+                    (puthash (replace-match "LAMBDA" t t new-name) c names))
 	        (setq c (1+ c))))))
         ;; Special case for "BELL" which is apparently the only char which
         ;; doesn't have a new name and whose old-name is shadowed by a newer
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* bug#30513: Unicode Character Name is misspelled (MATHEMATICAL ITALIC SMALL LAMDA)
  2018-02-18 15:04 ` Noam Postavsky
  2018-02-18 15:23   ` Michael Grünewald
@ 2018-02-18 16:31   ` Eli Zaretskii
  1 sibling, 0 replies; 12+ messages in thread
From: Eli Zaretskii @ 2018-02-18 16:31 UTC (permalink / raw)
  To: Noam Postavsky; +Cc: michipili, 30513

> From: Noam Postavsky <npostavs@gmail.com>
> Date: Sun, 18 Feb 2018 10:04:24 -0500
> Cc: 30513@debbugs.gnu.org
> 
> Michael Grünewald <michipili@gmail.com> writes:
> 
> > The Unicode character 𝜆 has its name misspelled, it should be
> >
> >   MATHEMATICAL ITALIC SMALL LAMBDA
> >
> > instead of 
> >
> >   MATHEMATICAL ITALIC SMALL LAMDA
> >
> > (notice a B is missing in the second variant).
> 
> https://en.wikipedia.org/wiki/Lambda says:
> 
>     Unicode uses the spelling "lamda" in character names, instead of
>     "lambda", due to "preferences expressed by the Greek National Body"[14]
> 
> [14]: http://www.unicode.org/mail-arch/unicode-ml/y2010-m06/0063.html

Indeed.  And Emacs takes the names from the Unicode character database
anyway, so we cannot misspell the names.

Note that some LAMDA letters have the "old name" property that uses
LAMBDA, but this specific character doesn't (probably because it was
added after the above-mentioned preference was adopted by the Unicode
consortium).





^ permalink raw reply	[flat|nested] 12+ messages in thread

* bug#30513: Unicode Character Name is misspelled (MATHEMATICAL ITALIC SMALL LAMDA)
  2018-02-18 15:23   ` Michael Grünewald
  2018-02-18 15:54     ` Noam Postavsky
@ 2018-02-18 16:31     ` Eli Zaretskii
  1 sibling, 0 replies; 12+ messages in thread
From: Eli Zaretskii @ 2018-02-18 16:31 UTC (permalink / raw)
  To: Michael Grünewald; +Cc: npostavs, 30513

> From: Michael Grünewald <michipili@gmail.com>
> Date: Sun, 18 Feb 2018 16:23:03 +0100
> Cc: 30513@debbugs.gnu.org
> 
> (I think I am well aware of the LAMBDA vs. LAMDA detail now but what about the huge crowd
> of Emacs users entering MATHEMATICAL ITALIC SMALL letters using C-x 8 RET? ;) )

Patches to support such lose matching are welcome, I think.





^ permalink raw reply	[flat|nested] 12+ messages in thread

* bug#30513: Unicode Character Name is misspelled (MATHEMATICAL ITALIC SMALL LAMDA)
       [not found]       ` <83woza9q0s.fsf@gnu.org>
@ 2018-02-18 18:29         ` Noam Postavsky
  2018-02-18 19:49           ` Drew Adams
  0 siblings, 1 reply; 12+ messages in thread
From: Noam Postavsky @ 2018-02-18 18:29 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: michipili, 30513

Eli Zaretskii <eliz@gnu.org> writes:

>> +                (if (and (not old-name) new-name (string-match "LAMDA" new-name))
>> +                    (puthash (replace-match "LAMBDA" t t new-name) c names))
>
> Won't this make ucs-names even larger and more redundant?

It will make ucs-names slightly larger and more redundant.  I think the
trade-off is worth it.  To give precise numbers, it adds 12 entries for
a total of 42857, which is 0.029%.

    (length
     ;; Entries satisfying (and (not old-name) new-name (string-match "LAMDA" new-name))
     '("MATHEMATICAL SANS-SERIF BOLD ITALIC SMALL LAMDA"
       "MATHEMATICAL SANS-SERIF BOLD ITALIC CAPITAL LAMDA"
       "MATHEMATICAL SANS-SERIF BOLD SMALL LAMDA"
       "MATHEMATICAL SANS-SERIF BOLD CAPITAL LAMDA"
       "MATHEMATICAL BOLD ITALIC SMALL LAMDA"
       "MATHEMATICAL BOLD ITALIC CAPITAL LAMDA"
       "MATHEMATICAL ITALIC SMALL LAMDA"
       "MATHEMATICAL ITALIC CAPITAL LAMDA"
       "MATHEMATICAL BOLD SMALL LAMDA"
       "MATHEMATICAL BOLD CAPITAL LAMDA"
       "UGARITIC LETTER LAMDA"
       "GREEK LETTER SMALL CAPITAL LAMDA")) ;=> 12

    (hash-table-count ucs-names) ;=> 42857





^ permalink raw reply	[flat|nested] 12+ messages in thread

* bug#30513: Unicode Character Name is misspelled (MATHEMATICAL ITALIC SMALL LAMDA)
  2018-02-18 18:29         ` Noam Postavsky
@ 2018-02-18 19:49           ` Drew Adams
  2018-02-23 19:50             ` Marcin Borkowski
  0 siblings, 1 reply; 12+ messages in thread
From: Drew Adams @ 2018-02-18 19:49 UTC (permalink / raw)
  To: Noam Postavsky, Eli Zaretskii; +Cc: michipili, 30513

> >> +                (if (and (not old-name) new-name (string-match
> "LAMDA" new-name))
> >> +                    (puthash (replace-match "LAMBDA" t t new-name) c
> names))
> >
> > Won't this make ucs-names even larger and more redundant?
> 
> It will make ucs-names slightly larger and more redundant.  I think the
> trade-off is worth it.  To give precise numbers, it adds 12 entries for
> a total of 42857, which is 0.029%.

I would not make the point that this adds too many chars
for `ucs-names' or for `C-x 8 RET'.

I would make the point that we should not be inventing
character names and then associating such inventions with
what has heretofore been a pretty faithful reflection of
the Unicode standard.

There are many different possible uses of `ucs-names'.
It should not be assumed that the only use is to complete
`C-x 8 RET' or that every use of that command or
`ucs-names' will be improved by "loose" matching that
allows names that are not defined by Unicode.

If someone wants a command (or a hash table or other
mapping similar to `ucs-names') that provides and uses
handy non-Unicode names, s?he can easily define it.

And if Emacs itself wants to provide such a command or
such a map-producing function it can do it.  But please
do not do this to `ucs-names' or the default behavior
for character insertion (i.e., `C-x 8 RET').  Every such
possible "improvement" of character names for one person
is sure to be a detriment to someone else and other use
cases.

Unicode itself does not define additional LAMDA versions
of character names.  `ucs-names' and `C-x 8 RET' should
respect that and faithfully reflect the Unicode standard.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* bug#30513: Unicode Character Name is misspelled (MATHEMATICAL ITALIC SMALL LAMDA)
  2018-02-18 19:49           ` Drew Adams
@ 2018-02-23 19:50             ` Marcin Borkowski
  2018-02-23 20:15               ` Drew Adams
  0 siblings, 1 reply; 12+ messages in thread
From: Marcin Borkowski @ 2018-02-23 19:50 UTC (permalink / raw)
  To: Drew Adams; +Cc: michipili, Noam Postavsky, 30513


On 2018-02-18, at 20:49, Drew Adams <drew.adams@oracle.com> wrote:

>> >> +                (if (and (not old-name) new-name (string-match
>> "LAMDA" new-name))
>> >> +                    (puthash (replace-match "LAMBDA" t t new-name) c
>> names))
>> >
>> > Won't this make ucs-names even larger and more redundant?
>>
>> It will make ucs-names slightly larger and more redundant.  I think the
>> trade-off is worth it.  To give precise numbers, it adds 12 entries for
>> a total of 42857, which is 0.029%.
>
> I would not make the point that this adds too many chars
> for `ucs-names' or for `C-x 8 RET'.
>
> I would make the point that we should not be inventing
> character names and then associating such inventions with
> what has heretofore been a pretty faithful reflection of
> the Unicode standard.

How about this one?

︘
PRESENTATION FORM FOR VERTICAL RIGHT WHITE LENTICULAR BRAKCET
                                                         ^^

And what about this one?  Press C-x 8 RET MATHEMATICAL ITALIC SMALL TAB
and try to answer the puzzle: where has the "MATHEMATICAL ITALIC SMALL H"
gone?

(The answer, rot'd-13 so that I don't spoil it for Unicode wannabe
detectives;-): CYNAPX PBAFGNAG, U+210E.)

IOW, I would argue that _some_ kind of system to help the user overcome
the inherent Unicode problems might be a good idea.

Best,

--
Marcin Borkowski





^ permalink raw reply	[flat|nested] 12+ messages in thread

* bug#30513: Unicode Character Name is misspelled (MATHEMATICAL ITALIC SMALL LAMDA)
  2018-02-23 19:50             ` Marcin Borkowski
@ 2018-02-23 20:15               ` Drew Adams
  2018-02-24 20:41                 ` Marcin Borkowski
  0 siblings, 1 reply; 12+ messages in thread
From: Drew Adams @ 2018-02-23 20:15 UTC (permalink / raw)
  To: Marcin Borkowski; +Cc: michipili, Noam Postavsky, 30513

> > I would not make the point that this adds too many chars
> > for `ucs-names' or for `C-x 8 RET'.
> >
> > I would make the point that we should not be inventing
> > character names and then associating such inventions with
> > what has heretofore been a pretty faithful reflection of
> > the Unicode standard.
> 
> How about this one?
> 
> ︘
> PRESENTATION FORM FOR VERTICAL RIGHT WHITE LENTICULAR BRAKCET
>                                                          ^^
> And what about this one?  Press C-x 8 RET MATHEMATICAL ITALIC SMALL TAB
> and try to answer the puzzle: where has the "MATHEMATICAL ITALIC SMALL H"
> gone?
> 
> (The answer, rot'd-13 so that I don't spoil it for Unicode wannabe
> detectives;-): CYNAPX PBAFGNAG, U+210E.)
> 
> IOW, I would argue that _some_ kind of system to help the user
> overcome the inherent Unicode problems might be a good idea.

Agreed: some help would help. ;-)  But not at the cost of
changing `ucs-names'.

You snipped most of my post, including the part that said
that although we should leave the set of Unicode names as
Unicode defines them, so that `ucs-names' remains faithful
to the standard, we can certainly add Emacs constructs (e.g.
commands, completion functions, whatever), to help users
use alternate names of our own invention, including spelling
corrections.

The fault is not with `ucs-names'.  The fault, if there
be any, is with the ways we currently _make use of it_
for users.

We could offer additional or alternative ways for users to
make use of it.  We could, for example, change `insert-char'
to respect a user option that expresses just how much such
help to provide, e.g., the degree of spelling help,
correction, abbreviation, or whatever.

If we do that then we should at least allow one of the
option values to mean that no such help is to be offered,
in which case `insert-char' would offer only the official
names.

There are other uses of `ucs-names', beyond `insert-char',
at least in 3rd-party libraries.  We should definitely not
assume that all uses of `ucs-names' should benefit or be
troubled by any Emacs-specific "improvements" we might
want to offer for the available char names.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* bug#30513: Unicode Character Name is misspelled (MATHEMATICAL ITALIC SMALL LAMDA)
  2018-02-23 20:15               ` Drew Adams
@ 2018-02-24 20:41                 ` Marcin Borkowski
  0 siblings, 0 replies; 12+ messages in thread
From: Marcin Borkowski @ 2018-02-24 20:41 UTC (permalink / raw)
  To: Drew Adams; +Cc: michipili, Noam Postavsky, 30513


On 2018-02-23, at 21:15, Drew Adams <drew.adams@oracle.com> wrote:

>> > I would not make the point that this adds too many chars
>> > for `ucs-names' or for `C-x 8 RET'.
>> >
>> > I would make the point that we should not be inventing
>> > character names and then associating such inventions with
>> > what has heretofore been a pretty faithful reflection of
>> > the Unicode standard.
>> 
>> How about this one?
>> 
>> ︘
>> PRESENTATION FORM FOR VERTICAL RIGHT WHITE LENTICULAR BRAKCET
>>                                                          ^^
>> And what about this one?  Press C-x 8 RET MATHEMATICAL ITALIC SMALL TAB
>> and try to answer the puzzle: where has the "MATHEMATICAL ITALIC SMALL H"
>> gone?
>> 
>> (The answer, rot'd-13 so that I don't spoil it for Unicode wannabe
>> detectives;-): CYNAPX PBAFGNAG, U+210E.)
>> 
>> IOW, I would argue that _some_ kind of system to help the user
>> overcome the inherent Unicode problems might be a good idea.
>
> Agreed: some help would help. ;-)  But not at the cost of
> changing `ucs-names'.
>
> You snipped most of my post, including the part that said
> that although we should leave the set of Unicode names as
> Unicode defines them, so that `ucs-names' remains faithful
> to the standard, we can certainly add Emacs constructs (e.g.
> commands, completion functions, whatever), to help users
> use alternate names of our own invention, including spelling
> corrections.
>
> The fault is not with `ucs-names'.  The fault, if there
> be any, is with the ways we currently _make use of it_
> for users.

I agree.

> We could offer additional or alternative ways for users to
> make use of it.  We could, for example, change `insert-char'
> to respect a user option that expresses just how much such
> help to provide, e.g., the degree of spelling help,
> correction, abbreviation, or whatever.
>
> If we do that then we should at least allow one of the
> option values to mean that no such help is to be offered,
> in which case `insert-char' would offer only the official
> names.
>
> There are other uses of `ucs-names', beyond `insert-char',
> at least in 3rd-party libraries.  We should definitely not
> assume that all uses of `ucs-names' should benefit or be
> troubled by any Emacs-specific "improvements" we might
> want to offer for the available char names.

+1.  For one interesting use of ucs-names, see my blog post here:
http://mbork.pl/2017-10-02_Converting_TeX_sequences_to_Unicode_characters

Best,

--
Marcin Borkowski





^ permalink raw reply	[flat|nested] 12+ messages in thread

* bug#30513: Unicode Character Name is misspelled (MATHEMATICAL ITALIC SMALL LAMDA)
  2018-02-18 15:54     ` Noam Postavsky
       [not found]       ` <83woza9q0s.fsf@gnu.org>
@ 2020-09-04  4:45       ` Lars Ingebrigtsen
  1 sibling, 0 replies; 12+ messages in thread
From: Lars Ingebrigtsen @ 2020-09-04  4:45 UTC (permalink / raw)
  To: Noam Postavsky; +Cc: Michael Grünewald, 30513

Noam Postavsky <npostavs@gmail.com> writes:

> Yeah, this is fresh in my memory since I was recently playing with ucs-insert
> and was a bit surprised to discover "LAMBDA" under the *old* name:
>
>   name: GREEK SMALL LETTER LAMDA
>   old-name: GREEK SMALL LETTER LAMBDA
>
> Anyway, I think the following patch should smooth things over:

The discussion then turned to whether this would add a lot of redundant
entries, but it's just 12, so I think that's fine.  It was also
suggested that there should be a more general mechanism to provide
alternative/fixed spellings for any kind of oddly-spelled Unicode
character, and that's true.

But in isolation, and because Emacs is a Lisp system, I think Noam's
patch makes sense, and I've applied it (with some cosmetic changes) to
Emacs 28.

If anybody disagrees, feel free to revert.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2020-09-04  4:45 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-02-18 14:58 bug#30513: Unicode Character Name is misspelled (MATHEMATICAL ITALIC SMALL LAMDA) Michael Grünewald
2018-02-18 15:04 ` Noam Postavsky
2018-02-18 15:23   ` Michael Grünewald
2018-02-18 15:54     ` Noam Postavsky
     [not found]       ` <83woza9q0s.fsf@gnu.org>
2018-02-18 18:29         ` Noam Postavsky
2018-02-18 19:49           ` Drew Adams
2018-02-23 19:50             ` Marcin Borkowski
2018-02-23 20:15               ` Drew Adams
2018-02-24 20:41                 ` Marcin Borkowski
2020-09-04  4:45       ` Lars Ingebrigtsen
2018-02-18 16:31     ` Eli Zaretskii
2018-02-18 16:31   ` Eli Zaretskii

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).