unofficial mirror of help-gnu-emacs@gnu.org
 help / color / mirror / Atom feed
* Different names for Unicode codepoint
@ 2016-04-21 19:04 Lele Gaifax
  2016-04-21 19:40 ` Eli Zaretskii
  2016-04-21 19:40 ` tomas
  0 siblings, 2 replies; 4+ messages in thread
From: Lele Gaifax @ 2016-04-21 19:04 UTC (permalink / raw)
  To: help-gnu-emacs; +Cc: python-list

Hi,

is there a particular reason for the slightly different names that Emacs
(version 25.0.92) and Python (version 3.6.0a0) give to a single Unicode entity?

Just to mention one codepoint, ⋖ is called "LESS THAN WITH DOT" accordingly to
Emacs' C-x 8 RET TAB menu, while in Python:

    >>> import unicodedata
    >>> unicodedata.name('⋖')
    'LESS-THAN WITH DOT'
    >>> print("\N{LESS THAN WITH DOT}")
      File "<stdin>", line 1
    SyntaxError: (unicode error) ...: unknown Unicode character name

ciao, lele.
-- 
nickname: Lele Gaifax | Quando vivrò di quello che ho pensato ieri
real: Emanuele Gaifas | comincerò ad aver paura di chi mi copia.
lele@metapensiero.it  |                 -- Fortunato Depero, 1929.




^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Different names for Unicode codepoint
  2016-04-21 19:04 Different names for Unicode codepoint Lele Gaifax
@ 2016-04-21 19:40 ` Eli Zaretskii
  2016-04-21 19:56   ` Lele Gaifax
  2016-04-21 19:40 ` tomas
  1 sibling, 1 reply; 4+ messages in thread
From: Eli Zaretskii @ 2016-04-21 19:40 UTC (permalink / raw)
  To: help-gnu-emacs; +Cc: python-list

> From: Lele Gaifax <lele@metapensiero.it>
> Date: Thu, 21 Apr 2016 21:04:32 +0200
> Cc: python-list@python.org
> 
> is there a particular reason for the slightly different names that Emacs
> (version 25.0.92) and Python (version 3.6.0a0) give to a single Unicode entity?

They don't.

> Just to mention one codepoint, ⋖ is called "LESS THAN WITH DOT" accordingly to
> Emacs' C-x 8 RET TAB menu, while in Python:
> 
>     >>> import unicodedata
>     >>> unicodedata.name('⋖')
>     'LESS-THAN WITH DOT'
>     >>> print("\N{LESS THAN WITH DOT}")
>       File "<stdin>", line 1
>     SyntaxError: (unicode error) ...: unknown Unicode character name

Emacs shows both the "Name" and the "Old Name" properties of
characters as completion candidates, while Python evidently supports
only "Name".  If you type "C-x 8 RET LESS TAB", then you will see
among the completion candidates both "LESS THAN WITH DOT" and
"LESS-THAN WITH DOT".  The former is the "old name" of this character,
according to the Unicode Character Database (which is where Emacs
obtains the names and other properties of characters).



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Different names for Unicode codepoint
  2016-04-21 19:04 Different names for Unicode codepoint Lele Gaifax
  2016-04-21 19:40 ` Eli Zaretskii
@ 2016-04-21 19:40 ` tomas
  1 sibling, 0 replies; 4+ messages in thread
From: tomas @ 2016-04-21 19:40 UTC (permalink / raw)
  To: help-gnu-emacs

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Thu, Apr 21, 2016 at 09:04:32PM +0200, Lele Gaifax wrote:
> Hi,
> 
> is there a particular reason for the slightly different names that Emacs
> (version 25.0.92) and Python (version 3.6.0a0) give to a single Unicode entity?
> 
> Just to mention one codepoint, ⋖ is called "LESS THAN WITH DOT" accordingly to
> Emacs' C-x 8 RET TAB menu, while in Python:
> 
>     >>> import unicodedata
>     >>> unicodedata.name('⋖')
>     'LESS-THAN WITH DOT'
>     >>> print("\N{LESS THAN WITH DOT}")
>       File "<stdin>", line 1
>     SyntaxError: (unicode error) ...: unknown Unicode character name

FWIW, "my" Emacs [1] says:

  Character code properties: customize what to show
    name: LESS-THAN WITH DOT
    old-name: LESS THAN WITH DOT

That means the spelling without the dash seems to be somewhat oldish.

[1] GNU Emacs 25.1.50.1 (x86_64-unknown-linux-gnu, GTK+ Version 2.24.29)
    of 2016-03-07

regards
- -- tomás
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)

iEYEARECAAYFAlcZLKMACgkQBcgs9XrR2kbW9wCfbXrqFKi0q8H4PZihI4hyObyg
SHkAn3zur28ELYGDnnOmdSJcEEAy4a2b
=im17
-----END PGP SIGNATURE-----



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Different names for Unicode codepoint
  2016-04-21 19:40 ` Eli Zaretskii
@ 2016-04-21 19:56   ` Lele Gaifax
  0 siblings, 0 replies; 4+ messages in thread
From: Lele Gaifax @ 2016-04-21 19:56 UTC (permalink / raw)
  To: help-gnu-emacs; +Cc: python-list

Eli Zaretskii <eliz@gnu.org> writes:

> Emacs shows both the "Name" and the "Old Name" properties of
> characters as completion candidates, while Python evidently supports
> only "Name".  If you type "C-x 8 RET LESS TAB", then you will see
> among the completion candidates both "LESS THAN WITH DOT" and
> "LESS-THAN WITH DOT".  The former is the "old name" of this character,
> according to the Unicode Character Database (which is where Emacs
> obtains the names and other properties of characters).

Thank you Eli, didn't notice!

ciao, lele.
-- 
nickname: Lele Gaifax | Quando vivrò di quello che ho pensato ieri
real: Emanuele Gaifas | comincerò ad aver paura di chi mi copia.
lele@metapensiero.it  |                 -- Fortunato Depero, 1929.




^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2016-04-21 19:56 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-04-21 19:04 Different names for Unicode codepoint Lele Gaifax
2016-04-21 19:40 ` Eli Zaretskii
2016-04-21 19:56   ` Lele Gaifax
2016-04-21 19:40 ` tomas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).