* bug#32599: 25.2; Feature request: input PUA characters by name [not found] <86sh30fg4q.fsf@mimuw.edu.pl> @ 2018-08-31 6:05 ` Janusz S. Bień 2018-08-31 8:05 ` Robert Pluim 0 siblings, 1 reply; 39+ messages in thread From: Janusz S. Bień @ 2018-08-31 6:05 UTC (permalink / raw) To: 32599 The second attempt to post. Regards JSB On Mon, Aug 27 2018 at 9:00 +0200, jsbien@mimuw.edu.pl writes: > Please extend 'insert-char' to allow using the PUA character names and > codes provided in the format of Unicode named sequences > > http://www.unicode.org/reports/tr34/tr34-23.html > > but containing only single characters. > > Such data are already available for the Medieval Unicode Font Initiative > specification at > > https://bitbucket.org/jsbien/unihistext/src/master/example/ > > A more radical extension would allow also to input real named sequences, > but from my point of view this is of much lower importance. > > Best regards > > Janusz > > > In GNU Emacs 25.2.2 (x86_64-pc-linux-gnu, GTK+ Version 3.22.30) > of 2018-07-11, modified by Debian built on x86-ubc-01 > Windowing system distributor 'The X.Org Foundation', version 11.0.12000000 > System Description: Debian GNU/Linux testing (buster) > > Configured using: > 'configure --build x86_64-linux-gnu --prefix=/usr > --sharedstatedir=/var/lib --libexecdir=/usr/lib > --localstatedir=/var/lib --infodir=/usr/share/info > --mandir=/usr/share/man --with-pop=yes > --enable-locallisppath=/etc/emacs25:/etc/emacs:/usr/local/share/emacs/25.2/site-lisp:/usr/local/share/emacs/site-lisp:/usr/share/emacs/25.2/site-lisp:/usr/share/emacs/site-lisp > --with-sound=alsa --without-gconf --build x86_64-linux-gnu > --prefix=/usr --sharedstatedir=/var/lib --libexecdir=/usr/lib > --localstatedir=/var/lib --infodir=/usr/share/info > --mandir=/usr/share/man --with-pop=yes > --enable-locallisppath=/etc/emacs25:/etc/emacs:/usr/local/share/emacs/25.2/site-lisp:/usr/local/share/emacs/site-lisp:/usr/share/emacs/25.2/site-lisp:/usr/share/emacs/site-lisp > --with-sound=alsa --without-gconf --with-x=yes --with-x-toolkit=gtk3 > --with-toolkit-scroll-bars 'CFLAGS=-g -O2 > -fdebug-prefix-map=/build/emacs25-cfFROJ/emacs25-25.2+1=. -fstack-protector-strong > -Wformat -Werror=format-security -Wall' 'CPPFLAGS=-Wdate-time > -D_FORTIFY_SOURCE=2' LDFLAGS=-Wl,-z,relro' > > Configured features: > XPM JPEG TIFF GIF PNG RSVG IMAGEMAGICK SOUND GPM DBUS GSETTINGS NOTIFY > ACL LIBSELINUX GNUTLS LIBXML2 FREETYPE M17N_FLT LIBOTF XFT ZLIB > TOOLKIT_SCROLL_BARS GTK3 X11 > > Important settings: > value of $LANG: en_US.UTF-8 > locale-coding-system: utf-8-unix > > Major mode: Group > > Minor modes in effect: > cursor-sensor-mode: t > gnus-undo-mode: t > tooltip-mode: t > global-eldoc-mode: t > electric-indent-mode: t > mouse-wheel-mode: t > tool-bar-mode: t > menu-bar-mode: t > file-name-shadow-mode: t > global-font-lock-mode: t > font-lock-mode: t > blink-cursor-mode: t > auto-composition-mode: t > auto-encryption-mode: t > auto-compression-mode: t > buffer-read-only: t > line-number-mode: t > transient-mark-mode: t > > Recent messages: > Sending... > Mark set [2 times] > Sending via mail... > Sending email > Sending email done > Mark set > Saving file /home/jsbien/Mail/archive/sent/2018-08... > Wrote /home/jsbien/Mail/archive/sent/2018-08 > Sending...done > Auto-saving...done > > Load-path shadows: > /usr/share/emacs/25.2/site-lisp/debian-startup hides /usr/share/emacs/site-lisp/debian-startup > /usr/share/emacs25/site-lisp/cmake-data/cmake-mode hides /usr/share/emacs/site-lisp/cmake-mode > /usr/share/emacs/site-lisp/rst hides /usr/share/emacs/25.2/lisp/textmodes/rst > /usr/share/emacs25/site-lisp/latex-cjk-thai/thai-word hides /usr/share/emacs/25.2/lisp/language/thai-word > > Features: > (shadow warnings emacsbug smtpmail mailalias bbdb-pgp bbdb-message > sendmail nnir sort gnus-cite smiley ansi-color shr-color color url-util > url-parse url-vars shr dom subr-x browse-url mm-archive mail-extr > gnus-async gnus-bcklg qp gnus-ml disp-table cursor-sensor nndraft nnmh > nndoc nnfolder utf-7 bbdb-gnus bbdb-mua bbdb-com crm network-stream nsm > auth-source cl-seq eieio eieio-core cl-macs starttls gnus-agent > gnus-srvr gnus-score score-mode nnvirtual gnus-msg gnus-art mm-uu > mml2015 mm-view mml-smime smime dig mailcap nntp gnus-cache gnus-sum > gnus-group gnus-undo gnus-start gnus-cloud nnimap nnmail mail-source tls > gnutls utf7 netrc nnoo parse-time gnus-spec gnus-int gnus-range message > dired-x dired format-spec rfc822 mml mml-sec password-cache epg > mm-decode mm-bodies mm-encode mail-parse rfc2231 rfc2047 rfc2045 > ietf-drums mailabbrev gmm-utils mailheader gnus-win gnus gnus-ems > nnheader gnus-util mail-utils mm-util help-fns mail-prsvr windmove quail > bbdb bbdb-site timezone server edmacro kmacro mairix cus-edit cus-start > cus-load wid-edit finder-inf tex-site info debian-el package epg-config > seq byte-opt gv bytecomp byte-compile cl-extra help-mode easymenu cconv > cl-loaddefs pcase cl-lib bbdb-loaddefs time-date mule-util tooltip eldoc > electric uniquify ediff-hook vc-hooks lisp-float-type mwheel x-win > term/common-win x-dnd tool-bar dnd fontset image regexp-opt fringe > tabulated-list newcomment elisp-mode lisp-mode prog-mode register page > menu-bar rfn-eshadow timer select scroll-bar mouse jit-lock font-lock > syntax facemenu font-core frame cl-generic cham georgian utf-8-lang > misc-lang vietnamese tibetan thai tai-viet lao korean japanese eucjp-ms > cp51932 hebrew greek romanian slovak czech european ethiopic indian > cyrillic chinese charscript case-table epa-hook jka-cmpr-hook help > simple abbrev minibuffer cl-preloaded nadvice loaddefs button faces > cus-face macroexp files text-properties overlay sha1 md5 base64 format > env code-pages mule custom widget hashtable-print-readable backquote > dbusbind inotify dynamic-setting system-font-setting font-render-setting > move-toolbar gtk x-toolkit x multi-tty make-network-process emacs) > > Memory information: > ((conses 16 260949 39014) > (symbols 48 33546 0) > (miscs 40 251 358) > (strings 32 65611 13129) > (string-bytes 1 2142874) > (vectors 16 34262) > (vector-slots 8 1334294 209768) > (floats 8 577 397) > (intervals 56 725 127) > (buffers 976 34)) -- , Janusz S. Bien emeryt (emeritus) https://sites.google.com/view/jsbien ^ permalink raw reply [flat|nested] 39+ messages in thread
* bug#32599: 25.2; Feature request: input PUA characters by name 2018-08-31 6:05 ` bug#32599: 25.2; Feature request: input PUA characters by name Janusz S. Bień @ 2018-08-31 8:05 ` Robert Pluim 2018-08-31 9:09 ` Janusz S. Bień 0 siblings, 1 reply; 39+ messages in thread From: Robert Pluim @ 2018-08-31 8:05 UTC (permalink / raw) To: Janusz S. Bień; +Cc: 32599 jsbien@mimuw.edu.pl (Janusz S. Bień) writes: > On Mon, Aug 27 2018 at 9:00 +0200, jsbien@mimuw.edu.pl writes: >> Please extend 'insert-char' to allow using the PUA character names and >> codes provided in the format of Unicode named sequences >> >> http://www.unicode.org/reports/tr34/tr34-23.html >> >> but containing only single characters. Iʼm not sure I understand the request: insert-char already supports using character names or code points to specify the character. What's missing? Thanks Robert ^ permalink raw reply [flat|nested] 39+ messages in thread
* bug#32599: 25.2; Feature request: input PUA characters by name 2018-08-31 8:05 ` Robert Pluim @ 2018-08-31 9:09 ` Janusz S. Bień 2018-08-31 12:34 ` Robert Pluim 2022-04-26 13:09 ` Lars Ingebrigtsen 0 siblings, 2 replies; 39+ messages in thread From: Janusz S. Bień @ 2018-08-31 9:09 UTC (permalink / raw) To: 32599 On Fri, Aug 31 2018 at 10:05 +0200, rpluim@gmail.com writes: > jsbien@mimuw.edu.pl (Janusz S. Bień) writes: > >> On Mon, Aug 27 2018 at 9:00 +0200, jsbien@mimuw.edu.pl writes: >>> Please extend 'insert-char' to allow using the PUA character names and >>> codes provided in the format of Unicode named sequences >>> >>> http://www.unicode.org/reports/tr34/tr34-23.html >>> >>> but containing only single characters. > > Iʼm not sure I understand the request: insert-char already supports > using character names or code points to specify the character. What's > missing? You are missing "PUA" in the topic and in the text of my feature request. PUA is Private Use Area. Cf. e.g. https://folk.uib.no/hnooh/mufi/ and try to insert a character using such name as 'COMBINING ABBREVIATION MARK SUPERSCRIPT UR ROUND R FORM'. Cf. also https://www.unicode.org/mail-arch/unicode-ml/y2018-m08/0096.html Best regards Janusz -- , Janusz S. Bien emeryt (emeritus) https://sites.google.com/view/jsbien ^ permalink raw reply [flat|nested] 39+ messages in thread
* bug#32599: 25.2; Feature request: input PUA characters by name 2018-08-31 9:09 ` Janusz S. Bień @ 2018-08-31 12:34 ` Robert Pluim 2018-08-31 12:54 ` Janusz S. Bień 2022-04-26 13:09 ` Lars Ingebrigtsen 1 sibling, 1 reply; 39+ messages in thread From: Robert Pluim @ 2018-08-31 12:34 UTC (permalink / raw) To: Janusz S. Bień; +Cc: 32599 jsbien@mimuw.edu.pl (Janusz S. Bień) writes: > On Fri, Aug 31 2018 at 10:05 +0200, rpluim@gmail.com writes: >> jsbien@mimuw.edu.pl (Janusz S. Bień) writes: >> >>> On Mon, Aug 27 2018 at 9:00 +0200, jsbien@mimuw.edu.pl writes: >>>> Please extend 'insert-char' to allow using the PUA character names and >>>> codes provided in the format of Unicode named sequences >>>> >>>> http://www.unicode.org/reports/tr34/tr34-23.html >>>> >>>> but containing only single characters. >> >> Iʼm not sure I understand the request: insert-char already supports >> using character names or code points to specify the character. What's >> missing? > > You are missing "PUA" in the topic and in the text of my feature > request. PUA is Private Use Area. Cf. e.g. > I missed that indeed. So what you want is a way of dynamically extending which named entities can be used with insert-char based on the type of information here: https://folk.uib.no/hnooh/mufi/ Robert ^ permalink raw reply [flat|nested] 39+ messages in thread
* bug#32599: 25.2; Feature request: input PUA characters by name 2018-08-31 12:34 ` Robert Pluim @ 2018-08-31 12:54 ` Janusz S. Bień 2019-05-26 8:10 ` Janusz S. Bień 0 siblings, 1 reply; 39+ messages in thread From: Janusz S. Bień @ 2018-08-31 12:54 UTC (permalink / raw) To: 32599 On Fri, Aug 31 2018 at 14:34 +0200, rpluim@gmail.com writes: > jsbien@mimuw.edu.pl (Janusz S. Bień) writes: > >> On Fri, Aug 31 2018 at 10:05 +0200, rpluim@gmail.com writes: >>> jsbien@mimuw.edu.pl (Janusz S. Bień) writes: >>> >>>> On Mon, Aug 27 2018 at 9:00 +0200, jsbien@mimuw.edu.pl writes: >>>>> Please extend 'insert-char' to allow using the PUA character names and >>>>> codes provided in the format of Unicode named sequences >>>>> >>>>> http://www.unicode.org/reports/tr34/tr34-23.html >>>>> >>>>> but containing only single characters. >>> >>> Iʼm not sure I understand the request: insert-char already supports >>> using character names or code points to specify the character. What's >>> missing? >> >> You are missing "PUA" in the topic and in the text of my feature >> request. PUA is Private Use Area. Cf. e.g. >> > > I missed that indeed. So what you want is a way of dynamically > extending which named entities can be used with insert-char based on > the type of information here: > > https://folk.uib.no/hnooh/mufi/ Yes, although it is a slight oversimplification. First, the MUFI data in a more convenient form are available here: On Mon, Aug 27 2018 at 9:00 +0200, jsbien@mimuw.edu.pl writes: [...] > https://bitbucket.org/jsbien/unihistext/src/master/example/ Secondly, other users may be interested in other sets of PUA characters, cf. http://andron-typeforum.xobor.de/t10f13-Towards-a-linguistic-corporate-use-area-LINCUA.html https://en.wikipedia.org/wiki/ConScript_Unicode_Registry Best regards Janusz -- , Janusz S. Bien emeryt (emeritus) https://sites.google.com/view/jsbien ^ permalink raw reply [flat|nested] 39+ messages in thread
* bug#32599: 25.2; Feature request: input PUA characters by name 2018-08-31 12:54 ` Janusz S. Bień @ 2019-05-26 8:10 ` Janusz S. Bień 2019-05-26 14:45 ` Eli Zaretskii 0 siblings, 1 reply; 39+ messages in thread From: Janusz S. Bień @ 2019-05-26 8:10 UTC (permalink / raw) To: 32599 On Fri, Aug 31 2018 at 14:54 +02, Janusz S. Bień wrote: > On Fri, Aug 31 2018 at 14:34 +0200, rpluim@gmail.com writes: >> jsbien@mimuw.edu.pl (Janusz S. Bień) writes: >> >>> On Fri, Aug 31 2018 at 10:05 +0200, rpluim@gmail.com writes: >>>> jsbien@mimuw.edu.pl (Janusz S. Bień) writes: >>>> >>>>> On Mon, Aug 27 2018 at 9:00 +0200, jsbien@mimuw.edu.pl writes: >>>>>> Please extend 'insert-char' to allow using the PUA character names and >>>>>> codes provided in the format of Unicode named sequences >>>>>> >>>>>> http://www.unicode.org/reports/tr34/tr34-23.html [...] > First, the MUFI data in a more convenient form are available here: > > On Mon, Aug 27 2018 at 9:00 +0200, jsbien@mimuw.edu.pl writes: > > [...] > >> https://bitbucket.org/jsbien/unihistext/src/master/example/ If you prefer a file pattern after UnicodeData.txt, you can find it here: http://www.kreativekorp.com/charset/PUADATA/PUBLIC/MUFI/ > > Secondly, other users may be interested in other sets of PUA characters, > cf. > > http://andron-typeforum.xobor.de/t10f13-Towards-a-linguistic-corporate-use-area-LINCUA.html > https://en.wikipedia.org/wiki/ConScript_Unicode_Registry or Under-ConScript Unicode Registry: http://www.kreativekorp.com/ucsur/ Best regards Janusz -- , Janusz S. Bien emeryt (emeritus) https://sites.google.com/view/jsbien ^ permalink raw reply [flat|nested] 39+ messages in thread
* bug#32599: 25.2; Feature request: input PUA characters by name 2019-05-26 8:10 ` Janusz S. Bień @ 2019-05-26 14:45 ` Eli Zaretskii 2019-05-26 15:18 ` Janusz S. Bień 0 siblings, 1 reply; 39+ messages in thread From: Eli Zaretskii @ 2019-05-26 14:45 UTC (permalink / raw) To: jsbien; +Cc: 32599 > From: jsbien@mimuw.edu.pl (Janusz S. Bień) > Date: Sun, 26 May 2019 10:10:02 +0200 > > > First, the MUFI data in a more convenient form are available here: > > > > On Mon, Aug 27 2018 at 9:00 +0200, jsbien@mimuw.edu.pl writes: > > > > [...] > > > >> https://bitbucket.org/jsbien/unihistext/src/master/example/ > > If you prefer a file pattern after UnicodeData.txt, you can find it > here: > > http://www.kreativekorp.com/charset/PUADATA/PUBLIC/MUFI/ > > > > > Secondly, other users may be interested in other sets of PUA characters, > > cf. > > > > http://andron-typeforum.xobor.de/t10f13-Towards-a-linguistic-corporate-use-area-LINCUA.html > > https://en.wikipedia.org/wiki/ConScript_Unicode_Registry > > or Under-ConScript Unicode Registry: > > http://www.kreativekorp.com/ucsur/ The UnicodeData.txt file is compiled into Emacs, but the files you mention cannot be compiled into it, because they vary, and because different users might want different lists of characters to be supported. So we need to design how this will work. In addition, I think PUA codepoints aren't really treated as characters in Emacs, so there's a need for some infrastructure changes. Patches welcome. ^ permalink raw reply [flat|nested] 39+ messages in thread
* bug#32599: 25.2; Feature request: input PUA characters by name 2019-05-26 14:45 ` Eli Zaretskii @ 2019-05-26 15:18 ` Janusz S. Bień 2019-05-26 15:48 ` Janusz S. Bień 2019-05-26 16:56 ` Eli Zaretskii 0 siblings, 2 replies; 39+ messages in thread From: Janusz S. Bień @ 2019-05-26 15:18 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 32599 On Sun, May 26 2019 at 17:45 +03, Eli Zaretskii wrote: >> From: jsbien@mimuw.edu.pl (Janusz S. Bień) >> Date: Sun, 26 May 2019 10:10:02 +0200 >> >> > First, the MUFI data in a more convenient form are available here: >> > >> > On Mon, Aug 27 2018 at 9:00 +0200, jsbien@mimuw.edu.pl writes: >> > >> > [...] >> > >> >> https://bitbucket.org/jsbien/unihistext/src/master/example/ >> >> If you prefer a file pattern after UnicodeData.txt, you can find it >> here: >> >> http://www.kreativekorp.com/charset/PUADATA/PUBLIC/MUFI/ >> >> > >> > Secondly, other users may be interested in other sets of PUA characters, >> > cf. >> > >> > http://andron-typeforum.xobor.de/t10f13-Towards-a-linguistic-corporate-use-area-LINCUA.html >> > https://en.wikipedia.org/wiki/ConScript_Unicode_Registry >> >> or Under-ConScript Unicode Registry: >> >> http://www.kreativekorp.com/ucsur/ > > The UnicodeData.txt file is compiled into Emacs, I know and I'm curious whether it is really needed. Why it cannot be loaded at the startup? The advantage would be the user can use always the up-to-date version of UnicodeData.txt (have you noticed that since 7th May we have now Unicode 12.1 because SQUARE ERA NAME REIWA was added?). > but the files you mention cannot be compiled into it, because they > vary, and because different users might want different lists of > characters to be supported. So we need to design how this will work. My naive idea is to "cheat" Emacs by providing it with the extended data without changing the original logic. Efficiency is less important than convenience, perhaps you can "advice" the 'describe-char' function to look for the data elsewhere. > In addition, I think PUA codepoints aren't really treated as > characters in Emacs, so there's a need for some infrastructure > changes. I do not propose to support the supplemental PUA planes. For the BMP this probably boils down to the availability of the property information. As we have now a pseudo-UnicodeData.txt for the PUA characters (at least thise I'm interested in) this doesn't seem to me a big problem). > Patches welcome. Unfortunately I'm unable to provide them myself. Best regards Janusz -- , Janusz S. Bien emeryt (emeritus) https://sites.google.com/view/jsbien ^ permalink raw reply [flat|nested] 39+ messages in thread
* bug#32599: 25.2; Feature request: input PUA characters by name 2019-05-26 15:18 ` Janusz S. Bień @ 2019-05-26 15:48 ` Janusz S. Bień 2019-05-26 16:51 ` Eli Zaretskii 2019-05-26 16:56 ` Eli Zaretskii 1 sibling, 1 reply; 39+ messages in thread From: Janusz S. Bień @ 2019-05-26 15:48 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 32599 On Sun, May 26 2019 at 17:18 +02, Janusz S. Bień wrote: [...] >> The UnicodeData.txt file is compiled into Emacs, > > I know and I'm curious whether it is really needed. It was a bad wording as I can guess it is done for efficiency. How the character data are stored and accessed? I have no qualification to find the answer myself in the code, > Why it cannot be loaded at the startup? Or, better, partially overriden or supplemented? Is is prohibited by the way the character data are stored? Best regards Janusz -- , Janusz S. Bien emeryt (emeritus) https://sites.google.com/view/jsbien ^ permalink raw reply [flat|nested] 39+ messages in thread
* bug#32599: 25.2; Feature request: input PUA characters by name 2019-05-26 15:48 ` Janusz S. Bień @ 2019-05-26 16:51 ` Eli Zaretskii 0 siblings, 0 replies; 39+ messages in thread From: Eli Zaretskii @ 2019-05-26 16:51 UTC (permalink / raw) To: jsbien; +Cc: 32599 > From: jsbien@mimuw.edu.pl (Janusz S. Bień) > Cc: 32599@debbugs.gnu.org > Date: Sun, 26 May 2019 17:48:43 +0200 > > >> The UnicodeData.txt file is compiled into Emacs, > > > > I know and I'm curious whether it is really needed. > > It was a bad wording as I can guess it is done for efficiency. Indeed. Both processing efficiency and (perhaps more importantly) memory efficiency. Some of the resulting data is accessed by core features that must be very efficient: the display engine, the regexp search functions, etc. > How the character data are stored and accessed? The data is stored in char-tables of special kind. See the files uni-*.el which are produced as part of the Emacs build process. > > Why it cannot be loaded at the startup? > > Or, better, partially overriden or supplemented? Is is prohibited by the > way the character data are stored? You can override the char-tables at run time, of course, but first you need to generate the new ones. And there's the other problem I mentioned: with general treatment of PUA codepoints. ^ permalink raw reply [flat|nested] 39+ messages in thread
* bug#32599: 25.2; Feature request: input PUA characters by name 2019-05-26 15:18 ` Janusz S. Bień 2019-05-26 15:48 ` Janusz S. Bień @ 2019-05-26 16:56 ` Eli Zaretskii 2019-05-26 17:33 ` Janusz S. Bień 1 sibling, 1 reply; 39+ messages in thread From: Eli Zaretskii @ 2019-05-26 16:56 UTC (permalink / raw) To: jsbien; +Cc: 32599 > From: jsbien@mimuw.edu.pl (Janusz S. Bień) > Cc: 32599@debbugs.gnu.org > Date: Sun, 26 May 2019 17:18:21 +0200 > > > In addition, I think PUA codepoints aren't really treated as > > characters in Emacs, so there's a need for some infrastructure > > changes. > > I do not propose to support the supplemental PUA planes. For the BMP > this probably boils down to the availability of the property > information. As we have now a pseudo-UnicodeData.txt for the PUA > characters (at least thise I'm interested in) this doesn't seem to me a > big problem). I think we are miscommunicating. The problems I alluded to start from the fact that we exclude the PUA codepoints from the character property database. This means you cannot change their case and access their syntax category, for example. Functions that select a suitable font also ignore PUA codepoints, IIRC. Etc. etc. -- Someone™ should go over all the places where we specify character properties and use them, and make sure PUA codepoints aren't disregarded. Otherwise, just making Emacs know the names of these codepoints will be of very limited value. ^ permalink raw reply [flat|nested] 39+ messages in thread
* bug#32599: 25.2; Feature request: input PUA characters by name 2019-05-26 16:56 ` Eli Zaretskii @ 2019-05-26 17:33 ` Janusz S. Bień 2019-05-26 18:52 ` Eli Zaretskii 0 siblings, 1 reply; 39+ messages in thread From: Janusz S. Bień @ 2019-05-26 17:33 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 32599 On Sun, May 26 2019 at 19:56 +03, Eli Zaretskii wrote: >> From: jsbien@mimuw.edu.pl (Janusz S. Bień) >> Cc: 32599@debbugs.gnu.org >> Date: Sun, 26 May 2019 17:18:21 +0200 >> >> > In addition, I think PUA codepoints aren't really treated as >> > characters in Emacs, so there's a need for some infrastructure >> > changes. >> >> I do not propose to support the supplemental PUA planes. For the BMP >> this probably boils down to the availability of the property >> information. As we have now a pseudo-UnicodeData.txt for the PUA >> characters (at least thise I'm interested in) this doesn't seem to me a >> big problem). > > I think we are miscommunicating. I'm afraid it's my fault, looks like I write too briefly. > The problems I alluded to start from the fact that we exclude the PUA > codepoints from the character property database. I understand this but I expected, perhaps incorrectly, that the code for it is rather simple. > This means you cannot change their case and access > their syntax category, for example. Functions that select a suitable > font also ignore PUA codepoints, IIRC. So the problem is that the relevant code occurs in several places. > Etc. etc. -- Someone™ should > go over all the places where we specify character properties and use > them, and make sure PUA codepoints aren't disregarded. I understand one has to look for the occurences of a function or constant. Definitely there are tools to make this easy/easier (many years ago I was experimenting with tags file), now I would expect them to be quite sophisticated. How would you approach this problem? Best regards Janusz -- , Janusz S. Bien emeryt (emeritus) https://sites.google.com/view/jsbien ^ permalink raw reply [flat|nested] 39+ messages in thread
* bug#32599: 25.2; Feature request: input PUA characters by name 2019-05-26 17:33 ` Janusz S. Bień @ 2019-05-26 18:52 ` Eli Zaretskii 2019-05-27 5:48 ` Janusz S. Bień 0 siblings, 1 reply; 39+ messages in thread From: Eli Zaretskii @ 2019-05-26 18:52 UTC (permalink / raw) To: jsbien; +Cc: 32599 > From: jsbien@mimuw.edu.pl (Janusz S. Bień) > Cc: 32599@debbugs.gnu.org > Date: Sun, 26 May 2019 19:33:20 +0200 > > > The problems I alluded to start from the fact that we exclude the PUA > > codepoints from the character property database. > > I understand this but I expected, perhaps incorrectly, that the code for > it is rather simple. Simple: yes. But it is also tedious, as there are many characters. See characters.el. With PUA codepoints there's one more complication: their attributes are not standardized, so there should be a way of defining that dynamically, not as characters.el does for all the other characters whose attributes are static. > > This means you cannot change their case and access > > their syntax category, for example. Functions that select a suitable > > font also ignore PUA codepoints, IIRC. > > So the problem is that the relevant code occurs in several places. No. Case-fiddling depends on data set up by characters.el, so once that data is set, all the rest should "just work". And similarly for other attributes. > > Etc. etc. -- Someone™ should > > go over all the places where we specify character properties and use > > them, and make sure PUA codepoints aren't disregarded. > > I understand one has to look for the occurences of a function or > constant. Definitely there are tools to make this easy/easier (many > years ago I was experimenting with tags file), now I would expect them > to be quite sophisticated. I'm not sure such tools will help here. I think a thorough audit of the related data and code is needed. > How would you approach this problem? I have no idea, sorry. I mentioned some of the issues in the hope that will help interested individuals to find the relevant places easier. How to modify the code and data in order to allow use of PUA codepoints, and on top of that to have the properties of each PUA codepoint determined from external data which isn't known in advance, is part of the design problem the person who'd like working on this will have to solve. Personally, I'm surprised people use PUA for these purposes, and even more surprised they expect Emacs to support this. But that's me. ^ permalink raw reply [flat|nested] 39+ messages in thread
* bug#32599: 25.2; Feature request: input PUA characters by name 2019-05-26 18:52 ` Eli Zaretskii @ 2019-05-27 5:48 ` Janusz S. Bień 2019-05-27 17:11 ` Eli Zaretskii 0 siblings, 1 reply; 39+ messages in thread From: Janusz S. Bień @ 2019-05-27 5:48 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 32599 On Sun, May 26 2019 at 21:52 +03, Eli Zaretskii wrote: [...] > Personally, I'm surprised people use PUA for > these purposes, Which purposes? Cf. https://en.wikipedia.org/wiki/Private_Use_Areas http://www.kreativekorp.com/charset/PUADATA/ http://bit.ly/2XVTzRL-LINCUA > and even more surprised they expect Emacs to support > this. But that's me. PUA characters, especially MUFI, are needed to typeset some texts. (XeLa)TeX is still a very good typesetting system and Emacs-based AUCTeX is still a very good tool to use TeX. Best regards Janusz -- , Janusz S. Bien emeryt (emeritus) https://sites.google.com/view/jsbien ^ permalink raw reply [flat|nested] 39+ messages in thread
* bug#32599: 25.2; Feature request: input PUA characters by name 2019-05-27 5:48 ` Janusz S. Bień @ 2019-05-27 17:11 ` Eli Zaretskii 2019-05-27 17:39 ` Janusz S. Bień 0 siblings, 1 reply; 39+ messages in thread From: Eli Zaretskii @ 2019-05-27 17:11 UTC (permalink / raw) To: jsbien; +Cc: 32599 > From: jsbien@mimuw.edu.pl (Janusz S. Bień) > Cc: 32599@debbugs.gnu.org > Date: Mon, 27 May 2019 07:48:38 +0200 > > On Sun, May 26 2019 at 21:52 +03, Eli Zaretskii wrote: > > [...] > > > Personally, I'm surprised people use PUA for > > these purposes, > > Which purposes? The purposes of mainstream text editing. Are there other comparable applications that let users define fonts for PUA codepoints, define their attributes, and then manipulate those characters as any other? > > and even more surprised they expect Emacs to support > > this. But that's me. > > PUA characters, especially MUFI, are needed to typeset some > texts. (XeLa)TeX is still a very good typesetting system and Emacs-based > AUCTeX is still a very good tool to use TeX. Those are separate projects. If they need to use non-standard characters with corresponding non-standard fonts, they could maintain some add-on packages for Emacs to do that. Asking Emacs to maintain compatibility to various ad-hoc registries outside of Unicode is not really reasonable. How will we know which PUA codepoints to support? How would we know which registered codepoints are stable enough and wide-spread enough to justify their support in Emacs? How will we keep track of the changes in these areas? This is a task for someone who is an expert in this domain, is aware of all the important developments there, and who can update Emacs whenever something important changes. As things are, we have trouble even tracking the Unicode Standard: each new edition takes several hours to incorporate, run the tests, make the necessary changes, etc. I guess what I'm saying is that without a dedicated volunteer who would take care of this issue we can only wish such support will be added, but we have no real hope it will materialize, except by some enormous luck. Of course, if you know someone who could be persuaded to come on board and work on this now and in the future, I think the feature will be welcome. Thanks. ^ permalink raw reply [flat|nested] 39+ messages in thread
* bug#32599: 25.2; Feature request: input PUA characters by name 2019-05-27 17:11 ` Eli Zaretskii @ 2019-05-27 17:39 ` Janusz S. Bień 2019-05-27 18:45 ` Eli Zaretskii 0 siblings, 1 reply; 39+ messages in thread From: Janusz S. Bień @ 2019-05-27 17:39 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 32599 On Mon, May 27 2019 at 20:11 +03, Eli Zaretskii wrote: >> From: jsbien@mimuw.edu.pl (Janusz S. Bień) >> Cc: 32599@debbugs.gnu.org >> Date: Mon, 27 May 2019 07:48:38 +0200 >> >> On Sun, May 26 2019 at 21:52 +03, Eli Zaretskii wrote: >> >> [...] >> >> > Personally, I'm surprised people use PUA for >> > these purposes, >> >> Which purposes? > > The purposes of mainstream text editing. Are there other comparable > applications that let users define fonts for PUA codepoints, define > their attributes, and then manipulate those characters as any other? I don't know and don't care, as I use only Emacs for editing. >> > and even more surprised they expect Emacs to support >> > this. But that's me. >> >> PUA characters, especially MUFI, are needed to typeset some >> texts. (XeLa)TeX is still a very good typesetting system and Emacs-based >> AUCTeX is still a very good tool to use TeX. > > Those are separate projects. If they need to use non-standard > characters with corresponding non-standard fonts, they could maintain > some add-on packages for Emacs to do that. Thanks for the suggestion. I will contact AUCTeX people and see what thay think about it. > > Asking Emacs to maintain compatibility to various ad-hoc registries > outside of Unicode is not really reasonable. Yes, but it is not what I propose. I just would the user to be able to use his own definition of PUA provided in the form of an additional UnicodeData.txt (not necessarily as a part of Unicode, it can be perhaps a different coding system). [...] > I guess what I'm saying is that without a dedicated volunteer who > would take care of this issue we can only wish such support will be > added, but we have no real hope it will materialize, except by some > enormous luck. As I said, we have different things in mind, so I'm not sure the above statement really applies to my proposal. > Of course, if you know someone who could be persuaded to come on board > and work on this now and in the future, I think the feature will be > welcome. I don't know such a person now, but this can of course change in the future. Best regards Janusz -- , Janusz S. Bien emeryt (emeritus) https://sites.google.com/view/jsbien ^ permalink raw reply [flat|nested] 39+ messages in thread
* bug#32599: 25.2; Feature request: input PUA characters by name 2019-05-27 17:39 ` Janusz S. Bień @ 2019-05-27 18:45 ` Eli Zaretskii 2019-05-28 5:18 ` Janusz S. Bień 0 siblings, 1 reply; 39+ messages in thread From: Eli Zaretskii @ 2019-05-27 18:45 UTC (permalink / raw) To: jsbien; +Cc: 32599 > From: jsbien@mimuw.edu.pl (Janusz S. Bień) > Cc: 32599@debbugs.gnu.org > Date: Mon, 27 May 2019 19:39:50 +0200 > > > Asking Emacs to maintain compatibility to various ad-hoc registries > > outside of Unicode is not really reasonable. > > Yes, but it is not what I propose. I just would the user to be able to > use his own definition of PUA provided in the form of an additional > UnicodeData.txt (not necessarily as a part of Unicode, it can be perhaps > a different coding system). We don't have infrastructure for reliably doing such customizations on user level. Even importing a new version of the Unicode Standard currently requires some manual adaptations, although most of the job is done by just rebuilding Emacs after downloading a few UCD files. Apart from making it possible to modify/update the relevant tables programmatically, someone should figure out which properties are relevant to PUA codepoints, which scripts they should belong, and also how to specify fonts for them. And if you want users to be able to do this stuff, all those missing infrastructure features should be exposed as user commands which don't require rebuilding Emacs and preferably not even restarting it. Someone who knows enough about the issue should do this job, for the feature to be available. ^ permalink raw reply [flat|nested] 39+ messages in thread
* bug#32599: 25.2; Feature request: input PUA characters by name 2019-05-27 18:45 ` Eli Zaretskii @ 2019-05-28 5:18 ` Janusz S. Bień 2019-05-28 5:34 ` Eli Zaretskii 0 siblings, 1 reply; 39+ messages in thread From: Janusz S. Bień @ 2019-05-28 5:18 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 32599 On Mon, May 27 2019 at 21:45 +03, Eli Zaretskii wrote: >> From: jsbien@mimuw.edu.pl (Janusz S. Bień) >> Cc: 32599@debbugs.gnu.org >> Date: Mon, 27 May 2019 19:39:50 +0200 >> >> > Asking Emacs to maintain compatibility to various ad-hoc registries >> > outside of Unicode is not really reasonable. >> >> Yes, but it is not what I propose. I just would the user to be able to >> use his own definition of PUA provided in the form of an additional >> UnicodeData.txt (not necessarily as a part of Unicode, it can be perhaps >> a different coding system). > > We don't have infrastructure for reliably doing such customizations on > user level. Even importing a new version of the Unicode Standard > currently requires some manual adaptations, although most of the job > is done by just rebuilding Emacs after downloading a few UCD files. I'm intrigued by the need of manual adaptation. However my main point today is different: Why our posts are not available neither at https://debbugs.gnu.org/cgi/bugreport.cgi?bug=32599 not at https://groups.google.com/forum/#!msg/gnu.emacs.bug/SK-RO_OG_GQ/BmUzA-23AgAJ ? This dosn't allow other users to express their opinion on the matter. -- , Janusz S. Bien emeryt (emeritus) https://sites.google.com/view/jsbien ^ permalink raw reply [flat|nested] 39+ messages in thread
* bug#32599: 25.2; Feature request: input PUA characters by name 2019-05-28 5:18 ` Janusz S. Bień @ 2019-05-28 5:34 ` Eli Zaretskii 2019-05-28 5:39 ` Janusz S. Bień 0 siblings, 1 reply; 39+ messages in thread From: Eli Zaretskii @ 2019-05-28 5:34 UTC (permalink / raw) To: jsbien; +Cc: 32599 On May 28, 2019 8:18:13 AM GMT+03:00, jsbien@mimuw.edu.pl wrote: > However my main point today is different: > > Why our posts are not available neither at > > https://debbugs.gnu.org/cgi/bugreport.cgi?bug=32599 > > not at > > https://groups.google.com/forum/#!msg/gnu.emacs.bug/SK-RO_OG_GQ/BmUzA-23AgAJ > > ? > > This dosn't allow other users to express their opinion on the matter. Our posts do appear on that page, not sure why you don't see them. ^ permalink raw reply [flat|nested] 39+ messages in thread
* bug#32599: 25.2; Feature request: input PUA characters by name 2019-05-28 5:34 ` Eli Zaretskii @ 2019-05-28 5:39 ` Janusz S. Bień 2020-12-30 17:49 ` Janusz S. Bień 0 siblings, 1 reply; 39+ messages in thread From: Janusz S. Bień @ 2019-05-28 5:39 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 32599 On Tue, May 28 2019 at 8:34 +03, Eli Zaretskii wrote: > On May 28, 2019 8:18:13 AM GMT+03:00, jsbien@mimuw.edu.pl wrote: > >> However my main point today is different: >> >> Why our posts are not available neither at >> >> https://debbugs.gnu.org/cgi/bugreport.cgi?bug=32599 >> >> not at >> >> https://groups.google.com/forum/#!msg/gnu.emacs.bug/SK-RO_OG_GQ/BmUzA-23AgAJ >> >> ? >> >> This dosn't allow other users to express their opinion on the matter. > > Our posts do appear on that page, not sure why you don't see them. You are right, I see it in Chrome, looks like a strange Firefox problem. -- , Janusz S. Bien emeryt (emeritus) https://sites.google.com/view/jsbien ^ permalink raw reply [flat|nested] 39+ messages in thread
* bug#32599: 25.2; Feature request: input PUA characters by name 2019-05-28 5:39 ` Janusz S. Bień @ 2020-12-30 17:49 ` Janusz S. Bień 2020-12-30 20:52 ` Eli Zaretskii 0 siblings, 1 reply; 39+ messages in thread From: Janusz S. Bień @ 2020-12-30 17:49 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 32599 Let me to approach the problem from another angle. Input of Unicode characters by name is done by "insert-char" defined in editfns.c. The code is quite short: --8<---------------cut here---------------start------------->8--- DEFUN ("insert-char", Finsert_char, Sinsert_char, 1, 3, "(list (read-char-by-name \"Insert character (Unicode name or hex): \")\ (prefix-numeric-value current-prefix-arg)\ t))", doc: /* Insert COUNT copies of CHARACTER. Interactively, prompt for CHARACTER. You can specify CHARACTER in one of these ways: - As its Unicode character name, e.g. \"LATIN SMALL LETTER A\". Completion is available; if you type a substring of the name preceded by an asterisk `*', Emacs shows all names which include that substring, not necessarily at the beginning of the name. [...] The optional third argument INHERIT, if non-nil, says to inherit text properties from adjoining text, if those properties are sticky. If called interactively, INHERIT is t. */) (Lisp_Object character, Lisp_Object count, Lisp_Object inherit) { int i, stringlen; register ptrdiff_t n; int c, len; unsigned char str[MAX_MULTIBYTE_LENGTH]; char string[4000]; CHECK_CHARACTER (character); if (NILP (count)) XSETFASTINT (count, 1); else CHECK_FIXNUM (count); c = XFIXNAT (character); if (!NILP (BVAR (current_buffer, enable_multibyte_characters))) len = CHAR_STRING (c, str); else str[0] = c, len = 1; if (XFIXNUM (count) <= 0) return Qnil; if (BUF_BYTES_MAX / len < XFIXNUM (count)) buffer_overflow (); n = XFIXNUM (count) * len; stringlen = min (n, sizeof string - sizeof string % len); for (i = 0; i < stringlen; i++) string[i] = str[i % len]; while (n > stringlen) { maybe_quit (); if (!NILP (inherit)) insert_and_inherit (string, stringlen); else insert (string, stringlen); n -= stringlen; } if (!NILP (inherit)) insert_and_inherit (string, n); else insert (string, n); return Qnil; } --8<---------------cut here---------------end--------------->8--- Which part of the code is responsible for prompting, name input and consulting the character names list? Best regards Janusz -- , Janusz S. Bien emeryt (emeritus) https://sites.google.com/view/jsbien ^ permalink raw reply [flat|nested] 39+ messages in thread
* bug#32599: 25.2; Feature request: input PUA characters by name 2020-12-30 17:49 ` Janusz S. Bień @ 2020-12-30 20:52 ` Eli Zaretskii 2020-12-31 6:39 ` Janusz S. Bień 0 siblings, 1 reply; 39+ messages in thread From: Eli Zaretskii @ 2020-12-30 20:52 UTC (permalink / raw) To: jsbien; +Cc: 32599 > From: jsbien@mimuw.edu.pl (Janusz S. Bień) > Cc: 32599@debbugs.gnu.org > Date: Wed, 30 Dec 2020 18:49:26 +0100 > > Which part of the code is responsible for prompting, name input and > consulting the character names list? The interactive spec: "(list (read-char-by-name \"Insert character (Unicode name or hex): \")\ (prefix-numeric-value current-prefix-arg)\ t))", IOW, the call to read-char-by-name does that. ^ permalink raw reply [flat|nested] 39+ messages in thread
* bug#32599: 25.2; Feature request: input PUA characters by name 2020-12-30 20:52 ` Eli Zaretskii @ 2020-12-31 6:39 ` Janusz S. Bień 2020-12-31 7:49 ` Eli Zaretskii 0 siblings, 1 reply; 39+ messages in thread From: Janusz S. Bień @ 2020-12-31 6:39 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 32599 On Wed, Dec 30 2020 at 22:52 +02, Eli Zaretskii wrote: >> From: jsbien@mimuw.edu.pl (Janusz S. Bień) >> Cc: 32599@debbugs.gnu.org >> Date: Wed, 30 Dec 2020 18:49:26 +0100 >> >> Which part of the code is responsible for prompting, name input and >> consulting the character names list? > > The interactive spec: > > "(list (read-char-by-name \"Insert character (Unicode name or hex): \")\ > (prefix-numeric-value current-prefix-arg)\ > t))", > > IOW, the call to read-char-by-name does that. Thanks. The answer was quite obvious, I'm slightly ashamed I didn't noticed it :-( So the PUA names are to be added to ucs-names in mule-cmds.el. At first glance it doesn't seem difficult, as I already have them in the form of uni-name.el. I will work on it after New Year. Regards - Janusz -- , Janusz S. Bien emeryt (emeritus) https://sites.google.com/view/jsbien ^ permalink raw reply [flat|nested] 39+ messages in thread
* bug#32599: 25.2; Feature request: input PUA characters by name 2020-12-31 6:39 ` Janusz S. Bień @ 2020-12-31 7:49 ` Eli Zaretskii 2020-12-31 8:14 ` Janusz S. Bień 0 siblings, 1 reply; 39+ messages in thread From: Eli Zaretskii @ 2020-12-31 7:49 UTC (permalink / raw) To: jsbien; +Cc: 32599 On December 31, 2020 8:39:03 AM GMT+02:00, jsbien@mimuw.edu.pl wrote: > So the PUA names are to be added to ucs-names in mule-cmds.el. At > first > glance it doesn't seem difficult, as I already have them in the form > of > uni-name.el. I will work on it after New Year. What would be the advantage of adding names for PUA codepoints? The disadvantage is clear: bloating the Emacs process memory footprint by some 137,000 strings of some non-descriptive form, like PUA-CHARACTER-nn. Btw, the names are not generated in mule-cmds.el, they are generated in unidata-gen.el. ^ permalink raw reply [flat|nested] 39+ messages in thread
* bug#32599: 25.2; Feature request: input PUA characters by name 2020-12-31 7:49 ` Eli Zaretskii @ 2020-12-31 8:14 ` Janusz S. Bień 2020-12-31 9:06 ` Eli Zaretskii 0 siblings, 1 reply; 39+ messages in thread From: Janusz S. Bień @ 2020-12-31 8:14 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 32599 On Thu, Dec 31 2020 at 9:49 +02, Eli Zaretskii wrote: > On December 31, 2020 8:39:03 AM GMT+02:00, jsbien@mimuw.edu.pl wrote: >> So the PUA names are to be added to ucs-names in mule-cmds.el. At >> first >> glance it doesn't seem difficult, as I already have them in the form >> of >> uni-name.el. I will work on it after New Year. > > What would be the advantage of adding names for PUA codepoints? To make it clear: I'm not adding names to PUA codepoints, I add to Emacs the names already in use for some PUA codepoints. > The disadvantage is clear: bloating the Emacs process memory footprint > by some 137,000 strings of some non-descriptive form, like > PUA-CHARACTER-nn. What do you need PUA-CHARACTER-nn for? I never proposed anything like this. Moreover: There is always price for everything. BTW, not 137,000 but 739 for MUFI 4.0 and a little more for the current version. > > Btw, the names are not generated in mule-cmds.el, they are generated > in unidata-gen.el. I know that and, as I said in my previous mail, I've generated them already: --8<---------------cut here---------------start------------->8--- position: 4051 of 4693 (86%), column: 40 character: (displayed as ) (codepoint 59575, #o164267, #xe8b7) preferred charset: unicode (Unicode (ISO10646)) code point in charset: 0xE8B7 syntax: w which means: word category: L:Left-to-right (strong) to input: type "C-x 8 RET e8b7" buffer code: #xEE #xA2 #xB7 file code: #xEE #xA2 #xB7 (encoded by coding system utf-8-unix) display: by this font (glyph code) xft:-psbk-Junicode-normal-normal-normal-*-15-*-*-*-*-0-iso10646-1 (#x94C) Character code properties: customize what to show name: LATIN SMALL LETTER LONG S WITH FLOURISH general-category: Co (Other, Private Use) decomposition: (59575) ('') --8<---------------cut here---------------end--------------->8--- However read-char-by-name takes the names from ucs-names which I have to update. Regards - Janusz -- , Janusz S. Bien emeryt (emeritus) https://sites.google.com/view/jsbien ^ permalink raw reply [flat|nested] 39+ messages in thread
* bug#32599: 25.2; Feature request: input PUA characters by name 2020-12-31 8:14 ` Janusz S. Bień @ 2020-12-31 9:06 ` Eli Zaretskii 2020-12-31 9:31 ` Janusz S. Bień 0 siblings, 1 reply; 39+ messages in thread From: Eli Zaretskii @ 2020-12-31 9:06 UTC (permalink / raw) To: jsbien; +Cc: 32599 On December 31, 2020 10:14:41 AM GMT+02:00, jsbien@mimuw.edu.pl wrote: > On Thu, Dec 31 2020 at 9:49 +02, Eli Zaretskii wrote: > > On December 31, 2020 8:39:03 AM GMT+02:00, jsbien@mimuw.edu.pl > wrote: > >> So the PUA names are to be added to ucs-names in mule-cmds.el. At > >> first > >> glance it doesn't seem difficult, as I already have them in the > form > >> of > >> uni-name.el. I will work on it after New Year. > > > > What would be the advantage of adding names for PUA codepoints? > > To make it clear: I'm not adding names to PUA codepoints, I add to > Emacs > the names already in use for some PUA codepoints. > > > The disadvantage is clear: bloating the Emacs process memory > footprint > > by some 137,000 strings of some non-descriptive form, like > > PUA-CHARACTER-nn. > > What do you need PUA-CHARACTER-nn for? I never proposed anything like > this. > > Moreover: > > There is always price for everything. > > BTW, not 137,000 but 739 for MUFI 4.0 and a little more for the > current > version. OK, it wasn't clear that you want to add MUFI codepoints. However, if that is the intent, I think we should first discuss whether we want to add PUA codepoints defined by the different initiatives, and if so, which ones. MUFI is not the only such collections, there are others. We could instead wait until these codepoints are incorporated into the Unicode Standard (as already happened with some MUFI codepoints). Or we might decide that we want to add some infrastructure which could be used by users to add support for ranges of PUA without adding that by default. IOW, I think this kind of changes should be discussed first, and the place to discuss them is not here, it's on emacs-devel. ^ permalink raw reply [flat|nested] 39+ messages in thread
* bug#32599: 25.2; Feature request: input PUA characters by name 2020-12-31 9:06 ` Eli Zaretskii @ 2020-12-31 9:31 ` Janusz S. Bień 2020-12-31 10:31 ` Eli Zaretskii 0 siblings, 1 reply; 39+ messages in thread From: Janusz S. Bień @ 2020-12-31 9:31 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 32599 On Thu, Dec 31 2020 at 11:06 +02, Eli Zaretskii wrote: [...] > OK, it wasn't clear that you want to add MUFI codepoints. > > However, if that is the intent, I think we should first discuss > whether we want to add PUA codepoints defined by the different > initiatives, and if so, which ones. MUFI is not the only such > collections, there are others. Correct, e.g. the character set of JuniusX font which uses plane 15 (https://github.com/psb1558/Junicode-New/discussions/44) > We could instead wait until these codepoints are incorporated into the > Unicode Standard (as already happened with some MUFI codepoints). Some of them will never be accepted by Unicode because they violate their principle. > Or we might decide that we want to add some infrastructure which could > be used by users to add support for ranges of PUA without adding that > by default. That's what I have in mind from the very beginning. > IOW, I think this kind of changes should be discussed first, and the > place to discuss them is not here, it's on emacs-devel. Before starting the discussion on emacs-devel I would like to have a proof-of-concept and also a quick solution for my own use. Regards - Janusz -- , Janusz S. Bien emeryt (emeritus) https://sites.google.com/view/jsbien ^ permalink raw reply [flat|nested] 39+ messages in thread
* bug#32599: 25.2; Feature request: input PUA characters by name 2020-12-31 9:31 ` Janusz S. Bień @ 2020-12-31 10:31 ` Eli Zaretskii 0 siblings, 0 replies; 39+ messages in thread From: Eli Zaretskii @ 2020-12-31 10:31 UTC (permalink / raw) To: jsbien; +Cc: 32599 On December 31, 2020 11:31:32 AM GMT+02:00, jsbien@mimuw.edu.pl wrote: > On Thu, Dec 31 2020 at 11:06 +02, Eli Zaretskii wrote: > > [...] > > > OK, it wasn't clear that you want to add MUFI codepoints. > > > > However, if that is the intent, I think we should first discuss > > whether we want to add PUA codepoints defined by the different > > initiatives, and if so, which ones. MUFI is not the only such > > collections, there are others. > > Correct, e.g. the character set of JuniusX font which uses plane 15 > (https://github.com/psb1558/Junicode-New/discussions/44) > > > We could instead wait until these codepoints are incorporated into > the > > Unicode Standard (as already happened with some MUFI codepoints). > > Some of them will never be accepted by Unicode because they violate > their principle. > > > Or we might decide that we want to add some infrastructure which > could > > be used by users to add support for ranges of PUA without adding > that > > by default. > > That's what I have in mind from the very beginning. Well, your message talked about adding names for these codepoints, not about adding infrastructure. Apologies for reading that too literally. ^ permalink raw reply [flat|nested] 39+ messages in thread
* bug#32599: 25.2; Feature request: input PUA characters by name 2018-08-31 9:09 ` Janusz S. Bień 2018-08-31 12:34 ` Robert Pluim @ 2022-04-26 13:09 ` Lars Ingebrigtsen 2022-04-26 13:30 ` Janusz S. Bień 1 sibling, 1 reply; 39+ messages in thread From: Lars Ingebrigtsen @ 2022-04-26 13:09 UTC (permalink / raw) To: Janusz S. Bień; +Cc: 32599 jsbien@mimuw.edu.pl (Janusz S. Bień) writes: > You are missing "PUA" in the topic and in the text of my feature > request. PUA is Private Use Area. Cf. e.g. (I'm going through old bug reports that unfortunately weren't resolved at the time.) If I understand correctly, you just want to be able to extend the range of characters you get when saying `C-x 8 RET TAB'? That can be done today by just adding elements to the `ucs-names' hash table. Is that not sufficient for your needs? -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 39+ messages in thread
* bug#32599: 25.2; Feature request: input PUA characters by name 2022-04-26 13:09 ` Lars Ingebrigtsen @ 2022-04-26 13:30 ` Janusz S. Bień 2022-04-26 13:37 ` Lars Ingebrigtsen 0 siblings, 1 reply; 39+ messages in thread From: Janusz S. Bień @ 2022-04-26 13:30 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: 32599 On Tue, Apr 26 2022 at 15:09 +02, Lars Ingebrigtsen wrote: > jsbien@mimuw.edu.pl (Janusz S. Bień) writes: > >> You are missing "PUA" in the topic and in the text of my feature >> request. PUA is Private Use Area. Cf. e.g. > > (I'm going through old bug reports that unfortunately weren't resolved > at the time.) > > If I understand correctly, you just want to be able to extend the range > of characters you get when saying `C-x 8 RET TAB'? Yes. I've solved it in a quick and dirty way, but it was very cumbersome and usable only in a specific verion of Emacs: https://github.com/jsbien/unicode4polish/tree/master/Emacs-MUFI > That can be done today by just adding elements to the `ucs-names' hash > table. Which Emacs version do you mean by "today"? What do you mean by "just adding"? What about an example, e.g. U+F159 LATIN ABBREVIATION SIGN SMALL DE? > Is that not sufficient for your needs? Perhaps, but I'm not sure yet. Best regards Janusz -- , Janusz S. Bien emeryt (emeritus) https://sites.google.com/view/jsbien ^ permalink raw reply [flat|nested] 39+ messages in thread
* bug#32599: 25.2; Feature request: input PUA characters by name 2022-04-26 13:30 ` Janusz S. Bień @ 2022-04-26 13:37 ` Lars Ingebrigtsen 2022-04-26 13:44 ` Janusz S. Bień 0 siblings, 1 reply; 39+ messages in thread From: Lars Ingebrigtsen @ 2022-04-26 13:37 UTC (permalink / raw) To: Janusz S. Bień; +Cc: 32599 Janusz S. Bień <jsbien@mimuw.edu.pl> writes: >> That can be done today by just adding elements to the `ucs-names' hash >> table. > > Which Emacs version do you mean by "today"? Since about 2008. > What do you mean by "just adding"? > > What about an example, e.g. U+F159 LATIN ABBREVIATION SIGN SMALL DE? (unless ucs-names (ucs-names)) (setf (gethash "LATIN ABBREVIATION SIGN SMALL DE" ucs-names) #xF159) -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 39+ messages in thread
* bug#32599: 25.2; Feature request: input PUA characters by name 2022-04-26 13:37 ` Lars Ingebrigtsen @ 2022-04-26 13:44 ` Janusz S. Bień 2022-04-26 13:45 ` Lars Ingebrigtsen 0 siblings, 1 reply; 39+ messages in thread From: Janusz S. Bień @ 2022-04-26 13:44 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: 32599 On Tue, Apr 26 2022 at 15:37 +02, Lars Ingebrigtsen wrote: > Janusz S. Bień <jsbien@mimuw.edu.pl> writes: > >>> That can be done today by just adding elements to the `ucs-names' hash >>> table. >> >> Which Emacs version do you mean by "today"? > > Since about 2008. > >> What do you mean by "just adding"? >> >> What about an example, e.g. U+F159 LATIN ABBREVIATION SIGN SMALL DE? > > (unless ucs-names > (ucs-names)) > (setf (gethash "LATIN ABBREVIATION SIGN SMALL DE" ucs-names) #xF159) --8<---------------cut here---------------start------------->8--- Debugger entered--Lisp error: (wrong-type-argument hash-table-p nil) puthash("LATIN ABBREVIATION SIGN SMALL DE" 61785 nil) (let* ((v ucs-names)) (puthash "LATIN ABBREVIATION SIGN SMALL DE" 61785 v)) (progn (let* ((v ucs-names)) (puthash "LATIN ABBREVIATION SIGN SMALL DE" 61785 v))) eval((progn (let* ((v ucs-names)) (puthash "LATIN ABBREVIATION SIGN SMALL DE" 61785 v))) t) elisp--eval-last-sexp(t) eval-last-sexp(t) eval-print-last-sexp(nil) funcall-interactively(eval-print-last-sexp nil) call-interactively(eval-print-last-sexp nil nil) command-execute(eval-print-last-sexp) --8<---------------cut here---------------end--------------->8--- GNU Emacs 28.1 (build 1, x86_64-pc-linux-gnu, GTK+ Version 3.24.24, cairo version 1.16.0) compiled locally. JSB -- , Janusz S. Bien emeryt (emeritus) https://sites.google.com/view/jsbien ^ permalink raw reply [flat|nested] 39+ messages in thread
* bug#32599: 25.2; Feature request: input PUA characters by name 2022-04-26 13:44 ` Janusz S. Bień @ 2022-04-26 13:45 ` Lars Ingebrigtsen 2022-04-26 15:33 ` Janusz S. Bień 0 siblings, 1 reply; 39+ messages in thread From: Lars Ingebrigtsen @ 2022-04-26 13:45 UTC (permalink / raw) To: Janusz S. Bień; +Cc: 32599 Janusz S. Bień <jsbien@mimuw.edu.pl> writes: >> (unless ucs-names >> (ucs-names)) >> (setf (gethash "LATIN ABBREVIATION SIGN SMALL DE" ucs-names) #xF159) > > Debugger entered--Lisp error: (wrong-type-argument hash-table-p nil) > puthash("LATIN ABBREVIATION SIGN SMALL DE" 61785 nil) Did you eval the `unless' form first? -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 39+ messages in thread
* bug#32599: 25.2; Feature request: input PUA characters by name 2022-04-26 13:45 ` Lars Ingebrigtsen @ 2022-04-26 15:33 ` Janusz S. Bień 2022-04-26 15:49 ` Robert Pluim 2022-04-27 11:51 ` Lars Ingebrigtsen 0 siblings, 2 replies; 39+ messages in thread From: Janusz S. Bień @ 2022-04-26 15:33 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: 32599 On Tue, Apr 26 2022 at 15:45 +02, Lars Ingebrigtsen wrote: > Janusz S. Bień <jsbien@mimuw.edu.pl> writes: > >>> (unless ucs-names >>> (ucs-names)) >>> (setf (gethash "LATIN ABBREVIATION SIGN SMALL DE" ucs-names) #xF159) >> >> Debugger entered--Lisp error: (wrong-type-argument hash-table-p nil) >> puthash("LATIN ABBREVIATION SIGN SMALL DE" 61785 nil) > > Did you eval the `unless' form first? I thought I did but I used the wrong command :-( Thank you very much for your solution. So yes, my original problem is solved and you can close the bug report. This is however not the end of the story. I would like describe-char to produce the correct name of the PUA character instead of "the character’s canonical name and other properties defined by the Unicode Data Base;". I understand this should posted as a separate feature request. Best regards Janusz -- , Janusz S. Bien emeryt (emeritus) https://sites.google.com/view/jsbien ^ permalink raw reply [flat|nested] 39+ messages in thread
* bug#32599: 25.2; Feature request: input PUA characters by name 2022-04-26 15:33 ` Janusz S. Bień @ 2022-04-26 15:49 ` Robert Pluim 2022-04-26 16:09 ` Janusz S. Bień 2022-04-27 11:51 ` Lars Ingebrigtsen 1 sibling, 1 reply; 39+ messages in thread From: Robert Pluim @ 2022-04-26 15:49 UTC (permalink / raw) To: Janusz S. Bień; +Cc: 32599, Lars Ingebrigtsen >>>>> On Tue, 26 Apr 2022 17:33:37 +0200, Janusz S. Bień <jsbien@mimuw.edu.pl> said: Janusz> On Tue, Apr 26 2022 at 15:45 +02, Lars Ingebrigtsen wrote: >> Janusz S. Bień <jsbien@mimuw.edu.pl> writes: >> >>>> (unless ucs-names >>>> (ucs-names)) >>>> (setf (gethash "LATIN ABBREVIATION SIGN SMALL DE" ucs-names) #xF159) >>> >>> Debugger entered--Lisp error: (wrong-type-argument hash-table-p nil) >>> puthash("LATIN ABBREVIATION SIGN SMALL DE" 61785 nil) >> >> Did you eval the `unless' form first? Janusz> I thought I did but I used the wrong command :-( Janusz> Thank you very much for your solution. Janusz> So yes, my original problem is solved and you can close the bug report. Janusz> This is however not the end of the story. I would like describe-char to Janusz> produce the correct name of the PUA character instead of "the Janusz> character’s canonical name and other properties defined by the Unicode Janusz> Data Base;". I understand this should posted as a separate feature Janusz> request. (put-char-code-property #xf159 'name "LATIN ABBREVIATION SIGN SMALL DE") (PUA characters donʼt have canonical names). Robert -- ^ permalink raw reply [flat|nested] 39+ messages in thread
* bug#32599: 25.2; Feature request: input PUA characters by name 2022-04-26 15:49 ` Robert Pluim @ 2022-04-26 16:09 ` Janusz S. Bień 2022-04-26 16:44 ` Eli Zaretskii 0 siblings, 1 reply; 39+ messages in thread From: Janusz S. Bień @ 2022-04-26 16:09 UTC (permalink / raw) To: Robert Pluim; +Cc: 32599, Lars Ingebrigtsen On Tue, Apr 26 2022 at 17:49 +02, Robert Pluim wrote: [...] > Janusz> This is however not the end of the story. I would like describe-char to > Janusz> produce the correct name of the PUA character instead of "the > Janusz> character’s canonical name and other properties defined by the Unicode > Janusz> Data Base;". I understand this should posted as a separate feature > Janusz> request. > > (put-char-code-property #xf159 'name "LATIN ABBREVIATION SIGN SMALL DE") Great, thank you very much! > (PUA characters donʼt have canonical names). I know, I'm interested only in Medieval Unicode Font Initiative names. But this is not the end of the story :-) What is needed to treat the character as a letter? To be precise, I would forward-word to go to the end of ahb, now it stops after a. Best regards Janusz -- , Janusz S. Bien emeryt (emeritus) https://sites.google.com/view/jsbien ^ permalink raw reply [flat|nested] 39+ messages in thread
* bug#32599: 25.2; Feature request: input PUA characters by name 2022-04-26 16:09 ` Janusz S. Bień @ 2022-04-26 16:44 ` Eli Zaretskii 2022-04-26 18:22 ` Janusz S. Bień 0 siblings, 1 reply; 39+ messages in thread From: Eli Zaretskii @ 2022-04-26 16:44 UTC (permalink / raw) To: jsbien; +Cc: 32599, rpluim, larsi > From: Janusz S. Bień <jsbien@mimuw.edu.pl> > Date: Tue, 26 Apr 2022 18:09:53 +0200 > Cc: 32599@debbugs.gnu.org, Lars Ingebrigtsen <larsi@gnus.org> > > What is needed to treat the character as a letter? To be precise, I > would forward-word to go to the end of ahb, now it stops after a. You need to manually assign that character its properties. See characters.el. You may also need to manually assign some Unicode properties using put-char-code-property. ^ permalink raw reply [flat|nested] 39+ messages in thread
* bug#32599: 25.2; Feature request: input PUA characters by name 2022-04-26 16:44 ` Eli Zaretskii @ 2022-04-26 18:22 ` Janusz S. Bień 0 siblings, 0 replies; 39+ messages in thread From: Janusz S. Bień @ 2022-04-26 18:22 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 32599, rpluim, larsi On Tue, Apr 26 2022 at 19:44 +03, Eli Zaretskii wrote: >> From: Janusz S. Bień <jsbien@mimuw.edu.pl> >> Date: Tue, 26 Apr 2022 18:09:53 +0200 >> Cc: 32599@debbugs.gnu.org, Lars Ingebrigtsen <larsi@gnus.org> >> >> What is needed to treat the character as a letter? To be precise, I >> would forward-word to go to the end of ahb, now it stops after a. > > You need to manually assign that character its properties. See > characters.el. You may also need to manually assign some Unicode > properties using put-char-code-property. Thanks for the advice. Best regards Janusz -- , Janusz S. Bien emeryt (emeritus) https://sites.google.com/view/jsbien ^ permalink raw reply [flat|nested] 39+ messages in thread
* bug#32599: 25.2; Feature request: input PUA characters by name 2022-04-26 15:33 ` Janusz S. Bień 2022-04-26 15:49 ` Robert Pluim @ 2022-04-27 11:51 ` Lars Ingebrigtsen 1 sibling, 0 replies; 39+ messages in thread From: Lars Ingebrigtsen @ 2022-04-27 11:51 UTC (permalink / raw) To: Janusz S. Bień; +Cc: 32599 Janusz S. Bień <jsbien@mimuw.edu.pl> writes: > So yes, my original problem is solved and you can close the bug report. OK; done. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 39+ messages in thread
end of thread, other threads:[~2022-04-27 11:51 UTC | newest] Thread overview: 39+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- [not found] <86sh30fg4q.fsf@mimuw.edu.pl> 2018-08-31 6:05 ` bug#32599: 25.2; Feature request: input PUA characters by name Janusz S. Bień 2018-08-31 8:05 ` Robert Pluim 2018-08-31 9:09 ` Janusz S. Bień 2018-08-31 12:34 ` Robert Pluim 2018-08-31 12:54 ` Janusz S. Bień 2019-05-26 8:10 ` Janusz S. Bień 2019-05-26 14:45 ` Eli Zaretskii 2019-05-26 15:18 ` Janusz S. Bień 2019-05-26 15:48 ` Janusz S. Bień 2019-05-26 16:51 ` Eli Zaretskii 2019-05-26 16:56 ` Eli Zaretskii 2019-05-26 17:33 ` Janusz S. Bień 2019-05-26 18:52 ` Eli Zaretskii 2019-05-27 5:48 ` Janusz S. Bień 2019-05-27 17:11 ` Eli Zaretskii 2019-05-27 17:39 ` Janusz S. Bień 2019-05-27 18:45 ` Eli Zaretskii 2019-05-28 5:18 ` Janusz S. Bień 2019-05-28 5:34 ` Eli Zaretskii 2019-05-28 5:39 ` Janusz S. Bień 2020-12-30 17:49 ` Janusz S. Bień 2020-12-30 20:52 ` Eli Zaretskii 2020-12-31 6:39 ` Janusz S. Bień 2020-12-31 7:49 ` Eli Zaretskii 2020-12-31 8:14 ` Janusz S. Bień 2020-12-31 9:06 ` Eli Zaretskii 2020-12-31 9:31 ` Janusz S. Bień 2020-12-31 10:31 ` Eli Zaretskii 2022-04-26 13:09 ` Lars Ingebrigtsen 2022-04-26 13:30 ` Janusz S. Bień 2022-04-26 13:37 ` Lars Ingebrigtsen 2022-04-26 13:44 ` Janusz S. Bień 2022-04-26 13:45 ` Lars Ingebrigtsen 2022-04-26 15:33 ` Janusz S. Bień 2022-04-26 15:49 ` Robert Pluim 2022-04-26 16:09 ` Janusz S. Bień 2022-04-26 16:44 ` Eli Zaretskii 2022-04-26 18:22 ` Janusz S. Bień 2022-04-27 11:51 ` Lars Ingebrigtsen
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/emacs.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).