unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#51638: 26.1; Writing Romanian Characters
@ 2021-11-06 12:28 crstml
  2021-11-06 17:12 ` Eli Zaretskii
  0 siblings, 1 reply; 4+ messages in thread
From: crstml @ 2021-11-06 12:28 UTC (permalink / raw)
  To: 51638

--text follows this line--



---- BEGIN DESCRIPTION ----
Hello all,

After asking a solution to a personal problem on the help-gnu-emacs mailing
list, I was told to submit a bug report. So here is the issue:


Sometimes I need to use Romanian characters. After I configure emacs to use
the Romanian language environment with set-language-environment I activate
an input method that allows me to write these characters with C-\.

I works very well. I use the latin-2-postfix input method to write
language specific characters. But, there is problem:

In the above mentioned input method if I write

      s ,

I obtain the character

      ş (UNICODE: U+015F; ISO-8859-2/iso-latin-2: 0xBA or 186; Entity: ş)

which is very similar but NOT THE SAME with the Romanian character

      ș (UNICODE: U+0219; ISO-8859-16/iso-latin-10: 0xBA or 186; Entity: ș)

These are two distinct characters. Visually there is no problem reading a
Romanian text containing U+015F instead of U+0219 but there can be problem
when we perform character conversions or searches.

For example if I write my Romanian text with U+015F instead of U+0219 and
save it in unicode all works well. But then if I want to convert that file
to ISO-8859-16, the convertor will tell me that the character U+015F cannot
be converted to the requested character set.

By giving emacs the command "describe-language-environment" in the Romanian
environment I can see that  iso-latin-10 is listed as a Coding system appropriate
for this environment.

My question is: Is it possible to configure emacs use iso-latin-10 instead
of iso-8859-2 in the Romanian enviroment?

Best regards
Cristian


---- BEGIN DESCRIPTION ----

In GNU Emacs 26.1 (build 2, x86_64-pc-linux-gnu, GTK+ Version 3.24.5)
  of 2021-01-31, modified by Debian built on x86-csail-01
Windowing system distributor 'The X.Org Foundation', version 11.0.12004000
System Description:    Debian GNU/Linux 10 (buster)

Recent messages:
Loading /var/cache/dictionaries-common/emacsen-ispell-default.el (source)...done
Loading debian-ispell...done
Loading /var/cache/dictionaries-common/emacsen-ispell-dicts.el (source)...done
Loading /etc/emacs/site-start.d/50dictionaries-common.el (source)...done
Loading paren...done
For information about GNU Emacs and the GNU system, type C-h C-a.
Mark set
ESC <drag-mouse-1> is undefined
Mark set [3 times]
ESC <mouse-1> is undefined

Configured using:
  'configure --build x86_64-linux-gnu --prefix=/usr
  --sharedstatedir=/var/lib --libexecdir=/usr/lib
  --localstatedir=/var/lib --infodir=/usr/share/info
  --mandir=/usr/share/man --enable-libsystemd --with-pop=yes
  --enable-locallisppath=/etc/emacs:/usr/local/share/emacs/26.1/site-lisp:/usr/local/share/emacs/site-lisp:/usr/share/emacs/26.1/site-lisp:/usr/share/emacs/site-lisp
  --with-sound=alsa --without-gconf --with-mailutils --build
  x86_64-linux-gnu --prefix=/usr --sharedstatedir=/var/lib
  --libexecdir=/usr/lib --localstatedir=/var/lib
  --infodir=/usr/share/info --mandir=/usr/share/man --enable-libsystemd
  --with-pop=yes
  --enable-locallisppath=/etc/emacs:/usr/local/share/emacs/26.1/site-lisp:/usr/local/share/emacs/site-lisp:/usr/share/emacs/26.1/site-lisp:/usr/share/emacs/site-lisp
  --with-sound=alsa --without-gconf --with-mailutils --with-x=yes
  --with-x-toolkit=gtk3 --with-toolkit-scroll-bars 'CFLAGS=-g -O2
  -fdebug-prefix-map=/build/emacs-9Yet8u/emacs-26.1+1=. -fstack-protector-strong
  -Wformat -Werror=format-security -Wall' 'CPPFLAGS=-Wdate-time
  -D_FORTIFY_SOURCE=2' LDFLAGS=-Wl,-z,relro'

Configured features:
XPM JPEG TIFF GIF PNG RSVG IMAGEMAGICK SOUND GPM DBUS GSETTINGS NOTIFY
ACL LIBSELINUX GNUTLS LIBXML2 FREETYPE M17N_FLT LIBOTF XFT ZLIB
TOOLKIT_SCROLL_BARS GTK3 X11 THREADS LIBSYSTEMD LCMS2

Important settings:
   value of $LANG: en_US.UTF-8
   locale-coding-system: utf-8-unix

Major mode: Lisp Interaction

Minor modes in effect:
   show-paren-mode: t
   tooltip-mode: t
   global-eldoc-mode: t
   eldoc-mode: t
   electric-indent-mode: t
   mouse-wheel-mode: t
   menu-bar-mode: t
   file-name-shadow-mode: t
   global-font-lock-mode: t
   font-lock-mode: t
   blink-cursor-mode: t
   auto-composition-mode: t
   auto-encryption-mode: t
   auto-compression-mode: t
   column-number-mode: t
   line-number-mode: t
   transient-mark-mode: t

Load-path shadows:
None found.

Features:
(shadow sort mail-extr emacsbug message rmc puny seq byte-opt gv
bytecomp byte-compile cconv cl-loaddefs cl-lib dired dired-loaddefs
format-spec rfc822 mml easymenu mml-sec password-cache epa derived epg
epg-config gnus-util rmail rmail-loaddefs mm-decode mm-bodies mm-encode
mail-parse rfc2231 mailabbrev gmm-utils mailheader sendmail rfc2047
rfc2045 ietf-drums mm-util mail-prsvr mail-utils elec-pair paren
cus-start cus-load time-date mule-util tooltip eldoc electric uniquify
ediff-hook vc-hooks lisp-float-type mwheel term/x-win x-win
term/common-win x-dnd tool-bar dnd fontset image regexp-opt fringe
tabulated-list replace newcomment text-mode elisp-mode lisp-mode
prog-mode register page menu-bar rfn-eshadow isearch timer select
scroll-bar mouse jit-lock font-lock syntax facemenu font-core
term/tty-colors frame cl-generic cham georgian utf-8-lang misc-lang
vietnamese tibetan thai tai-viet lao korean japanese eucjp-ms cp51932
hebrew greek romanian slovak czech european ethiopic indian cyrillic
chinese composite charscript charprop case-table epa-hook jka-cmpr-hook
help simple abbrev obarray minibuffer cl-preloaded nadvice loaddefs
button faces cus-face macroexp files text-properties overlay sha1 md5
base64 format env code-pages mule custom widget hashtable-print-readable
backquote dbusbind inotify lcms2 dynamic-setting system-font-setting
font-render-setting move-toolbar gtk x-toolkit x multi-tty
make-network-process emacs)

Memory information:
((conses 16 104718 7893)
  (symbols 48 21356 2)
  (miscs 40 65 204)
  (strings 32 30055 1038)
  (string-bytes 1 768403)
  (vectors 16 14759)
  (vector-slots 8 497148 7484)
  (floats 8 51 188)
  (intervals 56 278 0)
  (buffers 992 11))






^ permalink raw reply	[flat|nested] 4+ messages in thread

* bug#51638: 26.1; Writing Romanian Characters
  2021-11-06 12:28 bug#51638: 26.1; Writing Romanian Characters crstml
@ 2021-11-06 17:12 ` Eli Zaretskii
  2021-11-07  1:03   ` crstml
  0 siblings, 1 reply; 4+ messages in thread
From: Eli Zaretskii @ 2021-11-06 17:12 UTC (permalink / raw)
  To: crstml; +Cc: 51638

> From: crstml@libero.it
> Date: Sat, 6 Nov 2021 13:28:16 +0100
> 
> Sometimes I need to use Romanian characters. After I configure emacs to use
> the Romanian language environment with set-language-environment I activate
> an input method that allows me to write these characters with C-\.
> 
> I works very well. I use the latin-2-postfix input method to write
> language specific characters. But, there is problem:
> 
> In the above mentioned input method if I write
> 
>       s ,
> 
> I obtain the character
> 
>       ş (UNICODE: U+015F; ISO-8859-2/iso-latin-2: 0xBA or 186; Entity: &scedil;)
> 
> which is very similar but NOT THE SAME with the Romanian character
> 
>       ș (UNICODE: U+0219; ISO-8859-16/iso-latin-10: 0xBA or 186; Entity: &#x219;)
> 
> These are two distinct characters. Visually there is no problem reading a
> Romanian text containing U+015F instead of U+0219 but there can be problem
> when we perform character conversions or searches.
> 
> For example if I write my Romanian text with U+015F instead of U+0219 and
> save it in unicode all works well. But then if I want to convert that file
> to ISO-8859-16, the convertor will tell me that the character U+015F cannot
> be converted to the requested character set.
> 
> By giving emacs the command "describe-language-environment" in the Romanian
> environment I can see that  iso-latin-10 is listed as a Coding system appropriate
> for this environment.
> 
> My question is: Is it possible to configure emacs use iso-latin-10 instead
> of iso-8859-2 in the Romanian enviroment?

Please try the patch below.  After applying the patch, typing "s ,"
will show two variants in the echo-area, and you can choose between
them with C-f/C-b and the arrow keys.

Is that a satisfactory solution?

diff --git a/lisp/leim/quail/latin-post.el b/lisp/leim/quail/latin-post.el
index 8329fff..78ae896 100644
--- a/lisp/leim/quail/latin-post.el
+++ b/lisp/leim/quail/latin-post.el
@@ -215,7 +215,15 @@
   others     |    /    | s/ -> ß
 
 Doubling the postfix separates the letter and postfix: e.g. a\\='\\=' -> a\\='
-" nil t nil nil nil nil nil nil nil nil t)
+"
+ '(("\C-?" . quail-delete-last-char)
+   (">" . quail-next-translation)
+   ("\C-f" . quail-next-translation)
+   ([right] . quail-next-translation)
+   ("<" . quail-prev-translation)
+   ("\C-b" . quail-prev-translation)
+   ([left] . quail-prev-translation))
+ t nil nil nil nil nil nil nil nil t)
 
 (quail-define-rules
  ("A'" ?Á)
@@ -246,7 +254,7 @@
  ("R'" ?Ŕ)
  ("R~" ?Ř)
  ("S'" ?Ś)
- ("S," ?Ş)
+ ("S," "ŞȘ") ; the second variant is for Romanian
  ("S~" ?Š)
  ("T," ?Ţ)
  ("T~" ?Ť)
@@ -286,7 +294,7 @@
  ("r'" ?ŕ)
  ("r~" ?ř)
  ("s'" ?ś)
- ("s," ?ş)
+ ("s," "şș") ; the second variant is for Romanian
  ("s/" ?ß)
  ("s~" ?š)
  ("t," ?ţ)





^ permalink raw reply related	[flat|nested] 4+ messages in thread

* bug#51638: 26.1; Writing Romanian Characters
  2021-11-06 17:12 ` Eli Zaretskii
@ 2021-11-07  1:03   ` crstml
  2021-11-07 10:48     ` Eli Zaretskii
  0 siblings, 1 reply; 4+ messages in thread
From: crstml @ 2021-11-07  1:03 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 51638

[-- Attachment #1: Type: text/plain, Size: 2734 bytes --]

Eli Zaretskii wrote:
>
> Please try the patch below.  After applying the patch, typing "s ,"
> will show two variants in the echo-area, and you can choose between
> them with C-f/C-b and the arrow keys.
>
> Is that a satisfactory solution?

The solution is satisfactory for personal use. Thank you very much.
But there the following issues:

1) The patch is not complete for Romanian. A solution for Romanian
     should also handle the T and t letters. I send you attached a patch
     based on yours that handles also these letters

2) For example if I try to type the Romanian word "Șes" then when I
     try  to write the "S WITH COMMA", after I type "S" ","  and finally
     a "C-f" to select my letter then best for a user would to type the
     next letter  that follows in his/her word ("e" in this case). But it
     doesn't work. If the user types "e" then the first choice is inserted
     in the buffer. The user must select the second letter with an ENTER.

3) For general use a better solution should exist because it is not
     very user  friendly to type so many characters to select your letter.
     Probably a true  Romanian language environment based on
     ISO-8859-16 should be provided.  The actual choice of ISO-8859-2
     for the Romanian environment probably is not the best.

    For example I will modify your patch to insert my ISO-8859-16
    characters instead of those in ISO-LATIN-2. Because if I write Romanian
    I never have to interact with the s an t variants in ISO-LATIN-2.

That's all. Thank you.
Cristian






> diff --git a/lisp/leim/quail/latin-post.el b/lisp/leim/quail/latin-post.el
> index 8329fff..78ae896 100644
> --- a/lisp/leim/quail/latin-post.el
> +++ b/lisp/leim/quail/latin-post.el
> @@ -215,7 +215,15 @@
>     others     |    /    | s/ -> ß
>   
>   Doubling the postfix separates the letter and postfix: e.g. a\\='\\=' -> a\\='
> -" nil t nil nil nil nil nil nil nil nil t)
> +"
> + '(("\C-?" . quail-delete-last-char)
> +   (">" . quail-next-translation)
> +   ("\C-f" . quail-next-translation)
> +   ([right] . quail-next-translation)
> +   ("<" . quail-prev-translation)
> +   ("\C-b" . quail-prev-translation)
> +   ([left] . quail-prev-translation))
> + t nil nil nil nil nil nil nil nil t)
>   
>   (quail-define-rules
>    ("A'" ?Á)
> @@ -246,7 +254,7 @@
>    ("R'" ?Ŕ)
>    ("R~" ?Ř)
>    ("S'" ?Ś)
> - ("S," ?Ş)
> + ("S," "ŞȘ") ; the second variant is for Romanian
>    ("S~" ?Š)
>    ("T," ?Ţ)
>    ("T~" ?Ť)
> @@ -286,7 +294,7 @@
>    ("r'" ?ŕ)
>    ("r~" ?ř)
>    ("s'" ?ś)
> - ("s," ?ş)
> + ("s," "şș") ; the second variant is for Romanian
>    ("s/" ?ß)
>    ("s~" ?š)
>    ("t," ?ţ)
>
>
>


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: patch-for-s-and-t.patch --]
[-- Type: text/x-patch; name="patch-for-s-and-t.patch", Size: 1176 bytes --]

--- /usr/share/emacs/26.1/lisp/leim/quail/latin-post.el	2021-11-07 01:15:43.545644339 +0100
+++ with-t/latin-post.el	2021-11-07 00:51:48.914144632 +0100
@@ -215,7 +215,15 @@
   others     |    /    | s/ -> ß
 
 Doubling the postfix separates the letter and postfix: e.g. a\\='\\=' -> a\\='
-" nil t nil nil nil nil nil nil nil nil t)
+"
+ '(("\C-?" . quail-delete-last-char)
+   (">" . quail-next-translation)
+   ("\C-f" . quail-next-translation)
+   ([right] . quail-next-translation)
+   ("<" . quail-prev-translation)
+   ("\C-b" . quail-prev-translation)
+   ([left] . quail-prev-translation))
+ t nil nil nil nil nil nil nil nil t)
 
 (quail-define-rules
  ("A'" ?Á)
@@ -246,9 +254,9 @@
  ("R'" ?Ŕ)
  ("R~" ?Ř)
  ("S'" ?Ś)
- ("S," ?Ş)
+ ("S," "ŞȘ") ; the second variant is for Romanian
  ("S~" ?Š)
- ("T," ?Ţ)
+ ("T," "ŢȚ") ; the second variant is for Romanian
  ("T~" ?Ť)
  ("U'" ?Ú)
  ("U:" ?Ű)
@@ -286,10 +294,10 @@
  ("r'" ?ŕ)
  ("r~" ?ř)
  ("s'" ?ś)
- ("s," ?ş)
+ ("s," "şș") ; the second variant is for Romanian
  ("s/" ?ß)
  ("s~" ?š)
- ("t," ?ţ)
+ ("t," "ţț") ; the second variant is for Romanian
  ("t~" ?ť)
  ("u'" ?ú)
  ("u:" ?ű)

^ permalink raw reply	[flat|nested] 4+ messages in thread

* bug#51638: 26.1; Writing Romanian Characters
  2021-11-07  1:03   ` crstml
@ 2021-11-07 10:48     ` Eli Zaretskii
  0 siblings, 0 replies; 4+ messages in thread
From: Eli Zaretskii @ 2021-11-07 10:48 UTC (permalink / raw)
  To: crstml; +Cc: 51638

> Cc: 51638@debbugs.gnu.org
> From: crstml@libero.it
> Date: Sun, 7 Nov 2021 02:03:12 +0100
> 
> Eli Zaretskii wrote:
> >
> > Please try the patch below.  After applying the patch, typing "s ,"
> > will show two variants in the echo-area, and you can choose between
> > them with C-f/C-b and the arrow keys.
> >
> > Is that a satisfactory solution?
> 
> The solution is satisfactory for personal use. Thank you very much.
> But there the following issues:
> 
> 1) The patch is not complete for Romanian. A solution for Romanian
>      should also handle the T and t letters. I send you attached a patch
>      based on yours that handles also these letters

Thanks, I installed that.

> 2) For example if I try to type the Romanian word "Șes" then when I
>      try  to write the "S WITH COMMA", after I type "S" ","  and finally
>      a "C-f" to select my letter then best for a user would to type the
>      next letter  that follows in his/her word ("e" in this case). But it
>      doesn't work. If the user types "e" then the first choice is inserted
>      in the buffer. The user must select the second letter with an ENTER.

That's how Emacs input methods work when there's more than one
translation of what you typed.

> 3) For general use a better solution should exist because it is not
>      very user  friendly to type so many characters to select your letter.
>      Probably a true  Romanian language environment based on
>      ISO-8859-16 should be provided.  The actual choice of ISO-8859-2
>      for the Romanian environment probably is not the best.
> 
>     For example I will modify your patch to insert my ISO-8859-16
>     characters instead of those in ISO-LATIN-2. Because if I write Romanian
>     I never have to interact with the s an t variants in ISO-LATIN-2.

Patches to add such an language-environment will be welcome.

Thanks.





^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-11-07 10:48 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-06 12:28 bug#51638: 26.1; Writing Romanian Characters crstml
2021-11-06 17:12 ` Eli Zaretskii
2021-11-07  1:03   ` crstml
2021-11-07 10:48     ` Eli Zaretskii

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).