all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
* bug#12693: 24.2.50; src/w32font.c should depend on ANSI code page
@ 2012-10-20 21:46 Kazuhiro Ito
  2012-10-23 11:52 ` Jason Rumney
  2020-09-13 16:16 ` bug#12693: [cygwin] Setting fonts with non-ascii names throws error quit Lars Ingebrigtsen
  0 siblings, 2 replies; 11+ messages in thread
From: Kazuhiro Ito @ 2012-10-20 21:46 UTC (permalink / raw)
  To: 12693

When I run Emacs on Cygwin with the native Windows UI, I can't specify
font by non-ascii font name.  For example, the below code success on
precompiled binary on Windows (Japanese edition) but raises error on
Cygwin with the native Windows UI.

(set-default-font "MS ゴシック-14")

The reason is that lfFaceName member of LOGFONT structure is expected
to be encoded in ANSI code page, but Emacs encodes in or decodes as
the coding system specified in locale-coding-system variable.  It is
set to utf-8-unix on Cygwin and causes the above problem.

I think the below patch or similar modification would be needed.


=== modified file 'src/w32font.c'
--- src/w32font.c	2012-09-17 12:07:36 +0000
+++ src/w32font.c	2012-10-20 12:12:49 +0000
@@ -34,6 +34,15 @@
 #include "font.h"
 #include "w32font.h"
 
+/* From w32select.c */
+extern Lisp_Object QANSICP;
+
+#define ENCODE_ACP(str)					\
+  (code_convert_string_norecord (str, QANSICP, 1))
+
+#define DECODE_ACP(str)					\
+  (code_convert_string_norecord (str, QANSICP, 0))
+
 /* Cleartype available on Windows XP, cleartype_natural from XP SP1.
    The latter does not try to fit cleartype smoothed fonts into the
    same bounding box as the non-antialiased version of the font.
@@ -285,7 +294,7 @@
 Lisp_Object
 intern_font_name (char * string)
 {
-  Lisp_Object str = DECODE_SYSTEM (build_string (string));
+  Lisp_Object str = DECODE_ACP (build_string (string));
   int len = SCHARS (str);
   Lisp_Object obarray = check_obarray (Vobarray);
   Lisp_Object tem = oblookup (obarray, SDATA (str), len, len);
@@ -971,10 +980,10 @@
       }
     if (name)
       font->props[FONT_FULLNAME_INDEX]
-        = DECODE_SYSTEM (build_string (name));
+        = DECODE_ACP (build_string (name));
     else
       font->props[FONT_FULLNAME_INDEX]
-	= DECODE_SYSTEM (build_string (logfont.lfFaceName));
+	= DECODE_ACP (build_string (logfont.lfFaceName));
   }
 
   font->max_width = w32_font->metrics.tmMaxCharWidth;
@@ -2035,7 +2044,7 @@
       else if (SYMBOLP (tmp))
 	{
 	  strncpy (logfont->lfFaceName,
-		   SDATA (ENCODE_SYSTEM (SYMBOL_NAME (tmp))), LF_FACESIZE);
+		   SDATA (ENCODE_ACP (SYMBOL_NAME (tmp))), LF_FACESIZE);
 	  logfont->lfFaceName[LF_FACESIZE-1] = '\0';
 	}
     }
@@ -2131,7 +2140,7 @@
       if (NILP (family))
         continue;
       else if (SYMBOLP (family))
-        name = SDATA (ENCODE_SYSTEM (SYMBOL_NAME (family)));
+        name = SDATA (ENCODE_ACP (SYMBOL_NAME (family)));
       else
 	continue;
 
@@ -2511,7 +2520,7 @@
       || logfont_to_fcname (&lf, cf.iPointSize, buf, 100) < 0)
     return Qnil;
 
-  return DECODE_SYSTEM (build_string (buf));
+  return DECODE_ACP (build_string (buf));
 }
 
 static const char *const w32font_booleans [] = {

=== modified file 'src/w32select.c'
--- src/w32select.c	2012-10-11 00:32:25 +0000
+++ src/w32select.c	2012-10-20 06:11:00 +0000
@@ -117,7 +117,8 @@
    based on current system parameters. */
 static LCID DEFAULT_LCID;
 static UINT ANSICP, OEMCP;
-static Lisp_Object QUNICODE, QANSICP, QOEMCP;
+static Lisp_Object QUNICODE, QOEMCP;
+Lisp_Object QANSICP;
 
 /* A hidden window just for the clipboard management. */
 static HWND clipboard_owner;


-- 
Kazuhiro Ito





^ permalink raw reply	[flat|nested] 11+ messages in thread

* bug#12693: 24.2.50; src/w32font.c should depend on ANSI code page
  2012-10-20 21:46 bug#12693: 24.2.50; src/w32font.c should depend on ANSI code page Kazuhiro Ito
@ 2012-10-23 11:52 ` Jason Rumney
  2012-10-23 13:05   ` Kazuhiro Ito
  2012-10-23 16:12   ` Eli Zaretskii
  2020-09-13 16:16 ` bug#12693: [cygwin] Setting fonts with non-ascii names throws error quit Lars Ingebrigtsen
  1 sibling, 2 replies; 11+ messages in thread
From: Jason Rumney @ 2012-10-23 11:52 UTC (permalink / raw)
  To: Kazuhiro Ito; +Cc: 12693

Kazuhiro Ito <kzhr@d1.dion.ne.jp> writes:

> When I run Emacs on Cygwin with the native Windows UI, I can't specify
> font by non-ascii font name.  For example, the below code success on
> precompiled binary on Windows (Japanese edition) but raises error on
> Cygwin with the native Windows UI.
>
> (set-default-font "MS ゴシック-14")
>
> The reason is that lfFaceName member of LOGFONT structure is expected
> to be encoded in ANSI code page, but Emacs encodes in or decodes as
> the coding system specified in locale-coding-system variable.  It is
> set to utf-8-unix on Cygwin and causes the above problem.

This is a problem with the Cygwin build's initialisation of
locale-coding-system. It is supposed to be set to the coding system that
system calls will accept, which on Windows cannot be utf-8 (maybe on
recent versions it can be, but when I tried on Windows XP, it caused all
manner of problems).







^ permalink raw reply	[flat|nested] 11+ messages in thread

* bug#12693: 24.2.50; src/w32font.c should depend on ANSI code page
  2012-10-23 11:52 ` Jason Rumney
@ 2012-10-23 13:05   ` Kazuhiro Ito
  2012-10-23 16:22     ` Eli Zaretskii
  2012-10-23 16:12   ` Eli Zaretskii
  1 sibling, 1 reply; 11+ messages in thread
From: Kazuhiro Ito @ 2012-10-23 13:05 UTC (permalink / raw)
  To: Jason Rumney; +Cc: 12693

> > When I run Emacs on Cygwin with the native Windows UI, I can't specify
> > font by non-ascii font name.  For example, the below code success on
> > precompiled binary on Windows (Japanese edition) but raises error on
> > Cygwin with the native Windows UI.
> >
> > (set-default-font "MS ゴシック-14")
> >
> > The reason is that lfFaceName member of LOGFONT structure is expected
> > to be encoded in ANSI code page, but Emacs encodes in or decodes as
> > the coding system specified in locale-coding-system variable.  It is
> > set to utf-8-unix on Cygwin and causes the above problem.
> 
> This is a problem with the Cygwin build's initialisation of
> locale-coding-system. It is supposed to be set to the coding system that
> system calls will accept, which on Windows cannot be utf-8 (maybe on
> recent versions it can be, but when I tried on Windows XP, it caused all
> manner of problems).

On Cygwin, locale-coding-system's value depends on its environment.
For example,

$ env LANG=ja_JP.CP932 emacs --batch --eval '(princ locale-coding-system)'
-> japanese-cp932-unix

$ env LANG=ja_JP.UTF-8 emacs --batch --eval '(princ locale-coding-system)'
-> utf-8-unix


And, some functions expect locale-coding-system to be set locale's
coding system, not ANSI code page.
Please try the below code (cygwin, locale is ja_JP.UTF-8).

(list
 locale-coding-system
 (let ((locale-coding-system 'utf-8))
   (format-time-string "%c"))
 (let ((locale-coding-system 'cp932))
   (format-time-string "%c")))

-> (utf-8-unix "2012年10月23日 21時30分39秒" #("2012蟷エ10譛\21023譌・ 21譎\20230蛻\20639遘\222" 4 5 (charset cp932-2-byte) 5 8 (charset katakana-sjis) 8 13 (charset cp932-2-byte) 13 17 (charset katakana-sjis) 17 26 (charset cp932-2-byte)))


At present, locale-coding-system has to be ANSI code page for
(w32-select-font), and has to be locale's coding system for
(format-time-string "%c").  The cause is that we use two kinds of
system calls, Windows's API and Cygwin's API (may three, if we count
Windows's Unicode API).

-- 
Kazuhiro Ito

^ permalink raw reply	[flat|nested] 11+ messages in thread

* bug#12693: 24.2.50; src/w32font.c should depend on ANSI code page
  2012-10-23 11:52 ` Jason Rumney
  2012-10-23 13:05   ` Kazuhiro Ito
@ 2012-10-23 16:12   ` Eli Zaretskii
  1 sibling, 0 replies; 11+ messages in thread
From: Eli Zaretskii @ 2012-10-23 16:12 UTC (permalink / raw)
  To: Jason Rumney; +Cc: kzhr, 12693

> From: Jason Rumney <jasonr@gnu.org>
> Date: Tue, 23 Oct 2012 19:52:30 +0800
> Cc: 12693@debbugs.gnu.org
> 
> [locale-coding-system] is supposed to be set to the coding system that
> system calls will accept, which on Windows cannot be utf-8 (maybe on
> recent versions it can be, but when I tried on Windows XP, it caused all
> manner of problems).

No, UTF-8 still cannot be used on Windows, AFAIK.





^ permalink raw reply	[flat|nested] 11+ messages in thread

* bug#12693: 24.2.50; src/w32font.c should depend on ANSI code page
  2012-10-23 13:05   ` Kazuhiro Ito
@ 2012-10-23 16:22     ` Eli Zaretskii
  2012-10-25 21:18       ` Daniel Colascione
  0 siblings, 1 reply; 11+ messages in thread
From: Eli Zaretskii @ 2012-10-23 16:22 UTC (permalink / raw)
  To: Kazuhiro Ito; +Cc: 12693

> Date: Tue, 23 Oct 2012 22:05:46 +0900
> From: Kazuhiro Ito <kzhr@d1.dion.ne.jp>
> Cc: 12693@debbugs.gnu.org
> 
> On Cygwin, locale-coding-system's value depends on its environment.
> For example,
> 
> $ env LANG=ja_JP.CP932 emacs --batch --eval '(princ locale-coding-system)'
> -> japanese-cp932-unix
> 
> $ env LANG=ja_JP.UTF-8 emacs --batch --eval '(princ locale-coding-system)'
> -> utf-8-unix

This is not necessarily relevant to Emacs, or at least doesn't provide
a definitive answer to the question what encoding should ENCODE_SYSTEM
use in the cygw32 build, which is a kind of androgen wrt encoding and
decoding issues.

There are several places where this issue might (or will) pop up:

  . decoding keyboard key events
  . encoding and decoding file names
  . encoding strings passed to various non-file APIs, like the one you
    mentioned

At least the first 2 items use different single-byte encoding in the
GUI and the console frames.

Someone(TM) should analyze all these and come up with recommendations
whether cygw32 should cater to the normal Cygwin locale, or maybe for
practical reasons it should do something else.

> Please try the below code (cygwin, locale is ja_JP.UTF-8).
> 
> (list
>  locale-coding-system
>  (let ((locale-coding-system 'utf-8))
>    (format-time-string "%c"))
>  (let ((locale-coding-system 'cp932))
>    (format-time-string "%c")))

This is but one example.  As you yourself found out, this encoding is
unsuitable for the font interface.

> At present, locale-coding-system has to be ANSI code page for
> (w32-select-font)

So maybe we need w32-select-font to use UTF-16 in the cygw32 case, as
it does for menus.

> The cause is that we use two kinds of system calls, Windows's API
> and Cygwin's API (may three, if we count Windows's Unicode API).

See above: there's much more than just 3.





^ permalink raw reply	[flat|nested] 11+ messages in thread

* bug#12693: 24.2.50; src/w32font.c should depend on ANSI code page
  2012-10-23 16:22     ` Eli Zaretskii
@ 2012-10-25 21:18       ` Daniel Colascione
  2012-10-26  7:30         ` Eli Zaretskii
  0 siblings, 1 reply; 11+ messages in thread
From: Daniel Colascione @ 2012-10-25 21:18 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Kazuhiro Ito, 12693

[-- Attachment #1: Type: text/plain, Size: 1680 bytes --]

On 10/23/2012 9:22 AM, Eli Zaretskii wrote:
>> Date: Tue, 23 Oct 2012 22:05:46 +0900
>> From: Kazuhiro Ito <kzhr@d1.dion.ne.jp>
>> Cc: 12693@debbugs.gnu.org
>>
>> On Cygwin, locale-coding-system's value depends on its environment.
>> For example,
>>
>> $ env LANG=ja_JP.CP932 emacs --batch --eval '(princ locale-coding-system)'
>> -> japanese-cp932-unix
>>
>> $ env LANG=ja_JP.UTF-8 emacs --batch --eval '(princ locale-coding-system)'
>> -> utf-8-unix
> 
> This is not necessarily relevant to Emacs, or at least doesn't provide
> a definitive answer to the question what encoding should ENCODE_SYSTEM
> use in the cygw32 build, which is a kind of androgen wrt encoding and
> decoding issues.
> 
> There are several places where this issue might (or will) pop up:
> 
>   . decoding keyboard key events

Already handled, I believe.

>   . encoding and decoding file names

We talk to Cygwin here, so there's no problem using locale-coding-system.

>   . encoding strings passed to various non-file APIs, like the one you
>     mentioned

I tried to ferret these out what I was doing the initial port, but it looks like
I missed the font code.

> 
> At least the first 2 items use different single-byte encoding in the
> GUI and the console frames.
> 
> Someone(TM) should analyze all these and come up with recommendations
> whether cygw32 should cater to the normal Cygwin locale, or maybe for
> practical reasons it should do something else.

The right code for Cygw32 is to always define NTGUI_UNICODE and unconditionally
use Unicode APIs when NTGUI_UNICODE is set. Maybe, someday, we can define
NTGUI_UNICODE for the NT port too.


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 256 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* bug#12693: 24.2.50; src/w32font.c should depend on ANSI code page
  2012-10-25 21:18       ` Daniel Colascione
@ 2012-10-26  7:30         ` Eli Zaretskii
  0 siblings, 0 replies; 11+ messages in thread
From: Eli Zaretskii @ 2012-10-26  7:30 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: kzhr, 12693

> Date: Thu, 25 Oct 2012 14:18:07 -0700
> From: Daniel Colascione <dancol@dancol.org>
> CC: Kazuhiro Ito <kzhr@d1.dion.ne.jp>, 12693@debbugs.gnu.org
> 
> The right code for Cygw32 is to always define NTGUI_UNICODE and unconditionally
> use Unicode APIs when NTGUI_UNICODE is set.

I figured that much.  So I suggest that the patch to fix this issue be
reworked in that direction.

> Maybe, someday, we can define NTGUI_UNICODE for the NT port too.

That's the plan, yes.  Although I think it will not be a compile-time
test, since there's a lot of work involved, and so some old code will
have to coexist with the new for some time.  Volunteers are welcome.





^ permalink raw reply	[flat|nested] 11+ messages in thread

* bug#12693: [cygwin] Setting fonts with non-ascii names throws error quit
  2012-10-20 21:46 bug#12693: 24.2.50; src/w32font.c should depend on ANSI code page Kazuhiro Ito
  2012-10-23 11:52 ` Jason Rumney
@ 2020-09-13 16:16 ` Lars Ingebrigtsen
  2020-09-14  8:40   ` Kazuhiro Ito
  1 sibling, 1 reply; 11+ messages in thread
From: Lars Ingebrigtsen @ 2020-09-13 16:16 UTC (permalink / raw)
  To: Kazuhiro Ito; +Cc: 12693

Kazuhiro Ito <kzhr@d1.dion.ne.jp> writes:

> When I run Emacs on Cygwin with the native Windows UI, I can't specify
> font by non-ascii font name.  For example, the below code success on
> precompiled binary on Windows (Japanese edition) but raises error on
> Cygwin with the native Windows UI.
>
> (set-default-font "MS ゴシック-14")

This was seven years ago, and this function no longer exists, so
obviously things have changed in this area.  Are you still seeing this
bug in a recent version of Emacs?

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 11+ messages in thread

* bug#12693: [cygwin] Setting fonts with non-ascii names throws error quit
  2020-09-13 16:16 ` bug#12693: [cygwin] Setting fonts with non-ascii names throws error quit Lars Ingebrigtsen
@ 2020-09-14  8:40   ` Kazuhiro Ito
  2020-09-14 10:52     ` Lars Ingebrigtsen
  0 siblings, 1 reply; 11+ messages in thread
From: Kazuhiro Ito @ 2020-09-14  8:40 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: 12693

> > When I run Emacs on Cygwin with the native Windows UI, I can't specify
> > font by non-ascii font name.  For example, the below code success on
> > precompiled binary on Windows (Japanese edition) but raises error on
> > Cygwin with the native Windows UI.
> >
> > (set-default-font "MS ゴシック-14")
> 
> This was seven years ago, and this function no longer exists, so
> obviously things have changed in this area.  Are you still seeing this
> bug in a recent version of Emacs?

Yes.

(set-frame-font "MS ゴシック-14") raises an error on Cygw32 build
but not on MinGW64 build.  x-select-font function returns encoded
string on Cygw32 build.  Let-binding locale-coding-system to the
correct codepage can avoid the problem.

;; Chose "MS ゴシック-14"
(x-select-font)

-> "\202l\202r \203S\203V\203b\203N-14"

(let ((locale-coding-system 'cp932))
  (x-select-font))

-> #("MS ゴシック-14" 0 10 (charset cp932-2-byte))

(set-frame-font "MS ゴシック-14")

-> error

(let ((locale-coding-system 'cp932))
  (set-frame-font "MS ゴシック-14"))

-> Frame font is changed.

-- 
Kazuhiro Ito





^ permalink raw reply	[flat|nested] 11+ messages in thread

* bug#12693: [cygwin] Setting fonts with non-ascii names throws error quit
  2020-09-14  8:40   ` Kazuhiro Ito
@ 2020-09-14 10:52     ` Lars Ingebrigtsen
  2020-09-14 11:38       ` Kazuhiro Ito
  0 siblings, 1 reply; 11+ messages in thread
From: Lars Ingebrigtsen @ 2020-09-14 10:52 UTC (permalink / raw)
  To: Kazuhiro Ito; +Cc: 12693

Kazuhiro Ito <kzhr@d1.dion.ne.jp> writes:

> (set-frame-font "MS ゴシック-14") raises an error on Cygw32 build
> but not on MinGW64 build.  x-select-font function returns encoded
> string on Cygw32 build.  Let-binding locale-coding-system to the
> correct codepage can avoid the problem.
>
> ;; Chose "MS ゴシック-14"
> (x-select-font)
>
> -> "\202l\202r \203S\203V\203b\203N-14"

Hm...  I don't use Windows, so I can't test this, but perhaps the result
from `x-select-font' should use `detect-coding-string' or something on
the result (and then decode it) so that we get a correct string in Emacs?

> (let ((locale-coding-system 'cp932))
>   (x-select-font))
>
> -> #("MS ゴシック-14" 0 10 (charset cp932-2-byte))
>
> (set-frame-font "MS ゴシック-14")
>
> -> error
>
> (let ((locale-coding-system 'cp932))
>   (set-frame-font "MS ゴシック-14"))
>
> -> Frame font is changed.

And the same here, but the other way around -- encode the string before
calling set-frame-front?

Unfortunately, on Debian, it looks like none of the fonts available here
have non-ASCII names, so I can't really test whether this idea even
makes any sense.  Anybody?

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 11+ messages in thread

* bug#12693: [cygwin] Setting fonts with non-ascii names throws error quit
  2020-09-14 10:52     ` Lars Ingebrigtsen
@ 2020-09-14 11:38       ` Kazuhiro Ito
  0 siblings, 0 replies; 11+ messages in thread
From: Kazuhiro Ito @ 2020-09-14 11:38 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: 12693

> > (set-frame-font "MS ゴシック-14") raises an error on Cygw32 build
> > but not on MinGW64 build.  x-select-font function returns encoded
> > string on Cygw32 build.  Let-binding locale-coding-system to the
> > correct codepage can avoid the problem.
> >
> > ;; Chose "MS ゴシック-14"
> > (x-select-font)
> >
> > -> "\202l\202r \203S\203V\203b\203N-14"
> 
> Hm...  I don't use Windows, so I can't test this, but perhaps the result
> from `x-select-font' should use `detect-coding-string' or something on
> the result (and then decode it) so that we get a correct string in Emacs?

As discussed in the original thread, Emacs uses ANSI version of
Windows API to handle fonts.  Strings passed to or received from APIs
should be encoded in or decoded from ANSI codepage.  To do that,
ENCODE_SYSTEM and DECODE_SYSTEM macros are used (See src/w32font.c).
It means that locale-coding-system is used around Windows font API.
That works well on MinGW64, because locale-coding-system is the same
with ANSI codepage.  But on Cygw32, locale-coding-system is normally
utf-8 and it is not ANSI codepage.  This is the cause of the problem.

My original post makes Emacs use ANSI codepage for Windows font API.
Further discussion indicates to make Emacs on Windows use unicode API
if available.  But no progresss after that.

-- 
Kazuhiro Ito





^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2020-09-14 11:38 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-10-20 21:46 bug#12693: 24.2.50; src/w32font.c should depend on ANSI code page Kazuhiro Ito
2012-10-23 11:52 ` Jason Rumney
2012-10-23 13:05   ` Kazuhiro Ito
2012-10-23 16:22     ` Eli Zaretskii
2012-10-25 21:18       ` Daniel Colascione
2012-10-26  7:30         ` Eli Zaretskii
2012-10-23 16:12   ` Eli Zaretskii
2020-09-13 16:16 ` bug#12693: [cygwin] Setting fonts with non-ascii names throws error quit Lars Ingebrigtsen
2020-09-14  8:40   ` Kazuhiro Ito
2020-09-14 10:52     ` Lars Ingebrigtsen
2020-09-14 11:38       ` Kazuhiro Ito

Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.