unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
From: Eli Zaretskii <eliz@gnu.org>
To: Kenichi Handa <handa@gnu.org>
Cc: 11860@debbugs.gnu.org, smias@yandex.ru
Subject: bug#11860: 24.1; Arabic - Harakat (diacritics, short vowels) don't appear
Date: Sat, 18 Aug 2012 18:33:21 +0300	[thread overview]
Message-ID: <837gswgqpq.fsf@gnu.org> (raw)
In-Reply-To: <87k3wwimlk.fsf@gnu.org>

> From: Kenichi Handa <handa@gnu.org>
> Cc: 11860@debbugs.gnu.org, smias@yandex.ru, handa@gnu.org
> Date: Sat, 18 Aug 2012 18:19:19 +0900
> 
> > If this is the case, how come we display the diacriticals correctly on
> > Windows in other cases, e.g. with Hebrew?
> 
> For Hebrew too, on Windows, I see the same problem as what
> Steffan <smias@yandex.ru> reported:
> 
> In article <349641344144469@web8d.yandex.ru>, Steffan <smias@yandex.ru> writes:
> >>> I choose "hebrew-full" as input-method.
> >>> 
> >>> - After typing 'f' I get KAF
> >>> - then by typing d I get GIMMEL
> >>> - and after typing 'D' I get "the three point sign" (HEBREW POINT QUBUTS) not below the GIMMEL but the KAF!
> 
> If you don't face with that problem, perhaps we are using
> the different font.  C-u C-x = tells that "courier new" is
> used for hebrew too in my case.

"Courier New" is the font that is used, and I still don't see the
problem.  The HEBREW POINT QUBUTS is displayed below GIMEL, as I'd
expect.

> I've just read the function uniscribe_shape in
> w32uniscribe.c.  It seems that these are the key API for
> uniscribe:
> 
> * ScriptItemize -- no idea what is this

It breaks the string to be displayed into individually shapeable
chunks, called "items".  We then pass each chunk to Uniscribe
separately for shaping.

> * ScriptShape -- perhaps for glyph substitution (GSUB features of opentype)

http://msdn.microsoft.com/en-us/library/windows/desktop/dd368564%28v=vs.85%29.aspx
says that this function "Generates glyphs and visual attributes for a
Unicode run".

> * ScriptPlace -- perhaps for glyph positioning (GPOS features of opentype)
> 
> So at first please check the documentation of ScriptShape
> and figure out how it works for bidi script; i.e. what order
> does it expect for input, and what order does it produce.

From the above page:

  If fLogicalOrder is set to TRUE in the SCRIPT_ANALYSIS structure, the
  function always generates glyphs in the same order as the original
  Unicode characters. If fLogicalOrder is set to FALSE, the function
  generates right-to-left items in reverse order so that ScriptTextOut
  does not have to reverse them before calling ExtTextOut.

And w32uniscribe.c sets that flag to TRUE a few lines before it calls
ScriptShape, because Emacs itself reorders characters:

  for (i = 0; i < nitems; i++)
    {
      int nglyphs, nchars_in_run;
      nchars_in_run = items[i+1].iCharPos - items[i].iCharPos;
      /* Force ScriptShape to generate glyphs in the same order as
	 they are in the input LGSTRING, which is in the logical
	 order.  */
      items[i].a.fLogicalOrder = 1;  <<<<<<<<<<<<<<<<<<<<<<<<

      /* Context may be NULL here, in which case the cache should be
         used without needing to select the font.  */
      result = ScriptShape (context, &(uniscribe_font->cache),
			    chars + items[i].iCharPos, nchars_in_run,
			    max_glyphs - done_glyphs, &(items[i].a),
			    glyphs, clusters, attributes, &nglyphs);

> Next please find the meaning of this code fragment:
> 
> 		  /* Detect clusters, for linking codes back to
> 		     characters.  */
> 		  if (attributes[j].fClusterStart)
> 		    {
> 		      while (from < nchars_in_run && clusters[from] < j)
> 			from++;
> 		      if (from >= nchars_in_run)
> 			from = to = nchars_in_run - 1;
> 		      else
> 			{
> 			  int k;
> 			  to = nchars_in_run - 1;
> 			  for (k = from + 1; k < nchars_in_run; k++)
> 			    {
> 			      if (clusters[k] > j)
> 				{
> 				  to = k - 1;
> 				  break;
> 				}
> 			    }
> 			}
> 		    }
> 
> The comment refer to "clusters".  I don't know what it
> exactly means in uniscribe, but I guess it relates to
> grapheme cluster, and if so, this part seems to relates to
> the ordering of glyphs in this kind of grapheme clauster:
> 
>   [0 1 1593 969 8 1 8 12 4 nil]
>   [0 1 1593 760 0 3 6 12 4 [1 -2 0]]

No, they are character clusters, not grapheme clusters.  They could be
similar (or even identical) to grapheme clusters, but I'm not sure,
because I have a very vague idea about both.  You can find some
details here:

   http://msdn.microsoft.com/en-us/library/windows/desktop/dd317792%28v=vs.85%29.aspx

I hope this will allow you to understand the meaning of the above
code, by looking at how the results are used in the calls to
LGLYPH_SET_* macros right below the above snippet.

Thanks.





  reply	other threads:[~2012-08-18 15:33 UTC|newest]

Thread overview: 71+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-07-04  9:17 bug#11860: 24.1; Arabic - Harakat (diacritics, short vowels) don't appear Steffan
2012-07-04 20:22 ` Eli Zaretskii
2012-07-05 17:53 ` Steffan
2012-08-05  5:27 ` Steffan
2012-08-05 15:49   ` Eli Zaretskii
2012-08-13  0:02     ` Kenichi Handa
2012-08-18  2:45       ` Kenichi Handa
2012-08-18  7:14         ` Eli Zaretskii
2012-08-18  9:19           ` Kenichi Handa
2012-08-18 15:33             ` Eli Zaretskii [this message]
2012-08-19  7:32               ` YAMAMOTO Mitsuharu
2012-08-19 12:51                 ` Kenichi Handa
2012-08-19 13:20               ` Kenichi Handa
2012-08-19 18:44                 ` Eli Zaretskii
2012-08-19 18:53                   ` Werner LEMBERG
2012-08-20 17:24                   ` Eli Zaretskii
2012-08-19  3:02             ` Jason Rumney
2012-08-19 13:37               ` Kenichi Handa
2012-08-19 16:16                 ` Jason Rumney
2012-08-19 18:54                   ` Eli Zaretskii
2012-08-20 14:57                     ` Kenichi Handa
2012-08-20 17:16                       ` Eli Zaretskii
2012-08-21  9:20                         ` Kenichi Handa
2012-08-19 18:52                 ` Eli Zaretskii
2012-08-19 17:56               ` Eli Zaretskii
2012-08-19  4:34         ` YAMAMOTO Mitsuharu
2012-09-09  4:06           ` YAMAMOTO Mitsuharu
2012-09-11 14:49             ` Kenichi Handa
2012-09-11 17:48               ` Eli Zaretskii
2012-09-12 13:14                 ` Kenichi Handa
2012-09-12 16:34                   ` Eli Zaretskii
2012-09-13  6:07                     ` Kenichi Handa
2012-09-13 17:00                       ` Eli Zaretskii
2012-09-13 23:26                         ` Kenichi Handa
2012-09-16 12:03               ` Kenichi Handa
2012-09-16 12:41                 ` Eli Zaretskii
2012-09-16 15:43                   ` Stefan Monnier
2012-09-16 15:50                     ` Eli Zaretskii
2012-09-17 14:08                       ` Kenichi Handa
2012-09-17 16:58                         ` Stefan Monnier
2012-08-19 18:22         ` Eli Zaretskii
2012-08-21 13:16           ` Kenichi Handa
2012-08-21 17:32             ` Eli Zaretskii
2012-08-22  9:15               ` Kenichi Handa
2012-08-22 19:52 ` Steffan
2012-08-23  2:50   ` Eli Zaretskii
2012-08-22 21:40 ` Steffan
2012-08-23  2:49   ` Eli Zaretskii
2012-08-27 21:10 ` Steffan
2012-08-29  8:09   ` Kenichi Handa
2012-09-01 13:59     ` Eli Zaretskii
2012-09-03 13:55       ` Kenichi Handa
2012-09-03 15:53         ` Eli Zaretskii
2012-09-04  9:03           ` Kenichi Handa
2012-08-29  8:57 ` Steffan
2012-09-01 14:06   ` Eli Zaretskii
2012-09-03 15:31 ` Steffan
2012-09-03 16:28   ` Eli Zaretskii
2012-09-04 17:18   ` Eli Zaretskii
2012-09-03 16:24 ` Steffan
2012-09-03 17:49 ` Steffan
2012-09-06  2:09   ` YAMAMOTO Mitsuharu
2012-09-06  8:52 ` Steffan
2012-09-06  9:56   ` YAMAMOTO Mitsuharu
2012-09-06 10:47     ` Eli Zaretskii
2012-09-06 14:52 ` Steffan
2012-09-10 16:13 ` Steffan
2020-08-17 22:45 ` Stefan Kangas
2020-08-18  4:40   ` Eli Zaretskii
2020-08-18  9:47     ` Stefan Kangas
     [not found] <14231341502795@web11e.yandex.ru>
2012-07-05 17:16 ` Eli Zaretskii

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=837gswgqpq.fsf@gnu.org \
    --to=eliz@gnu.org \
    --cc=11860@debbugs.gnu.org \
    --cc=handa@gnu.org \
    --cc=smias@yandex.ru \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).