From: Eli Zaretskii <eliz@gnu.org>
To: Rah Guzar <aikrahguzar@gmail.com>
Cc: 50951@debbugs.gnu.org
Subject: bug#50951: Fwd: bug#50951: 28.0.50; Urdu text is not displayed correctly
Date: Sat, 02 Oct 2021 15:18:28 +0300 [thread overview]
Message-ID: <83sfxjbox7.fsf@gnu.org> (raw)
In-Reply-To: <CAP094xBzM6mxi9Q8ahYbK8u0Dp+rcfXXXVOAMHL1qpEUcBxH_A@mail.gmail.com> (message from Rah Guzar on Sat, 2 Oct 2021 13:43:47 +0200)
> From: Rah Guzar <aikrahguzar@gmail.com>
> Date: Sat, 2 Oct 2021 13:43:47 +0200
>
> Let us consider the word نہیں
>
> It is composed of four letters. I will use character field from `describe-char` for each of them below
> 1) ن (displayed as ن) (codepoint 1606, #o3106, #x646)
> 2) ہ (displayed as ہ) (codepoint 1729, #o3301, #x6c1)
> 3) ی (displayed as ی) (codepoint 1740, #o3314, #x6cc)
> 4) ں (displayed as ں) (codepoint 1722, #o3272, #x6ba)
>
> It should be displayed with all 4 characters joined together, instead they are all displayed individually.
What font displays them individually? You should be able to tell that
if you type "C-u C-x =" on one of these characters.
For me, they display joined together.
> If I change to `NotoNastaliqUrdu` this word is displayed correctly. But there is problem with حرف
>
> It consist of three letters,
> 1) ح (displayed as ح) (codepoint 1581, #o3055, #x62d)
> 2) ر (displayed as ر) (codepoint 1585, #o3061, #x631)
> 3) ف (displayed as ف) (codepoint 1601, #o3101, #x641)
>
> The first two characters should be joined and the last one should be on its own. This seems to be the case.
> But the two groups are rendered on top of each other making it illegible.
>
> So isn't this a matter of finding a proper font, in particularly given
> the "Nastaliq vs Naskh" issues? NotoNastaliqUrdu is not the only font
> supporting Nastaliq, so perhaps other fonts fare better?
>
> My knowledge here is very deficient but my impression is Nastaliq and Naskh are styles and shouldn't affect
> composition.
> NotoNastaliqUrdu was the only Urdu font available from my distro. Libreoffice which also uses harfbuzz
> renders it
> correctly so I didn't try another font at first. Like emacs libreoffice also uses a Naskh font by default but all the
> characters are joined properly.
>
> I did try some fonts from https://urdufonts.net/ after your suggestions and they render correctly. Specifically
> the font I tried
> were:
> Jameel Noori Nastaleeq Regular
> Alvi Nastaleeq
> Zohra Unicode
> Manzor Unicode
>
> I didn't notice a problem with any of them except a very minor one for the last two which have visible
> boundaries where glyphs
> are joined.
So would it be correct to say that using a proper font solves the
problem?
> Since Urdu uses the Arabic characters, Emacs uses character
> composition rules for Arabic when displaying this text. Do you know
> if the composition rules for Urdu are different?
>
> I think using Arabic composition rules might be part of the problem. Urdu alphabet is a superset of Arabic
> alphabet and if I
> don't set a font specifically designed for Urdu, the words where some characters should be joined but aren't
> always seem to
> include a character like ہ which is in Urdu alphabet but not in Arabic.
I don't think the problem is with compositions, because in the 2
examples you described above, there are no character compositions.
Moreover, our pattern for asking HarfBuzz to shape Arabic text is
this:
"[\u0600-\u074F\u200C\u200D]+"
which includes all of the characters, including U+06C1 which you say
causes problems.
You could try setting current-iso639-language to the symbol 'ur'
(without the quotes), that should tell HarfBuzz to shape the text as
appropriate for Urdu. But I think the real problem is with the font,
not with shaping.
next prev parent reply other threads:[~2021-10-02 12:18 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-10-01 20:11 bug#50951: 28.0.50; Urdu text is not displayed correctly Rah Guzar
2021-10-02 6:07 ` Eli Zaretskii
[not found] ` <CAP094xCyzg62eHeYCUkWy+eBCbEXC_AAU5YFbhTCcCR0cAOCQw@mail.gmail.com>
2021-10-02 11:43 ` bug#50951: Fwd: " Rah Guzar
2021-10-02 12:18 ` Eli Zaretskii [this message]
2021-10-02 12:47 ` Rah Guzar
2021-10-02 13:09 ` Eli Zaretskii
2021-10-02 14:19 ` Rah Guzar
2021-10-02 14:50 ` Eli Zaretskii
[not found] ` <CAP094xBq9YjL6xS56t-C3uhSH69TawhsCrF2FdSMySeDpZfGNw@mail.gmail.com>
2021-10-02 15:09 ` Eli Zaretskii
2021-10-02 15:18 ` Rah Guzar
2021-10-02 14:18 ` Andreas Schwab
2021-10-02 14:40 ` Eli Zaretskii
2021-10-02 15:07 ` Rah Guzar
2021-10-02 15:14 ` Eli Zaretskii
[not found] ` <CAP094xAoHdQZoPL9y6aZOq-WGZe0cYtNsm9Trm+yBiyjyZ4j7g@mail.gmail.com>
2021-10-02 15:54 ` Eli Zaretskii
2021-10-02 16:06 ` Rah Guzar
2021-10-02 16:09 ` Eli Zaretskii
2022-09-04 21:07 ` Lars Ingebrigtsen
2022-09-05 11:22 ` Eli Zaretskii
2022-09-05 11:57 ` Rah Guzar via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-09-05 12:29 ` Eli Zaretskii
2022-09-05 13:03 ` Rah Guzar via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-09-05 13:55 ` Eli Zaretskii
[not found] ` <87pmg97vsg.fsf@zohomail.eu>
2022-09-05 15:47 ` Eli Zaretskii
2022-09-06 4:26 ` Visuwesh
2022-09-06 11:05 ` Eli Zaretskii
2022-09-06 13:18 ` Visuwesh
2022-09-07 6:18 ` YAMAMOTO Mitsuharu
2022-09-07 11:27 ` Eli Zaretskii
2022-09-08 6:06 ` Visuwesh
2022-09-09 15:00 ` Rah Guzar via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-09-17 16:37 ` Rah Guzar via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-09-17 17:00 ` Eli Zaretskii
2022-09-20 3:41 ` YAMAMOTO Mitsuharu
2022-09-20 11:07 ` Eli Zaretskii
2022-09-21 2:20 ` YAMAMOTO Mitsuharu
2022-09-21 2:25 ` YAMAMOTO Mitsuharu
2022-09-22 5:37 ` Eli Zaretskii
2022-09-25 7:18 ` YAMAMOTO Mitsuharu
2022-09-26 7:18 ` Eli Zaretskii
2022-09-27 0:29 ` YAMAMOTO Mitsuharu
2022-09-20 12:35 ` Rah Guzar via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-09-11 10:26 ` Visuwesh
2022-09-11 11:11 ` Visuwesh
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/emacs/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=83sfxjbox7.fsf@gnu.org \
--to=eliz@gnu.org \
--cc=50951@debbugs.gnu.org \
--cc=aikrahguzar@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).