unofficial mirror of help-gnu-emacs@gnu.org
 help / color / mirror / Atom feed
* Inserting ே and க்ஷ without the dotted circle
@ 2015-11-09 17:44 Shakthi Kannan
  2015-11-09 18:55 ` Random832
  2015-11-09 19:04 ` Eli Zaretskii
  0 siblings, 2 replies; 14+ messages in thread
From: Shakthi Kannan @ 2015-11-09 17:44 UTC (permalink / raw)
  To: help-gnu-emacs

Hi,

I am using fonts-lohit-taml-classical 2.5.3-2 font on a Ubuntu 14.10
system. With Tamil Unicode character and GNU Emacs 24.5.1, I am able to
enter  ே and க்ஷ (க்ஷ = க + ் + ஷ) separately.

#1 I have a word where I need both the characters put together, but,
without the dotted circle. How can this be done?

I can open the font sources using FontForge:


https://fedorahosted.org/releases/l/o/lohit/lohit-tamil-classical-2.5.3.tar.gz

and when I click on U+0BC7, I see the glyph for ே, but, without the dotted
circle. But, the definition in Unicode has the dotted circle:

  http://www.charbase.com/0bc7-unicode-tamil-vowel-sign-ee

#2 Where is this behaviour defined for such characters?

#3 Suppose I add a new glyph to one of the available Unicode slots in the
font sources, where in Emacs should this be defined in order to use the
same?

Appreciate any help.

Thanks!

SK

-- 
Shakthi Kannan
http://www.shakthimaan.com


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Inserting ே and க்ஷ without the dotted circle
  2015-11-09 17:44 Inserting ே and க்ஷ without the dotted circle Shakthi Kannan
@ 2015-11-09 18:55 ` Random832
  2015-11-10  3:43   ` Shakthi Kannan
  2015-11-10 12:31   ` Alexis
  2015-11-09 19:04 ` Eli Zaretskii
  1 sibling, 2 replies; 14+ messages in thread
From: Random832 @ 2015-11-09 18:55 UTC (permalink / raw)
  To: help-gnu-emacs

Shakthi Kannan <shakthimaan@gmail.com> writes:

> Hi,
>
> I am using fonts-lohit-taml-classical 2.5.3-2 font on a Ubuntu 14.10
> system. With Tamil Unicode character and GNU Emacs 24.5.1, I am able to
> enter  ே and க்ஷ (க்ஷ = க + ் + ஷ) separately.
>
> #1 I have a word where I need both the characters put together, but,
> without the dotted circle. How can this be done?
>
> I can open the font sources using FontForge:
>
>
> https://fedorahosted.org/releases/l/o/lohit/lohit-tamil-classical-2.5.3.tar.gz
>
> and when I click on U+0BC7, I see the glyph for ே, but, without the dotted
> circle. But, the definition in Unicode has the dotted circle:

The dotted circle in the Unicode tables means that this is a combining
mark - specifically, in this case, that the glyph will appear to the
left of the preceding glyph that it is composed with.

To get the behavior I assume you want, you should enter U+0BC7 after the
others, i.e.:

0B95 0BCD 0BB7 0BC7, for this result: க்ஷே
= க + ் + ஷ + ே 

Note that this may not appear correctly in some terminals when you run
text-mode Emacs - it doesn't in mine.

>
>   http://www.charbase.com/0bc7-unicode-tamil-vowel-sign-ee
>
> #2 Where is this behaviour defined for such characters?

The Unicode standard provides information on how Indic scripts should be
rendered. Version 1.0 is available at the link below; I don't know if
any later version exists.

http://www.unicode.org/versions/Unicode1.0.0/V2appA.pdf

The dotted circle is generally used in charts to illustrate where a
combining character would be placed relative to preceding characters,
and in rendering to show an ill-formed sequence (i.e. the preceding
consonant cluster is missing).

> #3 Suppose I add a new glyph to one of the available Unicode slots in the
> font sources, where in Emacs should this be defined in order to use the
> same?

I am not sure what layer is responsible for doing this rendering, so I
don't know if there's anything you need to do or can do at the font level.




^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Inserting ே and க்ஷ without the dotted circle
  2015-11-09 17:44 Inserting ே and க்ஷ without the dotted circle Shakthi Kannan
  2015-11-09 18:55 ` Random832
@ 2015-11-09 19:04 ` Eli Zaretskii
  2015-11-10  4:07   ` Shakthi Kannan
  1 sibling, 1 reply; 14+ messages in thread
From: Eli Zaretskii @ 2015-11-09 19:04 UTC (permalink / raw)
  To: help-gnu-emacs

> Date: Mon, 9 Nov 2015 23:14:39 +0530
> From: Shakthi Kannan <shakthimaan@gmail.com>
> 
> I am using fonts-lohit-taml-classical 2.5.3-2 font on a Ubuntu 14.10
> system. With Tamil Unicode character and GNU Emacs 24.5.1, I am able to
> enter  ே and க்ஷ (க்ஷ = க + ் + ஷ) separately.
> 
> #1 I have a word where I need both the characters put together, but,
> without the dotted circle. How can this be done?

AFAIK, only by turning off auto-composition-mode.  That mode is on by
default.  Of course, if you do that, you won't get க்ஷ by typing
க + ் + ஷ.

> I can open the font sources using FontForge:
> 
> 
> https://fedorahosted.org/releases/l/o/lohit/lohit-tamil-classical-2.5.3.tar.gz
> 
> and when I click on U+0BC7, I see the glyph for ே, but, without the dotted
> circle. But, the definition in Unicode has the dotted circle:
> 
>   http://www.charbase.com/0bc7-unicode-tamil-vowel-sign-ee

So Emacs behaves correctly according to Unicode.

> #2 Where is this behaviour defined for such characters?

In lisp/language/indian.el, search for "tamil-composable-pattern".

> #3 Suppose I add a new glyph to one of the available Unicode slots in the
> font sources, where in Emacs should this be defined in order to use the
> same?

Sorry, I don't understand the question.  You want this new glyph to be
available for a codepoint that is different from U+0BC7?  Or do you
want something else?




^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Inserting ே and க்ஷ without the dotted circle
  2015-11-09 18:55 ` Random832
@ 2015-11-10  3:43   ` Shakthi Kannan
  2015-11-10 12:31   ` Alexis
  1 sibling, 0 replies; 14+ messages in thread
From: Shakthi Kannan @ 2015-11-10  3:43 UTC (permalink / raw)
  To: Random832; +Cc: help-gnu-emacs

Hi,

--- On Tue, Nov 10, 2015 at 12:25 AM, Random832 <random832@fastmail.com> wrote:
| 0B95 0BCD 0BB7 0BC7, for this result: க்ஷே
| = க + ் + ஷ + ே
\--

Worked. Thanks!

SK

-- 
Shakthi Kannan
http://www.shakthimaan.com



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Inserting ே and க்ஷ without the dotted circle
  2015-11-09 19:04 ` Eli Zaretskii
@ 2015-11-10  4:07   ` Shakthi Kannan
  2015-11-10 17:54     ` Eli Zaretskii
  0 siblings, 1 reply; 14+ messages in thread
From: Shakthi Kannan @ 2015-11-10  4:07 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: help-gnu-emacs

Hi Eli,

--- On Tue, Nov 10, 2015 at 12:34 AM, Eli Zaretskii <eliz@gnu.org> wrote:
| In lisp/language/indian.el, search for "tamil-composable-pattern".
\--

I see the mapping of characters now.

---
| Sorry, I don't understand the question.  You want this new glyph to be
| available for a codepoint that is different from U+0BC7?
\--

Yes. I guess I just have to add the unicode to the table in
tamil-composable-pattern as part of the consonants.

Sorry, for not being clear, the context is as follows:

In GNU Emacs, I am able to enter ஸ் + ரீ in a file. But, if I open the
same file in Gedit or in the browser (Chromium), it gets rendered or
composed correctly as ஸ்ரீ. I don't see this character in the Ubuntu
Monospace font that I use in Gedit and the browser.

Should I create a new glyph for ஸ்ரீ in the font, and add its unicode
number to the table in tamil-composable-pattern, or, can this
composition be made possible in GNU Emacs?

Thanks for your prompt replies,

SK

-- 
Shakthi Kannan
http://www.shakthimaan.com



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Inserting ே and க்ஷ without the dotted circle
  2015-11-09 18:55 ` Random832
  2015-11-10  3:43   ` Shakthi Kannan
@ 2015-11-10 12:31   ` Alexis
  2015-11-10 14:27     ` Random832
  1 sibling, 1 reply; 14+ messages in thread
From: Alexis @ 2015-11-10 12:31 UTC (permalink / raw)
  To: help-gnu-emacs


Random832 <random832@fastmail.com> writes:

> Shakthi Kannan <shakthimaan@gmail.com> writes:
>
>> #2 Where is this behaviour defined for such characters?
>
> The Unicode standard provides information on how Indic scripts 
> should be rendered. Version 1.0 is available at the link below; 
> I don't know if any later version exists.
>
> http://www.unicode.org/versions/Unicode1.0.0/V2appA.pdf

Unicode is now at version 8.0.0; here's the table of contents:

    http://www.unicode.org/versions/Unicode8.0.0/UnicodeBookTOC.pdf

Tamil and its rendering is discussed in section 12.6:

    http://www.unicode.org/versions/Unicode8.0.0/ch12.pdf


Alexis.



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Inserting ே and க்ஷ without the dotted circle
  2015-11-10 12:31   ` Alexis
@ 2015-11-10 14:27     ` Random832
  0 siblings, 0 replies; 14+ messages in thread
From: Random832 @ 2015-11-10 14:27 UTC (permalink / raw)
  To: help-gnu-emacs

Alexis <flexibeast@gmail.com> writes:

> Unicode is now at version 8.0.0; here's the table of contents:
>
>    http://www.unicode.org/versions/Unicode8.0.0/UnicodeBookTOC.pdf
>
> Tamil and its rendering is discussed in section 12.6:
>
>    http://www.unicode.org/versions/Unicode8.0.0/ch12.pdf

I couldn't find this via google. Because of that, I'd assumed that it
didn't exist in "book" form anymore and tried to (and also couldn't)
find a TR about it. The version 1.0 chapter came up on a search for
"unicode character shaping". Seems like bad search optimization.




^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Re: Inserting ே and க்ஷ without the dotted circle
  2015-11-10  4:07   ` Shakthi Kannan
@ 2015-11-10 17:54     ` Eli Zaretskii
  2015-11-11  5:52       ` Shakthi Kannan
  0 siblings, 1 reply; 14+ messages in thread
From: Eli Zaretskii @ 2015-11-10 17:54 UTC (permalink / raw)
  To: help-gnu-emacs

> Date: Tue, 10 Nov 2015 09:37:53 +0530
> From: Shakthi Kannan <shakthimaan@gmail.com>
> Cc: help-gnu-emacs@gnu.org
> 
> In GNU Emacs, I am able to enter ஸ் + ரீ in a file. But, if I open the
> same file in Gedit or in the browser (Chromium), it gets rendered or
> composed correctly as ஸ்ரீ. I don't see this character in the Ubuntu
> Monospace font that I use in Gedit and the browser.

(FWIW, this renders correctly on my system, but it isn't Ubuntu, and
the font is Latha.)

> Should I create a new glyph for ஸ்ரீ in the font, and add its unicode
> number to the table in tamil-composable-pattern, or, can this
> composition be made possible in GNU Emacs?

My recommendation would be rather to find a font that has good
coverage of the Tamil script, and then customize your default fontset
to tell Emacs to use that font for Tamil characters.  See fontset.el
for many examples of how this can be done.




^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Re: Inserting ே and க்ஷ without the dotted circle
  2015-11-10 17:54     ` Eli Zaretskii
@ 2015-11-11  5:52       ` Shakthi Kannan
  2015-11-11 15:15         ` Random832
  2015-11-11 15:38         ` Re: " Eli Zaretskii
  0 siblings, 2 replies; 14+ messages in thread
From: Shakthi Kannan @ 2015-11-11  5:52 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: help-gnu-emacs

Eli,

--- On Tue, Nov 10, 2015 at 11:24 PM, Eli Zaretskii <eliz@gnu.org> wrote:
| (FWIW, this renders correctly on my system, but it isn't Ubuntu, and
| the font is Latha.)
\--

Can you please provide a link to the Latha font that you use?

Thanks!

SK

-- 
Shakthi Kannan
http://www.shakthimaan.com



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Inserting ே and க்ஷ without the dotted circle
  2015-11-11  5:52       ` Shakthi Kannan
@ 2015-11-11 15:15         ` Random832
  2015-11-11 16:12           ` Shakthi Kannan
  2015-11-11 15:38         ` Re: " Eli Zaretskii
  1 sibling, 1 reply; 14+ messages in thread
From: Random832 @ 2015-11-11 15:15 UTC (permalink / raw)
  To: help-gnu-emacs

Shakthi Kannan <shakthimaan@gmail.com> writes:

> Eli,
>
> --- On Tue, Nov 10, 2015 at 11:24 PM, Eli Zaretskii <eliz@gnu.org> wrote:
> | (FWIW, this renders correctly on my system, but it isn't Ubuntu, and
> | the font is Latha.)
> \--
>
> Can you please provide a link to the Latha font that you use?

If it renders correctly for you in gedit, then the font is unlikely to
be the problem, unless you're using a different font in gedit than in
emacs. Can you find out what font you're using in Emacs, and what you're
using in gedit, and determine whether gedit and other GTK apps behave
the same or differently with the same font?

Also, are you using the Lucid or GTK version of Emacs (emacs24 vs
emacs24-lucid)?

Latha comes with MS Windows, and appears to cost US$69 to buy separately.




^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Re: Re: Inserting ே and க்ஷ without the dotted circle
  2015-11-11  5:52       ` Shakthi Kannan
  2015-11-11 15:15         ` Random832
@ 2015-11-11 15:38         ` Eli Zaretskii
  1 sibling, 0 replies; 14+ messages in thread
From: Eli Zaretskii @ 2015-11-11 15:38 UTC (permalink / raw)
  To: help-gnu-emacs

> Date: Wed, 11 Nov 2015 11:22:25 +0530
> From: Shakthi Kannan <shakthimaan@gmail.com>
> Cc: help-gnu-emacs@gnu.org
> 
> --- On Tue, Nov 10, 2015 at 11:24 PM, Eli Zaretskii <eliz@gnu.org> wrote:
> | (FWIW, this renders correctly on my system, but it isn't Ubuntu, and
> | the font is Latha.)
> \--
> 
> Can you please provide a link to the Latha font that you use?

It came with the box, I didn't install it.



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Inserting ே and க்ஷ without the dotted circle
  2015-11-11 15:15         ` Random832
@ 2015-11-11 16:12           ` Shakthi Kannan
  2015-11-11 20:54             ` Random832
  0 siblings, 1 reply; 14+ messages in thread
From: Shakthi Kannan @ 2015-11-11 16:12 UTC (permalink / raw)
  To: Random832; +Cc: help-gnu-emacs

Hi,

--- On Wed, Nov 11, 2015 at 8:45 PM, Random832 <random832@fastmail.com> wrote:
| Can you find out what font you're using in Emacs, and what you're
| using in gedit, and determine whether gedit and other GTK apps behave
| the same or differently with the same font?
\--

In Gedit and Chromium it is Ubuntu Monospace. In Emacs, "M-x
describe-font" tells me the following:

  -unknown-Ubuntu Mono-normal-normal-normal-*-17-*-*-*-m-0-iso10646-1

When I type "C-u C-x =" on the Tamil character, the chosen font is shown as:

  xft:-unknown-Lohit Tamil
Classical-normal-normal-normal-*-17-*-*-*-*-0-iso10646-1

I have all the required characters that I need in this classical font.
"Sri" is the last one that I need.

---
| Also, are you using the Lucid or GTK version of Emacs (emacs24 vs
| emacs24-lucid)?
\--

"M-x version" tells me:

  GNU Emacs 24.5.1 (x86_64-unknown-linux-gnu, GTK+ Version 3.12.2)

I compiled it from 24.5 sources available at:

  http://ftp.gnu.org/gnu/emacs/

SK

-- 
Shakthi Kannan
http://www.shakthimaan.com



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Inserting ே and க்ஷ without the dotted circle
  2015-11-11 16:12           ` Shakthi Kannan
@ 2015-11-11 20:54             ` Random832
  2015-11-12 15:55               ` Shakthi Kannan
  0 siblings, 1 reply; 14+ messages in thread
From: Random832 @ 2015-11-11 20:54 UTC (permalink / raw)
  To: help-gnu-emacs

Shakthi Kannan <shakthimaan@gmail.com> writes:
> In Gedit and Chromium it is Ubuntu Monospace. In Emacs, "M-x
> describe-font" tells me the following:
>
>   -unknown-Ubuntu Mono-normal-normal-normal-*-17-*-*-*-m-0-iso10646-1
>
> When I type "C-u C-x =" on the Tamil character, the chosen font is shown as:
>
>   xft:-unknown-Lohit Tamil
> Classical-normal-normal-normal-*-17-*-*-*-*-0-iso10646-1
>
> I have all the required characters that I need in this classical font.
> "Sri" is the last one that I need.

Have you tried the classical font in gedit, and ubuntu mono in emacs, in
order to determine if the problem is with the font or with emacs?
When you move the cursor over it, does it become one large cursor
covering the whole thing (even though it is rendered incorrectly), or
does the cursor move over each piece of it individually?

And just to be clear is this in the GUI emacs or a terminal?

These are not single characters, they're composed glyphs, and it's not
clear to me what aspects of the behavior the font is responsible for and
what the rendering engine is responsible for.

If you do have to edit the font, I think it won't be as simple as adding
a character with a unicode codepoint.




^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Inserting ே and க்ஷ without the dotted circle
  2015-11-11 20:54             ` Random832
@ 2015-11-12 15:55               ` Shakthi Kannan
  0 siblings, 0 replies; 14+ messages in thread
From: Shakthi Kannan @ 2015-11-12 15:55 UTC (permalink / raw)
  To: help-gnu-emacs

Hi,

--- On Thu, Nov 12, 2015 at 2:24 AM, Random832 <random832@fastmail.com> wrote:
| Have you tried the classical font in gedit
\--

I use the following to enter Tamil text in X:

  $ setxkbmap us,in dvorak,tam_unicode grp:lwin_toggle

I changed the font to use "Lohit Tamil Classical", but, typing Tamil
characters didn't show anything in Gedit. But, I was able to see the
characters in the browser and in GNU Emacs when I tried to type.

---
| and ubuntu mono in emacs,
\--

This is the default font that I use in GNU Emacs.

---
| When you move the cursor over it, does it become one large cursor
| covering the whole thing (even though it is rendered incorrectly), or
| does the cursor move over each piece of it individually?
\--

The latter. The cursor moves over each piece individually.

---
| If you do have to edit the font, I think it won't be as simple as adding
| a character with a unicode codepoint.
\--

I use xelatex to generate PDF from the input Tamil file, and it
renders ஸ்ரீ correctly in the output PDF. I can live with it for now.

Thanks!

SK

--
Shakthi Kannan
http://www.shakthimaan.com



^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2015-11-12 15:55 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-11-09 17:44 Inserting ே and க்ஷ without the dotted circle Shakthi Kannan
2015-11-09 18:55 ` Random832
2015-11-10  3:43   ` Shakthi Kannan
2015-11-10 12:31   ` Alexis
2015-11-10 14:27     ` Random832
2015-11-09 19:04 ` Eli Zaretskii
2015-11-10  4:07   ` Shakthi Kannan
2015-11-10 17:54     ` Eli Zaretskii
2015-11-11  5:52       ` Shakthi Kannan
2015-11-11 15:15         ` Random832
2015-11-11 16:12           ` Shakthi Kannan
2015-11-11 20:54             ` Random832
2015-11-12 15:55               ` Shakthi Kannan
2015-11-11 15:38         ` Re: " Eli Zaretskii

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).