* bug#47856: auto-fill-mode vs. oriental languages: no respect
@ 2021-04-18 0:14 積丹尼 Dan Jacobson
2021-04-18 6:46 ` Eli Zaretskii
0 siblings, 1 reply; 6+ messages in thread
From: 積丹尼 Dan Jacobson @ 2021-04-18 0:14 UTC (permalink / raw)
To: 47856
auto-fill-mode is an interactive compiled Lisp function in‘simple.el’. ...
When Auto Fill mode is enabled, inserting a space at a column
^^^^^[1]
beyond ‘current-fill-column’ automatically breaks the line at a
previous space.
^^^^^^^^^^^^^^[2]
That is all fine and dandy. But it has no respect for oriental
languages.
What if it bro
ke a line like th
is?
That's how it treats oriental languages.
What if emacs "helpfully" turned
...if temperature > temp
then stop_nuclear_reactor()
into
...if temp
erature > temp
then stop_nuclear_reactor()
Syntax error. Meltdown!
It's like if one wore braces for five years, and along came emacs and
in one second put ugly gaps back in your teeth.
Here is my line, pre-victimization:
<p>那麼請貴司, 走過去台電大樓坐下來合作, 透過台電內部精準座標, 把這些孤兒門牌, 盡量一一歸案。</p>
And here is the mangled result:
<p>那麼請貴司, 走過去台電大樓坐下來合作, 透過台電內部精準座標, 把
這些孤兒門牌, 盡量一一歸案。</p>
In [2] we were promised "at a previous *space*".
Well it lied.
We put plenty of *spaces* in the line,
just to feed its hungry mouth.
But no. It had to go rip in to
"把這些孤兒門牌,"
and put a goofy gap in:
"把 這些孤兒門牌,"
That's how the rendered HTML will look.
Might as well make it
"把 這 些 孤 兒 門 牌,"
that way readers will think you were angry.
Also if the space is accidentally inserted before presidents' names,
that will mean you support/honor/respect them.
Sure, in English, President Nixon looks better than PresidentNixon.
So you will just have to take my word that I know what I am talking about.
I.e., it is super dangerous to go inserting random spaces in oriental
languages where there was none to begin with.
If there was one to begin with, then make it two or three, be my guest.
But don't just go semi-randomly put ting gaps in gran dmas' te eth. Than k you.
If there really is no way then to break a line, then just don't break
it. It's the user's problem in that case.
Maybe it can play fast and lose with .txt files,
but it should know better how silly it will make HTML look.
"Well browsers will break your oriental lines arbitrarily anyway. Bug closed."
Yes, but they do that a the ends of lines they render. The damage that
emacs does to the source file ends up as an ugly mid-word gap.
(Unless in the rare case where the browser also breaks the line at
emacs' gap, in which case the reader will not notice any problem.)
[1] P.S., RET at the end of line will destroy the line too. Not just space.
What's worse, you probably won't notice what happened, as your eyes are
already on the next line.
Seen with emacs 27.1, using -q. LC_CTYPE=zh_TW.UTF-8 .
^ permalink raw reply [flat|nested] 6+ messages in thread
* bug#47856: auto-fill-mode vs. oriental languages: no respect
2021-04-18 0:14 bug#47856: auto-fill-mode vs. oriental languages: no respect 積丹尼 Dan Jacobson
@ 2021-04-18 6:46 ` Eli Zaretskii
2021-04-19 23:28 ` 積丹尼 Dan Jacobson
0 siblings, 1 reply; 6+ messages in thread
From: Eli Zaretskii @ 2021-04-18 6:46 UTC (permalink / raw)
To: 積丹尼 Dan Jacobson; +Cc: 47856
> From: 積丹尼 Dan Jacobson
> <jidanni@jidanni.org>
> Date: Sun, 18 Apr 2021 08:14:35 +0800
>
> auto-fill-mode is an interactive compiled Lisp function in‘simple.el’. ...
>
> When Auto Fill mode is enabled, inserting a space at a column
> ^^^^^[1]
> beyond ‘current-fill-column’ automatically breaks the line at a
> previous space.
> ^^^^^^^^^^^^^^[2]
>
> That is all fine and dandy. But it has no respect for oriental
> languages.
>
> What if it bro
> ke a line like th
> is?
>
> That's how it treats oriental languages.
>
> What if emacs "helpfully" turned
>
> ...if temperature > temp
> then stop_nuclear_reactor()
>
> into
>
> ...if temp
> erature > temp
> then stop_nuclear_reactor()
>
> Syntax error. Meltdown!
>
> It's like if one wore braces for five years, and along came emacs and
> in one second put ugly gaps back in your teeth.
>
> Here is my line, pre-victimization:
>
> <p>那麼請貴司, 走過去台電大樓坐下來合作, 透過台電內部精準座標, 把這些孤兒門牌, 盡量一一歸案。</p>
>
> And here is the mangled result:
>
> <p>那麼請貴司, 走過去台電大樓坐下來合作, 透過台電內部精準座標, 把
> 這些孤兒門牌, 盡量一一歸案。</p>
>
> In [2] we were promised "at a previous *space*".
Emacs by default employs the "kinsoku" rules for breaking lines in CJK
languages, when it fills text. Isn't the place where it breaks the
line in this case according to Kinsoku rules? if you set
enable-kinsoku to nil, don't you get what you expected? If so, this
seems to be a documentation issue.
^ permalink raw reply [flat|nested] 6+ messages in thread
* bug#47856: auto-fill-mode vs. oriental languages: no respect
2021-04-18 6:46 ` Eli Zaretskii
@ 2021-04-19 23:28 ` 積丹尼 Dan Jacobson
2021-04-20 5:20 ` Eli Zaretskii
0 siblings, 1 reply; 6+ messages in thread
From: 積丹尼 Dan Jacobson @ 2021-04-19 23:28 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: 47856
>>>>> "EZ" == Eli Zaretskii <eliz@gnu.org> writes:
EZ> Emacs by default employs the "kinsoku" rules for breaking lines in CJK
EZ> languages, when it fills text. Isn't the place where it breaks the
EZ> line in this case according to Kinsoku rules? if you set
EZ> enable-kinsoku to nil, don't you get what you expected? If so, this
EZ> seems to be a documentation issue.
Try it and you will see that whatever value enable-kinsoku has does not affect
this, nor #47857. And that is a good thing too. If it did we would
really be in trouble.
^ permalink raw reply [flat|nested] 6+ messages in thread
* bug#47856: auto-fill-mode vs. oriental languages: no respect
2021-04-19 23:28 ` 積丹尼 Dan Jacobson
@ 2021-04-20 5:20 ` Eli Zaretskii
2021-04-20 11:24 ` Eli Zaretskii
0 siblings, 1 reply; 6+ messages in thread
From: Eli Zaretskii @ 2021-04-20 5:20 UTC (permalink / raw)
To: 積丹尼 Dan Jacobson; +Cc: 47856
> From: 積丹尼 Dan Jacobson <jidanni@jidanni.org>
> Cc: 47856@debbugs.gnu.org
> Date: Tue, 20 Apr 2021 07:28:27 +0800
>
> >>>>> "EZ" == Eli Zaretskii <eliz@gnu.org> writes:
>
> EZ> Emacs by default employs the "kinsoku" rules for breaking lines in CJK
> EZ> languages, when it fills text. Isn't the place where it breaks the
> EZ> line in this case according to Kinsoku rules? if you set
> EZ> enable-kinsoku to nil, don't you get what you expected? If so, this
> EZ> seems to be a documentation issue.
>
> Try it and you will see that whatever value enable-kinsoku has does not affect
> this, nor #47857. And that is a good thing too. If it did we would
> really be in trouble.
Why are you so unhelpful? don't you want this issue investigated and
resolved? I asked the questions above because I don't speak Chinese
and cannot read the text you quoted in your report. Please help me
understand the issue by answering those questions, and please provide
any additional information that could be of relevance, so that we
could make some progress here. TIA.
^ permalink raw reply [flat|nested] 6+ messages in thread
* bug#47856: auto-fill-mode vs. oriental languages: no respect
2021-04-20 5:20 ` Eli Zaretskii
@ 2021-04-20 11:24 ` Eli Zaretskii
2021-04-20 12:14 ` Eli Zaretskii
0 siblings, 1 reply; 6+ messages in thread
From: Eli Zaretskii @ 2021-04-20 11:24 UTC (permalink / raw)
To: jidanni; +Cc: 47856
> From: Eli Zaretskii <eliz@gnu.org>
> Date: Tue, 20 Apr 2021 01:20:02 -0400
> Cc: 47856@debbugs.gnu.org
>
> Please help me understand the issue by answering those questions,
> and please provide any additional information that could be of
> relevance, so that we could make some progress here. TIA.
Never mind, I've managed to figure this out on my own. So, back to
the TIL department:
. For CJK scripts, Emacs's filling commands are allowed to break a
line at _any_ character, not just at whitespace. This is not just
Emacs's invention: the Unicode Line-Breaking Algorithm mandates the
same, albeit via special properties it assigns to CJK characters.
. If you load 'kinsoku', Emacs will additionally refrain from
breaking lines between some CJK characters, where there are special
rules which prohibit that. But still, line can be broken almost at
any place in CJK text, even under the kinsoku rules.
. Conclusion: this is the intended behavior, a feature.
So yeah, it's a documentation issue, to be fixed soon enough.
> "Well browsers will break your oriental lines arbitrarily anyway. Bug closed."
>
> Yes, but they do that a the ends of lines they render.
If this is what you want, it's a different feature: you need to turn
on word-wrap (M-x visual-line-mode RET), not auto-fill. In Emacs 28,
there will be an additional option, word-wrap-by-category, which will
obey kinsoku rules in visual-line-mode.
^ permalink raw reply [flat|nested] 6+ messages in thread
* bug#47856: auto-fill-mode vs. oriental languages: no respect
2021-04-20 11:24 ` Eli Zaretskii
@ 2021-04-20 12:14 ` Eli Zaretskii
0 siblings, 0 replies; 6+ messages in thread
From: Eli Zaretskii @ 2021-04-20 12:14 UTC (permalink / raw)
To: jidanni; +Cc: 47856-done
> Date: Tue, 20 Apr 2021 14:24:45 +0300
> From: Eli Zaretskii <eliz@gnu.org>
> Cc: 47856@debbugs.gnu.org
>
> So yeah, it's a documentation issue, to be fixed soon enough.
Now done, and closing the bug.
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2021-04-20 12:14 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-04-18 0:14 bug#47856: auto-fill-mode vs. oriental languages: no respect 積丹尼 Dan Jacobson
2021-04-18 6:46 ` Eli Zaretskii
2021-04-19 23:28 ` 積丹尼 Dan Jacobson
2021-04-20 5:20 ` Eli Zaretskii
2021-04-20 11:24 ` Eli Zaretskii
2021-04-20 12:14 ` Eli Zaretskii
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).