bug#20871: 25.0.50; fill-single-char-nobreak-p does not recognize a single-letter word when it is preceded by an open paren

all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed

* bug#20871: 25.0.50; fill-single-char-nobreak-p does not recognize a single-letter word when it is preceded by an open paren
@ 2015-06-22 10:19 Marcin Borkowski
  2015-06-22 10:28 ` Marcin Borkowski
  0 siblings, 1 reply; 45+ messages in thread
From: Marcin Borkowski @ 2015-06-22 10:19 UTC (permalink / raw)
  To: 20871

Hello,

today I found that fill-single-char-nobreak-p is just a bit too
simplistic.  When point is after e.g. the string " (a", it returns nil
instead of t.  I am not sure which characters should be added to the
regex, but at least the opening paren (and maybe bracket) should be
there, so I'd change the regex into [[:space:]][[(]*[[:alpha:]].  (Two
or more opening parens/brackets are unlikely, but when in doubt, I guess
it's better to return t than nil than the other way round.)

Best regards,

-- 
Marcin Borkowski               This email was proudly sent
http://mbork.pl                from my Emacs.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* bug#20871: 25.0.50; fill-single-char-nobreak-p does not recognize a single-letter word when it is preceded by an open paren
  2015-06-22 10:19 bug#20871: 25.0.50; fill-single-char-nobreak-p does not recognize a single-letter word when it is preceded by an open paren Marcin Borkowski
@ 2015-06-22 10:28 ` Marcin Borkowski
  2016-04-17  6:34   ` Marcin Borkowski
  0 siblings, 1 reply; 45+ messages in thread
From: Marcin Borkowski @ 2015-06-22 10:28 UTC (permalink / raw)
  To: 20871


On 2015-06-22, at 12:19, Marcin Borkowski <mbork@mbork.pl> wrote:

> Hello,
>
> today I found that fill-single-char-nobreak-p is just a bit too
> simplistic.  When point is after e.g. the string " (a", it returns nil
> instead of t.  I am not sure which characters should be added to the
> regex, but at least the opening paren (and maybe bracket) should be
> there, so I'd change the regex into [[:space:]][[(]*[[:alpha:]].  (Two
> or more opening parens/brackets are unlikely, but when in doubt, I guess
> it's better to return t than nil than the other way round.)
>
> Best regards,

Just noticed that there is a hardcoded (backward-char 2), so it
seems that adding a few characters to the regex is not enough.  Maybe
looking-back is the way to go (though it might slow filling down)?
I don't know.

Best,

-- 
Marcin Borkowski
http://octd.wmi.amu.edu.pl/en/Marcin_Borkowski
Faculty of Mathematics and Computer Science
Adam Mickiewicz University





^ permalink raw reply	[flat|nested] 45+ messages in thread

* bug#20871: 25.0.50; fill-single-char-nobreak-p does not recognize a single-letter word when it is preceded by an open paren
  2015-06-22 10:28 ` Marcin Borkowski
@ 2016-04-17  6:34   ` Marcin Borkowski
  2016-04-17 14:57     ` Eli Zaretskii
  0 siblings, 1 reply; 45+ messages in thread
From: Marcin Borkowski @ 2016-04-17  6:34 UTC (permalink / raw)
  To: 20871

[-- Attachment #1: Type: text/plain, Size: 1071 bytes --]

On 2015-06-22, at 12:28, Marcin Borkowski <mbork@mbork.pl> wrote:

> On 2015-06-22, at 12:19, Marcin Borkowski <mbork@mbork.pl> wrote:
>
>> Hello,
>>
>> today I found that fill-single-char-nobreak-p is just a bit too
>> simplistic.  When point is after e.g. the string " (a", it returns nil
>> instead of t.  I am not sure which characters should be added to the
>> regex, but at least the opening paren (and maybe bracket) should be
>> there, so I'd change the regex into [[:space:]][[(]*[[:alpha:]].  (Two
>> or more opening parens/brackets are unlikely, but when in doubt, I guess
>> it's better to return t than nil than the other way round.)
>>
>> Best regards,
>
> Just noticed that there is a hardcoded (backward-char 2), so it
> seems that adding a few characters to the regex is not enough.  Maybe
> looking-back is the way to go (though it might slow filling down)?
> I don't know.

Hi there,

so here's a patch for the bug I reported some time ago.  Please review
both the patch and the commit message (I'm still learning to write
them...).

Best,

-- 
Marcin

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Fix-fill-single-char-nobreak-p.patch --]
[-- Type: text/x-patch, Size: 1005 bytes --]

From de6c196235ef8abfff52c9cfc3d97a6350e8a5a7 Mon Sep 17 00:00:00 2001
From: Marcin Borkowski <mbork@mbork.pl>
Date: Sun, 17 Apr 2016 08:30:49 +0200
Subject: [PATCH] Fix `fill-single-char-nobreak-p'

* lisp/textmodes/fill.el (fill-single-char-nobreak-p): make space after
	opening paren and a single-letter word unbreakable (Bug#20871)
---
 lisp/textmodes/fill.el | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/lisp/textmodes/fill.el b/lisp/textmodes/fill.el
index 173d1c9..bade8fa 100644
--- a/lisp/textmodes/fill.el
+++ b/lisp/textmodes/fill.el
@@ -337,7 +337,10 @@ fill-single-char-nobreak-p
   (save-excursion
     (skip-chars-backward " \t")
     (backward-char 2)
-    (looking-at "[[:space:]][[:alpha:]]")))
+    (or (looking-at "[[:space:]][[:alpha:]]")
+        (progn
+          (backward-char 1)
+          (looking-at "[[:space:]]([[:alpha:]]")))))
 
 (defcustom fill-nobreak-predicate nil
   "List of predicates for recognizing places not to break a line.
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* bug#20871: 25.0.50; fill-single-char-nobreak-p does not recognize a single-letter word when it is preceded by an open paren
  2016-04-17  6:34   ` Marcin Borkowski
@ 2016-04-17 14:57     ` Eli Zaretskii
  2016-04-17 15:34       ` Marcin Borkowski
  2018-02-02  9:18       ` Michal Nazarewicz
  0 siblings, 2 replies; 45+ messages in thread
From: Eli Zaretskii @ 2016-04-17 14:57 UTC (permalink / raw)
  To: Marcin Borkowski; +Cc: 20871

> From: Marcin Borkowski <mbork@mbork.pl>
> Date: Sun, 17 Apr 2016 08:34:30 +0200
> 
> >> today I found that fill-single-char-nobreak-p is just a bit too
> >> simplistic.  When point is after e.g. the string " (a", it returns nil
> >> instead of t.  I am not sure which characters should be added to the
> >> regex, but at least the opening paren (and maybe bracket) should be
> >> there, so I'd change the regex into [[:space:]][[(]*[[:alpha:]].  (Two
> >> or more opening parens/brackets are unlikely, but when in doubt, I guess
> >> it's better to return t than nil than the other way round.)
> >>
> >> Best regards,
> >
> > Just noticed that there is a hardcoded (backward-char 2), so it
> > seems that adding a few characters to the regex is not enough.  Maybe
> > looking-back is the way to go (though it might slow filling down)?
> > I don't know.
> 
> Hi there,
> 
> so here's a patch for the bug I reported some time ago.

Could you please elaborate on the bug itself?

See, the function in question, fill-single-char-nobreak-p, is
documented as a possible value to use in the fill hook, for a very
specific purpose.  If you are saying that it doesn't fulfill that
purpose well enough, please show a use case where it fails to do that.
At least the situation you described, with " (a", doesn't seem to fit
the use cases which this function is supposed to cover, since the
parenthesis makes a 2-character sequence, whereas
fill-single-char-nobreak-p aims to support isolated one-character
words.

I also am not sure I understand what is so special about '(' that it
has to be hard-coded here.  What about '[' or '{' or '<' (or any other
punctuation character, for that matter)?

> Please review both the patch and the commit message (I'm still
> learning to write them...).

The commit message should begin with a capital letter.  Also, I think
this variant is more clear:

 Don't break after a single-character word that follows an opening
 parenthesis.

Thanks.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* bug#20871: 25.0.50; fill-single-char-nobreak-p does not recognize a single-letter word when it is preceded by an open paren
  2016-04-17 14:57     ` Eli Zaretskii
@ 2016-04-17 15:34       ` Marcin Borkowski
  2016-04-17 16:49         ` Eli Zaretskii
  2018-02-02  9:18       ` Michal Nazarewicz
  1 sibling, 1 reply; 45+ messages in thread
From: Marcin Borkowski @ 2016-04-17 15:34 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 20871


On 2016-04-17, at 14:57, Eli Zaretskii <eliz@gnu.org> wrote:

>> From: Marcin Borkowski <mbork@mbork.pl>
>> Date: Sun, 17 Apr 2016 08:34:30 +0200
>> 
>> >> today I found that fill-single-char-nobreak-p is just a bit too
>> >> simplistic.  When point is after e.g. the string " (a", it returns nil
>> >> instead of t.  I am not sure which characters should be added to the
>> >> regex, but at least the opening paren (and maybe bracket) should be
>> >> there, so I'd change the regex into [[:space:]][[(]*[[:alpha:]].  (Two
>> >> or more opening parens/brackets are unlikely, but when in doubt, I guess
>> >> it's better to return t than nil than the other way round.)
>> >>
>> >> Best regards,
>> >
>> > Just noticed that there is a hardcoded (backward-char 2), so it
>> > seems that adding a few characters to the regex is not enough.  Maybe
>> > looking-back is the way to go (though it might slow filling down)?
>> > I don't know.
>> 
>> Hi there,
>> 
>> so here's a patch for the bug I reported some time ago.
>
> Could you please elaborate on the bug itself?

In Polish typography, it is customary to foribid line breaks after
one-letter words (and we have quite a few of them: a, i, o, w, z - they
are conjunctions or prepositions).  And it is not uncommon to have
a combination of them with a parenthesized remark or something like
that.  That's why allowing a linebreak after, say "(a" when writing
something in Polish (like an email, for instance) is a bug IMO.

> See, the function in question, fill-single-char-nobreak-p, is
> documented as a possible value to use in the fill hook, for a very
> specific purpose.  If you are saying that it doesn't fulfill that
> purpose well enough, please show a use case where it fails to do that.
> At least the situation you described, with " (a", doesn't seem to fit
> the use cases which this function is supposed to cover, since the
> parenthesis makes a 2-character sequence, whereas
> fill-single-char-nobreak-p aims to support isolated one-character
> words.

I see.  So you suggest that instead of patching
`fill-single-char-nobreak-p' I should have provided another function,
customized for Polish?

In fact, I'm not so sure about it.  The whole point of such functions
(as I see it) is help write texts in natural langauges.  It seems
unnatural to treat words preceded by a space and by a parenthesis *in
a natural language* differently, no?

> I also am not sure I understand what is so special about '(' that it
> has to be hard-coded here.  What about '[' or '{' or '<' (or any other
> punctuation character, for that matter)?

The special thing about `(' is that (unlike other characters you
mentioned) is that it is actually used in a text in a natural language
(though one could make a case for `[', too).

>> Please review both the patch and the commit message (I'm still
>> learning to write them...).
>
> The commit message should begin with a capital letter.  Also, I think
> this variant is more clear:
>
>  Don't break after a single-character word that follows an opening
>  parenthesis.
>
> Thanks.

Thanks and best regards,

-- 
Marcin Borkowski
http://octd.wmi.amu.edu.pl/en/Marcin_Borkowski
Faculty of Mathematics and Computer Science
Adam Mickiewicz University





^ permalink raw reply	[flat|nested] 45+ messages in thread

* bug#20871: 25.0.50; fill-single-char-nobreak-p does not recognize a single-letter word when it is preceded by an open paren
  2016-04-17 15:34       ` Marcin Borkowski
@ 2016-04-17 16:49         ` Eli Zaretskii
  2016-04-17 17:41           ` Marcin Borkowski
  2016-04-27  7:02           ` Marcin Borkowski
  0 siblings, 2 replies; 45+ messages in thread
From: Eli Zaretskii @ 2016-04-17 16:49 UTC (permalink / raw)
  To: Marcin Borkowski; +Cc: 20871

> From: Marcin Borkowski <mbork@mbork.pl>
> Cc: 20871@debbugs.gnu.org
> Date: Sun, 17 Apr 2016 17:34:04 +0200
> 
> In Polish typography, it is customary to foribid line breaks after
> one-letter words (and we have quite a few of them: a, i, o, w, z - they
> are conjunctions or prepositions).  And it is not uncommon to have
> a combination of them with a parenthesized remark or something like
> that.  That's why allowing a linebreak after, say "(a" when writing
> something in Polish (like an email, for instance) is a bug IMO.
> 
> > See, the function in question, fill-single-char-nobreak-p, is
> > documented as a possible value to use in the fill hook, for a very
> > specific purpose.  If you are saying that it doesn't fulfill that
> > purpose well enough, please show a use case where it fails to do that.
> > At least the situation you described, with " (a", doesn't seem to fit
> > the use cases which this function is supposed to cover, since the
> > parenthesis makes a 2-character sequence, whereas
> > fill-single-char-nobreak-p aims to support isolated one-character
> > words.
> 
> I see.  So you suggest that instead of patching
> `fill-single-char-nobreak-p' I should have provided another function,
> customized for Polish?

Yes, I think so.  There's already fill-french-nobreak-p, why shouldn't
there be a Polish predicate?

> In fact, I'm not so sure about it.  The whole point of such functions
> (as I see it) is help write texts in natural langauges.  It seems
> unnatural to treat words preceded by a space and by a parenthesis *in
> a natural language* differently, no?

Not necessarily: that space that precedes the word is by itself a
line-breaking opportunity.  IOW, Emacs will break before 'a' in " a",
and the penalty will be only 1 character.  By contrast, breaking
before the parenthesis in your case will yield a penalty of 2
characters, which is a different tradeoff, worthy of asking the user
explicitly to agree to.

The default value of fill-nobreak-predicate is nil for a reason.

Thanks.





^ permalink raw reply	[flat|nested] 45+ messages in thread

* bug#20871: 25.0.50; fill-single-char-nobreak-p does not recognize a single-letter word when it is preceded by an open paren
  2016-04-17 16:49         ` Eli Zaretskii
@ 2016-04-17 17:41           ` Marcin Borkowski
  2016-04-27  7:02           ` Marcin Borkowski
  1 sibling, 0 replies; 45+ messages in thread
From: Marcin Borkowski @ 2016-04-17 17:41 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 20871


On 2016-04-17, at 16:49, Eli Zaretskii <eliz@gnu.org> wrote:

>> I see.  So you suggest that instead of patching
>> `fill-single-char-nobreak-p' I should have provided another function,
>> customized for Polish?
>
> Yes, I think so.  There's already fill-french-nobreak-p, why shouldn't
> there be a Polish predicate?
>
>> [...]
>
> Not necessarily: that space that precedes the word is by itself a
> line-breaking opportunity.  IOW, Emacs will break before 'a' in " a",
> and the penalty will be only 1 character.  By contrast, breaking
> before the parenthesis in your case will yield a penalty of 2
> characters, which is a different tradeoff, worthy of asking the user
> explicitly to agree to.
>
> The default value of fill-nobreak-predicate is nil for a reason.

I see.  So I'm going to prepare a patch where a new function is
introduced for Polish typography, and once it is accepted (which I hope
will happen:-)) I'm going to close this bug.

Best,

-- 
Marcin Borkowski
http://octd.wmi.amu.edu.pl/en/Marcin_Borkowski
Faculty of Mathematics and Computer Science
Adam Mickiewicz University





^ permalink raw reply	[flat|nested] 45+ messages in thread

* bug#20871: 25.0.50; fill-single-char-nobreak-p does not recognize a single-letter word when it is preceded by an open paren
  2016-04-17 16:49         ` Eli Zaretskii
  2016-04-17 17:41           ` Marcin Borkowski
@ 2016-04-27  7:02           ` Marcin Borkowski
  2016-04-27  7:20             ` Eli Zaretskii
  1 sibling, 1 reply; 45+ messages in thread
From: Marcin Borkowski @ 2016-04-27  7:02 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 20871

[-- Attachment #1: Type: text/plain, Size: 1626 bytes --]


On 2016-04-17, at 16:49, Eli Zaretskii <eliz@gnu.org> wrote:

>> From: Marcin Borkowski <mbork@mbork.pl>
>> Cc: 20871@debbugs.gnu.org
>> Date: Sun, 17 Apr 2016 17:34:04 +0200
>> 
>> In Polish typography, it is customary to foribid line breaks after
>> one-letter words (and we have quite a few of them: a, i, o, w, z - they
>> are conjunctions or prepositions).  And it is not uncommon to have
>> a combination of them with a parenthesized remark or something like
>> that.  That's why allowing a linebreak after, say "(a" when writing
>> something in Polish (like an email, for instance) is a bug IMO.
>> 
>> > See, the function in question, fill-single-char-nobreak-p, is
>> > documented as a possible value to use in the fill hook, for a very
>> > specific purpose.  If you are saying that it doesn't fulfill that
>> > purpose well enough, please show a use case where it fails to do that.
>> > At least the situation you described, with " (a", doesn't seem to fit
>> > the use cases which this function is supposed to cover, since the
>> > parenthesis makes a 2-character sequence, whereas
>> > fill-single-char-nobreak-p aims to support isolated one-character
>> > words.
>> 
>> I see.  So you suggest that instead of patching
>> `fill-single-char-nobreak-p' I should have provided another function,
>> customized for Polish?
>
> Yes, I think so.  There's already fill-french-nobreak-p, why shouldn't
> there be a Polish predicate?

I attach a new patch. Is it better now?

Best,

-- 
Marcin Borkowski
http://octd.wmi.amu.edu.pl/en/Marcin_Borkowski
Faculty of Mathematics and Computer Science
Adam Mickiewicz University

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0002-Add-the-function-fill-polish-nobreak-p.patch --]
[-- Type: text/x-patch, Size: 1418 bytes --]

From 6356c46bf6d5c90057b1a756f05c69791c9ff3db Mon Sep 17 00:00:00 2001
From: Marcin Borkowski <mbork@mbork.pl>
Date: Wed, 27 Apr 2016 08:59:15 +0200
Subject: [PATCH] Add the function `fill-polish-nobreak-p'

* lisp/textmodes/fill.el (fill-polish-nobreak-p): Prevent line-breaking
after a single-letter word even if this word is not preceded by
a space.  (Bug #20871)
---
 lisp/textmodes/fill.el | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/lisp/textmodes/fill.el b/lisp/textmodes/fill.el
index 173d1c9..0a95290 100644
--- a/lisp/textmodes/fill.el
+++ b/lisp/textmodes/fill.el
@@ -329,6 +329,18 @@ fill-french-nobreak-p
 	      (and (memq (preceding-char) '(?\t ?\s))
 		   (eq (char-syntax (following-char)) ?w)))))))
 
+(defun fill-polish-nobreak-p ()
+  "Return nil if Polish style allows breaking the line at point.
+This function may be used in the `fill-nobreak-predicate' hook.
+It is almost the same as `fill-single-char-nobreak-p', with the
+exception that it does not require the one-letter word to be
+preceded by a space.  This blocks line-breaking in cases like
+\"(a jednak)\"."
+  (save-excursion
+    (skip-chars-backward " \t")
+    (backward-char 2)
+    (looking-at "[^[:alpha:]][[:alpha:]]")))
+
 (defun fill-single-char-nobreak-p ()
   "Return non-nil if a one-letter word is before point.
 This function is suitable for adding to the hook `fill-nobreak-predicate',
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* bug#20871: 25.0.50; fill-single-char-nobreak-p does not recognize a single-letter word when it is preceded by an open paren
  2016-04-27  7:02           ` Marcin Borkowski
@ 2016-04-27  7:20             ` Eli Zaretskii
  2016-04-29 12:18               ` Marcin Borkowski
  0 siblings, 1 reply; 45+ messages in thread
From: Eli Zaretskii @ 2016-04-27  7:20 UTC (permalink / raw)
  To: Marcin Borkowski; +Cc: 20871

> From: Marcin Borkowski <mbork@mbork.pl>
> Cc: 20871@debbugs.gnu.org
> Date: Wed, 27 Apr 2016 09:02:36 +0200
> 
> >> I see.  So you suggest that instead of patching
> >> `fill-single-char-nobreak-p' I should have provided another function,
> >> customized for Polish?
> >
> > Yes, I think so.  There's already fill-french-nobreak-p, why shouldn't
> > there be a Polish predicate?
> 
> I attach a new patch. Is it better now?

Yes, thanks.

Just one comment, for your consideration:

> +    (looking-at "[^[:alpha:]][[:alpha:]]")))

You should be aware that starting with Emacs 25.1 [:alpha:] matches a
very large class of characters, some of them having nothing in common
with those used in Polish.  So perhaps it is better to use '\cl'
instead, which will only capture Latin characters?  Just a thought --
your call.





^ permalink raw reply	[flat|nested] 45+ messages in thread

* bug#20871: 25.0.50; fill-single-char-nobreak-p does not recognize a single-letter word when it is preceded by an open paren
  2016-04-27  7:20             ` Eli Zaretskii
@ 2016-04-29 12:18               ` Marcin Borkowski
  2016-04-30 11:21                 ` Eli Zaretskii
  0 siblings, 1 reply; 45+ messages in thread
From: Marcin Borkowski @ 2016-04-29 12:18 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 20871

[-- Attachment #1: Type: text/plain, Size: 1376 bytes --]

On 2016-04-27, at 10:20, Eli Zaretskii <eliz@gnu.org> wrote:

>> From: Marcin Borkowski <mbork@mbork.pl>
>> Cc: 20871@debbugs.gnu.org
>> Date: Wed, 27 Apr 2016 09:02:36 +0200
>> 
>> >> I see.  So you suggest that instead of patching
>> >> `fill-single-char-nobreak-p' I should have provided another function,
>> >> customized for Polish?
>> >
>> > Yes, I think so.  There's already fill-french-nobreak-p, why shouldn't
>> > there be a Polish predicate?
>> 
>> I attach a new patch. Is it better now?
>
> Yes, thanks.
>
> Just one comment, for your consideration:
>
>> +    (looking-at "[^[:alpha:]][[:alpha:]]")))
>
> You should be aware that starting with Emacs 25.1 [:alpha:] matches a
> very large class of characters, some of them having nothing in common
> with those used in Polish.  So perhaps it is better to use '\cl'
> instead, which will only capture Latin characters?  Just a thought --
> your call.

I guess you are right, Eli - in fact, all one-letter words in Polish are
matched by [aiouwz].  I decided to go with \cl, as you suggested,
though - this way, the function could be (probably) useful also for
Slovaks, for instance.  I attach the corrected patch.

Just to be sure: in my Emacs, \cl matches also ą, ę, ż, ź, á, ö etc.  Is
it intentional?  Is it documented somewhere?

Thanks and best regards,

-- 
Marcin

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0003-Add-the-function-fill-polish-nobreak-p.patch --]
[-- Type: text/x-patch, Size: 1411 bytes --]

From a3b46e0e260f011e9c5257bece8f2fdb123214e5 Mon Sep 17 00:00:00 2001
From: Marcin Borkowski <mbork@mbork.pl>
Date: Wed, 27 Apr 2016 08:59:15 +0200
Subject: [PATCH] Add the function `fill-polish-nobreak-p'

* lisp/textmodes/fill.el (fill-polish-nobreak-p): Prevent line-breaking
after a single-letter word even if this word is not preceded by
a space.  (Bug #20871)
---
 lisp/textmodes/fill.el | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/lisp/textmodes/fill.el b/lisp/textmodes/fill.el
index 173d1c9..80f7e96 100644
--- a/lisp/textmodes/fill.el
+++ b/lisp/textmodes/fill.el
@@ -329,6 +329,18 @@ fill-french-nobreak-p
 	      (and (memq (preceding-char) '(?\t ?\s))
 		   (eq (char-syntax (following-char)) ?w)))))))
 
+(defun fill-polish-nobreak-p ()
+  "Return nil if Polish style allows breaking the line at point.
+This function may be used in the `fill-nobreak-predicate' hook.
+It is almost the same as `fill-single-char-nobreak-p', with the
+exception that it does not require the one-letter word to be
+preceded by a space.  This blocks line-breaking in cases like
+\"(a jednak)\"."
+  (save-excursion
+    (skip-chars-backward " \t")
+    (backward-char 2)
+    (looking-at "[^[:alpha:]]\\cl")))
+
 (defun fill-single-char-nobreak-p ()
   "Return non-nil if a one-letter word is before point.
 This function is suitable for adding to the hook `fill-nobreak-predicate',
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* bug#20871: 25.0.50; fill-single-char-nobreak-p does not recognize a single-letter word when it is preceded by an open paren
  2016-04-29 12:18               ` Marcin Borkowski
@ 2016-04-30 11:21                 ` Eli Zaretskii
  2016-04-30 12:26                   ` Marcin Borkowski
  0 siblings, 1 reply; 45+ messages in thread
From: Eli Zaretskii @ 2016-04-30 11:21 UTC (permalink / raw)
  To: Marcin Borkowski; +Cc: 20871

> From: Marcin Borkowski <mbork@mbork.pl>
> Cc: 20871@debbugs.gnu.org
> Date: Fri, 29 Apr 2016 14:18:34 +0200
> 
> >> +    (looking-at "[^[:alpha:]][[:alpha:]]")))
> >
> > You should be aware that starting with Emacs 25.1 [:alpha:] matches a
> > very large class of characters, some of them having nothing in common
> > with those used in Polish.  So perhaps it is better to use '\cl'
> > instead, which will only capture Latin characters?  Just a thought --
> > your call.
> 
> I guess you are right, Eli - in fact, all one-letter words in Polish are
> matched by [aiouwz].  I decided to go with \cl, as you suggested,
> though - this way, the function could be (probably) useful also for
> Slovaks, for instance.  I attach the corrected patch.

LGTM, thanks.

> Just to be sure: in my Emacs, \cl matches also ą, ę, ż, ź, á, ö etc.  Is
> it intentional?

Yes.  \cl matches any character that belongs to any of the Latin
blocks.

> Is it documented somewhere?

Not sure what needs to be documented, please elaborate.





^ permalink raw reply	[flat|nested] 45+ messages in thread

* bug#20871: 25.0.50; fill-single-char-nobreak-p does not recognize a single-letter word when it is preceded by an open paren
  2016-04-30 11:21                 ` Eli Zaretskii
@ 2016-04-30 12:26                   ` Marcin Borkowski
  2016-04-30 12:38                     ` Eli Zaretskii
  0 siblings, 1 reply; 45+ messages in thread
From: Marcin Borkowski @ 2016-04-30 12:26 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 20871


On 2016-04-30, at 13:21, Eli Zaretskii <eliz@gnu.org> wrote:

>> From: Marcin Borkowski <mbork@mbork.pl>
>> Cc: 20871@debbugs.gnu.org
>> Date: Fri, 29 Apr 2016 14:18:34 +0200
>> 
>> >> +    (looking-at "[^[:alpha:]][[:alpha:]]")))
>> >
>> > You should be aware that starting with Emacs 25.1 [:alpha:] matches a
>> > very large class of characters, some of them having nothing in common
>> > with those used in Polish.  So perhaps it is better to use '\cl'
>> > instead, which will only capture Latin characters?  Just a thought --
>> > your call.
>> 
>> I guess you are right, Eli - in fact, all one-letter words in Polish are
>> matched by [aiouwz].  I decided to go with \cl, as you suggested,
>> though - this way, the function could be (probably) useful also for
>> Slovaks, for instance.  I attach the corrected patch.
>
> LGTM, thanks.

Thanks!

>> Just to be sure: in my Emacs, \cl matches also ą, ę, ż, ź, á, ö etc.  Is
>> it intentional?
>
> Yes.  \cl matches any character that belongs to any of the Latin
> blocks.
>
>> Is it documented somewhere?
>
> Not sure what needs to be documented, please elaborate.

Well, at first I thought that "Latin" means "matching [a-z]".  Finding
out that accented letter qualify, too, was a (pleasant) surprise.
Finding that out using `describe-categories' is a bit tricky, since its
output contains ranges, and I don't know which of them does e.g. "ą"
belong to.  The output of `describe-categories' says "Legend of category
mnemonics (see the tail for the longer description)"; I guess the
"longer" description might say something more.  For instance, this line:

(define-category ?l "Latin")

in characters.el

could be replaced by

(define-category ?l "Latin
Latin letters (including those with diacritics)")

This way, there would be at least a hint at the bottom of the *Help*
buffer displayed by `describe-categories'.

WDYT?  Would you like me to prepare a patch?

Best,

-- 
Marcin Borkowski
http://octd.wmi.amu.edu.pl/en/Marcin_Borkowski
Faculty of Mathematics and Computer Science
Adam Mickiewicz University





^ permalink raw reply	[flat|nested] 45+ messages in thread

* bug#20871: 25.0.50; fill-single-char-nobreak-p does not recognize a single-letter word when it is preceded by an open paren
  2016-04-30 12:26                   ` Marcin Borkowski
@ 2016-04-30 12:38                     ` Eli Zaretskii
  2016-04-30 16:41                       ` Marcin Borkowski
  0 siblings, 1 reply; 45+ messages in thread
From: Eli Zaretskii @ 2016-04-30 12:38 UTC (permalink / raw)
  To: Marcin Borkowski; +Cc: 20871

> From: Marcin Borkowski <mbork@mbork.pl>
> Cc: 20871@debbugs.gnu.org
> Date: Sat, 30 Apr 2016 14:26:28 +0200
> 
> (define-category ?l "Latin")
> 
> in characters.el
> 
> could be replaced by
> 
> (define-category ?l "Latin
> Latin letters (including those with diacritics)")

That doesn't sound right: why single out diacritics?  And why only for
Latin?

If we want to enhance those doc strings, each one of them should state
what Unicode blocks are covered.  (It would be okay to say something
like "all Latin blocks", instead of enumerating them all, and
similarly for the other categories.)

Thanks.





^ permalink raw reply	[flat|nested] 45+ messages in thread

* bug#20871: 25.0.50; fill-single-char-nobreak-p does not recognize a single-letter word when it is preceded by an open paren
  2016-04-30 12:38                     ` Eli Zaretskii
@ 2016-04-30 16:41                       ` Marcin Borkowski
  2016-04-30 17:01                         ` Eli Zaretskii
  2016-04-30 17:42                         ` Drew Adams
  0 siblings, 2 replies; 45+ messages in thread
From: Marcin Borkowski @ 2016-04-30 16:41 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 20871


On 2016-04-30, at 14:38, Eli Zaretskii <eliz@gnu.org> wrote:

>> From: Marcin Borkowski <mbork@mbork.pl>
>> Cc: 20871@debbugs.gnu.org
>> Date: Sat, 30 Apr 2016 14:26:28 +0200
>> 
>> (define-category ?l "Latin")
>> 
>> in characters.el
>> 
>> could be replaced by
>> 
>> (define-category ?l "Latin
>> Latin letters (including those with diacritics)")
>
> That doesn't sound right: why single out diacritics?  And why only for
> Latin?

Because I don't really know what this category is about?

I think that my confusion is a sufficient proof that more precise
documentation is needed.

> If we want to enhance those doc strings, each one of them should state
> what Unicode blocks are covered.  (It would be okay to say something
> like "all Latin blocks", instead of enumerating them all, and
> similarly for the other categories.)

A user might not know what exactly a "Unicode block" is.  (I don't.)
I think a pointer to some sources or a few words of explanation are
really needed.  (I won't make a patch, though, since I clearly know too
little about it to do it correctly.)

> Thanks.

Best,

-- 
Marcin Borkowski
http://octd.wmi.amu.edu.pl/en/Marcin_Borkowski
Faculty of Mathematics and Computer Science
Adam Mickiewicz University





^ permalink raw reply	[flat|nested] 45+ messages in thread

* bug#20871: 25.0.50; fill-single-char-nobreak-p does not recognize a single-letter word when it is preceded by an open paren
  2016-04-30 16:41                       ` Marcin Borkowski
@ 2016-04-30 17:01                         ` Eli Zaretskii
  2016-04-30 18:42                           ` Marcin Borkowski
  2016-04-30 17:42                         ` Drew Adams
  1 sibling, 1 reply; 45+ messages in thread
From: Eli Zaretskii @ 2016-04-30 17:01 UTC (permalink / raw)
  To: Marcin Borkowski; +Cc: 20871

> From: Marcin Borkowski <mbork@mbork.pl>
> Cc: 20871@debbugs.gnu.org
> Date: Sat, 30 Apr 2016 18:41:32 +0200
> 
> >> (define-category ?l "Latin
> >> Latin letters (including those with diacritics)")
> >
> > That doesn't sound right: why single out diacritics?  And why only for
> > Latin?
> 
> Because I don't really know what this category is about?
> 
> I think that my confusion is a sufficient proof that more precise
> documentation is needed.

I didn't object to expanding the doc string, I only suggested that we
do it right.

> > If we want to enhance those doc strings, each one of them should state
> > what Unicode blocks are covered.  (It would be okay to say something
> > like "all Latin blocks", instead of enumerating them all, and
> > similarly for the other categories.)
> 
> A user might not know what exactly a "Unicode block" is.  (I don't.)
> I think a pointer to some sources or a few words of explanation are
> really needed.

A reference to the ELisp manual should do (the explanation should be
added to the manual first, of course).

> (I won't make a patch, though, since I clearly know too little about
> it to do it correctly.)

An opportunity to learn, I'd say.  You could start with
admin/unidata/Blocks.txt, for example.  We use it to generate
charscript.el (and categories are a semi-obsolete facility that
predates char-script-table).





^ permalink raw reply	[flat|nested] 45+ messages in thread

* bug#20871: 25.0.50; fill-single-char-nobreak-p does not recognize a single-letter word when it is preceded by an open paren
  2016-04-30 16:41                       ` Marcin Borkowski
  2016-04-30 17:01                         ` Eli Zaretskii
@ 2016-04-30 17:42                         ` Drew Adams
  2016-04-30 18:24                           ` Eli Zaretskii
  1 sibling, 1 reply; 45+ messages in thread
From: Drew Adams @ 2016-04-30 17:42 UTC (permalink / raw)
  To: Marcin Borkowski, Eli Zaretskii; +Cc: 20871

> I think that my confusion is a sufficient proof that more precise
> documentation is needed.

+1.  Maybe not "more precise" (I cannot judge whether what is
there is precise), but more user-friendly.  It is fine (great)
to use the precise Unicode terminology and to point to the
Unicode standard for more information.  But it also helpful
to provide a little coaching in the Emacs doc, for us users
who are not Unicode pros.  It probably would not take much
additional explanation.

> > If we want to enhance those doc strings, each one of them should state
> > what Unicode blocks are covered.  (It would be okay to say something
> > like "all Latin blocks", instead of enumerating them all, and
> > similarly for the other categories.)
> 
> A user might not know what exactly a "Unicode block" is.  (I don't.)
> I think a pointer to some sources or a few words of explanation are
> really needed.  (I won't make a patch, though, since I clearly know too
> little about it to do it correctly.)

+1

This kind of user feedback is helpful.  It should not be
ignored, IMO, especially if the reason for ignoring is just
that the doc is accurate and precise.  It's also about
being amenable (dare I say even "inviting") to an average
Emacs user.





^ permalink raw reply	[flat|nested] 45+ messages in thread

* bug#20871: 25.0.50; fill-single-char-nobreak-p does not recognize a single-letter word when it is preceded by an open paren
  2016-04-30 17:42                         ` Drew Adams
@ 2016-04-30 18:24                           ` Eli Zaretskii
  2016-04-30 18:41                             ` Marcin Borkowski
  0 siblings, 1 reply; 45+ messages in thread
From: Eli Zaretskii @ 2016-04-30 18:24 UTC (permalink / raw)
  To: Drew Adams; +Cc: mbork, 20871

> Date: Sat, 30 Apr 2016 09:42:28 -0800 (GMT-08:00)
> From: Drew Adams <drew.adams@oracle.com>
> Cc: 20871@debbugs.gnu.org
> 
> This kind of user feedback is helpful.  It should not be
> ignored

Did I ignore it?





^ permalink raw reply	[flat|nested] 45+ messages in thread

* bug#20871: 25.0.50; fill-single-char-nobreak-p does not recognize a single-letter word when it is preceded by an open paren
  2016-04-30 18:24                           ` Eli Zaretskii
@ 2016-04-30 18:41                             ` Marcin Borkowski
  0 siblings, 0 replies; 45+ messages in thread
From: Marcin Borkowski @ 2016-04-30 18:41 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 20871


On 2016-04-30, at 20:24, Eli Zaretskii <eliz@gnu.org> wrote:

>> Date: Sat, 30 Apr 2016 09:42:28 -0800 (GMT-08:00)
>> From: Drew Adams <drew.adams@oracle.com>
>> Cc: 20871@debbugs.gnu.org
>> 
>> This kind of user feedback is helpful.  It should not be
>> ignored
>
> Did I ignore it?

I don't think so.

Best,

-- 
Marcin Borkowski
http://octd.wmi.amu.edu.pl/en/Marcin_Borkowski
Faculty of Mathematics and Computer Science
Adam Mickiewicz University





^ permalink raw reply	[flat|nested] 45+ messages in thread

* bug#20871: 25.0.50; fill-single-char-nobreak-p does not recognize a single-letter word when it is preceded by an open paren
  2016-04-30 17:01                         ` Eli Zaretskii
@ 2016-04-30 18:42                           ` Marcin Borkowski
  2016-04-30 19:01                             ` Eli Zaretskii
  0 siblings, 1 reply; 45+ messages in thread
From: Marcin Borkowski @ 2016-04-30 18:42 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 20871


On 2016-04-30, at 19:01, Eli Zaretskii <eliz@gnu.org> wrote:

>> From: Marcin Borkowski <mbork@mbork.pl>
>> Cc: 20871@debbugs.gnu.org
>> Date: Sat, 30 Apr 2016 18:41:32 +0200
>> 
>> >> (define-category ?l "Latin
>> >> Latin letters (including those with diacritics)")
>> >
>> > That doesn't sound right: why single out diacritics?  And why only for
>> > Latin?
>> 
>> Because I don't really know what this category is about?
>> 
>> I think that my confusion is a sufficient proof that more precise
>> documentation is needed.
>
> I didn't object to expanding the doc string, I only suggested that we
> do it right.

I didn't claim you objected.

>> > If we want to enhance those doc strings, each one of them should state
>> > what Unicode blocks are covered.  (It would be okay to say something
>> > like "all Latin blocks", instead of enumerating them all, and
>> > similarly for the other categories.)
>> 
>> A user might not know what exactly a "Unicode block" is.  (I don't.)
>> I think a pointer to some sources or a few words of explanation are
>> really needed.
>
> A reference to the ELisp manual should do (the explanation should be
> added to the manual first, of course).
>
>> (I won't make a patch, though, since I clearly know too little about
>> it to do it correctly.)
>
> An opportunity to learn, I'd say.  You could start with
> admin/unidata/Blocks.txt, for example.  We use it to generate
> charscript.el (and categories are a semi-obsolete facility that
> predates char-script-table).

You got me here.  The problem is, I'm not going to have a lot of free
time in the next few weeks/months.  (I'll also probably have to stop any
work on Emacs bugs for some time, for instance.)  But I'll try to look
into this one.

Best,

-- 
Marcin Borkowski
http://octd.wmi.amu.edu.pl/en/Marcin_Borkowski
Faculty of Mathematics and Computer Science
Adam Mickiewicz University





^ permalink raw reply	[flat|nested] 45+ messages in thread

* bug#20871: 25.0.50; fill-single-char-nobreak-p does not recognize a single-letter word when it is preceded by an open paren
  2016-04-30 18:42                           ` Marcin Borkowski
@ 2016-04-30 19:01                             ` Eli Zaretskii
  2017-12-07 13:28                               ` Marcin Borkowski
  0 siblings, 1 reply; 45+ messages in thread
From: Eli Zaretskii @ 2016-04-30 19:01 UTC (permalink / raw)
  To: Marcin Borkowski; +Cc: 20871

> From: Marcin Borkowski <mbork@mbork.pl>
> Cc: 20871@debbugs.gnu.org
> Date: Sat, 30 Apr 2016 20:42:43 +0200
> 
> > An opportunity to learn, I'd say.  You could start with
> > admin/unidata/Blocks.txt, for example.  We use it to generate
> > charscript.el (and categories are a semi-obsolete facility that
> > predates char-script-table).
> 
> You got me here.  The problem is, I'm not going to have a lot of free
> time in the next few weeks/months.

There's no rush.  You shouldn't feel hard-pressed for a quick job.

> But I'll try to look into this one.

Great, thanks.





^ permalink raw reply	[flat|nested] 45+ messages in thread

* bug#20871: 25.0.50; fill-single-char-nobreak-p does not recognize a single-letter word when it is preceded by an open paren
  2016-04-30 19:01                             ` Eli Zaretskii
@ 2017-12-07 13:28                               ` Marcin Borkowski
  2017-12-09 11:52                                 ` Eli Zaretskii
  0 siblings, 1 reply; 45+ messages in thread
From: Marcin Borkowski @ 2017-12-07 13:28 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 20871

On 2016-04-30, at 21:01, Eli Zaretskii <eliz@gnu.org> wrote:

>> From: Marcin Borkowski <mbork@mbork.pl>
>> Cc: 20871@debbugs.gnu.org
>> Date: Sat, 30 Apr 2016 20:42:43 +0200
>> 
>> > An opportunity to learn, I'd say.  You could start with
>> > admin/unidata/Blocks.txt, for example.  We use it to generate
>> > charscript.el (and categories are a semi-obsolete facility that
>> > predates char-script-table).
>> 
>> You got me here.  The problem is, I'm not going to have a lot of free
>> time in the next few weeks/months.
>
> There's no rush.  You shouldn't feel hard-pressed for a quick job.
>
>> But I'll try to look into this one.
>
> Great, thanks.

OK, so let me revive this old thread.  I decided to spend some (not too
much...) time on Emacs bugs again.  First things first: I think I fixed
this one.  (This is a bit embarassing: I did it many months ago, pushed
a branch and forgot to mention it here.)  Here it is:
http://git.savannah.gnu.org/cgit/emacs.git/log/?h=fix/bug-20871 - can
anyone look at it?

In the next few days I'm going to look at the documentation problem we
discussed here.

Best,

-- 
Marcin Borkowski

^ permalink raw reply	[flat|nested] 45+ messages in thread

* bug#20871: 25.0.50; fill-single-char-nobreak-p does not recognize a single-letter word when it is preceded by an open paren
  2017-12-07 13:28                               ` Marcin Borkowski
@ 2017-12-09 11:52                                 ` Eli Zaretskii
  2017-12-09 15:45                                   ` Marcin Borkowski
  0 siblings, 1 reply; 45+ messages in thread
From: Eli Zaretskii @ 2017-12-09 11:52 UTC (permalink / raw)
  To: Marcin Borkowski; +Cc: 20871

> From: Marcin Borkowski <mbork@mbork.pl>
> Cc: 20871@debbugs.gnu.org
> Date: Thu, 07 Dec 2017 14:28:20 +0100
> 
> OK, so let me revive this old thread.  I decided to spend some (not too
> much...) time on Emacs bugs again.  First things first: I think I fixed
> this one.  (This is a bit embarassing: I did it many months ago, pushed
> a branch and forgot to mention it here.)  Here it is:
> http://git.savannah.gnu.org/cgit/emacs.git/log/?h=fix/bug-20871 - can
> anyone look at it?

I took a look.  It looks reasonable, but I think it will also need to
be mentioned in the Emacs User manual, node "Fill Commands".  In
addition, could you write a couple of tests for this feature?

Thanks.





^ permalink raw reply	[flat|nested] 45+ messages in thread

* bug#20871: 25.0.50; fill-single-char-nobreak-p does not recognize a single-letter word when it is preceded by an open paren
  2017-12-09 11:52                                 ` Eli Zaretskii
@ 2017-12-09 15:45                                   ` Marcin Borkowski
  2017-12-19 11:44                                     ` Marcin Borkowski
  0 siblings, 1 reply; 45+ messages in thread
From: Marcin Borkowski @ 2017-12-09 15:45 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 20871


On 2017-12-09, at 12:52, Eli Zaretskii <eliz@gnu.org> wrote:

>> From: Marcin Borkowski <mbork@mbork.pl>
>> Cc: 20871@debbugs.gnu.org
>> Date: Thu, 07 Dec 2017 14:28:20 +0100
>> 
>> OK, so let me revive this old thread.  I decided to spend some (not too
>> much...) time on Emacs bugs again.  First things first: I think I fixed
>> this one.  (This is a bit embarassing: I did it many months ago, pushed
>> a branch and forgot to mention it here.)  Here it is:
>> http://git.savannah.gnu.org/cgit/emacs.git/log/?h=fix/bug-20871 - can
>> anyone look at it?
>
> I took a look.  It looks reasonable, but I think it will also need to
> be mentioned in the Emacs User manual, node "Fill Commands".  In
> addition, could you write a couple of tests for this feature?

OK, I'll look into it and get back here.

Best,

-- 
Marcin Borkowski





^ permalink raw reply	[flat|nested] 45+ messages in thread

* bug#20871: 25.0.50; fill-single-char-nobreak-p does not recognize a single-letter word when it is preceded by an open paren
  2017-12-09 15:45                                   ` Marcin Borkowski
@ 2017-12-19 11:44                                     ` Marcin Borkowski
  2017-12-19 16:15                                       ` Eli Zaretskii
  0 siblings, 1 reply; 45+ messages in thread
From: Marcin Borkowski @ 2017-12-19 11:44 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 20871


On 2017-12-09, at 16:45, Marcin Borkowski <mbork@mbork.pl> wrote:

> On 2017-12-09, at 12:52, Eli Zaretskii <eliz@gnu.org> wrote:
>
>>> From: Marcin Borkowski <mbork@mbork.pl>
>>> Cc: 20871@debbugs.gnu.org
>>> Date: Thu, 07 Dec 2017 14:28:20 +0100
>>>
>>> OK, so let me revive this old thread.  I decided to spend some (not too
>>> much...) time on Emacs bugs again.  First things first: I think I fixed
>>> this one.  (This is a bit embarassing: I did it many months ago, pushed
>>> a branch and forgot to mention it here.)  Here it is:
>>> http://git.savannah.gnu.org/cgit/emacs.git/log/?h=fix/bug-20871 - can
>>> anyone look at it?
>>
>> I took a look.  It looks reasonable, but I think it will also need to
>> be mentioned in the Emacs User manual, node "Fill Commands".  In
>> addition, could you write a couple of tests for this feature?
>
> OK, I'll look into it and get back here.

So I did take a look.  The mention in the manual is already there,
I forgot I did that back then.  Where should I put the tests?
I expected something in e.g. test/lisp/textmodes/fill.el, but there's
nothing like that there.  Also, should I test only my function, or
filling with it, or both?

Best,

--
Marcin Borkowski





^ permalink raw reply	[flat|nested] 45+ messages in thread

* bug#20871: 25.0.50; fill-single-char-nobreak-p does not recognize a single-letter word when it is preceded by an open paren
  2017-12-19 11:44                                     ` Marcin Borkowski
@ 2017-12-19 16:15                                       ` Eli Zaretskii
  2018-01-02  8:55                                         ` Marcin Borkowski
  0 siblings, 1 reply; 45+ messages in thread
From: Eli Zaretskii @ 2017-12-19 16:15 UTC (permalink / raw)
  To: Marcin Borkowski; +Cc: 20871

> From: Marcin Borkowski <mbork@mbork.pl>
> Cc: 20871@debbugs.gnu.org
> Date: Tue, 19 Dec 2017 12:44:21 +0100
> 
> >> I took a look.  It looks reasonable, but I think it will also need to
> >> be mentioned in the Emacs User manual, node "Fill Commands".  In
> >> addition, could you write a couple of tests for this feature?
> >
> > OK, I'll look into it and get back here.
> 
> So I did take a look.  The mention in the manual is already there,
> I forgot I did that back then.

Right, sorry I missed that somehow.

> Where should I put the tests?
> I expected something in e.g. test/lisp/textmodes/fill.el, but there's
> nothing like that there.

If the file doesn't exist, create it.

> Also, should I test only my function, or filling with it, or both?

I think filling with your function is the most important thing to
test.

Thanks.





^ permalink raw reply	[flat|nested] 45+ messages in thread

* bug#20871: 25.0.50; fill-single-char-nobreak-p does not recognize a single-letter word when it is preceded by an open paren
  2017-12-19 16:15                                       ` Eli Zaretskii
@ 2018-01-02  8:55                                         ` Marcin Borkowski
  2018-01-13  8:46                                           ` Eli Zaretskii
  0 siblings, 1 reply; 45+ messages in thread
From: Marcin Borkowski @ 2018-01-02  8:55 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 20871


On 2017-12-19, at 17:15, Eli Zaretskii <eliz@gnu.org> wrote:

>> So I did take a look.  The mention in the manual is already there,
>> I forgot I did that back then.
>
> Right, sorry I missed that somehow.

No problem.

>> Where should I put the tests?
>> I expected something in e.g. test/lisp/textmodes/fill.el, but there's
>> nothing like that there.
>
> If the file doesn't exist, create it.
>
>> Also, should I test only my function, or filling with it, or both?
>
> I think filling with your function is the most important thing to
> test.
>
> Thanks.

So that's what I did (see commit 1ad94a126b on branch fix/bug-20871,
http://git.savannah.gnu.org/cgit/emacs.git/log/?h=fix/bug-20871).  Is
that ok?  I admit that these tests are rather simplistic, and I'm still
not sure about my commit message.

Best,

-- 
Marcin Borkowski





^ permalink raw reply	[flat|nested] 45+ messages in thread

* bug#20871: 25.0.50; fill-single-char-nobreak-p does not recognize a single-letter word when it is preceded by an open paren
  2018-01-02  8:55                                         ` Marcin Borkowski
@ 2018-01-13  8:46                                           ` Eli Zaretskii
  2018-01-13 16:01                                             ` Marcin Borkowski
  0 siblings, 1 reply; 45+ messages in thread
From: Eli Zaretskii @ 2018-01-13  8:46 UTC (permalink / raw)
  To: Marcin Borkowski; +Cc: 20871

> From: Marcin Borkowski <mbork@mbork.pl>
> Cc: 20871@debbugs.gnu.org
> Date: Tue, 02 Jan 2018 09:55:47 +0100
> 
> >> Where should I put the tests?
> >> I expected something in e.g. test/lisp/textmodes/fill.el, but there's
> >> nothing like that there.
> >
> > If the file doesn't exist, create it.
> >
> >> Also, should I test only my function, or filling with it, or both?
> >
> > I think filling with your function is the most important thing to
> > test.
> >
> > Thanks.
> 
> So that's what I did (see commit 1ad94a126b on branch fix/bug-20871,
> http://git.savannah.gnu.org/cgit/emacs.git/log/?h=fix/bug-20871).  Is
> that ok?

Yes.

> I admit that these tests are rather simplistic, and I'm still
> not sure about my commit message.

I think this is ready to be installed, and if there are issues with
commit messages, we will deal with them when installing.

So please post the full patch for this issue, and let's take it from
there.

Thanks.





^ permalink raw reply	[flat|nested] 45+ messages in thread

* bug#20871: 25.0.50; fill-single-char-nobreak-p does not recognize a single-letter word when it is preceded by an open paren
  2018-01-13  8:46                                           ` Eli Zaretskii
@ 2018-01-13 16:01                                             ` Marcin Borkowski
  2018-01-13 16:53                                               ` Eli Zaretskii
  0 siblings, 1 reply; 45+ messages in thread
From: Marcin Borkowski @ 2018-01-13 16:01 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 20871


On 2018-01-13, at 09:46, Eli Zaretskii <eliz@gnu.org> wrote:

>> So that's what I did (see commit 1ad94a126b on branch fix/bug-20871,
>> http://git.savannah.gnu.org/cgit/emacs.git/log/?h=fix/bug-20871).  Is
>> that ok?
>
> Yes.
>
>> I admit that these tests are rather simplistic, and I'm still
>> not sure about my commit message.
>
> I think this is ready to be installed, and if there are issues with
> commit messages, we will deal with them when installing.
>
> So please post the full patch for this issue, and let's take it from
> there.

Thanks.  Why not just merge my branch into master?

Best,

-- 
Marcin Borkowski





^ permalink raw reply	[flat|nested] 45+ messages in thread

* bug#20871: 25.0.50; fill-single-char-nobreak-p does not recognize a single-letter word when it is preceded by an open paren
  2018-01-13 16:01                                             ` Marcin Borkowski
@ 2018-01-13 16:53                                               ` Eli Zaretskii
  2018-01-13 17:02                                                 ` Eli Zaretskii
  2018-01-15  5:13                                                 ` Marcin Borkowski
  0 siblings, 2 replies; 45+ messages in thread
From: Eli Zaretskii @ 2018-01-13 16:53 UTC (permalink / raw)
  To: Marcin Borkowski; +Cc: 20871

> From: Marcin Borkowski <mbork@mbork.pl>
> Cc: 20871@debbugs.gnu.org
> Date: Sat, 13 Jan 2018 17:01:14 +0100
> 
> > So please post the full patch for this issue, and let's take it from
> > there.
> 
> Thanks.  Why not just merge my branch into master?

Because that would not have the patch recorded by the tracker, and
because I'd like to have the patch reviewed by others if they want.

Thanks.





^ permalink raw reply	[flat|nested] 45+ messages in thread

* bug#20871: 25.0.50; fill-single-char-nobreak-p does not recognize a single-letter word when it is preceded by an open paren
  2018-01-13 16:53                                               ` Eli Zaretskii
@ 2018-01-13 17:02                                                 ` Eli Zaretskii
  2018-01-15  5:13                                                   ` Marcin Borkowski
  2018-01-15  5:13                                                 ` Marcin Borkowski
  1 sibling, 1 reply; 45+ messages in thread
From: Eli Zaretskii @ 2018-01-13 17:02 UTC (permalink / raw)
  To: mbork; +Cc: 20871

> Date: Sat, 13 Jan 2018 18:53:52 +0200
> From: Eli Zaretskii <eliz@gnu.org>
> Cc: 20871@debbugs.gnu.org
> 
> > Thanks.  Why not just merge my branch into master?
> 
> Because that would not have the patch recorded by the tracker, and
> because I'd like to have the patch reviewed by others if they want.

Maybe I misunderstood you: if you were asking whether you can merge
you're branch onto master, then please go ahead.  That would not let
you fix your commit messages easily, though (if that's what you wanted
to do).





^ permalink raw reply	[flat|nested] 45+ messages in thread

* bug#20871: 25.0.50; fill-single-char-nobreak-p does not recognize a single-letter word when it is preceded by an open paren
  2018-01-13 16:53                                               ` Eli Zaretskii
  2018-01-13 17:02                                                 ` Eli Zaretskii
@ 2018-01-15  5:13                                                 ` Marcin Borkowski
  2018-01-15  5:30                                                   ` Marcin Borkowski
  1 sibling, 1 reply; 45+ messages in thread
From: Marcin Borkowski @ 2018-01-15  5:13 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 20871


On 2018-01-13, at 17:53, Eli Zaretskii <eliz@gnu.org> wrote:

>> From: Marcin Borkowski <mbork@mbork.pl>
>> Cc: 20871@debbugs.gnu.org
>> Date: Sat, 13 Jan 2018 17:01:14 +0100
>> 
>> > So please post the full patch for this issue, and let's take it from
>> > there.
>> 
>> Thanks.  Why not just merge my branch into master?
>
> Because that would not have the patch recorded by the tracker, and
> because I'd like to have the patch reviewed by others if they want.

I see, I'll do it in a minute.

Best,

-- 
Marcin Borkowski





^ permalink raw reply	[flat|nested] 45+ messages in thread

* bug#20871: 25.0.50; fill-single-char-nobreak-p does not recognize a single-letter word when it is preceded by an open paren
  2018-01-13 17:02                                                 ` Eli Zaretskii
@ 2018-01-15  5:13                                                   ` Marcin Borkowski
  0 siblings, 0 replies; 45+ messages in thread
From: Marcin Borkowski @ 2018-01-15  5:13 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 20871


On 2018-01-13, at 18:02, Eli Zaretskii <eliz@gnu.org> wrote:

>> Date: Sat, 13 Jan 2018 18:53:52 +0200
>> From: Eli Zaretskii <eliz@gnu.org>
>> Cc: 20871@debbugs.gnu.org
>> 
>> > Thanks.  Why not just merge my branch into master?
>> 
>> Because that would not have the patch recorded by the tracker, and
>> because I'd like to have the patch reviewed by others if they want.
>
> Maybe I misunderstood you: if you were asking whether you can merge
> you're branch onto master, then please go ahead.  That would not let
> you fix your commit messages easily, though (if that's what you wanted
> to do).

Good point, I didn't think about it!

-- 
Marcin Borkowski





^ permalink raw reply	[flat|nested] 45+ messages in thread

* bug#20871: 25.0.50; fill-single-char-nobreak-p does not recognize a single-letter word when it is preceded by an open paren
  2018-01-15  5:13                                                 ` Marcin Borkowski
@ 2018-01-15  5:30                                                   ` Marcin Borkowski
  2018-01-15 13:07                                                     ` Eli Zaretskii
  0 siblings, 1 reply; 45+ messages in thread
From: Marcin Borkowski @ 2018-01-15  5:30 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 20871

[-- Attachment #1: Type: text/plain, Size: 359 bytes --]


On 2018-01-15, at 06:13, Marcin Borkowski <mbork@mbork.pl> wrote:

> On 2018-01-13, at 17:53, Eli Zaretskii <eliz@gnu.org> wrote:
>
>>> > So please post the full patch for this issue, and let's take it from
>>> > there.
>
> I see, I'll do it in a minute.

I attach the patch (which is made by squashing both the commits I made).

Best,

-- 
Marcin Borkowski

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Add-the-function-fill-polish-nobreak-p-with-tests.patch --]
[-- Type: text/x-patch, Size: 4969 bytes --]

From 96b4bd4cef5008561f336d8d1de47548d738c1ff Mon Sep 17 00:00:00 2001
From: Marcin Borkowski <mbork@mbork.pl>
Date: Wed, 27 Apr 2016 08:59:15 +0200
Subject: [PATCH] Add the function `fill-polish-nobreak-p' with tests

* lisp/textmodes/fill.el (fill-polish-nobreak-p): Prevent
line-breaking after a single-letter word even if this word is not
preceded by a space.  Fixes bug #20871.
---
 doc/emacs/text.texi               |  7 ++++--
 etc/NEWS                          |  5 ++++
 lisp/textmodes/fill.el            | 12 ++++++++++
 test/lisp/textmodes/fill-tests.el | 50 +++++++++++++++++++++++++++++++++++++++
 4 files changed, 72 insertions(+), 2 deletions(-)
 create mode 100644 test/lisp/textmodes/fill-tests.el

diff --git a/doc/emacs/text.texi b/doc/emacs/text.texi
index 846d9fe8c6..2f180f82ca 100644
--- a/doc/emacs/text.texi
+++ b/doc/emacs/text.texi
@@ -636,8 +636,11 @@ Fill Commands
 break the line there.  Functions you can use there include:
 @code{fill-single-word-nobreak-p} (don't break after the first word of
 a sentence or before the last); @code{fill-single-char-nobreak-p}
-(don't break after a one-letter word); and @code{fill-french-nobreak-p}
-(don't break after @samp{(} or before @samp{)}, @samp{:} or @samp{?}).
+(don't break after a one-letter word preceded by a whitespace
+character); @code{fill-french-nobreak-p} (don't break after @samp{(}
+or before @samp{)}, @samp{:} or @samp{?}); and
+@code{fill-polish-nobreak-p} (don't break after a one letter word,
+even if preceded by a non-whitespace character).
 
 @node Fill Prefix
 @subsection The Fill Prefix
diff --git a/etc/NEWS b/etc/NEWS
index 1d546c4ec1..ed1f931547 100644
--- a/etc/NEWS
+++ b/etc/NEWS
@@ -69,6 +69,11 @@ detect built-in libxml support, instead of testing for that
 indirectly, e.g., by checking that functions like
 'libxml-parse-html-region' return nil.
 
++++
+** New function 'fill-polish-nobreak-p', to be used in 'fill-nobreak-predicate'.
+It blocks line breaking after a one-letter word, also in the case when
+this word is preceded by a non-space, but non-alphanumeric character.
+
 \f
 * Editing Changes in Emacs 27.1
 
diff --git a/lisp/textmodes/fill.el b/lisp/textmodes/fill.el
index a46f0b2a4c..6d37be870b 100644
--- a/lisp/textmodes/fill.el
+++ b/lisp/textmodes/fill.el
@@ -339,6 +339,18 @@ fill-french-nobreak-p
 	      (and (memq (preceding-char) '(?\t ?\s))
 		   (eq (char-syntax (following-char)) ?w)))))))
 
+(defun fill-polish-nobreak-p ()
+  "Return nil if Polish style allows breaking the line at point.
+This function may be used in the `fill-nobreak-predicate' hook.
+It is almost the same as `fill-single-char-nobreak-p', with the
+exception that it does not require the one-letter word to be
+preceded by a space.  This blocks line-breaking in cases like
+\"(a jednak)\"."
+  (save-excursion
+    (skip-chars-backward " \t")
+    (backward-char 2)
+    (looking-at "[^[:alpha:]]\\cl")))
+
 (defun fill-single-char-nobreak-p ()
   "Return non-nil if a one-letter word is before point.
 This function is suitable for adding to the hook `fill-nobreak-predicate',
diff --git a/test/lisp/textmodes/fill-tests.el b/test/lisp/textmodes/fill-tests.el
new file mode 100644
index 0000000000..03323090f9
--- /dev/null
+++ b/test/lisp/textmodes/fill-tests.el
@@ -0,0 +1,50 @@
+;;; fill-test.el --- ERT tests for fill.el -*- lexical-binding: t -*-
+
+;; Copyright (C) 2017 Free Software Foundation, Inc.
+
+;; Author:     Marcin Borkowski <mbork@mbork.pl>
+;; Keywords:   text, wp
+
+;; This file is part of GNU Emacs.
+
+;; GNU Emacs is free software: you can redistribute it and/or modify
+;; it under the terms of the GNU General Public License as published by
+;; the Free Software Foundation, either version 3 of the License, or
+;; (at your option) any later version.
+
+;; GNU Emacs is distributed in the hope that it will be useful,
+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;; GNU General Public License for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with GNU Emacs.  If not, see <https://www.gnu.org/licenses/>.
+
+;;; Commentary:
+
+;; This package defines tests for the filling feature, specifically
+;; the `fill-polish-nobreak-p' function.
+
+;;; Code:
+
+(require 'ert)
+
+(ert-deftest fill-test-no-fill-polish-nobreak-p nil
+  "Tests of the `fill-polish-nobreak-p' function."
+  (with-temp-buffer
+    (insert "Abc d efg (h ijk).")
+    (setq fill-column 8)
+    (setq-local fill-nobreak-predicate '())
+    (fill-paragraph)
+    (should (string= (buffer-string) "Abc d\nefg (h\nijk).")))
+  (with-temp-buffer
+    (insert "Abc d efg (h ijk).")
+    (setq fill-column 8)
+    (setq-local fill-nobreak-predicate '(fill-polish-nobreak-p))
+    (fill-paragraph)
+    (should (string= (buffer-string) "Abc\nd efg\n(h ijk)."))))
+
+
+(provide 'fill-tests)
+
+;;; fill-tests.el ends here
-- 
2.15.1


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* bug#20871: 25.0.50; fill-single-char-nobreak-p does not recognize a single-letter word when it is preceded by an open paren
  2018-01-15  5:30                                                   ` Marcin Borkowski
@ 2018-01-15 13:07                                                     ` Eli Zaretskii
  2018-01-24  9:34                                                       ` Marcin Borkowski
  0 siblings, 1 reply; 45+ messages in thread
From: Eli Zaretskii @ 2018-01-15 13:07 UTC (permalink / raw)
  To: Marcin Borkowski; +Cc: 20871

> From: Marcin Borkowski <mbork@mbork.pl>
> Cc: 20871@debbugs.gnu.org
> Date: Mon, 15 Jan 2018 06:30:20 +0100
> 
> I attach the patch (which is made by squashing both the commits I made).

Thanks, this LGTM.  Please push to master.





^ permalink raw reply	[flat|nested] 45+ messages in thread

* bug#20871: 25.0.50; fill-single-char-nobreak-p does not recognize a single-letter word when it is preceded by an open paren
  2018-01-15 13:07                                                     ` Eli Zaretskii
@ 2018-01-24  9:34                                                       ` Marcin Borkowski
  2018-01-24 19:16                                                         ` Eli Zaretskii
  0 siblings, 1 reply; 45+ messages in thread
From: Marcin Borkowski @ 2018-01-24  9:34 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 20871


On 2018-01-15, at 14:07, Eli Zaretskii <eliz@gnu.org> wrote:

>> From: Marcin Borkowski <mbork@mbork.pl>
>> Cc: 20871@debbugs.gnu.org
>> Date: Mon, 15 Jan 2018 06:30:20 +0100
>> 
>> I attach the patch (which is made by squashing both the commits I made).
>
> Thanks, this LGTM.  Please push to master.

Done, thanks.

(I'm afraid that I messed up the commit message, which included another
branch name I used locally - sorry for that.  Also, after the merge it
occured to me that I could have used rebase instead.  Would it be
a better idea in the future?)

Best,

-- 
Marcin Borkowski





^ permalink raw reply	[flat|nested] 45+ messages in thread

* bug#20871: 25.0.50; fill-single-char-nobreak-p does not recognize a single-letter word when it is preceded by an open paren
  2018-01-24  9:34                                                       ` Marcin Borkowski
@ 2018-01-24 19:16                                                         ` Eli Zaretskii
  0 siblings, 0 replies; 45+ messages in thread
From: Eli Zaretskii @ 2018-01-24 19:16 UTC (permalink / raw)
  To: Marcin Borkowski; +Cc: 20871

> From: Marcin Borkowski <mbork@mbork.pl>
> Cc: 20871@debbugs.gnu.org
> Date: Wed, 24 Jan 2018 10:34:38 +0100
> 
> Also, after the merge it occured to me that I could have used rebase
> instead.  Would it be a better idea in the future?)

No, we don't recommend using rebase between public branches,
especially if there were merges from master to the feature branch.





^ permalink raw reply	[flat|nested] 45+ messages in thread

* bug#20871: 25.0.50; fill-single-char-nobreak-p does not recognize a single-letter word when it is preceded by an open paren
  2016-04-17 14:57     ` Eli Zaretskii
  2016-04-17 15:34       ` Marcin Borkowski
@ 2018-02-02  9:18       ` Michal Nazarewicz
  1 sibling, 0 replies; 45+ messages in thread
From: Michal Nazarewicz @ 2018-02-02  9:18 UTC (permalink / raw)
  To: Eli Zaretskii, Marcin Borkowski; +Cc: 20871

On Sun, Apr 17 2016, Eli Zaretskii wrote:
> Could you please elaborate on the bug itself?
>
> See, the function in question, fill-single-char-nobreak-p, is
> documented as a possible value to use in the fill hook, for a very
> specific purpose.  If you are saying that it doesn't fulfill that
> purpose well enough, please show a use case where it fails to do that.
> At least the situation you described, with " (a", doesn't seem to fit
> the use cases which this function is supposed to cover, since the
> parenthesis makes a 2-character sequence, whereas
> fill-single-char-nobreak-p aims to support isolated one-character
> words.

As person who wrote ‘fill-single-char-nobreak-p’ I can say that its
intention was to work for Polish and Czech typography.  In other words,
what Marcin reported is a genuine defect in the function.

In particular, function’s documentation mentions *one-letter* sequences,
not *one-character* sequences:

	Return non-nil if a one-letter word is before point.

(Admittedly, the name of the function is misleading).

As such, I would suggest applying the fix to
‘fill-single-char-nobreak-p’ rather than introducing a new function.

(And perhaps adding an alias and deprecating the old name if misleading
name is too big of a problem).

As another point of context, ideally, ‘tildify-mode’ and
‘fill-single-char-nobreak-p’ would use the exact same logic since they
both work to address the same underlying typographic conventions.

-- 
Best regards
ミハウ “𝓶𝓲𝓷𝓪86” ナザレヴイツ
«If at first you don’t succeed, give up skydiving»

^ permalink raw reply	[flat|nested] 45+ messages in thread

* bug#20871: 25.0.50; fill-single-char-nobreak-p does not recognize a single-letter word when it is preceded by an open paren
       [not found]   ` <CA+pa1O3rcDznoR0u8i_AN5iHe9mM+FHqkAO=yVM2Gu5_Gc40jQ@mail.gmail.com>
@ 2019-08-17  6:57     ` Eli Zaretskii
  2019-08-17 14:17       ` Marcin Borkowski
  2019-08-19 14:07       ` Michał Nazarewicz
  0 siblings, 2 replies; 45+ messages in thread
From: Eli Zaretskii @ 2019-08-17  6:57 UTC (permalink / raw)
  To: Michał Nazarewicz; +Cc: mattiase, mbork, 20871

> From: Michał Nazarewicz <mina86@mina86.com>
> Date: Fri, 16 Aug 2019 16:30:55 +0100
> Cc: Mattias Engdegård <mattiase@acm.org>, mbork@mbork.pl, 
> 	20871@debbugs.gnu.org
> 
> On Fri, 16 Aug 2019 at 13:53, Eli Zaretskii <eliz@gnu.org> wrote:
> > Michal says that the original fill-single-char-nobreak-p function
> > was supposed to handle the Polish case as well.  If that's true,
> > perhaps we should deprecate fill-polish-nobreak-p.
> 
> I’m in favour of deprecating one of the functions, yes.

OK, but see below.

> > However, I do wonder how can a general function handle these cases,
> > since in general there's no problem in breaking a line after
> > single-letter words.
> 
> We could try and look at `tildify-mode’ for inspiration and especially
> ‘tildify-space-pattern’¹ which is:
> 
>     "[,:;(][ \t]*[a]\\|\\<[AIKOSUVWZikosuvwz]"

My problem is conceptual rather than practical.  Since in , e.g.,
English it is okay to break a line after single-letter words, whereas
in Polish it is not, I wonder how can we have a single function
satisfy both requirements.  We would need some option, and then we
would need to decide what is the trigger for changing the value of
that option -- it could be the user, or the language environment, or
maybe something else.

tildify.el explicitly says that its defaults are for a specific
language, so I don't think it solves the problem that bothers me, as
described above.  This is why I originally suggested a separate
function -- having that is equivalent to having an option which
determines a behavior that depends on the language.

I'm also okay with extending tildify.el to support more than just
Czech rules, but that's a separate issue.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* bug#20871: 25.0.50; fill-single-char-nobreak-p does not recognize a single-letter word when it is preceded by an open paren
  2019-08-17  6:57     ` Eli Zaretskii
@ 2019-08-17 14:17       ` Marcin Borkowski
  2019-08-17 15:04         ` Eli Zaretskii
  2019-08-19 14:07       ` Michał Nazarewicz
  1 sibling, 1 reply; 45+ messages in thread
From: Marcin Borkowski @ 2019-08-17 14:17 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: mattiase, 20871, Michał Nazarewicz

AFAIR, I was the submitter of this bug report a few years ago.  Thanks
for working on this.

On 2019-08-17, at 08:57, Eli Zaretskii <eliz@gnu.org> wrote:

> My problem is conceptual rather than practical.  Since in , e.g.,
> English it is okay to break a line after single-letter words, whereas
> in Polish it is not [...]

FWIW, I remember some English native speaker (from the UK, I guess)
saying that he likes the rule about not breaking the line after e.g. "a"
and would like English to have a similar one.

Best,

-- 
Marcin Borkowski
http://mbork.pl

^ permalink raw reply	[flat|nested] 45+ messages in thread

* bug#20871: 25.0.50; fill-single-char-nobreak-p does not recognize a single-letter word when it is preceded by an open paren
  2019-08-17 14:17       ` Marcin Borkowski
@ 2019-08-17 15:04         ` Eli Zaretskii
  2019-08-17 15:58           ` Marcin Borkowski
  0 siblings, 1 reply; 45+ messages in thread
From: Eli Zaretskii @ 2019-08-17 15:04 UTC (permalink / raw)
  To: Marcin Borkowski; +Cc: mattiase, 20871, mina86

> From: Marcin Borkowski <mbork@mbork.pl>
> Cc: Michał Nazarewicz <mina86@mina86.com>,
>  mattiase@acm.org, 20871@debbugs.gnu.org
> Date: Sat, 17 Aug 2019 16:17:04 +0200
> 
> > My problem is conceptual rather than practical.  Since in , e.g.,
> > English it is okay to break a line after single-letter words, whereas
> > in Polish it is not [...]
> 
> FWIW, I remember some English native speaker (from the UK, I guess)
> saying that he likes the rule about not breaking the line after e.g. "a"
> and would like English to have a similar one.

I could understand that, but someone's personal preferences are not
necessarily good for the rest of us.





^ permalink raw reply	[flat|nested] 45+ messages in thread

* bug#20871: 25.0.50; fill-single-char-nobreak-p does not recognize a single-letter word when it is preceded by an open paren
  2019-08-17 15:04         ` Eli Zaretskii
@ 2019-08-17 15:58           ` Marcin Borkowski
  0 siblings, 0 replies; 45+ messages in thread
From: Marcin Borkowski @ 2019-08-17 15:58 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: mattiase, 20871, mina86


On 2019-08-17, at 17:04, Eli Zaretskii <eliz@gnu.org> wrote:

>> From: Marcin Borkowski <mbork@mbork.pl>
>> Cc: Michał Nazarewicz <mina86@mina86.com>,
>>  mattiase@acm.org, 20871@debbugs.gnu.org
>> Date: Sat, 17 Aug 2019 16:17:04 +0200
>>
>> > My problem is conceptual rather than practical.  Since in , e.g.,
>> > English it is okay to break a line after single-letter words, whereas
>> > in Polish it is not [...]
>>
>> FWIW, I remember some English native speaker (from the UK, I guess)
>> saying that he likes the rule about not breaking the line after e.g. "a"
>> and would like English to have a similar one.
>
> I could understand that, but someone's personal preferences are not
> necessarily good for the rest of us.

Agreed, this is not enough to actually implement anything, but I think
it's interesting to know.  (Actually, I never break a line after an "a"
when I write in English, and I think this is consistent with the general
rule that linebreaks should not come between things constituing some
kind of one entity - but this is just me.)

Best,

--
Marcin Borkowski
http://mbork.pl





^ permalink raw reply	[flat|nested] 45+ messages in thread

* bug#20871: 25.0.50; fill-single-char-nobreak-p does not recognize a single-letter word when it is preceded by an open paren
  2019-08-17  6:57     ` Eli Zaretskii
  2019-08-17 14:17       ` Marcin Borkowski
@ 2019-08-19 14:07       ` Michał Nazarewicz
  2019-08-19 15:01         ` Eli Zaretskii
  1 sibling, 1 reply; 45+ messages in thread
From: Michał Nazarewicz @ 2019-08-19 14:07 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Mattias Engdegård, mbork, 20871

On Sat, 17 Aug 2019 at 07:57, Eli Zaretskii <eliz@gnu.org> wrote:
> My problem is conceptual rather than practical.  Since in , e.g.,
> English it is okay to break a line after single-letter words, whereas
> in Polish it is not, I wonder how can we have a single function
> satisfy both requirements.  We would need some option, and then we
> would need to decide what is the trigger for changing the value of
> that option -- it could be the user, or the language environment, or
> maybe something else.

This sounds related to what we’ve discussed when I was working on the
Unicode casing rules.  There also some rules need to be enabled only
if the text is in a certain language.  Having buffer-local variable or
a text-property defining language of the buffer sounds to me like the
only possible solution if we want this fully automatic.

> tildify.el explicitly says that its defaults are for a specific
> language, so I don't think it solves the problem that bothers me, as
> described above.  This is why I originally suggested a separate
> function -- having that is equivalent to having an option which
> determines a behavior that depends on the language.

To be honest, I don’t understand the issue.  If having separate
functions is equivalent to having an option than users already have an
option: either add the function to ‘fill-nobreak-predicate’ or not.

As discussed previously, ‘fill-single-char-nobreak-p’ and
‘fill-polish-nobreak-p’ and serve pretty much the same purpose.  When
I wrote the former I had Polish typography in mind and obviously the
latter is meant to handle the same case.  As such, having those two
functions don’t provide much option to the user.

> I'm also okay with extending tildify.el to support more than just
> Czech rules, but that's a separate issue.

The differences between Czech and Polish can largely be ignored.  The
regex in tildify.el will work for Polish just fine.  The character
group simply lists all existing one-letter Czech words which is
superset of one-letter Polish words.

At the same time, those two languages are, to my knowledge, the only
which observe this typography rule, so there’s no need to add support
for more.

‘fill-single-char-nobreak-p’ matches any single-letter words just for
the sake of simplicity plus it then works with other languages, which
may have different one-letter words, if someone wishes to use it with
them.

-- 
Best regards
ミハウ “𝓶𝓲𝓷𝓪86” ナザレヴイツ
«If at first you don’t succeed, give up skydiving»

^ permalink raw reply	[flat|nested] 45+ messages in thread

* bug#20871: 25.0.50; fill-single-char-nobreak-p does not recognize a single-letter word when it is preceded by an open paren
  2019-08-19 14:07       ` Michał Nazarewicz
@ 2019-08-19 15:01         ` Eli Zaretskii
  2019-08-19 15:36           ` Michał Nazarewicz
  0 siblings, 1 reply; 45+ messages in thread
From: Eli Zaretskii @ 2019-08-19 15:01 UTC (permalink / raw)
  To: Michał Nazarewicz; +Cc: mattiase, mbork, 20871

> From: Michał Nazarewicz <mina86@mina86.com>
> Date: Mon, 19 Aug 2019 15:07:56 +0100
> Cc: Mattias Engdegård <mattiase@acm.org>, mbork@mbork.pl, 
> 	20871@debbugs.gnu.org
> 
> > tildify.el explicitly says that its defaults are for a specific
> > language, so I don't think it solves the problem that bothers me, as
> > described above.  This is why I originally suggested a separate
> > function -- having that is equivalent to having an option which
> > determines a behavior that depends on the language.
> 
> To be honest, I don’t understand the issue.  If having separate
> functions is equivalent to having an option than users already have an
> option: either add the function to ‘fill-nobreak-predicate’ or not.

Yes, of course.  I wrote the above in response to your suggestion to
leave just one function.

> As discussed previously, ‘fill-single-char-nobreak-p’ and
> ‘fill-polish-nobreak-p’ and serve pretty much the same purpose.  When
> I wrote the former I had Polish typography in mind and obviously the
> latter is meant to handle the same case.  As such, having those two
> functions don’t provide much option to the user.

If both functions attempt to produce the same behavior, then yes, we
need only one.  But then wouldn't we need a second one, to produce the
behavior expected, say, in US English?

> 
> > I'm also okay with extending tildify.el to support more than just
> > Czech rules, but that's a separate issue.
> 
> The differences between Czech and Polish can largely be ignored.

I didn't mean Polish, I meant in general languages where the
conventions are different.  Surely, there are some, and tildify
explicitly assumes that.





^ permalink raw reply	[flat|nested] 45+ messages in thread

* bug#20871: 25.0.50; fill-single-char-nobreak-p does not recognize a single-letter word when it is preceded by an open paren
  2019-08-19 15:01         ` Eli Zaretskii
@ 2019-08-19 15:36           ` Michał Nazarewicz
  2019-08-19 16:16             ` Eli Zaretskii
  0 siblings, 1 reply; 45+ messages in thread
From: Michał Nazarewicz @ 2019-08-19 15:36 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Mattias Engdegård, mbork, 20871

> > From: Michał Nazarewicz <mina86@mina86.com>
> > Date: Mon, 19 Aug 2019 15:07:56 +0100

> > As discussed previously, ‘fill-single-char-nobreak-p’ and
> > ‘fill-polish-nobreak-p’ and serve pretty much the same purpose.  When
> > I wrote the former I had Polish typography in mind and obviously the
> > latter is meant to handle the same case.  As such, having those two
> > functions don’t provide much option to the user.

On Mon, 19 Aug 2019 at 16:01, Eli Zaretskii <eliz@gnu.org> wrote:
> If both functions attempt to produce the same behavior, then yes, we
> need only one.  But then wouldn't we need a second one, to produce the
> behavior expected, say, in US English?

The expected behaviour for US English is achieved by not using the
function at all.

Unless you mean that someone want to follow the rule in English even
though it’s not an established thing in that language.  In that case
they can just use the existing functions and they will work for
English.

In that case I could see a potential reason to have multiple
functions:
- ‘fill-polish-nobreak-p’ – don’t break after a, e, i, o, u, w or z;
- ‘fill-czech-nobreak-p’ – don’t break after a, i, k, o, s, u, v or z;
  and
– ‘fill-single-char-nobreak-p’ – don’t break after any single letter
  word.
This can also be achieved by a single function and a variable listing
all the characters.

Note also that there is a different rule which applies to all
languages which deals with breaking line between number and
a unit, e.g. ‘60 s’, ‘100 m’ etc.  I’m not sure how this fits with
current discussion since neither tildify nor the *-nobreak-p functions
deal with that case.

> > > I'm also okay with extending tildify.el to support more than just
> > > Czech rules, but that's a separate issue.
> >
> > The differences between Czech and Polish can largely be ignored.
>
> I didn't mean Polish, I meant in general languages where the
> conventions are different.  Surely, there are some, and tildify
> explicitly assumes that.

I don’t think there are.  It is possible that I’m incorrect but all
the materials I’ve found talked about Polish and Czech only.  Polish
Wikipedia entry¹ explicitly states that this rule is only for those two
languages.

¹ https://pl.wikipedia.org/wiki/Sierotka_(typografia) (note that
  ‘sierotka’ literally translates to ‘orphan’ but is a different thing
  than ‘orphan’ in English typography).

--
Best regards
ミハウ “𝓶𝓲𝓷𝓪86” ナザレヴイツ
«If at first you don’t succeed, give up skydiving»

^ permalink raw reply	[flat|nested] 45+ messages in thread

* bug#20871: 25.0.50; fill-single-char-nobreak-p does not recognize a single-letter word when it is preceded by an open paren
  2019-08-19 15:36           ` Michał Nazarewicz
@ 2019-08-19 16:16             ` Eli Zaretskii
  0 siblings, 0 replies; 45+ messages in thread
From: Eli Zaretskii @ 2019-08-19 16:16 UTC (permalink / raw)
  To: Michał Nazarewicz; +Cc: mattiase, mbork, 20871

> From: Michał Nazarewicz <mina86@mina86.com>
> Date: Mon, 19 Aug 2019 16:36:15 +0100
> Cc: Mattias Engdegård <mattiase@acm.org>, mbork@mbork.pl, 
> 	20871@debbugs.gnu.org
> 
> > If both functions attempt to produce the same behavior, then yes, we
> > need only one.  But then wouldn't we need a second one, to produce the
> > behavior expected, say, in US English?
> 
> The expected behaviour for US English is achieved by not using the
> function at all.

If that's true (and I'm not an expert to say it is), then all we need
is a suitable change to the doc string of fill-nobreak-predicate,
since currently it says nothing about what should be done for English.

> In that case I could see a potential reason to have multiple
> functions:
> - ‘fill-polish-nobreak-p’ – don’t break after a, e, i, o, u, w or z;
> - ‘fill-czech-nobreak-p’ – don’t break after a, i, k, o, s, u, v or z;
>   and
> – ‘fill-single-char-nobreak-p’ – don’t break after any single letter
>   word.

Don't forget fill-french-nobreak-p.





^ permalink raw reply	[flat|nested] 45+ messages in thread

end of thread, other threads:[~2019-08-19 16:16 UTC | newest]

Thread overview: 45+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-06-22 10:19 bug#20871: 25.0.50; fill-single-char-nobreak-p does not recognize a single-letter word when it is preceded by an open paren Marcin Borkowski
2015-06-22 10:28 ` Marcin Borkowski
2016-04-17  6:34   ` Marcin Borkowski
2016-04-17 14:57     ` Eli Zaretskii
2016-04-17 15:34       ` Marcin Borkowski
2016-04-17 16:49         ` Eli Zaretskii
2016-04-17 17:41           ` Marcin Borkowski
2016-04-27  7:02           ` Marcin Borkowski
2016-04-27  7:20             ` Eli Zaretskii
2016-04-29 12:18               ` Marcin Borkowski
2016-04-30 11:21                 ` Eli Zaretskii
2016-04-30 12:26                   ` Marcin Borkowski
2016-04-30 12:38                     ` Eli Zaretskii
2016-04-30 16:41                       ` Marcin Borkowski
2016-04-30 17:01                         ` Eli Zaretskii
2016-04-30 18:42                           ` Marcin Borkowski
2016-04-30 19:01                             ` Eli Zaretskii
2017-12-07 13:28                               ` Marcin Borkowski
2017-12-09 11:52                                 ` Eli Zaretskii
2017-12-09 15:45                                   ` Marcin Borkowski
2017-12-19 11:44                                     ` Marcin Borkowski
2017-12-19 16:15                                       ` Eli Zaretskii
2018-01-02  8:55                                         ` Marcin Borkowski
2018-01-13  8:46                                           ` Eli Zaretskii
2018-01-13 16:01                                             ` Marcin Borkowski
2018-01-13 16:53                                               ` Eli Zaretskii
2018-01-13 17:02                                                 ` Eli Zaretskii
2018-01-15  5:13                                                   ` Marcin Borkowski
2018-01-15  5:13                                                 ` Marcin Borkowski
2018-01-15  5:30                                                   ` Marcin Borkowski
2018-01-15 13:07                                                     ` Eli Zaretskii
2018-01-24  9:34                                                       ` Marcin Borkowski
2018-01-24 19:16                                                         ` Eli Zaretskii
2016-04-30 17:42                         ` Drew Adams
2016-04-30 18:24                           ` Eli Zaretskii
2016-04-30 18:41                             ` Marcin Borkowski
2018-02-02  9:18       ` Michal Nazarewicz
     [not found] <9A9C6F59-CB27-42D1-911E-F027B443B9BE@acm.org>
     [not found] ` <8336i1p8zd.fsf@gnu.org>
     [not found]   ` <CA+pa1O3rcDznoR0u8i_AN5iHe9mM+FHqkAO=yVM2Gu5_Gc40jQ@mail.gmail.com>
2019-08-17  6:57     ` Eli Zaretskii
2019-08-17 14:17       ` Marcin Borkowski
2019-08-17 15:04         ` Eli Zaretskii
2019-08-17 15:58           ` Marcin Borkowski
2019-08-19 14:07       ` Michał Nazarewicz
2019-08-19 15:01         ` Eli Zaretskii
2019-08-19 15:36           ` Michał Nazarewicz
2019-08-19 16:16             ` Eli Zaretskii

Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.