all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Visuwesh <visuweshm@gmail.com>
To: "André A. Gomes" <andremegafone@gmail.com>
Cc: Lars Ingebrigtsen <larsi@gnus.org>,
	56844@debbugs.gnu.org, Juri Linkov <juri@linkov.net>
Subject: bug#56844: [PATCH] Refactor repunctuate-sentences to accommodate corner case.
Date: Tue, 02 Aug 2022 18:18:18 +0530	[thread overview]
Message-ID: <87czdi3j71.fsf@gmail.com> (raw)
In-Reply-To: <8735eezx8x.fsf@gmail.com> ("André A. Gomes"'s message of "Tue, 02 Aug 2022 14:43:42 +0300")

[செவ்வாய் ஆகஸ்ட் 02, 2022] André A. Gomes wrote:

> Juri Linkov <juri@linkov.net> writes:
>
>>>> It now gracefully handles the case when abbreviations such as e.g. or
>>>> i.e. are used in sentences.
>>>
>>> [...]
>>>
>>>> +        (regexp "\\([]\"')]?\\)\\([.?!]\\)\\([]\"')]?\\) +\\([\"')[:upper:]]\\)")
>>>
>>> I'm not quite sure I understand this patch.  Are you changing this to
>>> only consider punctuation that's followed by an upper-case character to
>>> be sentence-end punctuation?
>>
>> It would be better to add such heuristics to repunctuate-sentences-filter,
>> so anyone could customize it.
>
> In general I'd agree with you, but this patch is actually fixing a bug,
> not introducing a personal preference.  That's how I see it at least.

This breaks repunctuate-sentences for languages that don't have the
concept of upper and lower case characters.  Try repunctuate-sentences
with and without your patch for the following text,

தொழிற்சாலை யந்திரங்கள் தேவையான மட்டும் அந்தத் தொழிலாளர்களது சக்தியை உறிஞ்சித்
தீர்த்துவிடுவதோடு அந்த நாள் விழுங்கப்பட்டுவிடும். எந்தவிதமான எச்சமிச்சங்களும் இல்லாமல்
அன்றையப் பொழுது அழிந்து கழியும்; மனிதனும் தனது சவக்குழியை நோக்கி ஓரடி
முன்னேறிவிடுவான். ஆனால் இப்போதோ ஒய்வின்

  reply	other threads:[~2022-08-02 12:48 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-30 18:06 bug#56844: [PATCH] Refactor repunctuate-sentences to accommodate corner case André A. Gomes
2022-07-31  8:34 ` Lars Ingebrigtsen
2022-07-31 19:49   ` Juri Linkov
2022-08-02 11:43     ` André A. Gomes
2022-08-02 12:48       ` Visuwesh [this message]
2022-08-02 11:41   ` André A. Gomes
2022-08-02 11:45     ` Lars Ingebrigtsen
2022-08-02 12:10       ` Robert Pluim
2022-08-02 12:35     ` Stefan Kangas
2022-08-02 19:59       ` Juri Linkov
2022-09-02 10:47         ` Lars Ingebrigtsen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87czdi3j71.fsf@gmail.com \
    --to=visuweshm@gmail.com \
    --cc=56844@debbugs.gnu.org \
    --cc=andremegafone@gmail.com \
    --cc=juri@linkov.net \
    --cc=larsi@gnus.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.