unofficial mirror of help-gnu-emacs@gnu.org
 help / color / mirror / Atom feed
* FW: bug#32758: 26.1 emacs-mac 7.2; forward-sentence in eww
@ 2018-10-10 10:17 Van L
  2018-10-10 11:28 ` Yuri Khan
  2018-10-10 15:17 ` FW: " Stefan Monnier
  0 siblings, 2 replies; 5+ messages in thread
From: Van L @ 2018-10-10 10:17 UTC (permalink / raw)
  To: Emacs


Hello,

I am looking for a one-space after punctuation sentence-ending for M-e to jump by.

Disabling the following variable stops short at Nov. or Gov. (see below) 

┌────
│ (setq sentence-end-double-space nil)
└────

┌────
│ Haley’s departure also stoked speculation she could replace Lindsey Graham as the senator from
│ South Carolina, a possibility that Trump played down. Talk in Washington is that should Trump
│ replace Attorney General Jeff Sessions with Graham after the Nov. 6 congressional elections,
│ South Carolina Gov. Henry McMaster would be responsible for selecting a replacement to serve
│ until the 2020 election. McMaster was previously Haley’s No. 2 in the state.
└────

and I’m aware of names like A. B. C. Nurmagomedov which will stop early, too.

What rules specify an almost perfect spot for the end of a sentence followed by single space?; to fit in the sentence-end function. 

Is WordNet useful for this?

https://en.wikipedia.org/wiki/Wordnet

┌────
│ 178  (defun sentence-end ()
│ 179    "Return the regexp describing the end of a sentence.
│ 180  
│ 181  This function returns either the value of the variable `sentence-end'
│ 182  if it is non-nil, or the default value constructed from the
│ 183  variables `sentence-end-base', `sentence-end-double-space',
│ 184  `sentence-end-without-period' and `sentence-end-without-space'.
│ 185  
│ 186  The default value specifies that in order to be recognized as the
│ 187  end of a sentence, the ending period, question mark, or exclamation point
│ 188  must be followed by two spaces, with perhaps some closing delimiters
│ 189  in between.  See Info node `(elisp)Standard Regexps'."
│ 190    (or sentence-end
│ 191        ;; We accept non-break space along with space.
│ 192        (concat (if sentence-end-without-period "\\w[ \u00a0][ \u00a0]\\|")
│ 193  	      "\\("
│ 194  	      sentence-end-base
│ 195  	      (if sentence-end-double-space
│ 196  		  "\\($\\|[ \u00a0]$\\|\t\\|[ \u00a0][ \u00a0]\\)" "\\($\\|[\t \u00a0]\\)")
│ 197  	      "\\|[" sentence-end-without-space "]+"
│ 198  	      "\\)"
│ 199  	      "[ \u00a0\t\n]*")))
└────




^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: FW: bug#32758: 26.1 emacs-mac 7.2; forward-sentence in eww
  2018-10-10 10:17 FW: bug#32758: 26.1 emacs-mac 7.2; forward-sentence in eww Van L
@ 2018-10-10 11:28 ` Yuri Khan
  2018-10-10 14:19   ` Van L
  2018-10-10 15:17 ` FW: " Stefan Monnier
  1 sibling, 1 reply; 5+ messages in thread
From: Yuri Khan @ 2018-10-10 11:28 UTC (permalink / raw)
  To: van; +Cc: help-gnu-emacs

On Wed, Oct 10, 2018 at 5:17 PM Van L <van@scratch.space> wrote:

> I am looking for a one-space after punctuation sentence-ending for M-e to jump by.
>
> Disabling the following variable stops short at Nov. or Gov. (see below)
>
> ┌────
> │ (setq sentence-end-double-space nil)
> └────
>
> ┌────
> │ Haley’s departure also stoked speculation she could replace Lindsey Graham as the senator from
> │ South Carolina, a possibility that Trump played down. Talk in Washington is that should Trump
> │ replace Attorney General Jeff Sessions with Graham after the Nov. 6 congressional elections,
> │ South Carolina Gov. Henry McMaster would be responsible for selecting a replacement to serve
> │ until the 2020 election. McMaster was previously Haley’s No. 2 in the state.
> └────
>
> and I’m aware of names like A. B. C. Nurmagomedov which will stop early, too.

It might be a good idea to treat the sequence “period followed by a
single non-breaking space” as not ending a sentence. This, coupled
with the proper use of non-breaking spaces with abbreviations and
initials, will go a long way in solving the above false positive in
sentence end detection.

> ┌────
> │ 178  (defun sentence-end ()
> │ 191        ;; We accept non-break space along with space.
> │ 199         "[ \u00a0\t\n]*")))
> └────



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: bug#32758: 26.1 emacs-mac 7.2; forward-sentence in eww
  2018-10-10 11:28 ` Yuri Khan
@ 2018-10-10 14:19   ` Van L
  0 siblings, 0 replies; 5+ messages in thread
From: Van L @ 2018-10-10 14:19 UTC (permalink / raw)
  To: help-gnu-emacs


>> ┌────
>> │ Haley’s departure also stoked speculation she could replace Lindsey Graham as the senator from
>> │ South Carolina, a possibility that Trump played down. Talk in Washington is that should Trump
>> │ replace Attorney General Jeff Sessions with Graham after the Nov. 6 congressional elections,
>> │ South Carolina Gov. Henry McMaster would be responsible for selecting a replacement to serve
>> │ until the 2020 election. McMaster was previously Haley’s No. 2 in the state.
>> └────
>> 
>> and I’m aware of names like A. B. C. Nurmagomedov which will stop early, too.
> 
> It might be a good idea to treat the sequence “period followed by a
> single non-breaking space” as not ending a sentence. This, coupled
> with the proper use of non-breaking spaces with abbreviations and
> initials, will go a long way in solving the above false positive in
> sentence end detection.

The text passage is generated in eww-mode after pressing R. If that sparks any ideas.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: FW: bug#32758: 26.1 emacs-mac 7.2; forward-sentence in eww
  2018-10-10 10:17 FW: bug#32758: 26.1 emacs-mac 7.2; forward-sentence in eww Van L
  2018-10-10 11:28 ` Yuri Khan
@ 2018-10-10 15:17 ` Stefan Monnier
  2018-10-10 15:56   ` Yuri Khan
  1 sibling, 1 reply; 5+ messages in thread
From: Stefan Monnier @ 2018-10-10 15:17 UTC (permalink / raw)
  To: help-gnu-emacs

> and I’m aware of names like A. B. C. Nurmagomedov which will stop early, too.

As mentioned by Yuri NBSP can help this case.  Another heuristic is to
assume sentences don't end with a single-capital-letter word.

This said, there's also the occasional "Mr. Foo" or "Dr. Bar".

I suggest you collect examples to add them to
test/lisp/textmodes/paragraphs-tests.el (and then write some Elisp code
that tries to handle them all correctly).


        Stefan




^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: FW: bug#32758: 26.1 emacs-mac 7.2; forward-sentence in eww
  2018-10-10 15:17 ` FW: " Stefan Monnier
@ 2018-10-10 15:56   ` Yuri Khan
  0 siblings, 0 replies; 5+ messages in thread
From: Yuri Khan @ 2018-10-10 15:56 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: help-gnu-emacs

On Wed, Oct 10, 2018 at 10:18 PM Stefan Monnier
<monnier@iro.umontreal.ca> wrote:

> > and I’m aware of names like A. B. C. Nurmagomedov which will stop early, too.
>
> As mentioned by Yuri NBSP can help this case.  Another heuristic is to
> assume sentences don't end with a single-capital-letter word.

They sometimes do; we need plan B.

> This said, there's also the occasional "Mr. Foo" or "Dr. Bar".

These are no exceptions from the NBSP rule. Neither are St. Patrick
and Mt. Fuji.



^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2018-10-10 15:56 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-10-10 10:17 FW: bug#32758: 26.1 emacs-mac 7.2; forward-sentence in eww Van L
2018-10-10 11:28 ` Yuri Khan
2018-10-10 14:19   ` Van L
2018-10-10 15:17 ` FW: " Stefan Monnier
2018-10-10 15:56   ` Yuri Khan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).