all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: ken <gebser@mousecar.com>
To: Eric Abrahamsen <eric@ericabrahamsen.net>,
	 GNU Emacs List <help-gnu-emacs@gnu.org>
Subject: Re: Understanding Word  and Sentence Boundaries
Date: Tue, 11 Dec 2012 10:17:36 -0500	[thread overview]
Message-ID: <50C74E90.9010704@mousecar.com> (raw)
In-Reply-To: <878v94n6c4.fsf@ericabrahamsen.net>

On 12/11/2012 07:03 AM Eric Abrahamsen wrote:
> ken<gebser@mousecar.com>  writes:
>
>> On 06/26/2010 11:05 PM Deniz Dogan wrote:
>>> 2010/6/27 ken<gebser@mousecar.com>:
>>>>
>>>> On 06/26/2010 06:53 AM Paul Drummond wrote:
>>>>> Thanks for the responses guys.
>>>>>
>>>>> ....
>>>>>
>>>>> I presume that Emacs hackers either a) put up with it or b) spend a lot
>>>>> of time fixing each case until they are happy.
>>>>>
>>>>> I suspect the answer is b. ;-)
>>>>>
>>>>> I wish there was a single minor-mode that fixes all the word boundary
>>>>> issues for every major-mode I use!  I can but dream.   Or maybe I will
>>>>> get round to doing it myself one day!  ;)
>>>>>
>>>>> Cheers,
>>>>> Paul Drummond
>>>>
>>>>
>>>> Is it possible to specify word boundaries for a particular mode?
>>>>
>>>
>>> Yes, it's part of the syntax table. See e.g. `modify-syntax-entry'.
>>
>> Thanks for the pointer to that function.
>>
>> The behavior I see in need of repair is the role of so-called "comments"
>> in sentence syntax.</tag>   For instance, immediately before this
>> sentence are two spaces... which should signify the end of the
>> previous sentence.  But functions like "forward-sentence" and
>> "fill-paragraph" and "backward-sentence" don't recognize it.
>>
>> Said another way, the "</tag>" string obscures the relationship
>> between the period before it and the two spaces after it and so fails
>> to see that one sentence ends and another starts.  This occurs in
>> text-mode and seems to be inherited by other modes.
>>
>> If I'm reading "modify-syntax-entry" correctly, the default meanings
>> of '<' and'>' are, respectively, beginning and end of comment, so
>> modifying them wouldn't fix this problem.  Or can this be remedied by
>> a change in the syntax table?  Or is this a bug?
>
> For this particular case, I think you can modify the value of the
> `sentence-end' variable (which is returned by the `sentence-end'
> function? The whole thing is a little confusing). You'd probably be best
> off starting with the docstring for the sentence-end function, and
> working back from there.
>
> I think the `sentence-end' variable is automatically buffer-local, which
> means if you change it in a mode-hook it ought to work the way you want.
> I agree that the whole syntax thing feels like a very well-polished
> hack.
>
> E

Eric,

Yes, that would be the variable to adjust.  I took a hard look at it and 
discussed it (I believe) on this list years ago, but never came up with 
a fix.  As I see it, there are two problems:

First, "one" of the items in that RE would need to be "zero or more 
consecutive instances of '<' followed by any number of other characters 
up until the next '>' is found."  E.g., the RE would need to be able to 
find the end of this sentence</b></i>.)</q></p></span></div>  Though 
I've used REs successfully in quite a few instances and so with a small 
bit of help could probably figure that part out, there's a second issue.

My considered opinion is that in the above and similar examples, the end 
of the sentence is immediately after the period ('.')... or question 
mark, exclamation mark, etc. and not after the </div>.  That is where 
the point should go when forward-sentence is executed.  This means that 
no RE would work because, once it finds the RE-defined sentence-end, it 
then needs to go backwards within the found string until it encounters 
[.!?]+ and then search forward again to the first character after.  IOW, 
unless I'm missing some capability of REs, "sentence-end" needs to be a 
function rather than an RE and would be a different function than one 
which finds the beginning of a sentence.




  reply	other threads:[~2012-12-11 15:17 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-06-16 10:44 Understanding Word Boundaries Paul Drummond
2010-06-16 20:07 ` Karan Bathla
2010-06-17 13:37   ` Deniz Dogan
2010-06-23  9:02 ` Gary
2010-06-26 10:46   ` Paul Drummond
2010-06-26 10:53     ` Paul Drummond
2010-06-26 11:22       ` Thien-Thi Nguyen
2010-06-26 23:49       ` ken
2010-06-27  3:05         ` Deniz Dogan
2012-12-11 11:18           ` Understanding Word and Sentence Boundaries ken
2012-12-11 12:03             ` Eric Abrahamsen
2012-12-11 15:17               ` ken [this message]
2012-12-12  7:02                 ` Eric Abrahamsen
2012-12-12 14:32                   ` Finding end of sentence[ was Re: Understanding ... Sentence Boundaries] ken
2012-12-13  4:27                     ` Eric Abrahamsen
2012-12-13  5:59                       ` Eric Abrahamsen
     [not found]         ` <mailman.7.1277607983.30403.help-gnu-emacs@gnu.org>
2010-06-27 15:02           ` Understanding Word Boundaries Xah Lee
2012-12-11  2:11       ` Samuel Wales
     [not found]     ` <mailman.2.1277549613.3306.help-gnu-emacs@gnu.org>
2010-06-27 14:58       ` Xah Lee
2010-06-25 10:33 ` andreas.roehler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=50C74E90.9010704@mousecar.com \
    --to=gebser@mousecar.com \
    --cc=eric@ericabrahamsen.net \
    --cc=help-gnu-emacs@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.