From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: ken Newsgroups: gmane.emacs.help Subject: Finding end of sentence[ was Re: Understanding ... Sentence Boundaries] Date: Wed, 12 Dec 2012 09:32:31 -0500 Message-ID: <50C8957F.6060103@mousecar.com> References: <4C2691F1.5020802@mousecar.com> <50C716A3.7070302@mousecar.com> <878v94n6c4.fsf@ericabrahamsen.net> <50C74E90.9010704@mousecar.com> <87ip8792j8.fsf@ericabrahamsen.net> Reply-To: gebser@mousecar.com NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Trace: ger.gmane.org 1355322779 28426 80.91.229.3 (12 Dec 2012 14:32:59 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Wed, 12 Dec 2012 14:32:59 +0000 (UTC) To: GNU Emacs List Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Wed Dec 12 15:33:13 2012 Return-path: Envelope-to: geh-help-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1TinMx-0005ge-SG for geh-help-gnu-emacs@m.gmane.org; Wed, 12 Dec 2012 15:33:08 +0100 Original-Received: from localhost ([::1]:36117 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1TinMl-0000z1-0c for geh-help-gnu-emacs@m.gmane.org; Wed, 12 Dec 2012 09:32:55 -0500 Original-Received: from eggs.gnu.org ([208.118.235.92]:40276) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1TinMb-0000xa-Pp for help-gnu-emacs@gnu.org; Wed, 12 Dec 2012 09:32:50 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1TinMV-0007hu-Du for help-gnu-emacs@gnu.org; Wed, 12 Dec 2012 09:32:45 -0500 Original-Received: from mout.perfora.net ([74.208.4.195]:62501) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1TinMV-0007ho-7f for help-gnu-emacs@gnu.org; Wed, 12 Dec 2012 09:32:39 -0500 Original-Received: from dellap.mousecar.net (dsl093-011-016.cle1.dsl.speakeasy.net [66.93.11.16]) by mrelay.perfora.net (node=mrus4) with ESMTP (Nemesis) id 0Lg1bf-1TLJKD0YXr-00pWyn; Wed, 12 Dec 2012 09:32:37 -0500 User-Agent: Mozilla/5.0 (X11; Linux i686; rv:10.0.11) Gecko/20121120 Thunderbird/10.0.11 In-Reply-To: <87ip8792j8.fsf@ericabrahamsen.net> X-Provags-ID: V02:K0:A8MaJItP1aD+08+qRNlRAMFNe3ahF6JZ1LIrGDJOxz7 AbWAnfQ9f2np0mWs9wooSKIfdTegSNSFATHFc0lKgPV4xoq+T2 jZORnWvkMEGd9EMdOdufOfxsG+t516CHjVCS31y0510845Lojr 0zo/b9M3OTTPm1UOGH10BWF1z++jAsMXjXnoQOanYkbDYmQEWI cKGqNHmjMaAp+L8+MVsTNibyldfP1gxqeRuLMCtE+X2dg+KSu4 9glPUBMJHisfOeUfK60iWPbghEy2QUw1YSaQKGve3hdLHxCTcD EOKl9QhBvzVv/qVIw7om0PapKB7hxzd9lu4wamKFrSxbxjShk+ X8MWpVihMWbc8MtiuEeBGX2TsJS1gQeY6bLxpjclZ X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.4.x-2.6.x [generic] X-Received-From: 74.208.4.195 X-BeenThere: help-gnu-emacs@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Users list for the GNU Emacs text editor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Original-Sender: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.help:88149 Archived-At: On 12/12/2012 02:02 AM Eric Abrahamsen wrote: > ken writes: > >> On 12/11/2012 07:03 AM Eric Abrahamsen wrote: >>> ken writes: >>> >>>> On 06/26/2010 11:05 PM Deniz Dogan wrote: >>>>> 2010/6/27 ken: >>>>>> >>>>>> On 06/26/2010 06:53 AM Paul Drummond wrote: >>>>>>> Thanks for the responses guys. >>>>>>> >>>>>>> .... >>>>>>> >>>>>> Is it possible to specify word boundaries for a particular mode? >>>>>> >>>>> >>>>> Yes, it's part of the syntax table. See e.g. `modify-syntax-entry'. >>>> >>>> Thanks for the pointer to that function. >>>> >>>> The behavior I see in need of repair is the role of so-called "comments" >>>> in sentence syntax. For instance, immediately before this >>>> sentence are two spaces... which should signify the end of the >>>> previous sentence. But functions like "forward-sentence" and >>>> "fill-paragraph" and "backward-sentence" don't recognize it. >>>> >>>> Said another way, the "" string obscures the relationship >>>> between the period before it and the two spaces after it and so fails >>>> to see that one sentence ends and another starts. This occurs in >>>> text-mode and seems to be inherited by other modes. >>>> >>>> If I'm reading "modify-syntax-entry" correctly, the default meanings >>>> of '<' and'>' are, respectively, beginning and end of comment, so >>>> modifying them wouldn't fix this problem. Or can this be remedied by >>>> a change in the syntax table? Or is this a bug? >>> >>> For this particular case, I think you can modify the value of the >>> `sentence-end' variable (which is returned by the `sentence-end' >>> function? The whole thing is a little confusing). You'd probably be best >>> off starting with the docstring for the sentence-end function, and >>> working back from there. >>> >>> I think the `sentence-end' variable is automatically buffer-local, which >>> means if you change it in a mode-hook it ought to work the way you want. >>> I agree that the whole syntax thing feels like a very well-polished >>> hack. >>> >>> E >> >> Eric, >> >> Yes, that would be the variable to adjust. I took a hard look at it >> and discussed it (I believe) on this list years ago, but never came up >> with a fix. As I see it, there are two problems: >> >> First, "one" of the items in that RE would need to be "zero or more >> consecutive instances of '<' followed by any number of other >> characters up until the next '>' is found." E.g., the RE would need >> to be able to find the end of this >> sentence.)

Though I've used REs >> successfully in quite a few instances and so with a small bit of help >> could probably figure that part out, there's a second issue. >> [In my original post the paragraph below was unclear. So changed it.] >> My considered opinion is that in the above and similar examples, the >> end of the sentence is immediately after the period ('.')... or >> question mark, exclamation mark, etc. and not after the. That >> is where the point should go when forward-sentence is executed. This >> means that no RE would work because, once it finds the RE-defined >> sentence-end, it then needs to go backwards within the found string >> until it encounters [.!?]+ and then move the mark one char forward to the >> character after. IOW, unless I'm missing some capability of REs, >> "sentence-end" needs to be a function rather than an RE and would be a >> different function than one which finds the beginning of a sentence. > > I'm getting way out of my depth here, both regarding regexps and emacs' > sentence-related shenanigans, but you could consider advising the > `sentence-end' function so that it checks current the major mode, and > delegates to a different sentence-end function depending on the mode (or > declines to handle and bails to the built-in sentence-end). > > The individual mode-specific sentence-end functions look at the text > after point, and return a different regexp every time, one specifically > tailored to this particular sentence in this particular mode. The call to > `forward-sentence' or whatever happily uses a different regexp every > time it is called. > > Feels hacky, but I guess `sentence-end' is already doing this in a > sense -- potentially returning a different regexp every time. > > My brain is exhausted! > > E If one were to write a mode-specific replacement for the existing "forward-sentence" and "sentence-end", what are some ways in elisp to ensure that they're invoked when working in that mode? Would it be enough to include (the recoded) "forward-sentence" and "sentence-end" in the code for that mode...? or would some kind of specific hook language need to be included in ~/.emacs?