From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Karan Bathla Newsgroups: gmane.emacs.help Subject: Re: Understanding Word Boundaries Date: Wed, 16 Jun 2010 13:07:01 -0700 (PDT) Message-ID: <855647.31845.qm@web36205.mail.mud.yahoo.com> References: NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: multipart/alternative; boundary="0-1101676029-1276718821=:31845" X-Trace: dough.gmane.org 1276718916 14686 80.91.229.12 (16 Jun 2010 20:08:36 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Wed, 16 Jun 2010 20:08:36 +0000 (UTC) To: help-gnu-emacs@gnu.org, Paul Drummond Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Wed Jun 16 22:08:33 2010 connect(): No such file or directory Return-path: Envelope-to: geh-help-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1OOyuT-0004Ya-Bq for geh-help-gnu-emacs@m.gmane.org; Wed, 16 Jun 2010 22:08:30 +0200 Original-Received: from localhost ([127.0.0.1]:41737 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1OOyuS-00010t-O9 for geh-help-gnu-emacs@m.gmane.org; Wed, 16 Jun 2010 16:08:28 -0400 Original-Received: from [140.186.70.92] (port=50817 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1OOyt9-00009E-AQ for help-gnu-emacs@gnu.org; Wed, 16 Jun 2010 16:07:08 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69) (envelope-from ) id 1OOyt7-0005ev-Cr for help-gnu-emacs@gnu.org; Wed, 16 Jun 2010 16:07:07 -0400 Original-Received: from web36205.mail.mud.yahoo.com ([209.191.68.231]:24421) by eggs.gnu.org with smtp (Exim 4.69) (envelope-from ) id 1OOyt6-0005eZ-RQ for help-gnu-emacs@gnu.org; Wed, 16 Jun 2010 16:07:05 -0400 Original-Received: (qmail 32347 invoked by uid 60001); 16 Jun 2010 20:07:02 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1276718821; bh=IYqbcPQ9rSqUwD+RSixrvoMfYkbpOAjqediDkcOd+Mg=; h=Message-ID:X-YMail-OSG:Received:X-Mailer:Date:From:Subject:To:In-Reply-To:MIME-Version:Content-Type; b=dvGcP7IcYFgmRJFQq1ENoXvUfo7jcaM0u3Ki+vSc9xQ0kscUFHMXvhtXNc3+VM/AP5aBQkgts9YUwdP4/M0tIBoMBUreRghBwbZfp+tUhUlgTHyru8quikwX94vxQKOM0PM9++VIiRBT4a9mpgZAKi0lSDjIIQVjvTf74jqySCo= DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=Message-ID:X-YMail-OSG:Received:X-Mailer:Date:From:Subject:To:In-Reply-To:MIME-Version:Content-Type; b=oB7dk9/MF5zxSKjMm01BSryBV75os8SpmcSl/3C7Q3N7vLTxry+4q7lnS6kEAUnWcsTy950MV+rMjgNoFJjupwiSoG8AtHrca6zNcym+xdFEJaa2oKXTmOlZYhkRRud1eVcwWrGYHhO57TA1IIL5cYZ6r1kTTeicKnrpxvck8TE=; X-YMail-OSG: kyJHpQ8VM1knVMKrKD4i1B.hGXGE4tInVUWfpxpn79jUn_K d0MINotaWxiHtFWVNxAZI6HGmo_Uu1RHvsZDgPxym5ya3LCZo8deaVl3FtyX pBuP4xOO9seGdFB7kw7vQCVRKESKhCrjhB3J8SsanYji.JztJkYhFDC5kGSW 0sOGUSXv9mtODm_bWeZ7OeiI0ts.VEqJxrGQZXhOVbcN7MWPg1PND7dSeVPK FXhthlVLDuSkR9LB6Qb9QQiVFR15rFlTor12n.gz1Ga.AtksUwB.zVeCQUNb 3eH7.CSjgCqCmHQjVTfuyFc_BUiCcHaSJougcgbAL6qz3_UklX0xJHfC49_z kV6KsESY2QRy.XLVo2gqov9w- Original-Received: from [115.184.57.255] by web36205.mail.mud.yahoo.com via HTTP; Wed, 16 Jun 2010 13:07:01 PDT X-Mailer: YahooMailClassic/11.1.4 YahooMailWebService/0.8.103.269680 In-Reply-To: X-detected-operating-system: by eggs.gnu.org: FreeBSD 6.x (1) X-BeenThere: help-gnu-emacs@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Users list for the GNU Emacs text editor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.help:73902 Archived-At: --0-1101676029-1276718821=:31845 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable I don't know about the word boundary thing in vim and elisp code for that b= ut the behaviour of backward-kill-word is simple : kill the last word; wher= e a word is something alphanumeric. Any non alphanumeric characters like : = and ( are deleted automatically if between point and last word. There is no= concept here of : or ( being word boundaries. So if you do M-d on ":67a" whole thing gets deleted and in "67a:", : remain= s (with point at beginning of string). --- On Wed, 6/16/10, Paul Drummond wrote: From: Paul Drummond Subject: Understanding Word Boundaries To: help-gnu-emacs@gnu.org Date: Wednesday, June 16, 2010, 4:14 PM I have been an Emacs users for a few years now so definitely still a newbie= !=A0 While initially I struggled to control its power, I eventually came ro= und.=A0 Every issue I've had so far I've been able to fix by a quick search= in EmacsWiki, except for one frustrating and re-occurring problem that has= plagued me for years - word boundaries. =0A=0A=0A Before Emacs I used Vim exclusively and the word boundary behaviour in Vim = *just worked* - I didn't even have to think about it.=A0 No matter what lan= guage I used I could navigate and manipulate words without thinking about i= t.=A0 The way word boundaries work in Vim is elegant and I have spent a lot= of time trying to find some elisp to replicate the behaviour in Emacs but = to no avail. =0A=0A I could write some elisp myself but I am still very new to it so it will ta= ke a while - it's something I would like to do but I don't have time at the= moment.=A0 Regardless, an elisp solution to the problem is not the point o= f this post.=A0 I want to understand why word boundaries behave the way the= y do in Vanilla Emacs and I would greatly appropriate some views on this fr= om some Emacs Gurus!=20 =0A=0A=0A Every time I notice the word boundary behaviour when hacking in Emacs I won= der to myself - "I must be missing something here.=A0 Surely, experienced E= macs users don't just *put up* with this!=A0 Yet every forum response, blog= post, mailing-list post I have read suggests they do.=A0 This is atypical = of the Emacs community in my experience.=A0 Usually when something behaves = wrong in Emacs, it's easy to find some elisp that just fixes the problem fu= ll stop.=A0 Yet with word-boundaries all I can find is suggestions that fix= a particular gripe but nothing that provides a general solution. =0A=0A I have loads of examples but I will mentioned just a few here to hopefully = kick-start further discussion.=A0=20 ** Example 1 I use org-mode for my journal and today I hit the word-boundary problem whi= le entering my morning journal entry - here's a contrived example of what I= entered: =0A=0A ** [10:27] Understanding Word Boundaries in Emacs =0A=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 ^ With point at the end of the word "Understanding" I hit C-w (which I bind t= o backward-kill-word) and the word "Understanding" is killed as expected.= =A0 But when I hit C-w again, the point kills to the colon.=A0 Why?=A0 Why = is colon a word-boundary but the closing square bracket isn't? =0A=0A=0A ** Example 2 When editing C++ files I often need to delete the "ClassName::" part when d= eclaring functions in the header: void ClassName::function(); =A0=A0=A0=A0=A0=A0 ^ With point at the start of ClassName I want to press M-d twice to delete Cl= assName and :: but "::" isn't recognised as a word.=A0 In Vim I just type "= dw" twice and it *just works*. =0A=0A=0A ** Example 3 I have loads of problems when deleting and navigating words over multiple l= ines.=A0 In the following C++ code for instance: =A0=A0=A0 Page *page =3D new _Page(this); =A0=A0=A0 page.load(); =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 ^ =A0=A0=A0=A0=A0=A0=A0=A0=20 =0A=0A When point is after "page", before the dot on the second line and I hit M-b= (backward-word) point ends up at the first opening bracket of "Page(" !!! =0A Again, vim does the right thing here - pressing 'b' takes the point to the = closing bracket of Page(this) so it doesn't recognise the semi-colon as a b= racket which is intuitive and what I would expect.=A0 This is really the po= int I am trying to make.=A0 I have never taken the time to understand the b= ehaviour of word boundaries in Vim because *it just works*.=A0 In Emacs I a= m forced to think about word boundaries because Emacs keeps surprising me w= ith its weird behaviour! =0A=0A=0A Note: My examples happen to be C++ but I use lots of other languages too in= cluding elisp, Clojure, JavaScript, Python and Java and the word-boundaries= seem to be wrong for all of them. I have tried several different elisp solutions but each one has at least on= e feature that isn't quite right.=A0 Here are some links I kept, I've tried= many other solutions but don't have the links to hand: =0A=0A http://stackoverflow.com/questions/2078855/about-the-forward-and-backward-a= -word-behaviour-in-emacs =0A=0Ahttp://stackoverflow.com/questions/1771102/changing-emacs-forward-wor= d-behaviour/1772365#1772365 =0A So to wrap up, the point of this post is to kick-start a discussion about w= hy the word boundaries in Vanilla Emacs (specifically GNU Emacs 23.1.50.1 i= n my case) seem to be so awkward and unintuitive.=20 Regards, =0A=0APaul Drummond =0A=0A=0A=0A=0A --0-1101676029-1276718821=:31845 Content-Type: text/html; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable

I don't know about the word boundary thin= g in vim and elisp code for that but the behaviour of backward-kill-word is= simple : kill the last word; where a word is something alphanumeric. Any n= on alphanumeric characters like : and ( are deleted automatically if betwee= n point and last word. There is no concept here of : or ( being word bounda= ries.

So if you do M-d on ":67a" whole thing gets deleted and in "67= a:", : remains (with point at beginning of string).

--- On Wed, 6= /16/10, Paul Drummond <paul.drummond@iode.co.uk> wrote:

From: Paul Drummond <paul.drummond@iode.co= .uk>
Subject: Understanding Word Boundaries
To: help-gnu-emacs@gnu= .org
Date: Wednesday, June 16, 2010, 4:14 PM

I have been an Emacs users for a few years now so definitely still a newbie!=   While initially I struggled to control its power, I eventually came = round.  Every issue I've had so far I've been able to fix by a quick s= earch in EmacsWiki, except for one frustrating and re-occurring problem tha= t has plagued me for years - word boundaries.
=0A=0A=0A
Before Emacs = I used Vim exclusively and the word boundary behaviour in Vim *just worked*= - I didn't even have to think about it.  No matter what language I us= ed I could navigate and manipulate words without thinking about it.  T= he way word boundaries work in Vim is elegant and I have spent a lot of tim= e trying to find some elisp to replicate the behaviour in Emacs but to no a= vail.
=0A=0A
I could write some elisp myself but I am still very new = to it so it will take a while - it's something I would like to do but I don= 't have time at the moment.  Regardless, an elisp solution to the prob= lem is not the point of this post.  I want to understand why word boun= daries behave the way they do in Vanilla Emacs and I would greatly appropri= ate some views on this from some Emacs Gurus!
=0A=0A=0A
Every time I= notice the word boundary behaviour when hacking in Emacs I wonder to mysel= f - "I must be missing something here.  Surely, experienced Emacs user= s don't just *put up* with this!  Yet every forum response, blog post,= mailing-list post I have read suggests they do.  This is atypical of = the Emacs community in my experience.  Usually when something behaves = wrong in Emacs, it's easy to find some elisp that just fixes the problem fu= ll stop.  Yet with word-boundaries all I can find is suggestions that = fix a particular gripe but nothing that provides a general solution.
=0A= =0A
I have loads of examples but I will mentioned just a few here to hop= efully kick-start further discussion. 

** Example 1

I u= se org-mode for my journal and today I hit the word-boundary problem while = entering my morning journal entry - here's a contrived example of what I en= tered:
=0A=0A
** [10:27] Understanding Word Boundaries in Emacs
= =0A            =             &nb= sp;          ^
With point a= t the end of the word "Understanding" I hit C-w (which I bind to backward-k= ill-word) and the word "Understanding" is killed as expected.  But whe= n I hit C-w again, the point kills to the colon.  Why?  Why is co= lon a word-boundary but the closing square bracket isn't?
=0A=0A=0A
*= * Example 2

When editing C++ files I often need to delete the "Class= Name::" part when declaring functions in the header:

void ClassName:= :function();
       ^

With point at= the start of ClassName I want to press M-d twice to delete ClassName and := : but "::" isn't recognised as a word.  In Vim I just type "dw" twice = and it *just works*.
=0A=0A=0A
** Example 3

I have loads of pr= oblems when deleting and navigating words over multiple lines.  In the= following C++ code for instance:

    Page *page =3D = new _Page(this);
    page.load();
   &n= bsp;       ^      &n= bsp;  
=0A=0A
When point is after "page", before the dot o= n the second line and I hit M-b (backward-word) point ends up at the first = opening bracket of "Page(" !!!
=0A
Again, vim does the right thing he= re - pressing 'b' takes the point to the closing bracket of Page(this) so i= t doesn't recognise the semi-colon as a bracket which is intuitive and what= I would expect.  This is really the point I am trying to make.  = I have never taken the time to understand the behaviour of word boundaries = in Vim because *it just works*.  In Emacs I am forced to think about w= ord boundaries because Emacs keeps surprising me with its weird behaviour!<= br>=0A=0A=0A
Note: My examples happen to be C++ but I use lots of other = languages too including elisp, Clojure, JavaScript, Python and Java and the= word-boundaries seem to be wrong for all of them.

I have tried seve= ral different elisp solutions but each one has at least one feature that is= n't quite right.  Here are some links I kept, I've tried many other so= lutions but don't have the links to hand:
=0A=0A
http://stackoverflow.co= m/questions/2078855/about-the-forward-and-backward-a-word-behaviour-in-emac= s
=0A=0Ahttp://stackoverflow.com/questions/1771102/changing-emacs-forward-= word-behaviour/1772365#1772365
=0A
So to wrap up, the point of th= is post is to kick-start a discussion about why the word boundaries in Vani= lla Emacs (specifically GNU Emacs 23.1.50.1 in my case) seem to be so awkwa= rd and unintuitive.

Regards,
=0A=0APaul Drummond
=0A=0A
=

=0A=0A=0A=0A=0A=0A=0A=0A --0-1101676029-1276718821=:31845--