unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: Drew Adams <drew.adams@oracle.com>
To: Lars Ingebrigtsen <larsi@gnus.org>, emacs-devel <emacs-devel@gnu.org>
Subject: RE: Text property searching
Date: Mon, 16 Apr 2018 13:05:28 -0700 (PDT)	[thread overview]
Message-ID: <78f73e87-367d-4ab9-abfe-a1a60cbc44eb@default> (raw)
In-Reply-To: <87muy3ypl8.fsf@mouse.gnus.org>

> Below is a draft of the documentation of this function.  Does it all
> make sense?  :-)

(I'm going only by your doc/description, not the code, which
I don't have and won't bother to try to access.)

What if someone doesn't want to gather strings but instead
wants the match-zone limits?

E.g., instead of returning buffer substrings for the matches,
return conses (beg . end).  This is (should be) mainly about
searching the _buffer_.  It is not (should not be) mainly
about gathering a list of matching strings (or a defstruct
holding such a list).

IOW, this sounds wrong, to me:

  This function is modelled after ‘search-forward’ and friends in
  that it moves point, but it returns a structure that describes the
  match instead of returning it in ‘match-beginning’ and friends.

And better than it returning (beg . end) conses is for it
to just provide access, on demand, to the matched text and
its positions using `match-data' - the usual Emacs approach.

IOW, better for it to _really_ be "modeled after
`search-forward'" - to find and return a buffer position.
(`search-forward' does not just "move point" - it returns
it.)

With `search-forward' the side effect of matching lets you
easily do various things with the `match-data' (always
only on demand).  Why return a structure here?  Why even
build a structure and put the relevant info into it?

Why not let the usual kind of `search-forward'-using code
work just as well with your minor variant: get whatever
info you want, on demand, from the `match-data'?

The current design sounds a bit analogous to tossing out
`match-data' in favor of just `match-string'.  Except that
you even _return_ the strings, in a defstruct no less.

That might seem to be convenient for someone who always wants
the strings, but it sounds less useful generally.

Similarly, I'd think we would want all of the same optional
args and behavior as are provided by `search-forward':
limiting the search scope, raising or suppressing an error,
and repeating for a given count.  That's a proven and widely
used Emacs interface.

In sum, why isn't `search-forward' a proper model in all
respects?

>  -- Function: text-property-search-forward prop value predicate
>      Search for the next region that has text property PROP set to VALUE
>      according to PREDICATE.
> 
>      This function is modelled after ‘search-forward’ and friends in
>      that it moves point, but it returns a structure that describes the
>      match instead of returning it in ‘match-beginning’ and friends.
> 
>      If the text property can’t be found, the function returns ‘nil’.
>      If it’s found, point is placed at the end of the region that has
>      this text property match, and a ‘prop-match’ structure is returned.
> 
>      PREDICATE can either be ‘t’ (which is a synonym for ‘equal’), ‘nil’
>      (which means “not equal”), or a predicate that will be called with
>      two parameters: The first is VALUE, and the second is the value of
>      the text property we’re inspecting.
> 
>      In the examples below, imagine that you’re in a buffer that looks
>      like this:
> 
>           This is a bold and here's bolditalic and this is the end.
> 
>      That is, the “bold” words are the ‘bold’ face, and the “italic”
>      word is in the ‘italic’ face.
> 
>      With point at the start:
> 
>           (while (setq match (text-property-search-forward 'face 'bold
> t))
>             (push (buffer-substring (prop-match-beginning match) (prop-
> match-end match))
>                   words))
> 
>      This will pick out all the words that use the ‘bold’ face.
> 
>           (while (setq match (text-property-search-forward 'face nil t))
>             (push (buffer-substring (prop-match-beginning match) (prop-
> match-end match))
>                   words))
> 
>      This will pick out all the bits that have no face properties, which
>      will result in the list ‘("This is a " "and here's " "and this is
>      the end")’ (only reversed, since we used ‘push’).
> 
>           (while (setq match (text-property-search-forward 'face nil
> nil))
>             (push (buffer-substring (prop-match-beginning match) (prop-
> match-end match))
>                   words))
> 
>      This will pick out all the regions where ‘face’ is set to
>      something, but this is split up into where the properties change,
>      so the result here will be ‘"bold" "bold" "italic"’.
> 
>      For a more realistic example where you might use this, consider
>      that you have a buffer where certain sections represent URLs, and
>      these are tagged with ‘shr-url’.
> 
>           (while (setq match (text-property-search-forward 'shr-url nil
> nil))
>             (push (prop-match-value match) urls))
> 
>      This will give you a list of all those URLs.
> 
> ---
> 
> Hm...  it strikes me now that the two last parameters should be
> optional, since (text-property-search-forward 'shr-url) would then be
> even more obvious in its meaning.
> 
> --
> (domestic pets only, the antidote for overdose, milk.)
>    bloggy blog: https://urldefense.proofpoint.com/v2/url?u=http-
> 3A__lars.ingebrigtsen.no&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65ea
> pI_JnE&r=kI3P6ljGv6CTHIKju0jqInF6AOwMCYRDQUmqX22rJ98&m=Yw3C0DwmaGuclCaCVP
> qf0h4uc8nQ0WGIsKOuB6erSDk&s=AD99bU7m0KQGk9biPMMiyY0fEF5YLeA2s_8c-
> nbYakQ&e=
> 
> 



  parent reply	other threads:[~2018-04-16 20:05 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-15 22:56 Text property searching Lars Ingebrigtsen
2018-04-16  0:00 ` T.V Raman
2018-04-16  4:40 ` Dmitry Gutov
2018-04-16 12:01   ` Lars Ingebrigtsen
2018-04-16 13:04     ` Dmitry Gutov
2018-04-16 15:11       ` Lars Ingebrigtsen
2018-04-16 18:06         ` Eli Zaretskii
2018-04-16 18:30           ` Lars Ingebrigtsen
2018-04-16 14:30     ` Drew Adams
2018-04-16 17:52     ` Eli Zaretskii
2018-04-16 18:31       ` Lars Ingebrigtsen
2018-04-16 15:16 ` João Távora
2018-04-16 16:09   ` Lars Ingebrigtsen
     [not found]     ` <CALDnm52qb5jfjC181pS+UTwfFES95m=EYtyXJzya7pdBMz8rxA@mail.gmail.com>
     [not found]       ` <87d0yz15a3.fsf@mouse.gnus.org>
2018-04-16 16:54         ` João Távora
2018-04-16 16:57           ` Lars Ingebrigtsen
2018-04-16 17:30             ` João Távora
2018-04-16 17:35               ` Lars Ingebrigtsen
2018-04-16 18:26                 ` Lars Ingebrigtsen
2018-04-16 18:52                   ` Eli Zaretskii
2018-04-16 19:01                     ` Lars Ingebrigtsen
2018-04-16 19:48                       ` Eli Zaretskii
2018-04-16 19:53                         ` Lars Ingebrigtsen
2018-04-16 19:59                           ` Eli Zaretskii
2018-04-16 21:56                             ` Clément Pit-Claudel
2018-04-16 21:58                               ` Lars Ingebrigtsen
2018-04-16 22:06                                 ` João Távora
2018-04-16 22:21                                 ` Clément Pit-Claudel
2018-04-16 19:02                     ` Lars Ingebrigtsen
2018-04-16 19:50                       ` Eli Zaretskii
2018-04-16 19:56                         ` Lars Ingebrigtsen
2018-04-16 20:05                   ` Drew Adams [this message]
2018-04-16 20:11                     ` Lars Ingebrigtsen
2018-04-16 20:40               ` Lars Ingebrigtsen
2018-04-16 20:48                 ` Stefan Monnier
2018-04-16 21:17                 ` João Távora
2018-04-16 21:21                   ` Lars Ingebrigtsen
     [not found]                     ` <CALDnm51Hgs6b_Q=A0mZ=UMnOeOUX2fGE+dTf2JP4HOMF11-z8A@mail.gmail.com>
2018-04-16 21:32                       ` Lars Ingebrigtsen
2018-04-17 19:10                 ` Alan Mackenzie
2018-04-17 19:16                   ` Lars Ingebrigtsen
2018-04-17 20:31                     ` Alan Mackenzie
2018-04-17 20:42                       ` Lars Ingebrigtsen
2018-04-16 19:40             ` Alan Mackenzie
2018-04-16 19:49               ` Lars Ingebrigtsen
2018-04-16 20:07                 ` Alan Mackenzie
2018-04-16 20:31                   ` Lars Ingebrigtsen
2018-04-16 21:18   ` Lars Ingebrigtsen
2018-04-16 21:28     ` João Távora
2018-04-16 22:09       ` Lars Ingebrigtsen
2018-04-17 13:01         ` Stefan Monnier
2018-04-17 13:04           ` Lars Ingebrigtsen
2018-04-16 16:59 ` Lars Ingebrigtsen
2018-04-16 18:03   ` Lars Ingebrigtsen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=78f73e87-367d-4ab9-abfe-a1a60cbc44eb@default \
    --to=drew.adams@oracle.com \
    --cc=emacs-devel@gnu.org \
    --cc=larsi@gnus.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).