From: Drew Adams <drew.adams@oracle.com>
To: emacs-devel@gnu.org
Subject: Proposed enhancement for `split-string'
Date: Mon, 14 Jul 2014 15:51:24 -0700 (PDT) [thread overview]
Message-ID: <7025b422-78b5-4b17-b199-70cbb1f6de93@default> (raw)
Function `split-string' currently has this signature, where SEPARATORS
is a regexp that defines (by matching) the separators used to split
the STRING:
(split-string STRING &optional SEPARATORS OMIT-NULLS TRIM)
The STRING parts returned are the non-matches for regexp SEPARATORS.
I have an enhancement of `split-string' to propose, which lets you
alternatively split the string based on a character predicate or a
text property, instead of based on matching a regexp.
Code: http://www.emacswiki.org/emacs-en/download/subr%2b.el
Description: http://www.emacswiki.org/emacs/SplittingStrings
I can submit the enhancenment as a patch of subr.el, if there is
interest.
---
This would be the new (compatible) signature of `split-string':
(split-string STRING &optional HOW OMIT-NULLS TRIM FLIP TEST)
^^^ ^^^^ ^^^^
The second arg, HOW, can be a regexp, giving the same behavior as now.
Alternatively, HOW can be (1) a character predicate or (2) a doubleton
plist (PROPERTY VALUE), where PROPERTY is a text property and VALUE is
one of its possible values.
1. If HOW is a predicate then it must accept a character argument.
Substrings whose chars satisfy the predicate are used as
separators, so the return value is a list of substrings whose chars
do *not* satisfy predicate HOW.
2. If HOW is (PROPERTY VALUE) then STRING is split into substrings
whose chars do *not* have text property PROPERTY with value VALUE.
If VALUE is nil then any non-nil VALUE matches; that is, only the
presence of PROPERTY is tested. Characters that have PROPERTY belong
to the separators, which are excluded.
If VALUE is non-nil then a match occurs when the actual value of
PROPERTY is `eq' to VALUE; that is, characters that have a PROPERTY of
VALUE are those that are excluded.
Non-nil optional arg TEST is a binary predicate that is applied to
each char in STRING and to VALUE. If it returns non-nil for a given
character occurrence then that occurrence is part of a substring that
is excluded from the result (i.e., the char is part of a separator).
IOW, there are 3 ways to define the separator strings for splitting:
regexp matching, char-predicate satisfying, and text-property
matching.
By providing non-nil TEST you can test, for example:
* Whether the actual value of text property `invisible' belongs to the
current `buffer-invisibility-spec'.
* Whether a particular face is among the faces that are the value of
property `face'.
Non-nil optional arg FLIP simply swaps the separators and the kept
substrings - regardless of HOW the separating is defined. The
substrings that would be returned if FLIP were nil are treated as the
separators, and the substrings that would be treated as separators if
FLIP were nil are returned as the result of splitting.
The code I have also defines the following functions (in addition to a
few helper functions).
First, 3 specializations of `split-string', corresponding to the 3
kinds of HOW:
* `split-string-by-regexp' - `split-string' specialized for a regexp
HOW. That is, split by separator regexp matching. This is the
behavior of today's `split-string'.
* `split-string-by-property' - `split-string' specialized for a
property-value HOW. That is, split by separator property-value
matching.
* `split-string-by-predicate - `split-string' specialized for a
char-predicate HOW. That is, split by separator predicate
satisfying.
Second, functions similar to `buffer-substring', which return the
region as a string, but which exclude or include only certain string
parts:
* `buffer-substring-of-propertied' - Return the parts that have a
given PROPERTY.
* `buffer-substring-of-unpropertied' - Return the parts that do not
have a given PROPERTY.
* `buffer-substring-of-visible' - Return the visible parts.
* `buffer-substring-of-invisible' - Return the invisible parts.
* `buffer-substring-of-faced' - Return the parts that have property
`face'.
* `buffer-substring-of-unfaced' - Return the parts that do not have
property `face'.
Example use case:
I use `buffer-substring-of-visible' in a function that I bind to
`filter-buffer-substring-function', to remove invisible text from the
region string (which I use as part of an indirect buffer name):
(lambda (beg end _delete) ; Remove invisible text.
(let ((strg (buffer-substring-of-visible beg end)))
(set-text-properties 0 (length strg) () strg)
strg))
next reply other threads:[~2014-07-14 22:51 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-07-14 22:51 Drew Adams [this message]
2014-07-15 0:03 ` Proposed enhancement for `split-string' Stephen J. Turnbull
2014-07-18 12:24 ` Bozhidar Batsov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/emacs/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7025b422-78b5-4b17-b199-70cbb1f6de93@default \
--to=drew.adams@oracle.com \
--cc=emacs-devel@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).