* Proposed enhancement for `split-string'
@ 2014-07-14 22:51 Drew Adams
2014-07-15 0:03 ` Stephen J. Turnbull
0 siblings, 1 reply; 3+ messages in thread
From: Drew Adams @ 2014-07-14 22:51 UTC (permalink / raw)
To: emacs-devel
Function `split-string' currently has this signature, where SEPARATORS
is a regexp that defines (by matching) the separators used to split
the STRING:
(split-string STRING &optional SEPARATORS OMIT-NULLS TRIM)
The STRING parts returned are the non-matches for regexp SEPARATORS.
I have an enhancement of `split-string' to propose, which lets you
alternatively split the string based on a character predicate or a
text property, instead of based on matching a regexp.
Code: http://www.emacswiki.org/emacs-en/download/subr%2b.el
Description: http://www.emacswiki.org/emacs/SplittingStrings
I can submit the enhancenment as a patch of subr.el, if there is
interest.
---
This would be the new (compatible) signature of `split-string':
(split-string STRING &optional HOW OMIT-NULLS TRIM FLIP TEST)
^^^ ^^^^ ^^^^
The second arg, HOW, can be a regexp, giving the same behavior as now.
Alternatively, HOW can be (1) a character predicate or (2) a doubleton
plist (PROPERTY VALUE), where PROPERTY is a text property and VALUE is
one of its possible values.
1. If HOW is a predicate then it must accept a character argument.
Substrings whose chars satisfy the predicate are used as
separators, so the return value is a list of substrings whose chars
do *not* satisfy predicate HOW.
2. If HOW is (PROPERTY VALUE) then STRING is split into substrings
whose chars do *not* have text property PROPERTY with value VALUE.
If VALUE is nil then any non-nil VALUE matches; that is, only the
presence of PROPERTY is tested. Characters that have PROPERTY belong
to the separators, which are excluded.
If VALUE is non-nil then a match occurs when the actual value of
PROPERTY is `eq' to VALUE; that is, characters that have a PROPERTY of
VALUE are those that are excluded.
Non-nil optional arg TEST is a binary predicate that is applied to
each char in STRING and to VALUE. If it returns non-nil for a given
character occurrence then that occurrence is part of a substring that
is excluded from the result (i.e., the char is part of a separator).
IOW, there are 3 ways to define the separator strings for splitting:
regexp matching, char-predicate satisfying, and text-property
matching.
By providing non-nil TEST you can test, for example:
* Whether the actual value of text property `invisible' belongs to the
current `buffer-invisibility-spec'.
* Whether a particular face is among the faces that are the value of
property `face'.
Non-nil optional arg FLIP simply swaps the separators and the kept
substrings - regardless of HOW the separating is defined. The
substrings that would be returned if FLIP were nil are treated as the
separators, and the substrings that would be treated as separators if
FLIP were nil are returned as the result of splitting.
The code I have also defines the following functions (in addition to a
few helper functions).
First, 3 specializations of `split-string', corresponding to the 3
kinds of HOW:
* `split-string-by-regexp' - `split-string' specialized for a regexp
HOW. That is, split by separator regexp matching. This is the
behavior of today's `split-string'.
* `split-string-by-property' - `split-string' specialized for a
property-value HOW. That is, split by separator property-value
matching.
* `split-string-by-predicate - `split-string' specialized for a
char-predicate HOW. That is, split by separator predicate
satisfying.
Second, functions similar to `buffer-substring', which return the
region as a string, but which exclude or include only certain string
parts:
* `buffer-substring-of-propertied' - Return the parts that have a
given PROPERTY.
* `buffer-substring-of-unpropertied' - Return the parts that do not
have a given PROPERTY.
* `buffer-substring-of-visible' - Return the visible parts.
* `buffer-substring-of-invisible' - Return the invisible parts.
* `buffer-substring-of-faced' - Return the parts that have property
`face'.
* `buffer-substring-of-unfaced' - Return the parts that do not have
property `face'.
Example use case:
I use `buffer-substring-of-visible' in a function that I bind to
`filter-buffer-substring-function', to remove invisible text from the
region string (which I use as part of an indirect buffer name):
(lambda (beg end _delete) ; Remove invisible text.
(let ((strg (buffer-substring-of-visible beg end)))
(set-text-properties 0 (length strg) () strg)
strg))
^ permalink raw reply [flat|nested] 3+ messages in thread
* Proposed enhancement for `split-string'
2014-07-14 22:51 Proposed enhancement for `split-string' Drew Adams
@ 2014-07-15 0:03 ` Stephen J. Turnbull
2014-07-18 12:24 ` Bozhidar Batsov
0 siblings, 1 reply; 3+ messages in thread
From: Stephen J. Turnbull @ 2014-07-15 0:03 UTC (permalink / raw)
To: Drew Adams; +Cc: emacs-devel
Drew Adams writes:
> The second arg, HOW, can be a regexp, giving the same behavior as now.
> Alternatively, HOW can be (1) a character predicate or (2) a doubleton
> plist (PROPERTY VALUE), where PROPERTY is a text property and VALUE is
> one of its possible values.
Why not just allow it to be any function returning an interval (with
implicit argument = (point)), and provide appropriate functions to
accomplish the tasks you propose?
> By providing non-nil TEST you can test, for example:
>
> * Whether the actual value of text property `invisible' belongs to the
> current `buffer-invisibility-spec'.
>
> * Whether a particular face is among the faces that are the value of
> property `face'.
A general predicate for HOW could do this, too.
> Non-nil optional arg FLIP simply swaps the separators and the kept
> substrings - regardless of HOW the separating is defined.
This can be done for the "standard" functions by providing an optional
FLIP argument, and using (lambda () (how-func 'flip-me)) as the HOW.
Alternatively you could provide flipped standard HOW functions.
I have no objection to a new function `split-string-à-la-drew' with
any signature you like, but `split-string' should keep as simple a
signature as possible.
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Proposed enhancement for `split-string'
2014-07-15 0:03 ` Stephen J. Turnbull
@ 2014-07-18 12:24 ` Bozhidar Batsov
0 siblings, 0 replies; 3+ messages in thread
From: Bozhidar Batsov @ 2014-07-18 12:24 UTC (permalink / raw)
To: Stephen J. Turnbull, Drew Adams; +Cc: emacs-devel
[-- Attachment #1: Type: text/plain, Size: 1519 bytes --]
Drew’s suggestion can be implemented a different function in subr-x I guess.
—
Cheers,
Bozhidar
On July 15, 2014 at 3:04:48 AM, Stephen J. Turnbull (stephen@xemacs.org) wrote:
Drew Adams writes:
> The second arg, HOW, can be a regexp, giving the same behavior as now.
> Alternatively, HOW can be (1) a character predicate or (2) a doubleton
> plist (PROPERTY VALUE), where PROPERTY is a text property and VALUE is
> one of its possible values.
Why not just allow it to be any function returning an interval (with
implicit argument = (point)), and provide appropriate functions to
accomplish the tasks you propose?
> By providing non-nil TEST you can test, for example:
>
> * Whether the actual value of text property `invisible' belongs to the
> current `buffer-invisibility-spec'.
>
> * Whether a particular face is among the faces that are the value of
> property `face'.
A general predicate for HOW could do this, too.
> Non-nil optional arg FLIP simply swaps the separators and the kept
> substrings - regardless of HOW the separating is defined.
This can be done for the "standard" functions by providing an optional
FLIP argument, and using (lambda () (how-func 'flip-me)) as the HOW.
Alternatively you could provide flipped standard HOW functions.
I have no objection to a new function `split-string-à-la-drew' with
any signature you like, but `split-string' should keep as simple a
signature as possible.
[-- Attachment #2: Type: text/html, Size: 2454 bytes --]
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2014-07-18 12:24 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-07-14 22:51 Proposed enhancement for `split-string' Drew Adams
2014-07-15 0:03 ` Stephen J. Turnbull
2014-07-18 12:24 ` Bozhidar Batsov
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.