unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Why (substring "abc" 0 4) does not return "abc" instead of an error?
@ 2012-07-15 23:15 Bastien
  2012-07-15 23:29 ` Juanma Barranquero
  0 siblings, 1 reply; 42+ messages in thread
From: Bastien @ 2012-07-15 23:15 UTC (permalink / raw)
  To: emacs-devel

As the subject says: I wonder why 

  (substring "abc" 0 4)

does not return "abc".

Is it a design choice or due to some implementation 
constraints?

Thanks,

-- 
 Bastien




^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Why (substring "abc" 0 4) does not return "abc" instead of an error?
  2012-07-15 23:15 Why (substring "abc" 0 4) does not return "abc" instead of an error? Bastien
@ 2012-07-15 23:29 ` Juanma Barranquero
  2012-07-15 23:59   ` Bastien
  0 siblings, 1 reply; 42+ messages in thread
From: Juanma Barranquero @ 2012-07-15 23:29 UTC (permalink / raw)
  To: Bastien; +Cc: emacs-devel

On Mon, Jul 16, 2012 at 1:15 AM, Bastien <bzg@gnu.org> wrote:

> As the subject says: I wonder why
>
>   (substring "abc" 0 4)
>
> does not return "abc".

Why should it? How it is different from (aref "abc" 4)?

    Juanma



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Why (substring "abc" 0 4) does not return "abc" instead of an error?
  2012-07-15 23:29 ` Juanma Barranquero
@ 2012-07-15 23:59   ` Bastien
  2012-07-16  0:10     ` Juanma Barranquero
  0 siblings, 1 reply; 42+ messages in thread
From: Bastien @ 2012-07-15 23:59 UTC (permalink / raw)
  To: Juanma Barranquero; +Cc: emacs-devel

Juanma Barranquero <lekktu@gmail.com> writes:

> On Mon, Jul 16, 2012 at 1:15 AM, Bastien <bzg@gnu.org> wrote:
>
>> As the subject says: I wonder why
>>
>>   (substring "abc" 0 4)
>>
>> does not return "abc".
>
> Why should it? How it is different from (aref "abc" 4)?

I read (aref "abc" 4) as "return the 5th element of "abc"". 

So I expect an error here.

I read (substring "abc" 0 4) as "return the biggest substring
between 0 and 4" -- even if the string does not have 4 characters.

Surely I misread, but this would be handy in some cases, instead
of using something like (format "%.4s" "abc").

-- 
 Bastien



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Why (substring "abc" 0 4) does not return "abc" instead of an error?
  2012-07-15 23:59   ` Bastien
@ 2012-07-16  0:10     ` Juanma Barranquero
  2012-07-16  7:14       ` Bastien
  2012-07-16  7:38       ` Andreas Schwab
  0 siblings, 2 replies; 42+ messages in thread
From: Juanma Barranquero @ 2012-07-16  0:10 UTC (permalink / raw)
  To: Bastien; +Cc: emacs-devel

On Mon, Jul 16, 2012 at 1:59 AM, Bastien <bzg@gnu.org> wrote:

> I read (substring "abc" 0 4) as "return the biggest substring
> between 0 and 4" -- even if the string does not have 4 characters.

"Even if the string does not have 4 characters" is not even suggested
in substring's doc.

> Surely I misread, but this would be handy in some cases, instead
> of using something like (format "%.4s" "abc").

Would it be handy? Sometimes, perhaps. At other times, having
substring check that TO is indeed in range is quite useful. Also, if
you start with substring, then it will be buffer-substring, and then
most functions that deal with string ranges, and then buffer ranges?
I'm sure use cases could be found for all these functions where your
expansive interpretation would be handy...

BTW, for unexpected behavior wrt strings, my favourite is this one
(it's not a bug);

(let ((s "ab") (m 2)) (eq (substring s 0 m) (substring s 0 m))) => nil
(let ((s "ab") (m 1)) (eq (substring s 0 m) (substring s 0 m))) => nil
(let ((s "ab") (m 0)) (eq (substring s 0 m) (substring s 0 m))) => t

    Juanma



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Why (substring "abc" 0 4) does not return "abc" instead of an error?
@ 2012-07-16  3:45 Dmitry Gutov
  2012-07-16  7:32 ` Bastien
  2012-07-16 13:10 ` Pascal J. Bourguignon
  0 siblings, 2 replies; 42+ messages in thread
From: Dmitry Gutov @ 2012-07-16  3:45 UTC (permalink / raw)
  To: lekktu; +Cc: bzg, emacs-devel

Juanma Barranquero <lekktu@gmail.com> writes:

 > On Mon, Jul 16, 2012 at 1:59 AM, Bastien <bzg@gnu.org> wrote:
 >
 >> I read (substring "abc" 0 4) as "return the biggest substring
 >> between 0 and 4" -- even if the string does not have 4 characters.
 >
 > "Even if the string does not have 4 characters" is not even suggested
 > in substring's doc.

FWIW, it's common behavior in many other programming languages.

--Dmitry



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Why (substring "abc" 0 4) does not return "abc" instead of an error?
  2012-07-16  0:10     ` Juanma Barranquero
@ 2012-07-16  7:14       ` Bastien
  2012-07-16 16:15         ` Stefan Monnier
  2012-07-16  7:38       ` Andreas Schwab
  1 sibling, 1 reply; 42+ messages in thread
From: Bastien @ 2012-07-16  7:14 UTC (permalink / raw)
  To: Juanma Barranquero; +Cc: emacs-devel

Juanma Barranquero <lekktu@gmail.com> writes:

> On Mon, Jul 16, 2012 at 1:59 AM, Bastien <bzg@gnu.org> wrote:
>
>> I read (substring "abc" 0 4) as "return the biggest substring
>> between 0 and 4" -- even if the string does not have 4 characters.
>
> "Even if the string does not have 4 characters" is not even suggested
> in substring's doc.

I know -- I was just expressing my spontaneous (and erroneous)
assumption.

>> Surely I misread, but this would be handy in some cases, instead
>> of using something like (format "%.4s" "abc").
>
> Would it be handy? Sometimes, perhaps. At other times, having
> substring check that TO is indeed in range is quite useful. 

I would be happy with a third optional argument FAIL-SILENTLY.

> Also, if
> you start with substring, then it will be buffer-substring, and then
> most functions that deal with string ranges, and then buffer ranges?

I get your point and I don't mind a bit of discipline.

Still, `substring' already behaves differently than its cousins
by allowing negative values as arguments.  I have a fuzzy feeling
my expectation comes from this peculiarity, but cannot formalize how

> I'm sure use cases could be found for all these functions where your
> expansive interpretation would be handy...
>
> BTW, for unexpected behavior wrt strings, my favourite is this one
> (it's not a bug);
>
> (let ((s "ab") (m 2)) (eq (substring s 0 m) (substring s 0 m))) => nil
> (let ((s "ab") (m 1)) (eq (substring s 0 m) (substring s 0 m))) => nil
> (let ((s "ab") (m 0)) (eq (substring s 0 m) (substring s 0 m))) => t

Well, this boils down to

(eq "a" "a") => nil
(eq "" "")   => t

where "" is really nil -- I my brain can swallow easily.

-- 
 Bastien



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Why (substring "abc" 0 4) does not return "abc" instead of an error?
  2012-07-16  3:45 Dmitry Gutov
@ 2012-07-16  7:32 ` Bastien
  2012-07-16  7:52   ` Thierry Volpiatto
  2012-07-16 13:03   ` Dmitry Gutov
  2012-07-16 13:10 ` Pascal J. Bourguignon
  1 sibling, 2 replies; 42+ messages in thread
From: Bastien @ 2012-07-16  7:32 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: lekktu, emacs-devel

Dmitry Gutov <dgutov@yandex.ru> writes:

> Juanma Barranquero <lekktu@gmail.com> writes:
>
>> On Mon, Jul 16, 2012 at 1:59 AM, Bastien <bzg@gnu.org> wrote:
>>
>>> I read (substring "abc" 0 4) as "return the biggest substring
>>> between 0 and 4" -- even if the string does not have 4 characters.
>>
>> "Even if the string does not have 4 characters" is not even suggested
>> in substring's doc.
>
> FWIW, it's common behavior in many other programming languages.

Which behavior?  The one I expect?

-- 
 Bastien



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Why (substring "abc" 0 4) does not return "abc" instead of an error?
  2012-07-16  0:10     ` Juanma Barranquero
  2012-07-16  7:14       ` Bastien
@ 2012-07-16  7:38       ` Andreas Schwab
  2012-07-16  9:40         ` Juanma Barranquero
  1 sibling, 1 reply; 42+ messages in thread
From: Andreas Schwab @ 2012-07-16  7:38 UTC (permalink / raw)
  To: Juanma Barranquero; +Cc: Bastien, emacs-devel

Juanma Barranquero <lekktu@gmail.com> writes:

> BTW, for unexpected behavior wrt strings, my favourite is this one
> (it's not a bug);
>
> (let ((s "ab") (m 2)) (eq (substring s 0 m) (substring s 0 m))) => nil
> (let ((s "ab") (m 1)) (eq (substring s 0 m) (substring s 0 m))) => nil
> (let ((s "ab") (m 0)) (eq (substring s 0 m) (substring s 0 m))) => t

That's not much different from ("foo" == "foo") in C.  Using eq for
comparing non-atoms is seldom useful.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Why (substring "abc" 0 4) does not return "abc" instead of an error?
  2012-07-16  7:32 ` Bastien
@ 2012-07-16  7:52   ` Thierry Volpiatto
  2012-07-16  8:38     ` Bastien
  2012-07-16 13:03   ` Dmitry Gutov
  1 sibling, 1 reply; 42+ messages in thread
From: Thierry Volpiatto @ 2012-07-16  7:52 UTC (permalink / raw)
  To: emacs-devel

Bastien <bzg@gnu.org> writes:

> Dmitry Gutov <dgutov@yandex.ru> writes:
>
>> Juanma Barranquero <lekktu@gmail.com> writes:
>>
>>> On Mon, Jul 16, 2012 at 1:59 AM, Bastien <bzg@gnu.org> wrote:
>>>
>>>> I read (substring "abc" 0 4) as "return the biggest substring
>>>> between 0 and 4" -- even if the string does not have 4 characters.
>>>
>>> "Even if the string does not have 4 characters" is not even suggested
>>> in substring's doc.
>>
>> FWIW, it's common behavior in many other programming languages.
>
> Which behavior?  The one I expect?
What about protecting your code like this?

(let ((str "abc")
      (ind 9999))
  (substring str 0 (min (length str) ind)))
=>"abc"

You will never have error:

(loop with str = "abcdef"
      for ind from 0 to 7
      do (princ (substring str 0 (min (length str) ind)))
      do (terpri))

-- 
  Thierry
Get my Gnupg key:
gpg --keyserver pgp.mit.edu --recv-keys 59F29997 




^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Why (substring "abc" 0 4) does not return "abc" instead of an error?
  2012-07-16  7:52   ` Thierry Volpiatto
@ 2012-07-16  8:38     ` Bastien
  0 siblings, 0 replies; 42+ messages in thread
From: Bastien @ 2012-07-16  8:38 UTC (permalink / raw)
  To: Thierry Volpiatto; +Cc: emacs-devel

Thierry Volpiatto <thierry.volpiatto@gmail.com> writes:

> (let ((str "abc")
>       (ind 9999))
>   (substring str 0 (min (length str) ind)))
> =>"abc"

I know, but I guess it's more costly than

  (substring str 0 ind t)

where `t' means "fail quietly".  But I may be wrong.

-- 
 Bastien



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Why (substring "abc" 0 4) does not return "abc" instead of an error?
  2012-07-16  7:38       ` Andreas Schwab
@ 2012-07-16  9:40         ` Juanma Barranquero
  0 siblings, 0 replies; 42+ messages in thread
From: Juanma Barranquero @ 2012-07-16  9:40 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: Bastien, emacs-devel

On Mon, Jul 16, 2012 at 9:38 AM, Andreas Schwab <schwab@linux-m68k.org> wrote:

> That's not much different from ("foo" == "foo") in C.  Using eq for
> comparing non-atoms is seldom useful.

I know that, and I acknowledged it's not a bug. It's just weird that a
progression like that, in which just an index is varying, suddenly
changes result. But I know full well why does it happen.

    Juanma



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Why (substring "abc" 0 4) does not return "abc" instead of an error?
  2012-07-16  7:32 ` Bastien
  2012-07-16  7:52   ` Thierry Volpiatto
@ 2012-07-16 13:03   ` Dmitry Gutov
  2012-07-16 14:32     ` Bastien
  1 sibling, 1 reply; 42+ messages in thread
From: Dmitry Gutov @ 2012-07-16 13:03 UTC (permalink / raw)
  To: Bastien; +Cc: lekktu, emacs-devel

On 16.07.2012 11:32, Bastien wrote:
> Dmitry Gutov <dgutov@yandex.ru> writes:
>
>> Juanma Barranquero <lekktu@gmail.com> writes:
>>
>>> On Mon, Jul 16, 2012 at 1:59 AM, Bastien <bzg@gnu.org> wrote:
>>>
>>>> I read (substring "abc" 0 4) as "return the biggest substring
>>>> between 0 and 4" -- even if the string does not have 4 characters.
>>>
>>> "Even if the string does not have 4 characters" is not even suggested
>>> in substring's doc.
>>
>> FWIW, it's common behavior in many other programming languages.
>
> Which behavior?  The one I expect?

Yes. For example, JS, Ruby, Python and apparently C++ do.
Scheme, Java and C# don't, but they don't have the "negative index = 
from the end" behavior either.



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Why (substring "abc" 0 4) does not return "abc" instead of an error?
  2012-07-16  3:45 Dmitry Gutov
  2012-07-16  7:32 ` Bastien
@ 2012-07-16 13:10 ` Pascal J. Bourguignon
  2012-07-16 14:40   ` Bastien
  1 sibling, 1 reply; 42+ messages in thread
From: Pascal J. Bourguignon @ 2012-07-16 13:10 UTC (permalink / raw)
  To: emacs-devel

Dmitry Gutov <dgutov@yandex.ru> writes:

> Juanma Barranquero <lekktu@gmail.com> writes:
>
>> On Mon, Jul 16, 2012 at 1:59 AM, Bastien <bzg@gnu.org> wrote:
>>
>>> I read (substring "abc" 0 4) as "return the biggest substring
>>> between 0 and 4" -- even if the string does not have 4 characters.
>>
>> "Even if the string does not have 4 characters" is not even suggested
>> in substring's doc.
>
> FWIW, it's common behavior in many other programming languages.

(defun mysubstring (str start end)
  (substring str (max start 0)
                 (if end
                     (min end (length str))
                     (length str))))

and use (mysubstring "abc" 0 4) --> "abc" 
instead of substring.

The point of lisp is to let you define your own language seamlessly.
There's no difference between (mysubstring "abc" 0 4) and (substring
"abc" 0 4), one is not compiled better than the other, or whatever.

-- 
__Pascal Bourguignon__                     http://www.informatimago.com/
A bad day in () is better than a good day in {}.




^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Why (substring "abc" 0 4) does not return "abc" instead of an error?
  2012-07-16 13:03   ` Dmitry Gutov
@ 2012-07-16 14:32     ` Bastien
  0 siblings, 0 replies; 42+ messages in thread
From: Bastien @ 2012-07-16 14:32 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: lekktu, emacs-devel

Dmitry Gutov <dgutov@yandex.ru> writes:

>> Which behavior?  The one I expect?
>
> Yes. For example, JS, Ruby, Python and apparently C++ do.

Good to know, thanks.

-- 
 Bastien



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Why (substring "abc" 0 4) does not return "abc" instead of an error?
  2012-07-16 13:10 ` Pascal J. Bourguignon
@ 2012-07-16 14:40   ` Bastien
  2012-07-16 15:00     ` Pascal J. Bourguignon
  0 siblings, 1 reply; 42+ messages in thread
From: Bastien @ 2012-07-16 14:40 UTC (permalink / raw)
  To: Pascal J. Bourguignon; +Cc: emacs-devel

Hi Pascal,

"Pascal J. Bourguignon" <pjb@informatimago.com> writes:

> (defun mysubstring (str start end)
>   (substring str (max start 0)
>                  (if end
>                      (min end (length str))
>                      (length str))))
>
> and use (mysubstring "abc" 0 4) --> "abc" 
> instead of substring.

I know how to implement my own defun for this but thanks.

My question was about what _justifies_ the current behavior.

Dmitry said at least JS, Ruby, Python and perhaps C++ uses 
the behavior I mention --  so I'm even more curious now.

I am not saying the behavior I expect is superior, it is
just the one I expect -- I would like to read a good reason
for the current one.  Juanma have a point when he said that 
the current behavior is consistent with other *-substring 
functions but again, `substring' seems different to me.

> The point of lisp is to let you define your own language seamlessly.

True.  

But the point of sharing code is also to not reinvent the 
wheel, right?

Best,

-- 
 Bastien



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Why (substring "abc" 0 4) does not return "abc" instead of an error?
  2012-07-16 14:40   ` Bastien
@ 2012-07-16 15:00     ` Pascal J. Bourguignon
  2012-07-16 15:07       ` Lennart Borgman
  2012-07-16 15:46       ` Bastien
  0 siblings, 2 replies; 42+ messages in thread
From: Pascal J. Bourguignon @ 2012-07-16 15:00 UTC (permalink / raw)
  To: emacs-devel

Bastien <bzg@gnu.org> writes:

> Hi Pascal,
>
> "Pascal J. Bourguignon" <pjb@informatimago.com> writes:
>
>> (defun mysubstring (str start end)
>>   (substring str (max start 0)
>>                  (if end
>>                      (min end (length str))
>>                      (length str))))
>>
>> and use (mysubstring "abc" 0 4) --> "abc" 
>> instead of substring.
>
> I know how to implement my own defun for this but thanks.
>
> My question was about what _justifies_ the current behavior.

First, emacs and lisp were invented long before JS, Ruby, Python, C++
and a lot of other _currently_ popular languages and other languages
that were popular but are now forgotten ;-)

So emacs and lisp have another, older tradition.

If you were to invent a new lisp (or better, just writing a new lisp
application or library), then you could design a consistent set of
operators with a more "modern" look-and-feel; (the "modern" style spread
out in the 20's, it's an old style).

Technically, one good reason to signal an error instead of silently
clipping the arguments is that exactly it detects an error.  Since lisp
is a dynamically typed language, the type of the objects is controlled
by what the functions accept.  If you (or your compiler) formalize the
types accepted by the functions, then type inference can be implemented
and the program can be (for the most parts) type checked statically
too.  But even without static type checking with type inference, it's
useful to set up such constraints and signal such errors.

You could also accept non integer values for start and end.  Obviously
any real would be good too (but will you truncate or round?).  What
about complex numbers (if there were complexes in emacs lisp)?  Or just
what about other objects, what if we pass a string:

    (mysubstring str "42" "end-2")

We can imagine several useful behaviors.  But would a library/language
that would accept any type of arguments and values for any parameter be
really that useful?  Have a look at PHP and similar languages that
coerce everything everywhere.  I'm not sure that entirely helps writing
clean and bug-free programs.

But mostly the point is that it's a question of language (or DSL,
library) design.  It's somewhat arbitrary.  We can say that the
designers of substring (subseq in CL, and similar functions in ancestors
of CL and emacs lisp) found that it was useful to signal an error when
passing out of bounds arguments.



> Dmitry said at least JS, Ruby, Python and perhaps C++ uses 
> the behavior I mention --  so I'm even more curious now.
>
> I am not saying the behavior I expect is superior, it is
> just the one I expect -- I would like to read a good reason
> for the current one.  Juanma have a point when he said that 
> the current behavior is consistent with other *-substring 
> functions but again, `substring' seems different to me.

Well, lisp has this great advantage over other programming languages
that it is really very accomodating to your specific needs.

Look how emacs lisp provides a cl package with macros and functions
similar to those provided in Common Lisp.

Similarly, nothing prevents you to write an emacs lisp package with
macros and functions having a Javascript, or Ruby or Python or C++
look-and-feel, that would help programmers coming from those languages
to more easily adapt and feel more comfortable with emacs lisp, just
like cl helps me, a Common Lisp programmer, be more comfortable with
emacs lisp.

(require 'js)
(js-substring "abc" "0" 1000) --> "abc" ; and let JavaScript 
                                        ; programmers be happy!

(just document those functions and macros well, so that other emacs lisp
programmers can still understand what's happening).
-- 
__Pascal Bourguignon__                     http://www.informatimago.com/
A bad day in () is better than a good day in {}.




^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Why (substring "abc" 0 4) does not return "abc" instead of an error?
  2012-07-16 15:00     ` Pascal J. Bourguignon
@ 2012-07-16 15:07       ` Lennart Borgman
  2012-07-16 15:19         ` Pascal J. Bourguignon
  2012-07-16 15:22         ` Bastien
  2012-07-16 15:46       ` Bastien
  1 sibling, 2 replies; 42+ messages in thread
From: Lennart Borgman @ 2012-07-16 15:07 UTC (permalink / raw)
  To: Pascal J. Bourguignon; +Cc: emacs-devel

On Mon, Jul 16, 2012 at 5:00 PM, Pascal J. Bourguignon
<pjb@informatimago.com> wrote:
> Bastien <bzg@gnu.org> writes:
>
>> Hi Pascal,
>>
>> "Pascal J. Bourguignon" <pjb@informatimago.com> writes:
>>
>>> (defun mysubstring (str start end)
>>>   (substring str (max start 0)
>>>                  (if end
>>>                      (min end (length str))
>>>                      (length str))))
>>>
>>> and use (mysubstring "abc" 0 4) --> "abc"
>>> instead of substring.
...
> Technically, one good reason to signal an error instead of silently
> clipping the arguments is that exactly it detects an error.  Since lisp

It is a good reason not to change the old behaviour. However
introducing a new function in Emacs like the one you suggest would
save time for all those who need this function. (It is hard for me
even to come up with an example where I do not want this instead of
the old one.)



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Why (substring "abc" 0 4) does not return "abc" instead of an error?
  2012-07-16 15:07       ` Lennart Borgman
@ 2012-07-16 15:19         ` Pascal J. Bourguignon
  2012-07-16 15:22         ` Bastien
  1 sibling, 0 replies; 42+ messages in thread
From: Pascal J. Bourguignon @ 2012-07-16 15:19 UTC (permalink / raw)
  To: emacs-devel

Lennart Borgman <lennart.borgman@gmail.com> writes:

> On Mon, Jul 16, 2012 at 5:00 PM, Pascal J. Bourguignon
> <pjb@informatimago.com> wrote:
>> Bastien <bzg@gnu.org> writes:
>>
>>> Hi Pascal,
>>>
>>> "Pascal J. Bourguignon" <pjb@informatimago.com> writes:
>>>
>>>> (defun mysubstring (str start end)
>>>>   (substring str (max start 0)
>>>>                  (if end
>>>>                      (min end (length str))
>>>>                      (length str))))
>>>>
>>>> and use (mysubstring "abc" 0 4) --> "abc"
>>>> instead of substring.
> ...
>> Technically, one good reason to signal an error instead of silently
>> clipping the arguments is that exactly it detects an error.  Since lisp
>
> It is a good reason not to change the old behaviour. However
> introducing a new function in Emacs like the one you suggest would
> save time for all those who need this function. (It is hard for me
> even to come up with an example where I do not want this instead of
> the old one.)

You cut out the relevant part:

Well, lisp has this great advantage over other programming languages
that it is really very accomodating to your specific needs.

Look how emacs lisp provides a cl package with macros and functions
similar to those provided in Common Lisp.

Similarly, nothing prevents you to write an emacs lisp package with
macros and functions having a Javascript, or Ruby or Python or C++
look-and-feel, that would help programmers coming from those languages
to more easily adapt and feel more comfortable with emacs lisp, just
like cl helps me, a Common Lisp programmer, be more comfortable with
emacs lisp.

(require 'js)
(js-substring "abc" "0" 1000) --> "abc" ; and let JavaScript 
                                        ; programmers be happy!

(just document those functions and macros well, so that other emacs lisp
programmers can still understand what's happening).

-- 
__Pascal Bourguignon__                     http://www.informatimago.com/
A bad day in () is better than a good day in {}.




^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Why (substring "abc" 0 4) does not return "abc" instead of an error?
  2012-07-16 15:07       ` Lennart Borgman
  2012-07-16 15:19         ` Pascal J. Bourguignon
@ 2012-07-16 15:22         ` Bastien
  1 sibling, 0 replies; 42+ messages in thread
From: Bastien @ 2012-07-16 15:22 UTC (permalink / raw)
  To: Lennart Borgman; +Cc: Pascal J. Bourguignon, emacs-devel

Lennart Borgman <lennart.borgman@gmail.com> writes:

> It is a good reason not to change the old behaviour. However
> introducing a new function in Emacs like the one you suggest would
> save time for all those who need this function. 

Well, a third argument to let the function fail quitely would do.

> (It is hard for me even to come up with an example where I do not want
> this instead of the old one.)

For me too :)

-- 
 Bastien



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Why (substring "abc" 0 4) does not return "abc" instead of an error?
  2012-07-16 15:00     ` Pascal J. Bourguignon
  2012-07-16 15:07       ` Lennart Borgman
@ 2012-07-16 15:46       ` Bastien
  2012-07-16 15:49         ` Bastien
  2012-07-16 15:56         ` Pascal J. Bourguignon
  1 sibling, 2 replies; 42+ messages in thread
From: Bastien @ 2012-07-16 15:46 UTC (permalink / raw)
  To: Pascal J. Bourguignon; +Cc: emacs-devel

Hi Pascal,

"Pascal J. Bourguignon" <pjb@informatimago.com> writes:

> First, emacs and lisp were invented long before JS, Ruby, Python, C++
> and a lot of other _currently_ popular languages and other languages
> that were popular but are now forgotten ;-)
>
> So emacs and lisp have another, older tradition.
>
> If you were to invent a new lisp (or better, just writing a new lisp
> application or library), then you could design a consistent set of
> operators with a more "modern" look-and-feel; (the "modern" style spread
> out in the 20's, it's an old style).

I'm not into a ancient vs. modern quarrel.

> Technically, one good reason to signal an error instead of silently
> clipping the arguments is that exactly it detects an error.  Since lisp
> is a dynamically typed language, the type of the objects is controlled
> by what the functions accept.  If you (or your compiler) formalize the
> types accepted by the functions, then type inference can be implemented
> and the program can be (for the most parts) type checked statically
> too.  But even without static type checking with type inference, it's
> useful to set up such constraints and signal such errors.
>
> You could also accept non integer values for start and end.  Obviously
> any real would be good too (but will you truncate or round?).  What
> about complex numbers (if there were complexes in emacs lisp)?  Or just
> what about other objects, what if we pass a string:
>
>     (mysubstring str "42" "end-2")

I'm not interested in doing crazy stuff, I'm interested in

(substring "abc" 0 4 t)
  => "abc"

where `t' is the value of an option third NOERROR argument.

(substring "abc" 0 4) would still throw an error, so that
the change does not break any code.

> We can imagine several useful behaviors.  But would a library/language
> that would accept any type of arguments and values for any parameter be
> really that useful?  Have a look at PHP and similar languages that
> coerce everything everywhere.  I'm not sure that entirely helps writing
> clean and bug-free programs.

:)  But please, this is not a language issue, just a suggestion
on a useful extension to `substring'.

> Similarly, nothing prevents you to write an emacs lisp package with
> macros and functions having a Javascript, or Ruby or Python or C++
> look-and-feel, that would help programmers coming from those languages
> to more easily adapt and feel more comfortable with emacs lisp, just
> like cl helps me, a Common Lisp programmer, be more comfortable with
> emacs lisp.

I'm not really interested in other programmers, just in what I could
write in Elisp.

-- 
 Bastien



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Why (substring "abc" 0 4) does not return "abc" instead of an error?
  2012-07-16 15:46       ` Bastien
@ 2012-07-16 15:49         ` Bastien
  2012-07-16 19:49           ` Thien-Thi Nguyen
  2012-07-16 15:56         ` Pascal J. Bourguignon
  1 sibling, 1 reply; 42+ messages in thread
From: Bastien @ 2012-07-16 15:49 UTC (permalink / raw)
  To: Pascal J. Bourguignon; +Cc: emacs-devel

Bastien <bzg@gnu.org> writes:

> I'm not interested in doing crazy stuff, I'm interested in
>
> (substring "abc" 0 4 t)
>   => "abc"
>
> where `t' is the value of an option third NOERROR argument.
                                      ^^^^^

Fourth, actually.

-- 
 Bastien



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Why (substring "abc" 0 4) does not return "abc" instead of an error?
  2012-07-16 15:46       ` Bastien
  2012-07-16 15:49         ` Bastien
@ 2012-07-16 15:56         ` Pascal J. Bourguignon
  2012-07-16 16:13           ` Bastien
  1 sibling, 1 reply; 42+ messages in thread
From: Pascal J. Bourguignon @ 2012-07-16 15:56 UTC (permalink / raw)
  To: emacs-devel

Bastien <bzg@gnu.org> writes:

> Hi Pascal,
>
> "Pascal J. Bourguignon" <pjb@informatimago.com> writes:
>
>> First, emacs and lisp were invented long before JS, Ruby, Python, C++
>> and a lot of other _currently_ popular languages and other languages
>> that were popular but are now forgotten ;-)
>>
>> So emacs and lisp have another, older tradition.
>>
>> If you were to invent a new lisp (or better, just writing a new lisp
>> application or library), then you could design a consistent set of
>> operators with a more "modern" look-and-feel; (the "modern" style spread
>> out in the 20's, it's an old style).
>
> I'm not into a ancient vs. modern quarrel.

It's not a quarrel, it's just to situate your mindset: you've learned
first another language and you come in emacs lisp with that other
language mindset.  In your own relative history it looks like emacs lisp
is new and changed gratuituously from your "old" language.

But in the absolute time, lisp and emacs lisp was before, and it's the
other languages who tend to differ gratuituously from lisp.  (And then
usually evolve back toward lisp, but THAT is another story).


But it's OK, as I said, you can easily import your habits in lisp, lisp
is very maleable.

 
>> Technically, one good reason to signal an error instead of silently
>> clipping the arguments is that exactly it detects an error.  Since lisp
>> is a dynamically typed language, the type of the objects is controlled
>> by what the functions accept.  If you (or your compiler) formalize the
>> types accepted by the functions, then type inference can be implemented
>> and the program can be (for the most parts) type checked statically
>> too.  But even without static type checking with type inference, it's
>> useful to set up such constraints and signal such errors.
>>
>> You could also accept non integer values for start and end.  Obviously
>> any real would be good too (but will you truncate or round?).  What
>> about complex numbers (if there were complexes in emacs lisp)?  Or just
>> what about other objects, what if we pass a string:
>>
>>     (mysubstring str "42" "end-2")
>
> I'm not interested in doing crazy stuff, I'm interested in
>
> (substring "abc" 0 4 t)
>   => "abc"
>
> where `t' is the value of an option third NOERROR argument.
>
> (substring "abc" 0 4) would still throw an error, so that
> the change does not break any code.

I would not advise that nonetheless.  It can bite you.  I tried things
like that, and it bit me.  Better use a different name.

If there were CL-like packages in emacs lisp, we could use a symbol
named substring in a different package and it would be nice and clean.
But not yet in emacs lisp.


>> We can imagine several useful behaviors.  But would a library/language
>> that would accept any type of arguments and values for any parameter be
>> really that useful?  Have a look at PHP and similar languages that
>> coerce everything everywhere.  I'm not sure that entirely helps writing
>> clean and bug-free programs.
>
> :)  But please, this is not a language issue, just a suggestion
> on a useful extension to `substring'.

A library defines a "language" or a DSL (Domain Specific Language).

The only thing is that with some languages, the languages defined by
libraries are very constrained, syntactically and semantically.  But not
so in lisp.  Or rather, since lisp restricts itself to the sexp
low-level syntax, all language elements, either provided by the system
or user defined share the same status.


>> Similarly, nothing prevents you to write an emacs lisp package with
>> macros and functions having a Javascript, or Ruby or Python or C++
>> look-and-feel, that would help programmers coming from those languages
>> to more easily adapt and feel more comfortable with emacs lisp, just
>> like cl helps me, a Common Lisp programmer, be more comfortable with
>> emacs lisp.
>
> I'm not really interested in other programmers, just in what I could
> write in Elisp.

Yes, that's why my first suggestion was for you to define a mysubstring
function.

With the current tools, you cannot easily use a different symbol named
substring in emacs lisp, and you cannot easily and safely redefine the
symbol substring, because other functions use it and may depend on its
current behavior.  It's better to just use a different symbol bound to
your own function in your own code.


-- 
__Pascal Bourguignon__                     http://www.informatimago.com/
A bad day in () is better than a good day in {}.




^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Why (substring "abc" 0 4) does not return "abc" instead of an error?
  2012-07-16 15:56         ` Pascal J. Bourguignon
@ 2012-07-16 16:13           ` Bastien
  0 siblings, 0 replies; 42+ messages in thread
From: Bastien @ 2012-07-16 16:13 UTC (permalink / raw)
  To: Pascal J. Bourguignon; +Cc: emacs-devel

"Pascal J. Bourguignon" <pjb@informatimago.com> writes:

> It's not a quarrel, it's just to situate your mindset: you've learned
> first another language and you come in emacs lisp with that other
> language mindset.  

Elisp is the first language I learned.

>> I'm not interested in doing crazy stuff, I'm interested in
>>
>> (substring "abc" 0 4 t)
>>   => "abc"
>>
>> where `t' is the value of an option third NOERROR argument.
>>
>> (substring "abc" 0 4) would still throw an error, so that
>> the change does not break any code.
>
> I would not advise that nonetheless.  It can bite you.  I tried things
> like that, and it bit me.  Better use a different name.

Why?

> With the current tools, you cannot easily use a different symbol named
> substring in emacs lisp, and you cannot easily and safely redefine the
> symbol substring, because other functions use it and may depend on its
> current behavior.  It's better to just use a different symbol bound to
> your own function in your own code.

Or to wait for the maintainers to agree with me :)

-- 
 Bastien



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Why (substring "abc" 0 4) does not return "abc" instead of an error?
  2012-07-16  7:14       ` Bastien
@ 2012-07-16 16:15         ` Stefan Monnier
  2012-07-16 16:22           ` Bastien
  2012-07-16 16:46           ` Bastien
  0 siblings, 2 replies; 42+ messages in thread
From: Stefan Monnier @ 2012-07-16 16:15 UTC (permalink / raw)
  To: Bastien; +Cc: Juanma Barranquero, emacs-devel

>> Would it be handy? Sometimes, perhaps. At other times, having
>> substring check that TO is indeed in range is quite useful. 
> I would be happy with a third optional argument FAIL-SILENTLY.

We have general functionality when you want to ignore some errors, such
as condition-case.


        Stefan



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Why (substring "abc" 0 4) does not return "abc" instead of an error?
  2012-07-16 16:15         ` Stefan Monnier
@ 2012-07-16 16:22           ` Bastien
  2012-07-16 16:46           ` Bastien
  1 sibling, 0 replies; 42+ messages in thread
From: Bastien @ 2012-07-16 16:22 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Juanma Barranquero, emacs-devel

Stefan Monnier <monnier@IRO.UMontreal.CA> writes:

>>> Would it be handy? Sometimes, perhaps. At other times, having
>>> substring check that TO is indeed in range is quite useful. 
>> I would be happy with a third optional argument FAIL-SILENTLY.
>
> We have general functionality when you want to ignore some errors, such
> as condition-case.

I know, thanks.  My idea is that preventing this error is expected
enough to deserve a new optional argument.  Just as `re-search-forward'
has one, for example.

I can live with the current `substring', but I was being curious.

-- 
 Bastien



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Why (substring "abc" 0 4) does not return "abc" instead of an error?
  2012-07-16 16:15         ` Stefan Monnier
  2012-07-16 16:22           ` Bastien
@ 2012-07-16 16:46           ` Bastien
  2012-07-16 17:57             ` Tassilo Horn
                               ` (2 more replies)
  1 sibling, 3 replies; 42+ messages in thread
From: Bastien @ 2012-07-16 16:46 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Juanma Barranquero, emacs-devel

Stefan Monnier <monnier@IRO.UMontreal.CA> writes:

> We have general functionality when you want to ignore some errors, such
> as condition-case.

Also, I'm fine with

  (substring "abc" -1 1)
    => #ERROR

so using ̀condition-case' would not help me distinguish
between the case above and (substring "abc" 0 4), which
is what I want.

I see the benefit of having 

  (substring "abc" 0 4)
    => "abc"

in terms of simplifying Elisp writing -- and I still fail
to see the harm (but maybe Pascal will tell me where he has
been bitten by this.)

-- 
 Bastien



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Why (substring "abc" 0 4) does not return "abc" instead of an error?
  2012-07-16 16:46           ` Bastien
@ 2012-07-16 17:57             ` Tassilo Horn
  2012-07-16 18:51               ` Lars Magne Ingebrigtsen
  2012-07-16 19:25               ` Bastien
  2012-07-16 20:19             ` Pascal J. Bourguignon
  2012-07-16 20:30             ` Stefan Monnier
  2 siblings, 2 replies; 42+ messages in thread
From: Tassilo Horn @ 2012-07-16 17:57 UTC (permalink / raw)
  To: Bastien; +Cc: Juanma Barranquero, Stefan Monnier, emacs-devel

Bastien <bzg@gnu.org> writes:

>> We have general functionality when you want to ignore some errors,
>> such as condition-case.
>
> Also, I'm fine with
>
>   (substring "abc" -1 1)
>     => #ERROR

I don't see why that justifies an error and (substring "abc" 0 4) does
not.  -1 is a valid FROM index meaning the length of the string minus
one.  Its just that the TO index is smaller than FROM here, but IMO
that's the same class of errors as a too large TO index.

> so using ̀condition-case' would not help me distinguish
> between the case above and (substring "abc" 0 4), which
> is what I want.

A condition-case handler has access to the args given to the erroring
form, so the cases could be distinguished although both signal
args-of-range.  Well, not that it would help you much here.  

> I see the benefit of having 
>
>   (substring "abc" 0 4)
>     => "abc"
>
> in terms of simplifying Elisp writing -- and I still fail
> to see the harm (but maybe Pascal will tell me where he has
> been bitten by this.)

In my experiences, out-of-range indices into strings or arrays are
almost always programming errors.  I'm not even able to come up with
some concrete use-case where I'd like to have the suggested behavior.
Either I know exactly what I'm operating on and use indices, or I have
only some general assumptions and then use more fuzzy things like
splitting by regular expressions.

BTW: C++ string's substr method doesn't quite have the suggested
behavior.  It's arguments are not two indices, but one index and one
length n (the number of characters that should be returned).  If the
index is out of range, you'll get an out_of_range exception.  n however
may span longer than the rest of the string in which case the returned
string is shorter than the given length n.  But that's a different
story: indexes have to be in range.

Ditto for Ruby: String::slice also gets an index and a length, not two
indices.

Bye,
Tassilo



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Why (substring "abc" 0 4) does not return "abc" instead of an error?
  2012-07-16 17:57             ` Tassilo Horn
@ 2012-07-16 18:51               ` Lars Magne Ingebrigtsen
  2012-07-16 19:30                 ` Bastien
                                   ` (2 more replies)
  2012-07-16 19:25               ` Bastien
  1 sibling, 3 replies; 42+ messages in thread
From: Lars Magne Ingebrigtsen @ 2012-07-16 18:51 UTC (permalink / raw)
  To: Tassilo Horn; +Cc: Bastien, Juanma Barranquero, Stefan Monnier, emacs-devel

Tassilo Horn <tassilo@member.fsf.org> writes:

> I'm not even able to come up with some concrete use-case where I'd
> like to have the suggested behavior.

It's a very common use case for me.  I know that a string can't be
longer than X for some particular use, so I have to say

(insert (if (> (length string) 4)
            (substring string 0 4)
          string))

or something.          

-- 
(domestic pets only, the antidote for overdose, milk.)
  bloggy blog http://lars.ingebrigtsen.no/



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Why (substring "abc" 0 4) does not return "abc" instead of an error?
@ 2012-07-16 19:00 Dmitry Gutov
  2012-07-16 19:51 ` Tassilo Horn
  0 siblings, 1 reply; 42+ messages in thread
From: Dmitry Gutov @ 2012-07-16 19:00 UTC (permalink / raw)
  To: tassilo; +Cc: bzg, lekktu, monnier, emacs-devel

Tassilo Horn <tassilo@member.fsf.org> writes:
 > BTW: C++ string's substr method doesn't quite have the suggested
 > behavior.  It's arguments are not two indices, but one index and one
 > length n (the number of characters that should be returned).  If the
 > index is out of range, you'll get an out_of_range exception.  n however
 > may span longer than the rest of the string in which case the returned
 > string is shorter than the given length n.  But that's a different
 > story: indexes have to be in range.
 >
 > Ditto for Ruby: String::slice also gets an index and a length, not two
 > indices.

True, but it also accepts range as parameter. Neither form
raises error:

irb(main):007:0> "abc".slice(1, 10)
=> "bc"
irb(main):008:0> "abc".slice(0..1)
=> "ab"

Same thing with JavaScript:

--
[22:51:01.141] "abc".substring(0,1)
[22:51:01.147] "a"
[22:51:03.341] "abc".substring(1,2)
[22:51:03.345] "b"
[22:51:13.902] "abc".substring(1,20)
[22:51:13.905] "bc"
[22:52:00.768] "abc".substr(2,1)
[22:52:00.772] "c"

So far I don't see another language, aside from Emacs Lisp, that
interprets negative value of second index as "count from the end", yet
raises an "out of range" error if that value is too big.

--Dmitry



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Why (substring "abc" 0 4) does not return "abc" instead of an error?
  2012-07-16 17:57             ` Tassilo Horn
  2012-07-16 18:51               ` Lars Magne Ingebrigtsen
@ 2012-07-16 19:25               ` Bastien
  2012-07-16 19:43                 ` Bastien
  1 sibling, 1 reply; 42+ messages in thread
From: Bastien @ 2012-07-16 19:25 UTC (permalink / raw)
  To: Tassilo Horn; +Cc: Juanma Barranquero, Stefan Monnier, emacs-devel

Hi Tassilo,

Tassilo Horn <tassilo@member.fsf.org> writes:

> Bastien <bzg@gnu.org> writes:
>
>>> We have general functionality when you want to ignore some errors,
>>> such as condition-case.
>>
>> Also, I'm fine with
>>
>>   (substring "abc" -1 1)
>>     => #ERROR
>
> I don't see why that justifies an error and (substring "abc" 0 4) does
> not.  -1 is a valid FROM index meaning the length of the string minus
> one.  Its just that the TO index is smaller than FROM here, but IMO
> that's the same class of errors as a too large TO index.

I would distinguish problems caused by only one indice from those 
caused by the relationship between two indices.  Just nit-picking.

>> so using ̀condition-case' would not help me distinguish
>> between the case above and (substring "abc" 0 4), which
>> is what I want.
>
> A condition-case handler has access to the args given to the erroring
> form, so the cases could be distinguished although both signal
> args-of-range.  Well, not that it would help you much here.  

Fair enough.

>> I see the benefit of having 
>>
>>   (substring "abc" 0 4)
>>     => "abc"
>>
>> in terms of simplifying Elisp writing -- and I still fail
>> to see the harm (but maybe Pascal will tell me where he has
>> been bitten by this.)
>
> In my experiences, out-of-range indices into strings or arrays are
> almost always programming errors.  I'm not even able to come up with
> some concrete use-case where I'd like to have the suggested behavior.
> Either I know exactly what I'm operating on and use indices, or I have
> only some general assumptions and then use more fuzzy things like
> splitting by regular expressions.

I get your point about programming errors.  And I think it's just
part of my brain that cannot help thinking of the third argument as
a _length_ instead of an _index_.  I now see how allowing to fail 
quietly would make this misconception more common.

How about these two defun-future-defsubst candidates?

(defun string-head (string n)
  "Return N characters starting from the beginning of STRING.
If N is larger than the length of STRING, return it."
  (substring string 0 (min n (length string))))

(defun string-tail (string n)
  "Return N characters starting from the end of STRING.
If N is larger than the length of STRING, return it."
  (let* ((l (length string))
	 (s (- l n))
	 (d (if (> s 0) s 0)))
    (substring string d l)))

-- 
 Bastien



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Why (substring "abc" 0 4) does not return "abc" instead of an error?
  2012-07-16 18:51               ` Lars Magne Ingebrigtsen
@ 2012-07-16 19:30                 ` Bastien
  2012-07-16 19:30                 ` Tassilo Horn
  2012-07-16 20:20                 ` Pascal J. Bourguignon
  2 siblings, 0 replies; 42+ messages in thread
From: Bastien @ 2012-07-16 19:30 UTC (permalink / raw)
  To: Lars Magne Ingebrigtsen
  Cc: Juanma Barranquero, Tassilo Horn, Stefan Monnier, emacs-devel

Lars Magne Ingebrigtsen <larsi@gnus.org> writes:

> Tassilo Horn <tassilo@member.fsf.org> writes:
>
>> I'm not even able to come up with some concrete use-case where I'd
>> like to have the suggested behavior.
>
> It's a very common use case for me.  I know that a string can't be
> longer than X for some particular use, so I have to say
>
> (insert (if (> (length string) 4)
>             (substring string 0 4)
>           string))

Well, see `string-head' and `string-tail' in my other message.

-- 
 Bastien



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Why (substring "abc" 0 4) does not return "abc" instead of an error?
  2012-07-16 18:51               ` Lars Magne Ingebrigtsen
  2012-07-16 19:30                 ` Bastien
@ 2012-07-16 19:30                 ` Tassilo Horn
  2012-07-16 20:20                 ` Pascal J. Bourguignon
  2 siblings, 0 replies; 42+ messages in thread
From: Tassilo Horn @ 2012-07-16 19:30 UTC (permalink / raw)
  To: Lars Magne Ingebrigtsen
  Cc: Bastien, Juanma Barranquero, Stefan Monnier, emacs-devel

Lars Magne Ingebrigtsen <larsi@gnus.org> writes:

>> I'm not even able to come up with some concrete use-case where I'd
>> like to have the suggested behavior.
>
> It's a very common use case for me.  I know that a string can't be
> longer than X for some particular use, so I have to say
>
> (insert (if (> (length string) 4)
>             (substring string 0 4)
>           string))
>
> or something.          

Well, in those cases I'd just use (format "%.4" string), possibly with
"%4.4s" / "%-4.4s" if I also want padding to generate tabular-like
aligned output.

In any case, IMHO the little drawback in conciseness is worth the added
safety.  At least for me, whenever I have an index wrong it's because I
had false assumptions on my data (or because of my general arithmetic
disorder).  And then I'm happy if the error occurs as soon as possible
rather than having wrong results or an error later on where it might be
much harder to find the real cause.

Of course, that's no argument against another optional parameter to
substring to make it explicit that the TO index might be larger than the
length of the string.

Bye,
Tassilo



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Why (substring "abc" 0 4) does not return "abc" instead of an error?
  2012-07-16 19:25               ` Bastien
@ 2012-07-16 19:43                 ` Bastien
  0 siblings, 0 replies; 42+ messages in thread
From: Bastien @ 2012-07-16 19:43 UTC (permalink / raw)
  To: Tassilo Horn; +Cc: Juanma Barranquero, Stefan Monnier, emacs-devel

Bastien <bzg@gnu.org> writes:

> (defun string-tail (string n)
>   "Return N characters starting from the end of STRING.
> If N is larger than the length of STRING, return it."
>   (let* ((l (length string))
> 	 (s (- l n))
> 	 (d (if (> s 0) s 0)))
>     (substring string d l)))

Or simpler:

(defun string-tail (string n)
  (let ((l (length string)))
    (if (> n l)
	string
      (substring string (- l n) l))))

-- 
 Bastien



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Why (substring "abc" 0 4) does not return "abc" instead of an error?
  2012-07-16 15:49         ` Bastien
@ 2012-07-16 19:49           ` Thien-Thi Nguyen
  2012-07-16 22:32             ` Bastien
  0 siblings, 1 reply; 42+ messages in thread
From: Thien-Thi Nguyen @ 2012-07-16 19:49 UTC (permalink / raw)
  To: Bastien; +Cc: Pascal J. Bourguignon, emacs-devel

() Bastien <bzg@gnu.org>
() Mon, 16 Jul 2012 17:49:49 +0200

   Bastien <bzg@gnu.org> writes:

   > I'm not interested in doing crazy stuff, I'm interested in
   >
   > (substring "abc" 0 4 t)
   >   => "abc"
   >
   > where `t' is the value of an option third NOERROR argument.
                                         ^^^^^

   Fourth, actually.

If you prefer this slackful ‘substring’ and would like to use it
everywhere, then in practice "extending" ‘substring’ has the same
cost as writing a personalized version.

How's that?

Imagine if ‘substring’ were to indeed be changed to support this
NOERROR argument.  Since you want to use it everywhere, you would
start converting all the callsites to include NOERROR ‘t’.  After
a bit, you'd look askance at this extra verbosity, and decide to
abstract it w/ a personalized function, say "subs".

  (defun subs (string beg end)
    (substring string beg end t))

"Ah, much better!" you think.  Well i would agree.  But now that
you have ‘subs’ (and use it everywhere), why precisely do you need
the extended ‘substring’ anymore?  You could have saved the effort
of going north 1 then east 1 by going northeast 1.414...

  (defun subs (string beg end)
    (substring string beg (and (< end (length string))
                               end)))

...in the first place.  Overall, the design principle is that you
want the platform to be firm and the performer flexible, not the
other way around.  The audience may be pleased either way, but as
producer, you want the roadies to sleep better lest they revolt...



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Why (substring "abc" 0 4) does not return "abc" instead of an error?
  2012-07-16 19:00 Dmitry Gutov
@ 2012-07-16 19:51 ` Tassilo Horn
  0 siblings, 0 replies; 42+ messages in thread
From: Tassilo Horn @ 2012-07-16 19:51 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: bzg, lekktu, monnier, emacs-devel

Dmitry Gutov <dgutov@yandex.ru> writes:

>> Ditto for Ruby: String::slice also gets an index and a length, not two
>> indices.
>
> True, but it also accepts range as parameter. Neither form
> raises error:
>
> irb(main):007:0> "abc".slice(1, 10)
> => "bc"
> irb(main):008:0> "abc".slice(0..1)
> => "ab"

I just wanted to point out that the (int, int) version specifies it
second argument to be no index.

But, yes, the version that gets a range argument with the meaning of
taking the range's first and last value as start/end indices invalidates
my argument that indexes are usually strict quite a bit.

> Same thing with JavaScript:

Yes, but JS explicitly says that both arguments are indexes, just like
Elisp's substring.  So JS's substring function is really different while
C++ and Ruby are not really comparable because their arguments have a
different meaning.

But JS is different (unexpected) in many other ways, too:

  https://www.destroyallsoftware.com/talks/wat/

> So far I don't see another language, aside from Emacs Lisp, that
> interprets negative value of second index as "count from the end", yet
> raises an "out of range" error if that value is too big.

I don't, too.

Bye,
Tassilo



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Why (substring "abc" 0 4) does not return "abc" instead of an error?
  2012-07-16 16:46           ` Bastien
  2012-07-16 17:57             ` Tassilo Horn
@ 2012-07-16 20:19             ` Pascal J. Bourguignon
  2012-07-16 20:30             ` Stefan Monnier
  2 siblings, 0 replies; 42+ messages in thread
From: Pascal J. Bourguignon @ 2012-07-16 20:19 UTC (permalink / raw)
  To: emacs-devel

Bastien <bzg@gnu.org> writes:

> Stefan Monnier <monnier@IRO.UMontreal.CA> writes:
>
>> We have general functionality when you want to ignore some errors, such
>> as condition-case.
>
> Also, I'm fine with
>
>   (substring "abc" -1 1)
>     => #ERROR
>
> so using ̀condition-case' would not help me distinguish
> between the case above and (substring "abc" 0 4), which
> is what I want.
>
> I see the benefit of having 
>
>   (substring "abc" 0 4)
>     => "abc"
>
> in terms of simplifying Elisp writing -- and I still fail
> to see the harm (but maybe Pascal will tell me where he has
> been bitten by this.)

There would have been no harm if the language/library had been designed
that way.  It's arbitrary.  But since it has been designed the other
way, there would be harm if that changed.  There are a ton of code that
expects the original behavior.

-- 
__Pascal Bourguignon__                     http://www.informatimago.com/
A bad day in () is better than a good day in {}.




^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Why (substring "abc" 0 4) does not return "abc" instead of an error?
  2012-07-16 18:51               ` Lars Magne Ingebrigtsen
  2012-07-16 19:30                 ` Bastien
  2012-07-16 19:30                 ` Tassilo Horn
@ 2012-07-16 20:20                 ` Pascal J. Bourguignon
  2 siblings, 0 replies; 42+ messages in thread
From: Pascal J. Bourguignon @ 2012-07-16 20:20 UTC (permalink / raw)
  To: emacs-devel

Lars Magne Ingebrigtsen <larsi@gnus.org> writes:

> Tassilo Horn <tassilo@member.fsf.org> writes:
>
>> I'm not even able to come up with some concrete use-case where I'd
>> like to have the suggested behavior.
>
> It's a very common use case for me.  I know that a string can't be
> longer than X for some particular use, so I have to say
>
> (insert (if (> (length string) 4)
>             (substring string 0 4)
>           string))

(insert (substring string 0 (min 4 (length string))))

-- 
__Pascal Bourguignon__                     http://www.informatimago.com/
A bad day in () is better than a good day in {}.




^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Why (substring "abc" 0 4) does not return "abc" instead of an error?
  2012-07-16 16:46           ` Bastien
  2012-07-16 17:57             ` Tassilo Horn
  2012-07-16 20:19             ` Pascal J. Bourguignon
@ 2012-07-16 20:30             ` Stefan Monnier
  2012-07-16 22:28               ` Lennart Borgman
  2 siblings, 1 reply; 42+ messages in thread
From: Stefan Monnier @ 2012-07-16 20:30 UTC (permalink / raw)
  To: Bastien; +Cc: Juanma Barranquero, emacs-devel

> I see the benefit of having 

>   (substring "abc" 0 4)
>     => "abc"

> in terms of simplifying Elisp writing -- and I still fail
> to see the harm (but maybe Pascal will tell me where he has
> been bitten by this.)

I don't think such a semantics is harmful, indeed.  It's just different,
and I don't think this issue affects enough code that it's worth
changing from one to the other, nor is it worth adding a special
argument for it.


        Stefan



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Why (substring "abc" 0 4) does not return "abc" instead of an error?
  2012-07-16 20:30             ` Stefan Monnier
@ 2012-07-16 22:28               ` Lennart Borgman
  2012-07-16 22:48                 ` Bastien
  0 siblings, 1 reply; 42+ messages in thread
From: Lennart Borgman @ 2012-07-16 22:28 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Bastien, Juanma Barranquero, emacs-devel

On Mon, Jul 16, 2012 at 10:30 PM, Stefan Monnier
<monnier@iro.umontreal.ca> wrote:
>> I see the benefit of having
>
>>   (substring "abc" 0 4)
>>     => "abc"
>
>> in terms of simplifying Elisp writing -- and I still fail
>> to see the harm (but maybe Pascal will tell me where he has
>> been bitten by this.)
>
> I don't think such a semantics is harmful, indeed.  It's just different,
> and I don't think this issue affects enough code that it's worth
> changing from one to the other, nor is it worth adding a special
> argument for it.

I really disagree. If someone take the effort to implement the added
argument then please accept it as part of Emacs. (If no one does then
just let it be.)



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Why (substring "abc" 0 4) does not return "abc" instead of an error?
  2012-07-16 19:49           ` Thien-Thi Nguyen
@ 2012-07-16 22:32             ` Bastien
  0 siblings, 0 replies; 42+ messages in thread
From: Bastien @ 2012-07-16 22:32 UTC (permalink / raw)
  To: Thien-Thi Nguyen; +Cc: Pascal J. Bourguignon, emacs-devel

Thien-Thi Nguyen <ttn@gnuvola.org> writes:

> Overall, the design principle is that you
> want the platform to be firm and the performer flexible, not the
> other way around.  The audience may be pleased either way, but as
> producer, you want the roadies to sleep better lest they revolt...

Got it.  I'll sleep on this.

Now that the social pressure of nice Emacs fellows convinced me 
there was something wrong in my expectations, I still vote for
including `string-head' and `string-tail' (which see).

I see two reasons for not including them: one would be because
they are useless; the other because they would inevitably make 
someone request a new argument ERROR to throw an error when the
length of the head/tail is bigger than the string... but who
would be such a twisted mind?

Anyway, no more argument for tonight.

-- 
 Bastien



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Why (substring "abc" 0 4) does not return "abc" instead of an error?
  2012-07-16 22:28               ` Lennart Borgman
@ 2012-07-16 22:48                 ` Bastien
  2012-07-16 22:53                   ` Lennart Borgman
  0 siblings, 1 reply; 42+ messages in thread
From: Bastien @ 2012-07-16 22:48 UTC (permalink / raw)
  To: Lennart Borgman; +Cc: Juanma Barranquero, Stefan Monnier, emacs-devel

Lennart Borgman <lennart.borgman@gmail.com> writes:

> I really disagree. 

Would `string-head' suits your needs?

I find out that most of the use-cases I have for my slackfull 
error-free substring version, is when the first index is 0.

So this is the same than `string-head', really.

`string-head' is also what Lars would need in the example he
gave.

-- 
 Bastien



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Why (substring "abc" 0 4) does not return "abc" instead of an error?
  2012-07-16 22:48                 ` Bastien
@ 2012-07-16 22:53                   ` Lennart Borgman
  0 siblings, 0 replies; 42+ messages in thread
From: Lennart Borgman @ 2012-07-16 22:53 UTC (permalink / raw)
  To: Bastien; +Cc: Juanma Barranquero, Stefan Monnier, emacs-devel

On Tue, Jul 17, 2012 at 12:48 AM, Bastien <bzg@gnu.org> wrote:
> Lennart Borgman <lennart.borgman@gmail.com> writes:
>
>> I really disagree.
>
> Would `string-head' suits your needs?
>
> I find out that most of the use-cases I have for my slackfull
> error-free substring version, is when the first index is 0.
>
> So this is the same than `string-head', really.
>
> `string-head' is also what Lars would need in the example he
> gave.

I think I like your suggestion with a fourth argument to substring better.



^ permalink raw reply	[flat|nested] 42+ messages in thread

end of thread, other threads:[~2012-07-16 22:53 UTC | newest]

Thread overview: 42+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-07-15 23:15 Why (substring "abc" 0 4) does not return "abc" instead of an error? Bastien
2012-07-15 23:29 ` Juanma Barranquero
2012-07-15 23:59   ` Bastien
2012-07-16  0:10     ` Juanma Barranquero
2012-07-16  7:14       ` Bastien
2012-07-16 16:15         ` Stefan Monnier
2012-07-16 16:22           ` Bastien
2012-07-16 16:46           ` Bastien
2012-07-16 17:57             ` Tassilo Horn
2012-07-16 18:51               ` Lars Magne Ingebrigtsen
2012-07-16 19:30                 ` Bastien
2012-07-16 19:30                 ` Tassilo Horn
2012-07-16 20:20                 ` Pascal J. Bourguignon
2012-07-16 19:25               ` Bastien
2012-07-16 19:43                 ` Bastien
2012-07-16 20:19             ` Pascal J. Bourguignon
2012-07-16 20:30             ` Stefan Monnier
2012-07-16 22:28               ` Lennart Borgman
2012-07-16 22:48                 ` Bastien
2012-07-16 22:53                   ` Lennart Borgman
2012-07-16  7:38       ` Andreas Schwab
2012-07-16  9:40         ` Juanma Barranquero
  -- strict thread matches above, loose matches on Subject: below --
2012-07-16  3:45 Dmitry Gutov
2012-07-16  7:32 ` Bastien
2012-07-16  7:52   ` Thierry Volpiatto
2012-07-16  8:38     ` Bastien
2012-07-16 13:03   ` Dmitry Gutov
2012-07-16 14:32     ` Bastien
2012-07-16 13:10 ` Pascal J. Bourguignon
2012-07-16 14:40   ` Bastien
2012-07-16 15:00     ` Pascal J. Bourguignon
2012-07-16 15:07       ` Lennart Borgman
2012-07-16 15:19         ` Pascal J. Bourguignon
2012-07-16 15:22         ` Bastien
2012-07-16 15:46       ` Bastien
2012-07-16 15:49         ` Bastien
2012-07-16 19:49           ` Thien-Thi Nguyen
2012-07-16 22:32             ` Bastien
2012-07-16 15:56         ` Pascal J. Bourguignon
2012-07-16 16:13           ` Bastien
2012-07-16 19:00 Dmitry Gutov
2012-07-16 19:51 ` Tassilo Horn

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).