From mboxrd@z Thu Jan  1 00:00:00 1970
Path: news.gmane.org!not-for-mail
From: Steve Yegge <steve.yegge@gmail.com>
Newsgroups: gmane.emacs.devel
Subject: Re: proposal to make null string handling more emacs-y
Date: Thu, 26 Apr 2012 21:17:36 -0700
Message-ID: <CAGtm15=vkZ6K+Q246J4nx1rob_UjjQbwH_ZkrNJa7qdGr46jxQ@mail.gmail.com>
References: <CAGtm15=Wb1Uv_WvYKQ7hMBkb=2Y1qUJjGnFFkBj6fcPotgXV8g@mail.gmail.com>
	<83d36wfcf1.fsf@gnu.org> <jwvwr53j3i9.fsf-monnier+emacs@gnu.org>
	<834ns7g9r8.fsf@gnu.org> <jwvliljj0er.fsf-monnier+emacs@gnu.org>
	<CAGtm15=gkeHkVQdau0ywcBC6tHVLF4pHYzVhDO5B4Nob23z_zg@mail.gmail.com>
	<jwv1unaf0e8.fsf-monnier+emacs@gnu.org>
NNTP-Posting-Host: plane.gmane.org
Mime-Version: 1.0
Content-Type: multipart/alternative; boundary=20cf305b12665ea39e04bea16204
X-Trace: dough.gmane.org 1335500274 32067 80.91.229.3 (27 Apr 2012 04:17:54 GMT)
X-Complaints-To: usenet@dough.gmane.org
NNTP-Posting-Date: Fri, 27 Apr 2012 04:17:54 +0000 (UTC)
Cc: Eli Zaretskii <eliz@gnu.org>, emacs-devel@gnu.org
To: Stefan Monnier <monnier@iro.umontreal.ca>
Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Fri Apr 27 06:17:53 2012
Return-path: <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>
Envelope-to: ged-emacs-devel@m.gmane.org
Original-Received: from lists.gnu.org ([208.118.235.17])
	by plane.gmane.org with esmtp (Exim 4.69)
	(envelope-from <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>)
	id 1SNccw-00030q-1Q
	for ged-emacs-devel@m.gmane.org; Fri, 27 Apr 2012 06:17:50 +0200
Original-Received: from localhost ([::1]:46392 helo=lists.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>)
	id 1SNccv-00047i-8i
	for ged-emacs-devel@m.gmane.org; Fri, 27 Apr 2012 00:17:49 -0400
Original-Received: from eggs.gnu.org ([208.118.235.92]:59060)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <steve.yegge@gmail.com>) id 1SNccq-00047R-NV
	for emacs-devel@gnu.org; Fri, 27 Apr 2012 00:17:46 -0400
Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <steve.yegge@gmail.com>) id 1SNcco-0006WA-DJ
	for emacs-devel@gnu.org; Fri, 27 Apr 2012 00:17:44 -0400
Original-Received: from mail-yw0-f41.google.com ([209.85.213.41]:52505)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <steve.yegge@gmail.com>)
	id 1SNcck-0006Vo-Dv; Fri, 27 Apr 2012 00:17:38 -0400
Original-Received: by yhr47 with SMTP id 47so194026yhr.0
	for <multiple recipients>; Thu, 26 Apr 2012 21:17:36 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
	h=mime-version:in-reply-to:references:date:message-id:subject:from:to
	:cc:content-type;
	bh=G+nsPhwDXy0yDXVfwvTKh5NHZ3hMswXz2TD/2cNtRss=;
	b=ly0hGPV7Ib0pFZqJV9k35qIPr14Lrq0JfTVFlgqO7DUuNUpuhgQAX8RN7JTpo/I7yo
	PG9lUEQYvcRoz/zSLmZ2p4PJWpztI4LE8xhI0LowSGdEfQvpZpVAoThew0CmPzTTuGDt
	Ty0IJ4HtOS48s5uMsL0OavPA2tVPc1Jh/jG4ofRbDXfwWYvvBy0uIt1zR8DHX9li5Gtp
	tA/DWrZAShv+ZUF0pA+bjmKXikNqYHCxjapLuxf3JKqXsATDlzhEY+38c/KRw5i47AKZ
	vIJq5IZlKpjA7i4bDjzfpnfcs3346ns3MIggoiH3VNyFI+zVckfPsmM+pUku9n6PYaMK
	/4jg==
Original-Received: by 10.236.181.39 with SMTP id k27mr9576559yhm.52.1335500256223; Thu,
	26 Apr 2012 21:17:36 -0700 (PDT)
Original-Received: by 10.236.156.72 with HTTP; Thu, 26 Apr 2012 21:17:36 -0700 (PDT)
In-Reply-To: <jwv1unaf0e8.fsf-monnier+emacs@gnu.org>
X-detected-operating-system: by eggs.gnu.org: Genre and OS details not
	recognized.
X-Received-From: 209.85.213.41
X-BeenThere: emacs-devel@gnu.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: "Emacs development discussions." <emacs-devel.gnu.org>
List-Unsubscribe: <https://lists.gnu.org/mailman/options/emacs-devel>,
	<mailto:emacs-devel-request@gnu.org?subject=unsubscribe>
List-Archive: <http://lists.gnu.org/archive/html/emacs-devel>
List-Post: <mailto:emacs-devel@gnu.org>
List-Help: <mailto:emacs-devel-request@gnu.org?subject=help>
List-Subscribe: <https://lists.gnu.org/mailman/listinfo/emacs-devel>,
	<mailto:emacs-devel-request@gnu.org?subject=subscribe>
Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org
Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org
Xref: news.gmane.org gmane.emacs.devel:150079
Archived-At: <http://permalink.gmane.org/gmane.emacs.devel/150079>

--20cf305b12665ea39e04bea16204
Content-Type: text/plain; charset=ISO-8859-1

On Thu, Apr 26, 2012 at 6:10 PM, Stefan Monnier <monnier@iro.umontreal.ca>wrote:

> > There's a lot of code out there that's forced to do type assertions on
> > string args that could be simplified if these common functions could
> > accept nils.
>
> I can believe that (it's often handy to use "nil-in-nil-out" when
> composing functions which may return nil).


Indeed!


> But there's also the risk
> that you hide real errors, leading to weird behaviors that are more
> difficult to track down.
>

Yes, this is a risk.  However, it's happening right now, as we speak,
all the time.  When stuff works strangely, users find workarounds, and/or
ask developer lists to help them diagnose it.

When stuff doesn't work at all -- which is what happens when Emacs
starts throwing runtime errors -- then it's a lot harder for end-users to
find workarounds.  So it seems to me that we are optimizing for our own
convenience rather than that of end users.


> E.g. one package uses some other package's var before it's initialized,
> so with the current semantics you might get a clean error, whereas with
> your proposed semantics you might get some weird behavior where the user
> says
>
>  why doesn't bar find my thingy even though C-h v foo-var tells me it's
>  set to "/some/path"?
>

I think we need to be really careful here.  We're talking about these errors
as if they are type assertions in the classic type-theoretic sense, and we
are pointing at the well-documented benefits of early type warnings as a
justification for leaving the errors in place.

But type errors are for *developers*.  They are supposed to happen at
compile time.  When you're running the byte compiler or the unit tests, you
want it to fail loudly and early.  But when you ship the software, and it's
in the hands of end-users who may not be able to debug it, then unless
it's running a nuclear reactor or an airplane, you want the software to be
robust.  Your browser shouldn't crash because of a misbehaved site; your
web page shouldn't fail to load because of a misbehaved widget; your
CAD program shouldn't stop functioning because of a misbehaved polygon.
Yes, the resulting bug will be annoying, but it's far less annoying than
having
your work interrupted altogether.

With that in mind, I'm really wondering what the big fear is here.  As I
said,
I'm willing to concede that the generalized nil-always-acts-like-"" solution
may be too risky or too intrusive.  But for the list of specific functions
that
I followed up with, I think that for many of them it's quite natural to
assume
that they'd take nils.

Here are examples of functions that already take nil as a string argument:

(string< nil "") => nil
(string= nil "") => nil
(concat "a" nil "b")  => "ab" (yes, it's "effectively" taking nil, but it's
convenient!)
(string-to-sequence nil 'vector) => []
(string-to-list nil) => nil

I don't see any of these creating world-ending bugs, contrary to
predictions.

Similarly, I see a whole bunch of file-name manipulation functions that
accept the empty string as an argument.  But the empty string has no
semantic meaning whatsoever as a valid filename.  Without any semantic
meaning, it's arguably a bug to pass one in.  Yet someone came up with
arbitrary definitions for all all these functions should handle an empty
path.

Let's suppose -- bear with me here -- that long ago a mechanism had been
introduced into Emacs to prevent passing the empty string ("") to file-name
manipulation functions, because it's not a valid path on any system.  So
all file-name functions would throw an "invalid path" argument when passed
the empty string.  And let's say I came along and proposed that we ought
to accept the empty string for both user and programmer convenience, as
it is clearly possible to give it reasonable semantics for the entire set of
file-name functions.

In this hypothetical (yet quite similar) scenario, I can guarantee you that
people would be up in arms about how type assertions -- in this case, the
assertion that the path is well-formed by virtue of being nonempty -- can
help find all sorts of errors that might otherwise go undetected.  And there
would be dire predictions about introducing difficult-to-diagnose bugs.
C'mon... we all know this is what would happen.

But the empty strings are just fine.  And the nils will be too.  I promise!
=)

-steve

--20cf305b12665ea39e04bea16204
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

<div class=3D"gmail_extra">On Thu, Apr 26, 2012 at 6:10 PM, Stefan Monnier =
<span dir=3D"ltr">&lt;<a href=3D"mailto:monnier@iro.umontreal.ca" target=3D=
"_blank">monnier@iro.umontreal.ca</a>&gt;</span> wrote:<br><div class=3D"gm=
ail_quote">
<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex"><div class=3D"im">&gt; There&#39;s a lot of =
code out there that&#39;s forced to do type assertions on<br>
&gt; string args that could be simplified if these common functions could<b=
r>
&gt; accept nils.<br>
<br>
</div>I can believe that (it&#39;s often handy to use &quot;nil-in-nil-out&=
quot; when<br>
composing functions which may return nil). =A0</blockquote><div><br></div><=
div>Indeed!</div><div>=A0</div><blockquote class=3D"gmail_quote" style=3D"m=
argin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">But there&#39=
;s also the risk<br>

that you hide real errors, leading to weird behaviors that are more<br>
difficult to track down.<br></blockquote><div><br></div><div>Yes, this is a=
 risk. =A0However, it&#39;s happening right now, as we speak,</div><div>all=
 the time. =A0When stuff works strangely, users find workarounds, and/or</d=
iv>
<div>ask developer lists to help them diagnose it.</div><div><br></div><div=
>When stuff doesn&#39;t work at all -- which is what happens when Emacs</di=
v><div>starts throwing runtime errors -- then it&#39;s a lot harder for end=
-users to</div>
<div>find workarounds. =A0So it seems to me that we are optimizing for our =
own</div><div>convenience rather than that of end users.</div><div>=A0</div=
><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1=
px #ccc solid;padding-left:1ex">

E.g. one package uses some other package&#39;s var before it&#39;s initiali=
zed,<br>
so with the current semantics you might get a clean error, whereas with<br>
your proposed semantics you might get some weird behavior where the user<br=
>
says<br>
<br>
 =A0why doesn&#39;t bar find my thingy even though C-h v foo-var tells me i=
t&#39;s<br>
 =A0set to &quot;/some/path&quot;?<br></blockquote><div><br></div><div>I th=
ink we need to be really careful here. =A0We&#39;re talking about these err=
ors</div><div>as if they are type assertions in the classic type-theoretic =
sense, and we</div>
<div>are pointing at the well-documented benefits of early type warnings as=
 a</div><div>justification for leaving the errors in place.</div><div><br><=
/div><div>But type errors are for *developers*. =A0They are supposed to hap=
pen at</div>
<div>compile time. =A0When you&#39;re running the byte compiler or the unit=
 tests, you</div><div>want it to fail loudly and early. =A0But when you shi=
p the software, and it&#39;s</div><div>in the hands of end-users who may no=
t be able to debug it, then unless</div>
<div>it&#39;s running a nuclear reactor or an airplane, you want the softwa=
re to be</div><div>robust. =A0Your browser shouldn&#39;t crash because of a=
 misbehaved site; your</div><div>web page shouldn&#39;t fail to load becaus=
e of a misbehaved widget; your</div>
<div>CAD program shouldn&#39;t stop functioning because of a misbehaved pol=
ygon.</div><div>Yes, the resulting bug will be annoying, but it&#39;s far l=
ess annoying than having</div><div>your work interrupted altogether.</div>
<div><br></div><div>With that in mind, I&#39;m really wondering what the bi=
g fear is here. =A0As I said,</div><div>I&#39;m willing to concede that the=
 generalized nil-always-acts-like-&quot;&quot; solution</div><div>may be to=
o risky or too intrusive. =A0But for the list of specific functions that</d=
iv>
<div>I followed up with, I think that for many of them it&#39;s quite natur=
al to assume</div><div>that they&#39;d take nils.</div><div><br></div><div>=
Here are examples of functions that already take nil as a string argument:<=
/div>
<div><br></div><div>(string&lt; nil &quot;&quot;) =3D&gt; nil</div><div>(st=
ring=3D nil &quot;&quot;) =3D&gt; nil</div><div>(concat &quot;a&quot; nil &=
quot;b&quot;) =A0=3D&gt; &quot;ab&quot; (yes, it&#39;s &quot;effectively&qu=
ot; taking nil, but it&#39;s convenient!)</div>
<div>(string-to-sequence nil &#39;vector) =3D&gt; []</div><div>(string-to-l=
ist nil) =3D&gt; nil</div><div><br></div><div>I don&#39;t see any of these =
creating world-ending bugs, contrary to predictions.</div><div><br></div><d=
iv>
Similarly, I see a whole bunch of file-name manipulation functions that</di=
v><div>accept the empty string as an argument. =A0But the empty string has =
no</div><div>semantic meaning whatsoever as a valid filename. =A0Without an=
y semantic</div>
<div>meaning, it&#39;s arguably a bug to pass one in. =A0Yet someone came u=
p with</div><div>arbitrary definitions for all all these functions should h=
andle an empty path.</div><div><br></div><div>Let&#39;s suppose -- bear wit=
h me here -- that long ago a mechanism had been</div>
<div>introduced into Emacs to prevent passing the empty string (&quot;&quot=
;) to file-name</div><div>manipulation functions, because it&#39;s not a va=
lid path on any system. =A0So</div><div>all file-name functions would throw=
 an &quot;invalid path&quot; argument when passed</div>
<div>the empty string. =A0And let&#39;s say I came along and proposed that =
we ought</div><div>to accept the empty string for both user and programmer =
convenience, as</div><div>it is clearly possible to give it reasonable sema=
ntics for the entire set of</div>
<div>file-name functions.</div><div><br></div><div>In this hypothetical (ye=
t quite similar) scenario, I can guarantee you that</div><div>people would =
be up in arms about how type assertions -- in this case, the</div><div>
assertion that the path is well-formed by virtue of being nonempty -- can</=
div><div>help find all sorts of errors that might otherwise go undetected. =
=A0And there</div><div>would be dire predictions about introducing difficul=
t-to-diagnose bugs.</div>
<div>C&#39;mon... we all know this is what would happen.</div><div><br></di=
v><div>But the empty strings are just fine. =A0And the nils will be too. =
=A0I promise! =3D)</div><div><br></div><div>-steve</div></div></div>

--20cf305b12665ea39e04bea16204--