From mboxrd@z Thu Jan  1 00:00:00 1970
Path: news.gmane.org!not-for-mail
From: Daniel Colascione <dancol@dancol.org>
Newsgroups: gmane.emacs.devel
Subject: Re: [Emacs-diffs] trunk r114593: * lisp.h (eassert): Don't
	use	'assume'.
Date: Fri, 11 Oct 2013 02:55:44 -0700
Message-ID: <5257CB20.4030809@dancol.org>
References: <E1VTxwB-0001h8-7E@vcs.savannah.gnu.org>
	<52576305.9000703@dancol.org> <52579C68.1040904@cs.ucla.edu>
	<83iox4pa0w.fsf@gnu.org> <5257AB8C.40309@dancol.org>
	<83eh7sp6v0.fsf@gnu.org> <5257B489.2050609@dancol.org>
	<83k3hkrxao.fsf@gnu.org> <5257C27B.9090400@dancol.org>
	<83hacorvww.fsf@gnu.org>
NNTP-Posting-Host: plane.gmane.org
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-Trace: ger.gmane.org 1381485426 22544 80.91.229.3 (11 Oct 2013 09:57:06 GMT)
X-Complaints-To: usenet@ger.gmane.org
NNTP-Posting-Date: Fri, 11 Oct 2013 09:57:06 +0000 (UTC)
Cc: eggert@cs.ucla.edu, emacs-devel@gnu.org
To: Eli Zaretskii <eliz@gnu.org>
Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Fri Oct 11 11:57:09 2013
Return-path: <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>
Envelope-to: ged-emacs-devel@m.gmane.org
Original-Received: from lists.gnu.org ([208.118.235.17])
	by plane.gmane.org with esmtp (Exim 4.69)
	(envelope-from <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>)
	id 1VUZT3-0004UC-I7
	for ged-emacs-devel@m.gmane.org; Fri, 11 Oct 2013 11:57:09 +0200
Original-Received: from localhost ([::1]:53350 helo=lists.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>)
	id 1VUZT0-0003tw-76
	for ged-emacs-devel@m.gmane.org; Fri, 11 Oct 2013 05:57:06 -0400
Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:51211)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <dancol@dancol.org>) id 1VUZSd-0003pU-Vk
	for emacs-devel@gnu.org; Fri, 11 Oct 2013 05:57:03 -0400
Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <dancol@dancol.org>) id 1VUZSZ-0006BG-Ku
	for emacs-devel@gnu.org; Fri, 11 Oct 2013 05:56:43 -0400
Original-Received: from dancol.org ([2600:3c01::f03c:91ff:fedf:adf3]:47335)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <dancol@dancol.org>)
	id 1VUZST-00066m-3A; Fri, 11 Oct 2013 05:56:33 -0400
Original-Received: from [173.252.71.189] (helo=dcolascione-mbp.local)
	by dancol.org with esmtpsa (TLS1.0:DHE_RSA_CAMELLIA_256_CBC_SHA1:256)
	(Exim 4.80) (envelope-from <dancol@dancol.org>)
	id 1VUZSS-0001LE-DA; Fri, 11 Oct 2013 02:56:32 -0700
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8;
	rv:24.0) Gecko/20100101 Thunderbird/24.0
In-Reply-To: <83hacorvww.fsf@gnu.org>
X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address
	(bad octet value).
X-Received-From: 2600:3c01::f03c:91ff:fedf:adf3
X-BeenThere: emacs-devel@gnu.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: "Emacs development discussions." <emacs-devel.gnu.org>
List-Unsubscribe: <https://lists.gnu.org/mailman/options/emacs-devel>,
	<mailto:emacs-devel-request@gnu.org?subject=unsubscribe>
List-Archive: <http://lists.gnu.org/archive/html/emacs-devel>
List-Post: <mailto:emacs-devel@gnu.org>
List-Help: <mailto:emacs-devel-request@gnu.org?subject=help>
List-Subscribe: <https://lists.gnu.org/mailman/listinfo/emacs-devel>,
	<mailto:emacs-devel-request@gnu.org?subject=subscribe>
Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org
Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org
Xref: news.gmane.org gmane.emacs.devel:164084
Archived-At: <http://permalink.gmane.org/gmane.emacs.devel/164084>

On 10/11/13 2:36 AM, Eli Zaretskii wrote:
>> Date: Fri, 11 Oct 2013 02:18:51 -0700
>> From: Daniel Colascione <dancol@dancol.org>
>> CC: eggert@cs.ucla.edu, emacs-devel@gnu.org
>>
>>> That is a hypothetical situation.  In Emacs, the code is already
>>> written on the assumption that *b != 0, so it is already "optimized"
>>> for that.
>>
>> While the programmer may have written her C code under the assumption
>> that an asserted condition holds, the compiler can't know that the
>> asserted condition holds when generating its machine code. The point of
>> the assume mechanism is to provide this information to the compiler.
>
> The compiler will be unable to take advantage of that information,
> because there's no source code to apply that information to.

I don't understand this argument. In my example, the assume would inform 
the compiler that it didn't have to emit code to handle division in the 
case that *b is zero. Similar principles apply to assertions about 
numeric ranges, pointer non-NULL-ness, etc. The C99 restrict keyword is 
similar, in a sense.

>>> In most cases, you won't see any code that can be optimized
>>> out using this assumption, as the programmer already did that --
>>> that's why she added the assertion in the first place.
>>
>> At the C level, not the code generation level.
>
> Code is generated from the C code, not out of thin air.

And we're talking about giving the compiler more information about the C 
code.

>>> You are obviously thinking about a different reason of using
>>> assertions: to assert certain invariants in the code and enforce
>>> contracts of certain APIs.  But that is almost never the case in
>>> Emacs, because (gasp!) almost every Emacs API has no well-defined
>>> contract at all!
>>
>> Aren't these two cases actually the same?
>
> No, they are not.  E.g., if the programmer's assumption was wrong, and
> there are in fact valid use cases where her assertion doesn't hold,
> then in production code (where the assertions are defined away) your
> 'assume' will degrade the code quality for legitimate use cases.

Any assertion that might not actually hold is a bug. That a progammer 
might, after an assertion, insert code to act sanely in the case that an 
assertion does not hold is irrelevant: it's reacting to an error, not 
dealing with a situation that ought to occur in practice. If the 
programmer really expects an assertion not to hold, he can use a variant 
that doesn't assume the condition holds. But these cases should be rare.

>>>>> , and lumping them
>>>>> together into a single construct is likely to continue bumping upon
>>>>> problems that stem from basic incompatibility between the use cases,
>>>>> which target two different non-overlapping build types.
>>>>
>>>> Only when we have side effects.
>>>
>>> For now, yes.  I'm afraid that's just the tip of the iceberg, though.
>>
>> What other problems can you imagine?
>
> How should I know?  Does anyone know under which conditions, exactly,
> a badly engineered bridge will collapse?

So this point boils down to "I have a bad feeling about this"?

>>> The problem is to make sure an assertion obviously _is_ free of side
>>> effects.  With Emacs's massive use of macros, which call other macros,
>>> which call other macros, which... -- that is extremely hard.  And why
>>> should a programmer who just wants to assert something go to such
>>> great lengths?  That is just a maintenance burden for which I find no
>>> good justification.
>>
>> What great lengths? Most common macros --- things like EQ --- are
>> clearly free of side effects.
>
> There are a lot of macros much more complex than EQ.

So don't use the assume-and-assert macros for questionable cases.

>> The more exotic assertions probably aren't worth assuming anyway.
>
> Not sure I understand what you are saying here.

I'm speculating that the optimization value to be gained from assuming 
very complex conditions is smaller than the value gained for assuming 
relatively simple conditions.

>> If GCC had some builtin that allowed us to determine whether an
>> expression was free of side effects, we could use that to make the
>> decision automatically at compile time. Until we get such a facility,
>> though, providing some kind of eassert_and_assume macro helps people
>> make the best of simple assertions while avoiding the side effect
>> problem for more complicated ones.
>
> I think you are wrong, but I guess I'm unable to convince you.

AFAICT, your opposition here boils down to the idea that there's some 
fundamental difference between, on one hand, statements we make about 
program behavior when we're debugging, and on the other hand, statements 
we make about program behavior when we're optimizing. Except for some 
cases involving deficiencies in the language-level mechanisms with which 
we make these statements, I don't see why we should regard these 
statements as different at all. To the exent these deficiencies force us 
to care about the difference --- the side effect issue --- we can get 
around the problem by using one macro to express "assert and assume" and 
to express "just assert", since it's always safe to assert a statement 
we're assuming to be true, but not necessarily the other way around.

You could argue that having two macros instead of one imposes a 
maintenance burden and that there isn't a payoff sufficient to justify 
this burden, but I don't think the maintenance cost of having another 
macro is very large, especially if we leave existing assertions as they 
are and use the assume-and-assert macro only for cases that are clearly 
free of side effects.