unofficial mirror of guile-devel@gnu.org 
 help / color / mirror / Atom feed
* Verifying Toolchain Semantics
@ 2014-10-02 14:54 Ian Grant
  2014-10-03  6:23 ` Mark H Weaver
  0 siblings, 1 reply; 27+ messages in thread
From: Ian Grant @ 2014-10-02 14:54 UTC (permalink / raw)
  To: llvmdev, gcc@gcc.gnu.org, guile-devel, mlton-devel
  Cc: Linus Torvalds, Theo deRaadt, Richard Stallman, Markus Kuhn,
	schellr, John Harrison, monnier, Taylan Bayırlı

Dear programming language types,

I wrote this to try once again to explain what is the nature of the
problem that one would have in verifying the integrity of _any_
software toolchain, whether it is aimed ultimately at the production
of other software, or of hardware.

    http://livelogic.blogspot.com/2014/10/the-foundation-part-i.html

This three page text is ostensively about verifying the integrity of a
communications link operating over an un-trusted transport layer, but
a compiler is really a type of communications channel.

I am sure everyone still reading this has wondered about the
possibilities of using universal programming languages (universal in
the Church-Turing sense) as communications protocols. For example, one
could establish a point-to-point connection by writing a program which,
when run, output two more programs: one which, when run, outputs a
decoding pad for the next mesage one would transmit over that channel,
and the other the decoder which prints the message text together with
another program, which was the encoder for returning an
acknowledgement. Both endpoints would do this, and so the
programs/messages would be exchanged, with each one encoding the text
of the other.  Then these programs could include decisions not only
about the encoding of data, choice of one-time pads, etc. but perhaps
also the routes of messages, sending different parts via different
trusted routes over similar channels etc. etc. The variations are
endless, and limited only by the ingenuity of the programmers
communicating over those channels.

And really, I sorely pity anyone charged with organising any kind of
surveillance of a group of people who enjoy that sort of
game. Cracking "the code" would be practically impossible, because
there need never be any fixed concrete representation whatsoever of
the fundamental encoding as it is transmitted over the transport
medium: _all_ of the knowledge about the current and future encodings
can be sent over the previous "incarnations" of that and/or another
channel, and the encoding of channels thereby made non-deterministic:
this means that there could never be _in principle,_ any mechanical
process _whatsoever_ which could decode more than a few parts of any
of those messages. After this brief success, the poor would-be spy
would be right back at square one.

What I try to explain here is the essential distinction between what I
call _actual knowledge,_ as opposed to mere _represented knowledge,_
such as a password, or an SSL certificate, or the documentation for
some file format appearing on a web page. The distinction is that only
in the case of actual knowledge does one know _how_ and _why_ one
knows.

The motivation is the idea that by using actual rather than
represented knowledge, it is possible to construct such a trustworthy
system in practice. But there's a catch! The catch is that this will
only work for an organisation whose motives and governance are
completely open and transparent. This is because the technique relies
upon mutual trust, which is something that cannot exist without
openness and transparency. Bad guys just won't get it! To understand
why (in case it is not immediately obvious to you, that is) you will
need to read (or at least think carefully about) about how
error-detection would work in such a system.

The text consists of a title page with the abstract, and two full
pages. So it should be an easy read. I earlier sent out a nine page
document entitled GNU Thunder, in which I attempted to describe what I
consider to be essentially the same idea, but with the emphasis on
compilers and interpreters, rather than communications. The Thunder
text was a concrete suggestion for an implementation. This text
however is more abstract. But these two documents could be
considered to be complementary in two different senses.

I hope everyone enjoys this, and that it stimulates some interesting
thoughts, and subsequent discussion, and that those thoughts are all
good, and the discussion open and transparent. That way we could save
ourselves an awful lot of _really hairy_ metaprogramming!

Please feel free to copy either text and/or this message and pass it
around. Neither of these two texts are copyright, and the more people
that see them the better. Bad guys in particular need to know about
this much more than the good ones do.

Ian


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Verifying Toolchain Semantics
  2014-10-02 14:54 Verifying Toolchain Semantics Ian Grant
@ 2014-10-03  6:23 ` Mark H Weaver
  2014-10-03  7:15   ` William ML Leslie
                     ` (2 more replies)
  0 siblings, 3 replies; 27+ messages in thread
From: Mark H Weaver @ 2014-10-03  6:23 UTC (permalink / raw)
  To: Ian Grant; +Cc: guile-devel

Ian Grant <ian.a.n.grant@googlemail.com> writes:

> Dear programming language types,
>
> I wrote this to try once again to explain what is the nature of the
> problem that one would have in verifying the integrity of _any_
> software toolchain, whether it is aimed ultimately at the production
> of other software, or of hardware.
>
>     http://livelogic.blogspot.com/2014/10/the-foundation-part-i.html

I downloaded the PDF linked in that blog entry and attempted to view it
using Emacs's docview mode, which reported that the pdf->png process
died with a segfault.

It's ironic that someone who claims to be so concerned with security
steadfastly refuses to provide his most important essays in a simple,
transparent format.  Instead, he insists to distribute them in an opaque
format that can only be interpreted by a small handful of very complex
programs with a large attack surface.

For that matter, it's also interesting that someone concerned about
Thompson viruses would suggest that Guile should distribute it's
compiler in the form of pre-compiled intermediate C code (compiled from
Scheme) instead of bootstrapping from source code, in order to speed up
the compilation process.

I've wasted more time than I should have reading Ian's writings, looking
for an answer to this apparent contradiction in his views, and I haven't
found it.

While we're on the subject of paranoid theories, here's one for you:
maybe Ian Grant's true motive is to induce some of the most important
developers of free toolchains and the Linux kernel to load PDFs that
infect their computers with malware, in order to subvert our core
infrastructure.

Ian: tell me again, why do you refuse to distribute your essays in plain
text?  I read GNU Thunder and I don't remember seeing anything in there
that justifies the use of such a complex format.  As I recall, it's just
plain text anyway.

     Mark



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Verifying Toolchain Semantics
  2014-10-03  6:23 ` Mark H Weaver
@ 2014-10-03  7:15   ` William ML Leslie
  2014-10-03 12:56     ` Taylan Ulrich Bayirli/Kammer
  2014-10-03  8:45   ` Nala Ginrut
  2014-10-05  1:35   ` Ian Grant
  2 siblings, 1 reply; 27+ messages in thread
From: William ML Leslie @ 2014-10-03  7:15 UTC (permalink / raw)
  To: guile-devel

[-- Attachment #1: Type: text/plain, Size: 929 bytes --]

On 3 October 2014 16:23, Mark H Weaver <mhw@netris.org> wrote:

> Instead, he insists to distribute them in an opaque
> format that can only be interpreted by a small handful of very complex
> programs with a large attack surface.
>

​Oh, interesting point.  Maybe we should define PDF as an abstract
semantics that we can convert into a wide range of equivalent document
layout languages?  If the attacker can't tell exactly what xsl-fo or dsssl
the tool will output, or what software you're using to render the result,
it will magically make it more difficult to attack!​

-- 
William Leslie

Notice:
Likely much of this email is, by the nature of copyright, covered under
copyright law.  You absolutely MAY reproduce any part of it in accordance
with the copyright law of the nation you are reading this in.  Any attempt
to DENY YOU THOSE RIGHTS would be illegal without prior contractual
agreement.

[-- Attachment #2: Type: text/html, Size: 1349 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Verifying Toolchain Semantics
  2014-10-03  6:23 ` Mark H Weaver
  2014-10-03  7:15   ` William ML Leslie
@ 2014-10-03  8:45   ` Nala Ginrut
  2014-10-05  1:35   ` Ian Grant
  2 siblings, 0 replies; 27+ messages in thread
From: Nala Ginrut @ 2014-10-03  8:45 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: Ian Grant, guile-devel

On Fri, 2014-10-03 at 02:23 -0400, Mark H Weaver wrote:
> Ian Grant <ian.a.n.grant@googlemail.com> writes:
> 
> > Dear programming language types,
> >
> > I wrote this to try once again to explain what is the nature of the
> > problem that one would have in verifying the integrity of _any_
> > software toolchain, whether it is aimed ultimately at the production
> > of other software, or of hardware.
> >
> >     http://livelogic.blogspot.com/2014/10/the-foundation-part-i.html
> 
> I downloaded the PDF linked in that blog entry and attempted to view it
> using Emacs's docview mode, which reported that the pdf->png process
> died with a segfault.
> 

I encountered error when I tried to open it yesterday, several times. So
it maybe not a data transmitting problem.

> It's ironic that someone who claims to be so concerned with security
> steadfastly refuses to provide his most important essays in a simple,
> transparent format.  Instead, he insists to distribute them in an opaque
> format that can only be interpreted by a small handful of very complex
> programs with a large attack surface.

Maybe the author want a better layout of an article. It's better to
provide multi formats include TXT. Some guys just want to get to known
the idea quickly, like me.
But I guess there're several maths formulas in the article, so maybe TXT
is not enough for it.



-- 
People who are really serious about software should make their own
hardware. -- Alan Kay

nalaginrut.com





^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Verifying Toolchain Semantics
  2014-10-03  7:15   ` William ML Leslie
@ 2014-10-03 12:56     ` Taylan Ulrich Bayirli/Kammer
  2014-10-03 17:13       ` Mark H Weaver
  2014-10-05 13:57       ` William ML Leslie
  0 siblings, 2 replies; 27+ messages in thread
From: Taylan Ulrich Bayirli/Kammer @ 2014-10-03 12:56 UTC (permalink / raw)
  To: William ML Leslie; +Cc: guile-devel

William ML Leslie <william.leslie.ttg@gmail.com> writes:

> ​Oh, interesting point. Maybe we should define PDF as an abstract
> semantics that we can convert into a wide range of equivalent document
> layout languages? If the attacker can't tell exactly what xsl-fo or
> dsssl the tool will output, or what software you're using to render
> the result, it will magically make it more difficult to attack!​

Come on, that snark is undue. :-)

It won't make it "magically" more difficult, it will make it actually
more difficult.  Though I think your analogy is wrong.  PDF is already
an abstract semantics, like any formal language spec.  The question is
whether an implementation realizes precisely those semantics, or adds
"oh and insert a backdoor" to some pieces of code in that language, "as
an addition to the language semantics."  This augmentation to the
semantics needn't be apparent in the source code of the implementation,
since the language it's written in might again be compiled by a compiler
that "augments" the semantics of the language it compiles, and that in
turn needn't be apparent in the compiler's source ... until you arrive
at the since long vanished initial source code that planted the virus.

That's just reiterating what the Thompson hack is.  It relies on us
using a single lineage of compilers/software.  (Or a couple lineages
could be compromised separately; that would still be plausible if it's
plausible for one.)  Because at every layer the semantics can only be
compromised enough to propagate the compromise through one concrete
code-base.  In your PDF analogy, the solution is to write a spurious
amount of PDF implementations.  Or for C, to implement a spurious amount
of C compilers.  That is impractical because C is complex.  What might
be practical might be to write one new C compiler (guaranteed clean,
since it's new), and verify GCC with that[*].  What's more useful in the
long term is to define a much simpler language, and base our everything
on that instead of on C.  Then we can verify our toolchain more often,
since we needn't write a whole freaking C compiler every time. :-)

[*] Compile version X of GCC with the Clean C Compiler™, compile version
X of GCC with some GCC binary, then compile some GCC with both GCC-X
binaries, one of which is guaranteed clean, to see if they behave same
while compiling GCC; thus we get both a guaranteed clean GCC, and we get
to know whether the other was compromised in first place, or whether we
wasted a lot of time.

I think Ian is doing himself a disservice by using many fancy abstract
terms and, if I may be blunt, acting in some tinfoil hat-like ways, when
in fact the underlying ideas are relatively sane and simple.  It's all
there if you read a little between the lines, and a little around the
parts written in all-caps, like in:

http://lists.gnu.org/archive/html/guile-devel/2014-09/msg00035.html

Taylan



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Verifying Toolchain Semantics
  2014-10-03 12:56     ` Taylan Ulrich Bayirli/Kammer
@ 2014-10-03 17:13       ` Mark H Weaver
  2014-10-05 13:57       ` William ML Leslie
  1 sibling, 0 replies; 27+ messages in thread
From: Mark H Weaver @ 2014-10-03 17:13 UTC (permalink / raw)
  To: Taylan Ulrich Bayirli/Kammer; +Cc: guile-devel

Taylan Ulrich Bayirli/Kammer <taylanbayirli@gmail.com> writes:

> In your PDF analogy, the solution is to write a spurious
> amount of PDF implementations.  Or for C, to implement a spurious amount
> of C compilers.  That is impractical because C is complex.  What might
> be practical might be to write one new C compiler (guaranteed clean,
> since it's new), and verify GCC with that[*].  What's more useful in the
> long term is to define a much simpler language, and base our everything
> on that instead of on C.

Okay, so why does he insist on using PDF to distribute an essay that is
just plain text anyway?  He should practice what he preaches.

      Mark



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Verifying Toolchain Semantics
  2014-10-03  6:23 ` Mark H Weaver
  2014-10-03  7:15   ` William ML Leslie
  2014-10-03  8:45   ` Nala Ginrut
@ 2014-10-05  1:35   ` Ian Grant
  2014-10-05  5:39     ` Nala Ginrut
  2014-10-05  6:58     ` Mike Gerwitz
  2 siblings, 2 replies; 27+ messages in thread
From: Ian Grant @ 2014-10-05  1:35 UTC (permalink / raw)
  To: Mark H Weaver, Markus Kuhn; +Cc: guile-devel

On Fri, Oct 3, 2014 at 2:23 AM, Mark H Weaver <mhw@netris.org> wrote:
>>     http://livelogic.blogspot.com/2014/10/the-foundation-part-i.html
> I downloaded the PDF linked in that blog entry and attempted to view it
> using Emacs's docview mode, which reported that the pdf->png process
> died with a segfault.

This means that there is a bug in that program. PDF (or ISO 32000) is
a well-defined text file format. See
http://www.adobe.com/devnet/pdf/pdf_reference.html

So no software tools should ever segfault reading a PDF file. Even one
that contained a trojan program. If the software were properly
designed and written then it would be _impossible_ for there to be a
trojan in a PDF file.

> It's ironic that someone who claims to be so concerned with security
> steadfastly refuses to provide his most important essays in a simple,
> transparent format.  Instead, he insists to distribute them in an opaque
> format that can only be interpreted by a small handful of very complex
> programs with a large attack surface.

The "attack surface" is indeed huge, but only because of the shoddy
standards of the software developers who implement it!

> For that matter, it's also interesting that someone concerned about
> Thompson viruses would suggest that Guile should distribute it's
> compiler in the form of pre-compiled intermediate C code (compiled from
> Scheme) instead of bootstrapping from source code, in order to speed up
> the compilation process.

I have already explained why this is an invalid argument. It is also
hypocritical, because the guile source distribution already includes
over 50,000 lines of intermediate shell script. This is unintelligible
(I have tried reading ./configure, there are these functions which
construct strings of 512 backslashes, and stuff like that, and _no_
explanation for why that is necessary!)

On the contrary, the intermediate code I suggest shipping need not
necessarily be unintelligible. It would be very clear, and quite
concise. For an example (in Standard ML) of the sort of
code-generating code I am talking about, see

   http://livelogic.blogspot.com/2014/07/writing-assembler-using-standardml.html

Basically, the inductive structure of the parser and interpreter will
still be very clear in the application of the assembler-generating
primitives. Now Standard ML is nothing much more than than typed
scheme, with a _very_ powerful "module system" which has a sound
formal basis in constructive logic: an ML functor application is
essentiallya formal correctness proof. So we could quite easily
translate Standard ML to scheme, and the generated scheme would be
perfectly readable and auditable.

> I've wasted more time than I should have reading Ian's writings, looking
> for an answer to this apparent contradiction in his views, and I haven't
> found it.

You have indeed wasted your time, because the contradiction is not in
my writing, it's only in your own mind.

> While we're on the subject of paranoid theories, here's one for you:
> maybe Ian Grant's true motive is to induce some of the most important
> developers of free toolchains and the Linux kernel to load PDFs that
> infect their computers with malware, in order to subvert our core
> infrastructure.

Well, if I do succeed in distributing malware, it will be a good
demonstration of what I have been arguing for months now, which is
that your "core infrastructure" is _very,_ _very_ flaky, and that far
from being "the most important developers," you are in fact just
part-time amateur hackers playing at your 'hobbies'.

What I am trying to do here is wake you people up from what will
otherwise prove to be terminal sleep. This is not a hobby, you are
combatants in a global information war, and it will cost some of you
your lives,

> Ian: tell me again, why do you refuse to distribute your essays in plain
> text?  I read GNU Thunder and I don't remember seeing anything in there
> that justifies the use of such a complex format.  As I recall, it's just
> plain text anyway.

I don't distribute plain text because it is too easy to alter. Once I
send one of these "essays" out I have no control over what happens to
it. So I try to make it as hard as I reasonably can for people to edit
what I have written.

If you were "the most important developers" you would be able to write
a program to reliably display PDF.

Ian



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Verifying Toolchain Semantics
  2014-10-05  1:35   ` Ian Grant
@ 2014-10-05  5:39     ` Nala Ginrut
  2014-10-05 14:14       ` Ian Grant
  2014-10-05  6:58     ` Mike Gerwitz
  1 sibling, 1 reply; 27+ messages in thread
From: Nala Ginrut @ 2014-10-05  5:39 UTC (permalink / raw)
  To: Ian Grant; +Cc: Mark H Weaver, Markus Kuhn, guile-devel

[-- Attachment #1: Type: text/plain, Size: 1388 bytes --]

2014年10月5日 上午9:35于 "Ian Grant" <ian.a.n.grant@googlemail.com>写道:

>
> Well, if I do succeed in distributing malware, it will be a good
> demonstration of what I have been arguing for months now, which is
> that your "core infrastructure" is _very,_ _very_ flaky, and that far
> from being "the most important developers," you are in fact just
> part-time amateur hackers playing at your 'hobbies'.
>

Ian, you don't have to say so, since people here are the guys who could
actually help you. Or you don't have to email them.
I understand you've been trying to explain you idea. And I believe that
there're many people are listening to you silently. Your mind is profound
and too complex so that your essays are very important for them.
The real problem here, is the provided PDF can't be opened normally. That's
bad, for your idea. It's your mistake, not others. IMO, the point is not
what you concerned that PDF is more security than TXT to protect your
original mind. You have to let them see the essay successfully and
understand your idea first.

> If you were "the most important developers" you would be able to write
> a program to reliably display PDF.
>

As I said, don't be so ridiculous and aggressive.  I am one of the guys who
are interested your mind, please consider to build your mind easier, rather
than sarcasm.

> Ian
>

[-- Attachment #2: Type: text/html, Size: 1702 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Verifying Toolchain Semantics
  2014-10-05  1:35   ` Ian Grant
  2014-10-05  5:39     ` Nala Ginrut
@ 2014-10-05  6:58     ` Mike Gerwitz
  2014-10-05 16:11       ` Ian Grant
  1 sibling, 1 reply; 27+ messages in thread
From: Mike Gerwitz @ 2014-10-05  6:58 UTC (permalink / raw)
  To: Ian Grant; +Cc: Mark H Weaver, Markus Kuhn, guile-devel

[-- Attachment #1: Type: text/plain, Size: 3376 bytes --]

On Sat, Oct 04, 2014 at 09:35:09PM -0400, Ian Grant wrote:
> Well, if I do succeed in distributing malware, it will be a good
> demonstration of what I have been arguing for months now, which is
> that your "core infrastructure" is _very,_ _very_ flaky, and that far
> from being "the most important developers," you are in fact just
> part-time amateur hackers playing at your 'hobbies'.
> 
> What I am trying to do here is wake you people up from what will
> otherwise prove to be terminal sleep. This is not a hobby, you are
> combatants in a global information war, and it will cost some of you
> your lives,

As has been stated---your concerns are substantiated and understood,
and you clearly have much experience and information to contribute,
but your unnecessary and unsubstantiated insults and holier-than-thou
attitude prevent meaningful discussion, especially from those who are
spectating and unwilling to participate in a discussion that is
consequently destined to yield little more than childish banter and
silence, albeit sprinkled with bits of very interesting information
and resources.

The additional drama you infuse into the conversation---an example
being the latter paragraph above---also works against you. There are
many things that may cost us our lives, and I'm fairly certain that
this does not make the top million or so for most of us. I'm killing
myself sitting here typing this message.[0] From my understanding,
you're allowing your body to degenerate as we speak.

> I don't distribute plain text because it is too easy to alter. Once I
> send one of these "essays" out I have no control over what happens to
> it. So I try to make it as hard as I reasonably can for people to edit
> what I have written.

This argument is not valid---why is it hard to alter a PDF? In fact,
PDF manipulation is a dark (and probably cancer-causing) art that's
automated by countless businesses worldwide; it is a topic that eats
up a significant portion of development time at my employer's office.

Have you considered just distributing a GPG/PGP signature with your
works, or even signing the work itself? After all, this whole
discussion is about proving the unlikelihood of and preventing the
modification of data. Unlike the topic of complex binaries, your works
are trivially verifiable even by hand---take advantage of that. If in ASCII,
verification is a simple matter of diffing, even without cryptographic
assurances, provided that your original work is archived in a number of
reputable places (though I'd still sign my works); however, PDFs introduce
an infinite number of display modifications that can be produce a document
yielding a text isomorphic to the original---just because two PDFs of your
work are 99% different when binary-compared doesn't mean that the visual
meaning of text it renders is not 100% the same.

(To be fair: I'm fine with PDFs; it's hard to convert most TeX-heavy
writings using equations into meaningful ASCII, but I still provide ASCII
alternatives whenever reasonable, which is >90% of the time. Unicode is
often suitable when ASCII isn't.)

[0]: http://apps.washingtonpost.com/g/page/national/the-health-hazards-of-sitting/750/

-- 
Mike Gerwitz
Free Software Hacker | GNU Maintainer
http://mikegerwitz.com
FSF Member #5804 | GPG Key ID: 0x8EE30EAB

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Verifying Toolchain Semantics
  2014-10-03 12:56     ` Taylan Ulrich Bayirli/Kammer
  2014-10-03 17:13       ` Mark H Weaver
@ 2014-10-05 13:57       ` William ML Leslie
  1 sibling, 0 replies; 27+ messages in thread
From: William ML Leslie @ 2014-10-05 13:57 UTC (permalink / raw)
  To: guile-devel

[-- Attachment #1: Type: text/plain, Size: 9146 bytes --]

On 3 October 2014 22:56, Taylan Ulrich Bayirli/Kammer <
taylanbayirli@gmail.com> wrote:

> William ML Leslie <william.leslie.ttg@gmail.com> writes:
>
> > ​Oh, interesting point. Maybe we should define PDF as an abstract
> > semantics that we can convert into a wide range of equivalent document
> > layout languages? If the attacker can't tell exactly what xsl-fo or
> > dsssl the tool will output, or what software you're using to render
> > the result, it will magically make it more difficult to attack!​
>
> Come on, that snark is undue. :-)
>
> It won't make it "magically" more difficult, it will make it actually
> more difficult.


It makes it harder in the same way ASLR makes attacks on a lack of memory
safety harder: there is less you can rely on, but it's still basically the
same game.

Say, for example, that I can guess you will end up viewing the document
with one of five output programs - maybe, evince, chrome, tk, gecko
translating xsl-fo to xul or firefox viewing html 4 directly, and say that
I only have exploits for the first three.  Now I have a bit of a problem: I
need to find a way to get your tool to emit the specific structure that
will invoke the exploit.  Since it will show a document that looks correct,
although underneath may have a different structure, I will have to account
for variation in what it emits but there will be a limited class of
algebraic manipulations that it can make without changing the semantics of
the document.

Maybe I would not find a way to /reliably/ produce a document that will
infect your machine.  Or maybe I would.  It's hypothetical.  But you know
as well as I do that fuzzing a document generator to see what instructions
produce exploitable output is easier than either writing a verifiable PDF
viewer or writing a tool that bisimulates a PDF document with a class of
layout languages.  The cherry on top is that all that extra code you
introduce is surely ripe with additional vulnerabilities - so maybe I don't
need to bother with producing those back-end exploits at all.
​​

> ​​
>   Though I think your analogy is wrong.  PDF is already
> ​​
> an abstract semantics, like any formal language spec.  The question is
> ​​
> whether an implementation realizes precisely those semantics, or adds
> "oh and insert a backdoor" to some pieces of code in that language, "as
> an addition to the language semantics."  This augmentation to the
> semantics needn't be apparent in the source code of the implementation,
> since the language it's written in might again be compiled by a compiler
> that "augments" the semantics of the language it compiles, and that in
> turn needn't be apparent in the compiler's source ... until you arrive
> at the since long vanished initial source code that planted the virus.
>

​I wasn't suggesting that someone had a thompson hack in your pdf viewer
(do you consider postscript a reasonable language for a bootstrap compiler?
:-)​.  I was implying that infection via PDF is similar to a Thompson hack
in an interesting way:  having many possibly pdf pipelines makes the
problem only marginally harder, and in this, the problem is dual to that of
the Thompson hack.

Where it would take a great fool to rely on syntactic properties - a
/specific set of instructions/ - to recognise and infect a C compiler in
practice, it would take similar foolishness to imagine that applying a
class of machine-describable translations to a source would so reduce the
ability to get exploitable code to come out at the other end.  It is
*possible*, at least, it is possible to get to the point where you now only
have to rely on the program consuming the output, be it x.org or a linux
vt, by verfying that the output program is well typed; but then you could
have just written the well-typed viewer and been done with it.


>
> That's just reiterating what the Thompson hack is.  It relies on us
> using a single lineage of compilers/software.  (Or a couple lineages
> could be compromised separately; that would still be plausible if it's
> plausible for one.)  Because at every layer the semantics can only be
> compromised enough to propagate the compromise through one concrete
> code-base.


​No no no.  This is the point.  Anyone smart enough to have pulled of this
attack - someone who has compromised the sha function, too - is not relying
on syntactic features, they are doing decent static analysis of
higher-order behaviour.  Saying, this looks like - semantically - like
register allocation, that looks like it's producing an ELF header (very
easy to do), that has appropriate allocation and loop structure to do SSA
transformation.  Looking at the higher-order behaviour of code is not
difficult, it's exactly what Ian is suggesting we /do/ to combat the
problem.  I think it's probably easier to use this technique to propogate a
Thompson hack than to ​combat it.



>   In your PDF analogy, the solution is to write a spurious
> amount of PDF implementations.


​Except that in doing so, you've structured your vulnerabilities.  Please
keep in mind that nobody sets out to write a PDF reader that has
vulnerabilities (or, well, maybe.  hello Adobe!) - they are there by
accident​.  Prodicing a wide range of reader implementations just means
that the attacker needs to make some guesses.  You may have introduced more
severe bugs by this process.



>   Or for C, to implement a spurious amount
> of C compilers.  That is impractical because C is complex.  What might
> be practical might be to write one new C compiler (guaranteed clean,
> since it's new)


I hope by now you understand why I think this to be nonsense.  It's still a
C compiler.  You know that, I know that.  Why wouldn't the machine be able
to tell?​



> , and verify GCC with that[*].  What's more useful in the
> long term is to define a much simpler language, and base our everything
> on that instead of on C.  Then we can verify our toolchain more often,
> since we needn't write a whole freaking C compiler every time. :-)
>
> [*] Compile version X of GCC with the Clean C Compiler™, compile version
> X of GCC with some GCC binary, then compile some GCC with both GCC-X
> binaries, one of which is guaranteed clean, to see if they behave same
> while compiling GCC; thus we get both a guaranteed clean GCC, and we get
> to know whether the other was compromised in first place, or whether we
> wasted a lot of time.
>
> I think Ian is doing himself a disservice by using many fancy abstract
> terms and, if I may be blunt, acting in some tinfoil hat-like ways, when
> in fact the underlying ideas are relatively sane and simple.  It's all
> there if you read a little between the lines, and a little around the
> parts written in all-caps, like in:
>
> http://lists.gnu.org/archive/html/guile-devel/2014-09/msg00035.html
>

​This asssumption that I haven't been reading everything ​is unusual and
I'm not sure where it's coming from.  I guess Ian didn't manage to infect
me?  Maybe I wrote my own PDF viewer, maybe I use out of date software, or
maybe I just read the postscript in my terminal.

The problem that I have with this discussion is that rather than address
many* of the points in my email, he instead resorted to insisting I was a
shill.  It's very difficult to have a conversation with someone who uses
silencing tactics.  Or better - it's pointless.

If it were true, then you all lost a long time ago, I'm sorry.  So I
perpetuated a Thompson hack on a large, well-known free software codebase
while I was in primary school and without regular computer access, and
nobody noticed?  And now he doesn't realise I'm editing his emails and PDFs
in-flight to make him sound arrogant and clueless?

To be fair, I have seen similar levels of gaslighting before.  It's safe to
always assume /something/ is up.  But throwing around those sort of
allegations just because someone pokes holes in your arguments is an
acceptable cause for me to dismiss you as a nut, imho.

I don't have an issue with fancy abstract terms, but I do see clearly how
the proposition only solves a narrow subclass of the problems, and
specifically a subclass that would be better addressed with static analysis
and ideally types at the assembly level.  Show me a Thompson attack that
this 'semantic fixpoint' can work around, and I'll show you a crack that
can work around any similar 'semantic fixpoint'.
​​
​* well, he responded to the point about hardware.  If people want to know
why he's wrong, I can elaborate.  But by that point, I hoped people could
see for themselves that there's no reason to talk any further.

-- 
William Leslie

Notice:
Likely much of this email is, by the nature of copyright, covered under
copyright law.  You absolutely MAY reproduce any part of it in accordance
with the copyright law of the nation you are reading this in.  Any attempt
to DENY YOU THOSE RIGHTS would be illegal without prior contractual
agreement.

[-- Attachment #2: Type: text/html, Size: 12091 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Verifying Toolchain Semantics
  2014-10-05  5:39     ` Nala Ginrut
@ 2014-10-05 14:14       ` Ian Grant
  2014-10-05 15:15         ` Nala Ginrut
  0 siblings, 1 reply; 27+ messages in thread
From: Ian Grant @ 2014-10-05 14:14 UTC (permalink / raw)
  To: Nala Ginrut; +Cc: Mark H Weaver, Markus Kuhn, guile-devel

On Sun, Oct 5, 2014 at 1:39 AM, Nala Ginrut <nalaginrut@gmail.com> wrote:
>
> The real problem here, is the provided PDF can't be opened normally. That's
> bad, for your idea. It's your mistake, not others.

Then tell me the name, the sha512sum of the file, the URL from which
you downloaded it and the size of the file as reported by your
filesystem. Also tell me the exact error, and the program (including
the version number) and if possible the file location of the error.

Thanks
Ian



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Verifying Toolchain Semantics
  2014-10-05 14:14       ` Ian Grant
@ 2014-10-05 15:15         ` Nala Ginrut
  2014-10-05 16:24           ` Ian Grant
  0 siblings, 1 reply; 27+ messages in thread
From: Nala Ginrut @ 2014-10-05 15:15 UTC (permalink / raw)
  To: Ian Grant; +Cc: guile-devel

[-- Attachment #1: Type: text/plain, Size: 857 bytes --]

On Sun, Oct 5, 2014 at 10:14 PM, Ian Grant <ian.a.n.grant@googlemail.com>
wrote:

> On Sun, Oct 5, 2014 at 1:39 AM, Nala Ginrut <nalaginrut@gmail.com> wrote:
> >
> > The real problem here, is the provided PDF can't be opened normally.
> That's
> > bad, for your idea. It's your mistake, not others.
>
> Then tell me the name, the sha512sum of the file, the URL from which
> you downloaded it and the size of the file as reported by your
> filesystem. Also tell me the exact error, and the program (including
> the version number) and if possible the file location of the error.
>
>
Alright, I changed a system and try it again with evince successfully.
Anyway, I did't find any maths or special symbols in it, so it could be
published on your blog as plain text. But you may insist on the opinion of
PDF.
It's not your mistake but mine. ;-)

Thanks
> Ian
>

[-- Attachment #2: Type: text/html, Size: 1650 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Verifying Toolchain Semantics
  2014-10-05  6:58     ` Mike Gerwitz
@ 2014-10-05 16:11       ` Ian Grant
  2014-10-06  4:23         ` Mike Gerwitz
  0 siblings, 1 reply; 27+ messages in thread
From: Ian Grant @ 2014-10-05 16:11 UTC (permalink / raw)
  To: Ian Grant, Mark H Weaver, Markus Kuhn, guile-devel

On Sun, Oct 5, 2014 at 2:58 AM, Mike Gerwitz <mikegerwitz@gnu.org> wrote:
> On Sat, Oct 04, 2014 at 09:35:09PM -0400, Ian Grant wrote:
>> Well, if I do succeed in distributing malware, it will be a good
>> demonstration of what I have been arguing for months now, which is
>> that your "core infrastructure" is _very,_ _very_ flaky, and that far
>> from being "the most important developers," you are in fact just
>> part-time amateur hackers playing at your 'hobbies'.
>>
>> What I am trying to do here is wake you people up from what will
>> otherwise prove to be terminal sleep. This is not a hobby, you are
>> combatants in a global information war, and it will cost some of you
>> your lives,
>
> As has been stated---your concerns are substantiated and understood,

I wasn't aware that my concerns _have_ been substantiated. How? I am
not sure they have been understood, either.

> and you clearly have much experience and information to contribute,
> but your unnecessary and unsubstantiated insults and holier-than-thou
> attitude prevent meaningful discussion, especially from those who are
> spectating and unwilling to participate in a discussion that is
> consequently destined to yield little more than childish banter and
> silence, albeit sprinkled with bits of very interesting information
> and resources.

I have not insulted anyone, as I explained in the other thread. I am
"holier than though" And I wish that wasn't the case. So if someone
can prove me wrong, then please do so, the sooner the better!

> The additional drama you infuse into the conversation---an example
> being the latter paragraph above---also works against you. There are
> many things that may cost us our lives, and I'm fairly certain that
> this does not make the top million or so for most of us. I'm killing
> myself sitting here typing this message.[0] From my understanding,
> you're allowing your body to degenerate as we speak.

Well my body is getting on for 50 years old, so it doesn't matter so
much. Here's a "better" example:

   http://en.wikipedia.org/wiki/Aaron_Swartz

>> I don't distribute plain text because it is too easy to alter. Once I
>> send one of these "essays" out I have no control over what happens to
>> it. So I try to make it as hard as I reasonably can for people to edit
>> what I have written.
>
> This argument is not valid---why is it hard to alter a PDF? In fact,
> PDF manipulation is a dark (and probably cancer-causing) art that's
> automated by countless businesses worldwide; it is a topic that eats
> up a significant portion of development time at my employer's office.

It is valid. If you want to see why, then try to alter one of the PDFs
I've sent out.
What will you do when I later send out a series of checksums using a
checksum algorithm that neither you or anyone else ever heard of
before?

> Have you considered just distributing a GPG/PGP signature with your
> works, or even signing the work itself? After all, this whole
> discussion is about proving the unlikelihood of and preventing the
> modification of data.

Of course I have thought of that, and rejected it as a pointless waste
of time! If you read what I've written, on this list and also in that
blog article Mark pointed up, you will see why I think this. And even
if public key crypto was positively secure, it is worthless when I
have no rational basis for confidence in the security of the system
which holds the private keys. PK crypto, in this state we're in, has
negative value: it just creates a false sense oif security, and
_discourages_ people from actually thinking about the real problem
because they "know" their comms. are secure. They don't know that at
all.

> ... Unlike the topic of complex binaries, your works
> are trivially verifiable even by hand---take advantage of that. If in ASCII,
> verification is a simple matter of diffing, even without cryptographic
> assurances, provided that your original work is archived in a number of
> reputable places (though I'd still sign my works);

I could ship them as low-res PNG files. But really, it would be much
better just to get good working PDF viewers. There are _millions_ of
important documents which are in PDF form. We need to be able to
reliably verify and reference text within them. These include things
like the Intel architecture reference manuals.

> [...] however, PDFs introduce
> an infinite number of display modifications that can be produce a document
> yielding a text isomorphic to the original---just because two PDFs of your
> work are 99% different when binary-compared doesn't mean that the visual
> meaning of text it renders is not 100% the same.

I am well aware of this, and it is one reason why we need to be able
to read PDFs reliably. We can then vary the concrete representation
and defeat attempts at mass-surveillance and also prevent people from
using automated methods to edit the files in transit. This was
implicit in the very first message I sent Stallman, Torvalds, deRaadt
etc. and which I later cc'ed to this list:

  http://lists.gnu.org/archive/html/guile-devel/2014-09/msg00021.html

And before some smart-ass says "Oh yeah, and how are you going to
verify the contents with cryptographic checksums?" I warn you to
_think_ first, because I will metaphorically tear you into fifty thin,
bloody little strips before you can say "Doh!"!

You do all know were making a movie about this whole affair, don't you?

   http://livelogic.blogspot.com/2014/04/what-is-sixth-estate.html

Ian



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Verifying Toolchain Semantics
  2014-10-05 15:15         ` Nala Ginrut
@ 2014-10-05 16:24           ` Ian Grant
  2014-10-06  8:25             ` Nala Ginrut
  0 siblings, 1 reply; 27+ messages in thread
From: Ian Grant @ 2014-10-05 16:24 UTC (permalink / raw)
  To: Nala Ginrut, guile-devel

On Sun, Oct 5, 2014 at 11:15 AM, Nala Ginrut <nalaginrut@gmail.com> wrote:
>
> Alright, I changed a system and try it again with evince successfully.
> Anyway, I did't find any maths or special symbols in it, so it could be
> published on your blog as plain text. But you may insist on the opinion of
> PDF.

There is another reason why I use PDF. It's much nicer to read. I love
Garamond. Did you see the Jobs movie? Do you remember what he said
about Garamond?

> It's not your mistake but mine. ;-)

Well being mistaken about somebody else's mistake that wasn't a
mistake is about the least mistaken it's possible to be: it really
doesn't matter at all.

What we really need to do is find out what software Mark was using to
pdf->png. If it segfaults then there is a good chance that that bug
can be turned into a working exploit. Because all binaries distributed
by the binary-distributors are identical, attackers can analyse the
file and work out how to turn a bad pointer dereference into an
exploitable "PDF attack vector" which executes binary code contained
in the PDF.

Ian



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Verifying Toolchain Semantics
@ 2014-10-05 17:42 Ian Grant
  2014-10-05 18:19 ` Ian Grant
  0 siblings, 1 reply; 27+ messages in thread
From: Ian Grant @ 2014-10-05 17:42 UTC (permalink / raw)
  To: guile-devel, Taylan Bayırlı, William Leslie,
	Richard Stallman, Mark H Weaver

Taylan wrote:
> In your PDF analogy, the solution is to write a spurious
> amount of PDF implementations.  Or for C, to implement
> a spurious amount of C compilers.  That is impractical
> because C is complex.

It's not as complex as you might think. In the space of a couple of
months, I wrote what I reckon is 50% of a C interpreter all on my own.
And I have the basis of an abstract language for specifying semantics
as just sets of term-rewriting rules, which work on AST
representations of program source.

https://github.com/IanANGrant/red-october/blob/master/src/dynlibs/ffi/CPPLexer.lex
https://github.com/IanANGrant/red-october/blob/master/src/dynlibs/ffi/CParser.grm
https://github.com/IanANGrant/red-october/blob/master/src/dynlibs/ffi/CSyntax.sml

> What might be practical might be to write one new C
> compiler (guaranteed clean, since it's new), and verify
> GCC with that[*].

It will take a year or so to get a C compiler written in C working. So
it won't be new once you've got it to the point of being able to
compile GCC. And by the time that happens, if there has been a
compiler trap door, then the GCC binaries, and probably your "new" C
compiler binaries would "know about" each other. So you would not have
any more reason to believe the
new binaries are good than you had at the beginning. You will have
wasted a year.

> What's more useful in the
> long term is to define a much simpler language, and base
> our everything on that instead of on C.  Then we can verify
> our toolchain more often, since we needn't write a whole
> freaking C compiler every time. :-)

Almost, but that "Much Simpler Language" needs to be one designed to
write language interpreters by reading abstract specifications of the
syntax and semantics from "metadata" files. Then a standard
"interpreter interpreter for the 'MSL'" can be defined to do this job
automatically, and it could even implement itself. See the Reynolds'
paper I referenced in my first reply to William.

Then we will be able to write "a spurious number of whole freakin' C
interpreters" So for example, as I said to Richard a month ago, we
will be able to implement a C compiler in Microsoft Word BASIC, or in
COBOL, and that will be capable of compiling GCC, if we had a year or
so to wait while it does it ... but it could still be capable of
compiling checksumming programs, or diff programs, or unzipping
programs, and these would be extra data points to check whether other
tools like binary file viewers etc, have been compromised by a
compiler trap door.

Ian



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Verifying Toolchain Semantics
  2014-10-05 17:42 Ian Grant
@ 2014-10-05 18:19 ` Ian Grant
  0 siblings, 0 replies; 27+ messages in thread
From: Ian Grant @ 2014-10-05 18:19 UTC (permalink / raw)
  To: guile-devel, Taylan Bayırlı, William Leslie,
	Richard Stallman, Mark H Weaver

On Sun, Oct 5, 2014 at 1:42 PM, Ian Grant <ian.a.n.grant@googlemail.com> wrote:
> [we] will be able to implement a C compiler in Microsoft Word BASIC, or in
> COBOL, and that will be capable of compiling GCC, if we had a year or
> so to wait while it does it ...

This is not true. Word BASIC or COBOL could easily write out an
executable binary defined by an abstract formal description in a text
file. It would take a while, but the binary would be able to compile
GCC pretty quickly.



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Verifying Toolchain Semantics
@ 2014-10-06  0:30 Ian Grant
       [not found] ` <CAKFjmdzxAMvcry8N6B_atM_8vGyzA1Dfz9ygWxSgh3fD7EUAuQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 27+ messages in thread
From: Ian Grant @ 2014-10-06  0:30 UTC (permalink / raw)
  To: William Leslie, Taylan Bayırlı, guile-devel, lightning

http://lists.gnu.org/archive/html/guile-devel/2014-10/msg00016.html

From:    William ML Leslie
Date:    Mon, 6 Oct 2014 00:57:49 +1100
On 3 October 2014 22:56, Taylan Ulrich Bayirli/Kammer <address@hidden> wrote:

> Say, for example, that I can guess you will end up viewing the
> document with one of five output programs - maybe, evince, chrome,
> tk, gecko translating xsl-fo to xul or firefox viewing html 4
> directly, and say that I only have exploits for the first three.
> Now I have a bit of a problem: I need to find a way to get your tool
> to emit the specific structure that will invoke the exploit.  Since
> it will show a document that looks correct, although underneath may
> have a different structure, I will have to account for variation in
> what it emits but there will be a limited class of algebraic
> manipulations that it can make without changing the semantics of the
> document.

It's not about this. These are all the programs that have problems.
We're not going to use these sorts of programs anymore. We're going to
use auto-generated, nearly perfect code. And after a few years it
won't be just near perfect, it'll be perfect.

> Maybe I would not find a way to /reliably/ produce a document that
> will infect your machine.  Or maybe I would.  It's hypothetical.
> But you know as well as I do that fuzzing a document generator to
> see what instructions produce exploitable output is easier than
> either writing a verifiable PDF viewer or writing a tool that
> bisimulates a PDF document with a class of layout languages.  The
> cherry on top is that all that extra code you introduce is surely
> ripe with additional vulnerabilities - so maybe I don't need to
> bother with producing those back-end exploits at all.

This is wrong. If you look at the automatically-generated assembler
code from a formal spec, in the form of a functional program
generating that code:

  https://github.com/IanANGrant/red-october/blob/master/src/dynlibs/ffi/AbstractMachineWord.sml

you will see that the extra code has no extra attack vectors. It's all
machine-generated. If this were a piece of code for reading a binary
file, then it would have been generated from a grammar describing
exactly what are the valid words in that language. (Using words and
language in the technical sense as "valid strings" and "set of all
valid strings".) So buffer overruns won't happen. And if all the
semantics are formalised, then off-by-one errors won't happen
either. If they do, they'll happen all over the place and you will
find out very quickly.

> Though I think your analogy is wrong.  PDF is already an abstract
> semantics, like any formal language spec.  The question is whether
> an implementation realizes precisely those semantics, or adds "oh
> and insert a backdoor" to some pieces of code in that language, "as
> an addition to the language semantics."  This augmentation to the
> semantics needn't be apparent in the source code of the
> implementation, since the language it's written in might again be
> compiled by a compiler that "augments" the semantics of the language
> it compiles, and that in turn needn't be apparent in the compiler's
> source ... until you arrive at the since long vanished initial
> source code that planted the virus.

> I hope by now you understand why I think this to be nonsense.  It's
> still a C compiler.  You know that, I know that.  Why wouldn't the
> machine be able to tell?

Because it's formally undecidable. That's Rice's Theorem. And it's not
"in practice" probably detectable by semantantic methods, because we
are going to extra lengths to obscure the effective operational
semantics by multiple re-encodings. How are you going to recognise a
compiled Fortran90 program translating a C program into an abstract
assembler like lightning, as a C compiler? You don't even know what
the opcode numbers are: we can permute them.

> The problem that I have with this discussion is that rather than
> address many* of the points in my email, he instead resorted to
> insisting I was a shill.  It's very difficult to have a conversation
> with someone who uses silencing tactics.  Or better - it's
> pointless.

What are you talking about? Here is what you wrote to me:

   http://lists.gnu.org/archive/html/guile-devel/2014-09/msg00036.html

> Please understand that our lack of enthusiasm for it is not without
> our own individual commitments to the security and freedoms of
> users; rather while the extra assurances your project may provide
> sound nice, they are frankly rather minimal and don't have much
> additional utility, and so are not priority one.

Now if you wanted me to continue answering your questions, then well,
I'm sorry, I just got the wrong end of stick completely. I thought you
wanted me to fuck off!

So I replied as below. To you, privately. And that was it. Silencing
tactics? You then copied it to the world, and a guy who I _told_ you,
was deputy director of the NSA's National Computer Security Centre!

And I didn't mind. (I did wonder what the hell does he think he's
doing?! But it's your privilege to say what you want, to whom you want,
and none of my business.)

Date: Sun, 7 Sep 2014 11:39:44 +1000
Message-ID: <CAHgd1hEP2=dnqXTQx-EdozAXc0vjYh0ucthbyWFnPecQqB=7pw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
Subject: Re: GNU Thunder
From: William ML Leslie <william.leslie.ttg-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
To: Ian Grant <ian.a.n.grant-gM/Ye1E23mwN+BqQ9rBEUg@public.gmane.org>, guile-devel <guile-devel-mXXj517/zsQ@public.gmane.org>,
    lightning <lightning-mXXj517/zsQ@public.gmane.org>, schellr-EkmVulN54Sk@public.gmane.org
X-List-Received-Date: Sun, 07 Sep 2014 01:39:50 -0000

--e89a8ff1ca20e0b37105026fc673
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

On 7 September 2014 01:49, Ian Grant <ian.a.n.grant-gM/Ye1E23mwN+BqQ9rBEUg@public.gmane.org> wrote:

> You are just not at all convincing, I'm afraid. Tell your boss they didn'=
t
> train you properly, and can you get assigned somewhere else.
>

=E2=80=8BUnfortunately, this is a reality we have to deal with when discuss=
ing
security on the internet: we've got to assume the enemy is listening.  It
could be me, I could even be intercepting your emails and making them sound
incoherrent, and upping the tinfoil-hat quotient by =E2=80=8Badding people =
like
Theo and Linus to the To: list.  I could be suggesting that effort be spent
in a project that will provide very little value, wasting your time.

Remember:  I'm not suggesting what the outcome of your project will be,
just that if the result is negative, we still know nothing.  When testing a
system for subterfuge, we need to examine *all* of the moving parts, even
those that appear to be unused.  If the system you're building your
assembler on is compromised, it can still give you a negative answer.
That's what was so scary about this particular type of attack.

--=20
William Leslie

[...]

Here's my e-mail:

MIME-Version: 1.0
Received: by 10.194.219.234 with HTTP; Sat, 6 Sep 2014 08:49:15 -0700 (PDT)
In-Reply-To: <CAHgd1hH+VD9HqE=pRZOHaPK-fOgMJbHpv_RgPt5LUdLkTXCBwA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
References: <CAKFjmdzP89qSD03_MGqS1UawQvaq6yvme-abKcmLuA8DfUQE+A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
    <E1XKsRt-0002zo-64-iW7gFb+/I3LZHJUXO5efmti2O/JbrIOy@public.gmane.org>
    <CAKFjmdyUphk2LmDdDE_7gkDSKAu4COurtvafBwO5XwCgyM1OfA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
    <E1XLP5I-0005zC-Sn-iW7gFb+/I3LZHJUXO5efmti2O/JbrIOy@public.gmane.org>
    <CAKFjmdxt4jWAHAYXjzwPeUw+dTUBTPC94YJDNifsO7JVkNHjTQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
    <E1XNssG-00083e-BB-iW7gFb+/I3LZHJUXO5efmti2O/JbrIOy@public.gmane.org>
    <CAKFjmdzdkC5rW3DjBRbRueSiBs9dmJUSxGkPp2-6uc7SH9o79g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
    <E1XPA13-0002hD-Kp-iW7gFb+/I3LZHJUXO5efmti2O/JbrIOy@public.gmane.org>
    <CAKFjmdyc0D5vYBK=rQKzKNK+WRmWhkkL-RXqBMSHvhOzX3fHiw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
    <E1XPifZ-0004g6-KA-iW7gFb+/I3LZHJUXO5efmti2O/JbrIOy@public.gmane.org>
    <CAKFjmdwEs8TtPZjXWDYKoQXz4FEKy6p3T9+8jWLMY87Onn=VaQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
    <CAHgd1hH+VD9HqE=pRZOHaPK-fOgMJbHpv_RgPt5LUdLkTXCBwA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
Date: Sat, 6 Sep 2014 11:49:15 -0400
Delivered-To: ian.a.n.grant-gM/Ye1E23mwN+BqQ9rBEUg@public.gmane.org
Message-ID: <CAKFjmdzzRoTdzWxAOh2byRbWDtszWZtt8Z2k0eLCFYR+qmAC9g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
Subject: Re: GNU Thunder
From: Ian Grant <ian.a.n.grant-gM/Ye1E23mwN+BqQ9rBEUg@public.gmane.org>
To: William ML Leslie <william.leslie.ttg-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Content-Type: multipart/alternative; boundary=001a1136144a26f19e05026787e6

--001a1136144a26f19e05026787e6
Content-Type: text/plain; charset=UTF-8

You are just not at all convincing, I'm afraid. Tell your boss they didn't
train you properly, and can you get assigned somewhere else.

Ian

--001a1136144a26f19e05026787e6
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">You are just not at all convincing, I&#39;m afraid. Tell y=
our boss they didn&#39;t train you properly, and can you get assigned somew=
here else.<br><div class=3D"gmail_extra"><br></div><div class=3D"gmail_extr=
a">Ian<br></div><div class=3D"gmail_extra"><br></div></div>

--001a1136144a26f19e05026787e6--

=============================================================================

You see, I thought you were a Microsoft plant. I just couldn't figure
out any other reason why someone who clearly didn't know anything
about computers, would say they were going to write a secure operating
system! :-)

Ian

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Verifying Toolchain Semantics
  2014-10-05 16:11       ` Ian Grant
@ 2014-10-06  4:23         ` Mike Gerwitz
       [not found]           ` <20141006042323.GA31390-iW7gFb+/I3LZHJUXO5efmti2O/JbrIOy@public.gmane.org>
  0 siblings, 1 reply; 27+ messages in thread
From: Mike Gerwitz @ 2014-10-06  4:23 UTC (permalink / raw)
  To: Ian Grant; +Cc: Mark H Weaver, Markus Kuhn, guile-devel

[-- Attachment #1: Type: text/plain, Size: 10091 bytes --]

On Sun, Oct 05, 2014 at 12:11:00PM -0400, Ian Grant wrote:
> > As has been stated---your concerns are substantiated and understood,
> 
> I wasn't aware that my concerns _have_ been substantiated. How? I am
> not sure they have been understood, either.

They were substantiated long ago by the very references you've provided.
Your application to the GNU project doesn't make this a novel discussion.

The problem is understood. A proper and **practical** solution to the
problem may be less well understood, but you haven't yet convinced those in
charge of making such decisions that this is a threat (a) greater than all
other threats worth devoting time to protect against; and (b) worthy of such
an enormous time investment at this time.

It is worthy of addressing, and addressing it in the manner that you and
Dijkstra suggest will certainly go a long way to preventing nearly all
problems contributing to (a), but you need to take a different approach if
you're going to incite such change.

> Well my body is getting on for 50 years old, so it doesn't matter so
> much. Here's a "better" example:
> 
>    http://en.wikipedia.org/wiki/Aaron_Swartz

I hope very much that using Aaron as an example is to further your argument
that programming is a life-or-death situation, and that it's not a "better"
example of personal neglect or instability. The former has no relation to
the topic at hand, and the latter is---well, let's just hope that you poorly
represented yourself.

> > This argument is not valid---why is it hard to alter a PDF? In fact,
> > PDF manipulation is a dark (and probably cancer-causing) art that's
> > automated by countless businesses worldwide; it is a topic that eats
> > up a significant portion of development time at my employer's office.
> 
> It is valid. If you want to see why, then try to alter one of the PDFs
> I've sent out.
> What will you do when I later send out a series of checksums using a
> checksum algorithm that neither you or anyone else ever heard of
> before?

Your argument favored PDF over plain text. The point that I had made was
that you are unable to verify that two PDFs---unless binary
identical---render text that is unchanged from the original. If I convert
your PDF to a PNG and back to a PDF, it renders the same text. What does
that mean to you? Does that mean that the text has been altered? Do you care
that your work has been distribute in an alternative form? At what point
does it become "modified"? And how can you distinguish reasonable
alterations from unreasonable ones?

What does your checksum algorithm accomplish than existing checksum or
hashing algorithms do not? Why can that same principle not be applied to
plain text---why does the implementation of your checksum make it any more
difficult to modify the PDF? And why is your checksum better than an
established and heavily studied cryptographic standard (e.g. SHA-256)?

> > Have you considered just distributing a GPG/PGP signature with your
> > works, or even signing the work itself? After all, this whole
> > discussion is about proving the unlikelihood of and preventing the
> > modification of data.
> 
> Of course I have thought of that, and rejected it as a pointless waste
> of time! If you read what I've written, on this list

You have written a lot on this list, and there has been an absurd amount of
redundancy. I have not read it all.

> and also in that blog article Mark pointed up, you will see why I think
> this. And even if public key crypto was positively secure, it is worthless
> when I have no rational basis for confidence in the security of the system
> which holds the private keys.

In the case of PGP, you hold your private keys. I suppose if you're worried
about a Thompson virus, you can't even provide yourself assurances of the
security of your own system, but in such a severe case, there is nothing you
do on your system that can be trusted, so surely you must not enter your
password to access your e-mail on any of those systems. Because if you did,
surely someone could just impersonate you to begin with, and there'd be no
point in signing anything cryptographically, because such attempts would
have been preempted.

All that aside, it is still possible to host your private key on a machine
that is wholly segregated from the rest of the world, perform your
cryptographic operation on that, transfer its output (doesn't really matter
how at this point, as long as it's output-only and cannot be used to
compromise the machine---it only matters that the same output is supplied to
each machine) to N machines with N different lineages (even if one is a
PDP-11!), and verify the result of the cryptographic operation on those
machines to ensure that your data were not modified in the process of
applying the cryptographic operation.

But we're not talking about such assurances here, so most of those steps
would be unnecessary---upon receiving your signed request from the machine
hosting your private key, it *does not matter* what its output is: the plain
text is left, well, plain. You can verify it by hand. Then everyone else in
the world with your public key can verify that it is indeed the signature
produced by the same machine that produced the signature for all of your
other archived messages all over the internet (mailing lists, your own blog
entries, etc)---a task that can be trivially automated to scan for
discrepancies.

Your checksum would merely state "this PDF has not been modified". This
would say "this PDF has not been modified and I am actually the same person
that wrote all these other articles / e-mail messages, while claiming to be
Ian". And even if you have no trust in the cryptography that links the
signature to the private key, or have no trust in public key cryptography,
the signature *still* contains a hash[0], and so you get the former
assurance for free, assuming a [currently] trusted digest algorithm.

So tell me: how does your yet-to-be-released checksum algorithm provide
any better assurances than this?

Oh, but let's not stop there! Your signature relies on the security of the
hashing algorithm---if a collision is found, what does that mean for
cleartext signing? Well, if you had distributed a plain-text document in
ASCII (and Unicode is almost certainly fine too), it's very highly likely,
unless the hashing algorithm is just brutally and fundamentally flawed, that
the collision will contain a lot of garbage---garbage that would be
immediately noticeable. But if you distribute a PDF, it might be possible to
hide all of that garbage within its constructs.

> PK crypto, in this state we're in, has negative value: it just creates a
> false sense oif security, and _discourages_ people from actually thinking
> about the real problem because they "know" their comms. are secure. They
> don't know that at all.

I certainly hope that developers of systems that use public-key cryptography
understand the obvious, fundamental principle that you described in the
article that Mark linked.[1] Of course, I understand that this is not often
the case, and it's certainly not the case for most *users* of the software.
And that is dangerous, because GPG/PGP/PK-crypto users should certainly be
aware that any message they encrypt, even if the private key is discarded,
may very well one day be broken. And the adversary may be little 10yo Bobby
on his modern-era PC in the not-too-distant future.

But for the purpose of *this* discussion, the integrity of your chosen PK
algorithm isn't entirely relevant. Yes, your identity assurance is nice to
have, but at the *very least*, your signature provides a hash that is
verifiable.

Let's not play the "well the software performing the verification may be
compromised" card---it's still possible to manually perform the verification
if you really felt it to be necessary, if the situation were dire enough.

So again---have you considered distributing a signature with your documents?

> I could ship them as low-res PNG files. But really, it would be much
> better just to get good working PDF viewers. There are _millions_ of
> important documents which are in PDF form. We need to be able to
> reliably verify and reference text within them. These include things
> like the Intel architecture reference manuals.

Certainly, and I'm not arguing that. I'm arguing against your assertion that
PDFs are harder to modify than plain text and somehow provide greater
assurances; I'm arguing the exact opposite.

> I am well aware of this, and it is one reason why we need to be able
> to read PDFs reliably. We can then vary the concrete representation
> and defeat attempts at mass-surveillance and also prevent people from
> using automated methods to edit the files in transit.

Preventing file modifications in transit is a well-understood problem
(and the problems with various implementations understood) unrelated to this
one.

But having a PDF reader that is generated from a formal definition won't
prevent the problems that I described (although it'd prevent the majority of
today's attacks against, say, well-known proprietary readers), because *PDF
exists to present formatted text*, and there are countless ways to present
what appears to be the exact same formatting. So the answer to the
fundamental question "do PDFs X and Y present equivalent content?" is
non-trivial.

> And before some smart-ass says "Oh yeah, and how are you going to
> verify the contents with cryptographic checksums?" I warn you to
> _think_ first, because I will metaphorically tear you into fifty thin,
> bloody little strips before you can say "Doh!"!

I don't follow. Surely your method isn't novel.

[0]: http://tools.ietf.org/html/rfc4880
[1]: http://livelogic.blogspot.dk/2014/09/the-free-fair-and-forged.html

-- 
Mike Gerwitz
Free Software Hacker | GNU Maintainer
http://mikegerwitz.com
FSF Member #5804 | GPG Key ID: 0x8EE30EAB

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Verifying Toolchain Semantics
  2014-10-05 16:24           ` Ian Grant
@ 2014-10-06  8:25             ` Nala Ginrut
  0 siblings, 0 replies; 27+ messages in thread
From: Nala Ginrut @ 2014-10-06  8:25 UTC (permalink / raw)
  To: Ian Grant; +Cc: guile-devel

[-- Attachment #1: Type: text/plain, Size: 1466 bytes --]

On Mon, Oct 6, 2014 at 12:24 AM, Ian Grant <ian.a.n.grant@googlemail.com>
wrote:

> On Sun, Oct 5, 2014 at 11:15 AM, Nala Ginrut <nalaginrut@gmail.com> wrote:
> >
> > Alright, I changed a system and try it again with evince successfully.
> > Anyway, I did't find any maths or special symbols in it, so it could be
> > published on your blog as plain text. But you may insist on the opinion
> of
> > PDF.
>
> There is another reason why I use PDF. It's much nicer to read. I love
> Garamond. Did you see the Jobs movie? Do you remember what he said
> about Garamond?
>
>
Well, I'm a naive font fan so I may understand you. I usually use my fav
font on page according to the design with HTML5/CSS stuff.
Forget about it, it's far from the topic.



> What we really need to do is find out what software Mark was using to
> pdf->png. If it segfaults then there is a good chance that that bug
> can be turned into a working exploit. Because all binaries distributed
> by the binary-distributors are identical, attackers can analyse the
> file and work out how to turn a bad pointer dereference into an
> exploitable "PDF attack vector" which executes binary code contained
> in the PDF.
>
>
Agreed. But I think the PDF is more complicated than TXT. One of my fav
ezine, Phrack is plain txt, and it's cool.
So my suggestion is to provide both TXT and PDF if the article is important.

Anyway, the font is a big issue for whom loves it, so I don't involve such
argue. ;-)

[-- Attachment #2: Type: text/html, Size: 2271 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Verifying Toolchain Semantics
       [not found] ` <CAKFjmdzxAMvcry8N6B_atM_8vGyzA1Dfz9ygWxSgh3fD7EUAuQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2014-10-06  8:51   ` William ML Leslie
  0 siblings, 0 replies; 27+ messages in thread
From: William ML Leslie @ 2014-10-06  8:51 UTC (permalink / raw)
  To: guile-devel, lightning


[-- Attachment #1.1: Type: text/plain, Size: 8930 bytes --]

On 6 October 2014 11:30, Ian Grant <ian.a.n.grant-gM/Ye1E23mwN+BqQ9rBEUg@public.gmane.org> wrote:

> http://lists.gnu.org/archive/html/guile-devel/2014-10/msg00016.html
>
> From:    William ML Leslie
> Date:    Mon, 6 Oct 2014 00:57:49 +1100
> On 3 October 2014 22:56, Taylan Ulrich Bayirli/Kammer <address@hidden>
> wrote:
>
> > Say, for example, that I can guess you will end up viewing the
> > document with one of five output programs - maybe, evince, chrome,
> > tk, gecko translating xsl-fo to xul or firefox viewing html 4
> > directly, and say that I only have exploits for the first three.
> > Now I have a bit of a problem: I need to find a way to get your tool
> > to emit the specific structure that will invoke the exploit.  Since
> > it will show a document that looks correct, although underneath may
> > have a different structure, I will have to account for variation in
> > what it emits but there will be a limited class of algebraic
> > manipulations that it can make without changing the semantics of the
> > document.
>
> It's not about this. These are all the programs that have problems.
> We're not going to use these sorts of programs anymore. We're going to
> use auto-generated, nearly perfect code. And after a few years it
> won't be just near perfect, it'll be perfect.
>

​Verifiable, safe renderers for these kind of formats is something I really
want to see, but if you have that, why do you need the code to have this
amorphous `semantic-fixpoint` quality?​  What does it give you?
​​

>
> > Maybe I would not find a way to /reliably/ produce a document that
> > will infect your machine.  Or maybe I would.  It's hypothetical.
> > But you know as well as I do that fuzzing a document generator to
> > see what instructions produce exploitable output is easier than
> > either writing a verifiable PDF viewer or writing a tool that
> > bisimulates a PDF document with a class of layout languages.  The
> > cherry on top is that all that extra code you introduce is surely
> > ripe with additional vulnerabilities - so maybe I don't need to
> > bother with producing those back-end exploits at all.
>
> This is wrong. If you look at the automatically-generated assembler
> code from a formal spec, in the form of a functional program
> generating that code:
>
>
> https://github.com/IanANGrant/red-october/blob/master/src/dynlibs/ffi/AbstractMachineWord.sml
>

​The paragraph I wrote above talks about three things, in increasing order
of labour:

0. Fuzzing #2
1. A verifiable PDF viewer (a formal model, together with some
implementation of that model, generated or not)
2. ​A tool that bisimulates a veriety of layout languages
​​
​I was trying to suggest #1 as the ideal solution, as you seem to be here.
The original straw-man was #2, intended to be dual to the semantic fixpoint
you described.



>
> you will see that the extra code has no extra attack vectors. It's all
> machine-generated. If this were a piece of code for reading a binary
> file, then it would have been generated from a grammar describing
> exactly what are the valid words in that language. (Using words and
> language in the technical sense as "valid strings" and "set of all
> valid strings".) So buffer overruns won't happen. And if all the
> semantics are formalised, then off-by-one errors won't happen
> either. If they do, they'll happen all over the place and you will
> find out very quickly.
>
> > Though I think your analogy is wrong.  PDF is already an abstract
> > semantics, like any formal language spec.  The question is whether
> > an implementation realizes precisely those semantics, or adds "oh
> > and insert a backdoor" to some pieces of code in that language, "as
> > an addition to the language semantics."  This augmentation to the
> > semantics needn't be apparent in the source code of the
> > implementation, since the language it's written in might again be
> > compiled by a compiler that "augments" the semantics of the language
> > it compiles, and that in turn needn't be apparent in the compiler's
> > source ... until you arrive at the since long vanished initial
> > source code that planted the virus.
>
> > I hope by now you understand why I think this to be nonsense.  It's
> > still a C compiler.  You know that, I know that.  Why wouldn't the
> > machine be able to tell?
>
> Because it's formally undecidable. That's Rice's Theorem. And it's not
> "in practice" probably detectable by semantantic methods, because we
> are going to extra lengths to obscure the effective operational
> semantics by multiple re-encodings. How are you going to recognise a
> compiled Fortran90 program translating a C program into an abstract
> assembler like lightning, as a C compiler? You don't even know what
> the opcode numbers are: we can permute them.
>

Worse comes to worse, you feed it a small C program as input inside an
abstract interpreter.  It's the victim's CPU that is being burned, so
nontermination of the algorithm is a minor problem for the attacker, who
needs to ensure utilisation is low enough to avoid questions.

Rice's theorem applies to the entire class of programs that people can
write in a turing-complete language, but in practice, people write programs
that can be checked for denotational equivalence.  It is costly
(computationally) to do this with a general purpose programming language,​
but I think you'd be surprised at the wide range of existing programs that
we can analyse that way. *

I don't think that a semantic fixed point would be problematic for
detection of C compilers.  While the elaboration you write to go from
formal semantics to concrete implementation will obviously be
non-deterministic, it will still be algorithmic, in that someone ultimately
needs to list ways that definitions can be turned into concrete
implementations.  These make up an initial algebra, the basic terms of
which can be treated individually, and in particular, the 'lossy' aspects
of the translation that should make it difficult to reverse the process
(such as introduction of spurious effects) can be categorised.  The output
program can be partially normalised, and the resulting effect graph
analysed with the effect types of the implementation terms.

If nothing else, reducing to the first few elements of the IO colist and
normalising the values involved is unlikely to be prohibitive (otherwise
the program wouldn't run at an acceptible speed anyway).

​Hah - in a real bind, you don't need to detect the C compiler at all -
always add code that wraps IO calls / external effects, and augments the
output if need be.​

​* Can't seem to find a good paper on this at the moment, but I did enjoy
this post that has just been doing the rounds:


http://www.haskellforall.com/2014/09/morte-intermediate-language-for-super.html
​

  Papers saying "I tried to do it and failed" are more common, but I think
it's just a common task for the directionless PhD student.



> > The problem that I have with this discussion is that rather than
> > address many* of the points in my email, he instead resorted to
> > insisting I was a shill.  It's very difficult to have a conversation
> > with someone who uses silencing tactics.  Or better - it's
> > pointless.
>
> What are you talking about? Here is what you wrote to me:
>
>    http://lists.gnu.org/archive/html/guile-devel/2014-09/msg00036.html
>
> > Please understand that our lack of enthusiasm for it is not without
> > our own individual commitments to the security and freedoms of
> > users; rather while the extra assurances your project may provide
> > sound nice, they are frankly rather minimal and don't have much
> > additional utility, and so are not priority one.
>
> Now if you wanted me to continue answering your questions, then well,
> I'm sorry, I just got the wrong end of stick completely. I thought you
> wanted me to fuck off!
>

​I see.  I can be a bit literal sometimes.  I was being honest about my
priorities in response to a request to down-tools and work on this.



>
> So I replied as below. To you, privately. And that was it. Silencing
> tactics? You then copied it to the world, and a guy who I _told_ you,
> was deputy director of the NSA's National Computer Security Centre!
>
>
​I was talking about this:

​http://lists.gnu.org/archive/html/guile-devel/2014-09/msg00038.html

Maybe that's not what you were trying to do.

-- 
William Leslie

Notice:
Likely much of this email is, by the nature of copyright, covered under
copyright law.  You absolutely MAY reproduce any part of it in accordance
with the copyright law of the nation you are reading this in.  Any attempt
to DENY YOU THOSE RIGHTS would be illegal without prior contractual
agreement.

[-- Attachment #1.2: Type: text/html, Size: 12369 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Lightning mailing list
Lightning-mXXj517/zsQ@public.gmane.org
https://lists.gnu.org/mailman/listinfo/lightning

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Verifying Toolchain Semantics
       [not found]           ` <20141006042323.GA31390-iW7gFb+/I3LZHJUXO5efmti2O/JbrIOy@public.gmane.org>
@ 2014-10-07 17:18             ` Ian Grant
       [not found]               ` <CAKFjmdx+jzfapvrq6EEO8Skx2L2UZwi-DZ22xiq9t1438E7kOw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2014-10-08  3:55               ` Mike Gerwitz
  0 siblings, 2 replies; 27+ messages in thread
From: Ian Grant @ 2014-10-07 17:18 UTC (permalink / raw)
  To: Ian Grant, Mark H Weaver, Markus Kuhn, guile-devel, Mike Gerwitz,
	lightning, Richard Stallman

On Mon, Oct 6, 2014 at 12:23 AM, Mike Gerwitz <mikegerwitz-mXXj517/zsQ@public.gmane.org> wrote:
> On Sun, Oct 05, 2014 at 12:11:00PM -0400, Ian Grant wrote:
>> > As has been stated---your concerns are substantiated and understood,
>>
>> I wasn't aware that my concerns _have_ been substantiated. How? I am
>> not sure they have been understood, either.
>
> They were substantiated long ago by the very references you've provided.
> Your application to the GNU project doesn't make this a novel discussion.

This issue is not all of my concerns. This issue of insecure software
is a tiny part of the project.I am not "applying" for anything from
"the GNU project" I am _giving_ the GNU project a lifeline. I have
given all my ideas about how to write software properly, which you
desperately need to know. And I am giving you my time, teaching you
how to think about software and computing. I don't expect anything in
return. Some respect would be nice, but it's not essential, as you can
see.

> The problem is understood. A proper and **practical** solution to the
> problem may be less well understood, but you haven't yet convinced those in
> charge of making such decisions that this is a threat (a) greater than all
> other threats worth devoting time to protect against; and (b) worthy of such
> an enormous time investment at this time.

The problem is not understood by _you._ otherwise you wouldn't say it
was an enormous investment of time. We would have had half of it done
by now if people had responded to me constructively. But instead I
have had to spend over a month countering incoherent objections made
by smart-ass kids who can't tell one end of a computer from the other.

> It is worthy of addressing, and addressing it in the manner that you and
> Dijkstra suggest will certainly go a long way to preventing nearly all
> problems contributing to (a), but you need to take a different approach if
> you're going to incite such change.

Why do you think that?

>> Well my body is getting on for 50 years old, so it doesn't matter so
>> much. Here's a "better" example:
>>
>>    http://en.wikipedia.org/wiki/Aaron_Swartz
>
> I hope very much that using Aaron as an example is to further your argument
> that programming is a life-or-death situation, and that it's not a "better"
> example of personal neglect or instability.

No I did not mean it was a better example of personal neglect. But
arguably suicide *is* a better example of personal neglect and
instability. I wouldn't have been so crass as to say that though,
especially since I _sincerely doubt_ Aaron Swartz did commit suicide.
It is _much_ more likely that he was murdered.

> The former has no relation to the topic at hand, and the latter
> is---well, let's just hope that you poorly represented yourself.

Yes it _does_ have a relation to the topic at hand. I said "you are
combatants in a global information war which will cost some you your
lives" and you replied saying I was being melodramatic and that you
personally were sure there were a million things more likely to take
your life, and suggested that remaining seated for extended periods
and/or neglecting personal hygeine was one of them. Aaron's was a
better example of how this war could take someone's life. I  put
"better" in quotes, thinking people would know then that I didn't in
any way think that what happened to Aaron was in any way good.

>> > This argument is not valid---why is it hard to alter a PDF? In fact,
>> > PDF manipulation is a dark (and probably cancer-causing) art that's
>> > automated by countless businesses worldwide; it is a topic that eats
>> > up a significant portion of development time at my employer's office.
>>
>> It is valid. If you want to see why, then try to alter one of the PDFs
>> I've sent out.
>> What will you do when I later send out a series of checksums using a
>> checksum algorithm that neither you or anyone else ever heard of
>> before?
>
> Your argument favored PDF over plain text. The point that I had made was
> that you are unable to verify that two PDFs---unless binary
> identical---render text that is unchanged from the original. If I convert
> your PDF to a PNG and back to a PDF, it renders the same text. What does
> that mean to you? Does that mean that the text has been altered? Do you care
> that your work has been distribute in an alternative form? At what point
> does it become "modified"? And how can you distinguish reasonable
> alterations from unreasonable ones?

You can convert my PDF to anything you like, and if it represents the
same text, then I don't care. It would be a pointless thing to do
however, unless you wanted to defeat some automated mass-surveillance
system like Echelon.  What I don't want is people producing something
that says things I didn't say but which others might think I wrote. I
also don't want shoddy reproductions of text being passed around as
emails, accumulating quote marks and comments. I also want people to
see the text as I typeset it. I encode messages in these texts, and
they are only decodable from the typesetting.

> What does your checksum algorithm accomplish than existing checksum or
> hashing algorithms do not? Why can that same principle not be applied to
> plain text---why does the implementation of your checksum make it any more
> difficult to modify the PDF? And why is your checksum better than an
> established and heavily studied cryptographic standard (e.g. SHA-256)?

Do you mind me asking: do you have a university degree-level
qualification in computer science? If so, from which university is it?
(I don't have any such thing, in case you are wondering.)  Even
cryptographic checksums are not unique. So if you know what checksum
you are trying to defeat (MD5, say) you could make a new PDF with
different text, but which had the same checksum. The fact that this
wasn't obvious to you demonstrates something important: people
mistakenly use cryptography as a substitute for actual knowledge, even
when they _know_ they don't know how or why it actually works.

>> > Have you considered just distributing a GPG/PGP signature with your
>> > works, or even signing the work itself? After all, this whole
>> > discussion is about proving the unlikelihood of and preventing the
>> > modification of data.
>>
>> Of course I have thought of that, and rejected it as a pointless waste
>> of time! If you read what I've written, on this list
>
> You have written a lot on this list, and there has been an absurd amount of
> redundancy. I have not read it all.

The absurd amount of redundancy is because people like you and "Manicz
Patzigkeit" and William Leslie and Mark Weaver and Oleg and Richard
Stallman argue with me before they have understood what I've already
written on this list. It's a self-perpetuating error, it seems.

>> and also in that blog article Mark pointed up, you will see why I think
>> this. And even if public key crypto was positively secure, it is worthless
>> when I have no rational basis for confidence in the security of the system
>> which holds the private keys.
>
> In the case of PGP, you hold your private keys. I suppose if you're worried
> about a Thompson virus, you can't even provide yourself assurances of the
> security of your own system, but in such a severe case, there is nothing you
> do on your system that can be trusted, so surely you must not enter your
> password to access your e-mail on any of those systems. Because if you did,
> surely someone could just impersonate you to begin with, and there'd be no
> point in signing anything cryptographically, because such attempts would
> have been preempted.

I have zero confidence in the security of my own system. And not
because of a "Thompson virus" (It's not a virus, and it's no more
associated with Thompson than it is with the NSA.) it's because it's
mostly GNU software, which I know is so totally insecure you wouldn't
need a compiler trap door to get into any and every system.

> All that aside, it is still possible to host your private key on a machine
> that is wholly segregated from the rest of the world, perform your
> cryptographic operation on that, transfer its output (doesn't really matter
> how at this point, as long as it's output-only and cannot be used to
> compromise the machine---it only matters that the same output is supplied to
> each machine) to N machines with N different lineages (even if one is a
> PDP-11!), and verify the result of the cryptographic operation on those
> machines to ensure that your data were not modified in the process of
> applying the cryptographic operation.

What is amazing to me is that you think I don't know what you're
telling me. I have been writing software professionally for over 30
years, and I have been a unix sys-admin for fifteen years in an
institution where some of the secretaries apparently know more about
computing than you do.

> But we're not talking about such assurances here, so most of those steps
> would be unnecessary---upon receiving your signed request from the machine
> hosting your private key, it *does not matter* what its output is: the plain
> text is left, well, plain. You can verify it by hand. Then everyone else in
> the world with your public key can verify that it is indeed the signature
> produced by the same machine that produced the signature for all of your
> other archived messages all over the internet (mailing lists, your own blog
> entries, etc)---a task that can be trivially automated to scan for
> discrepancies.
>
> Your checksum would merely state "this PDF has not been modified". This
> would say "this PDF has not been modified and I am actually the same person
> that wrote all these other articles / e-mail messages, while claiming to be
> Ian". And even if you have no trust in the cryptography that links the
> signature to the private key, or have no trust in public key cryptography,
> the signature *still* contains a hash[0], and so you get the former
> assurance for free, assuming a [currently] trusted digest algorithm.
>
> So tell me: how does your yet-to-be-released checksum algorithm provide
> any better assurances than this?

I think I've explained that.

> Oh, but let's not stop there! Your signature relies on the security of the
> hashing algorithm---if a collision is found, what does that mean for
> cleartext signing? Well, if you had distributed a plain-text document in
> ASCII (and Unicode is almost certainly fine too), it's very highly likely,
> unless the hashing algorithm is just brutally and fundamentally flawed, that
> the collision will contain a lot of garbage---garbage that would be
> immediately noticeable. But if you distribute a PDF, it might be possible to
> hide all of that garbage within its constructs.

Let's do stop there. For your sake: you are in grave danger of making
an even bigger fool of yourself than you already have.

>> PK crypto, in this state we're in, has negative value: it just creates a
>> false sense oif security, and _discourages_ people from actually thinking
>> about the real problem because they "know" their comms. are secure. They
>> don't know that at all.
>
> I certainly hope that developers of systems that use public-key cryptography
> understand the obvious, fundamental principle that you described in the
> article that Mark linked.[1] Of course, I understand that this is not often
> the case, and it's certainly not the case for most *users* of the software.
> And that is dangerous, because GPG/PGP/PK-crypto users should certainly be
> aware that any message they encrypt, even if the private key is discarded,
> may very well one day be broken. And the adversary may be little 10yo Bobby
> on his modern-era PC in the not-too-distant future.

I sincerely doubt that developers of systems that use public-key
cryptography know that. In fact, I doubt anyone will be able to
provide me a reference to a commercially published source that clearly
and explicitly says essentially what I said there. I doubt anyone
could even give me a URL for a text that clearly and explicitly states
that. And to show it is widely acknowledged you would need to point to
dozens of instances.

> But for the purpose of *this* discussion, the integrity of your chosen PK
> algorithm isn't entirely relevant. Yes, your identity assurance is nice to
> have, but at the *very least*, your signature provides a hash that is
> verifiable.

I don't give a shit about my identity. It's the identity of the *text*
that's important.

> Let's not play the "well the software performing the verification may be
> compromised" card---it's still possible to manually perform the verification
> if you really felt it to be necessary, if the situation were dire enough.
>
> So again---have you considered distributing a signature with your documents?

No. You have not told me anything I didn't already know. But I hope
you appreciate I have told you things you should have known, had you
only thought a little about them.

>> I could ship them as low-res NG files. But really, it would be much
>> better just to get good working PDF viewers. There are _millions_ of
>> important documents which are in PDF form. We need to be able to
>> reliably verify and reference text within them. These include things
>> like the Intel architecture reference manuals.
>
> Certainly, and I'm not arguing that. I'm arguing against your assertion that
> PDFs are harder to modify than plain text and somehow provide greater
> assurances; I'm arguing the exact opposite.

PDFs _are_ harder to modify than plain text. If you really don't
believe me, then modify one of the PDFs I sent and tell me how easy it
was to do. And PDFs can provide greater assurances that the content is
original because they are structured documents, not linear strings of
characters. The same "multiple representations of a given text"
arguments apply to plain text files. PDFs also look better, and they
are easier for people to read. And some of them include complex
formatting that is more clearly expressed using high-quality
typesetting than linear text strings. PDFs can also be incrementally
and _verifiably_ modified.

It's really funny. You kids are all up in arms over an international
standard, meticulously defined, beautifully documented format that you
don't actually seem to know anything about. I suppose that's because
the specification is _quel horreur!_ a PDF file! And if you had just
done what I told you to do, and got a decent lexical analyzer
generator working for guile, and typed in the grammar, and got the
lightning interface for guile working,, you would have had a scheme
program that generated PDF reader programs by now. And that would have
given you a guile PDF display plugin for Firefox, for example.

>> I am well aware of this, and it is one reason why we need to be able
>> to read PDFs reliably. We can then vary the concrete representation
>> and defeat attempts at mass-surveillance and also prevent people from
>> using automated methods to edit the files in transit.
>
> Preventing file modifications in transit is a well-understood problem
> (and the problems with various implementations understood) unrelated to this
> one.

It is related, because those documents can be transformed if/when
necessary, and there are hopefully hundreds of copies around the place
by now, so it will be a bit hard for anyone to have them all deleted.

> But having a PDF reader that is generated from a formal definition won't
> prevent the problems that I described (although it'd prevent the majority of
> today's attacks against, say, well-known proprietary readers), because *PDF
> exists to present formatted text*, and there are countless ways to present
> what appears to be the exact same formatting. So the answer to the
> fundamental question "do PDFs X and Y present equivalent content?" is
> non-trivial.

You've lost the point of your argument now. This is incoherent. I am
not interested in automatically determining whether two PDFs represent
the same content, and I never was. That is formally undecidable, just
like William's "do programs X and Y present equivalent content?". It
requires actual human knowledge about what the glyphs represent.

>> And before some smart-ass says "Oh yeah, and how are you going to
>> verify the contents with cryptographic checksums?" I warn you to
>> _think_ first, because I will metaphorically tear you into fifty thin,
>> bloody little strips before you can say "Doh!"!
>
> I don't follow. Surely your method isn't novel.

Well if the method involved anything "novel" would you expect to be
able to know what it was just by thinking about it?

> Free Software Hacker | GNU Maintainer

GNU software needs a lot more developers and far fewer maintainers
than it has. We don't need to maintain crap, we need to replace it.

And maybe one-day you won't feel embarrassed to call yourself a
programmer, or a software engineer. Here's hoping, anyway.

Ian

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Verifying Toolchain Semantics
       [not found]               ` <CAKFjmdx+jzfapvrq6EEO8Skx2L2UZwi-DZ22xiq9t1438E7kOw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2014-10-07 17:28                 ` Mark H Weaver
  2014-10-07 17:56                   ` Ian Grant
  0 siblings, 1 reply; 27+ messages in thread
From: Mark H Weaver @ 2014-10-07 17:28 UTC (permalink / raw)
  To: Ian Grant
  Cc: Mike Gerwitz, Markus Kuhn, lightning, Richard Stallman,
	guile-devel

Ian, please stop posting to guile-devel.  You've made your points, and
I've even called attention to what I think is the best exposition of
your ideas.  At this point you're just repeating yourself and hurling
gratuitous insults.  Enough!

     Mark

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Verifying Toolchain Semantics
  2014-10-07 17:28                 ` Mark H Weaver
@ 2014-10-07 17:56                   ` Ian Grant
       [not found]                     ` <CAKFjmdwNTjJ7nU-rKEWkA+5whsfyrpqJ6RkhU+VRbUW6rqT03A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 27+ messages in thread
From: Ian Grant @ 2014-10-07 17:56 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: Markus Kuhn, lightning, Richard Stallman, guile-devel

On Tue, Oct 7, 2014 at 1:28 PM, Mark H Weaver <mhw@netris.org> wrote:
> Ian, please stop posting to guile-devel.  You've made your points, and
> I've even called attention to what I think is the best exposition of
> your ideas.  At this point you're just repeating yourself and hurling
> gratuitous insults.  Enough!

Well, it's insulting when you speak to me like a child, and when you
make childish suggestions about my motives for posting to this list.
But I don't complain, and I certainly wouldn't attempt what William
calls "silencing tactics".
.
Remember, we're making a movie about this, and the whole world is
watching what we say to each other, and it is all a matter of public
record, distributed across thousands of independent machines,

I am responding constructively to questions asked me by a guile
developer who is also an official representative of the FSF. Will the
FSF prevent me from doing so on an FSF forum. And if so, will any
guile developers respond to the mails I sent regarding guile? The one
about throw-handlers, and the one about block-allocations of cons
cells, the one about a liightning interface for guile. (People will
want to know why you ignored that.) and the one about 50,000 lines of
shell script with no explanation for things like this which are
basically setting up for an exploit. Once this sets the environment
variable, any programs knows it's on a back-level Solaris install, and
can infer a catalogue of exploits. But why is this check necessary
anyway? I know this not guile-specific, but it relates to my original
suggestion which was to replace autoconf with an abstract prolog
database for inferring system properties from formal descriptions, and
which wouldn't be vulnerable to this sort of nonsense., This is
something to which guile is ideally suited.

In short, no I won't stop responding to people who make stupid
comments on this list, either about me, or things I've written on this
list.

Speaking of which, what is the name and version of the program that
your emacs uses for "pdf->png" conversion? Your report, blaming me for
sending bad PDF, indicates a fairly fundamental misunderstanding of
what a program meant to do when it reads a file that supposed to be in
a defined format.

And lastly, just be happy we're not discussing the FSFs 2013 financial
filing, which you presumably haven't read, because it's a PDF file ...

Ian

as_nl='
'
export as_nl
# Printing a long string crashes Solaris 7 /usr/bin/printf.
as_echo='\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\'
as_echo=$as_echo$as_echo$as_echo$as_echo$as_echo
as_echo=$as_echo$as_echo$as_echo$as_echo$as_echo$as_echo
# Prefer a ksh shell builtin over an external printf program on Solaris,
# but without wasting forks for bash or zsh.
if test -z "$BASH_VERSION$ZSH_VERSION" \
    && (test "X`print -r -- $as_echo`" = "X$as_echo") 2>/dev/null; then
  as_echo='print -r --'
  as_echo_n='print -rn --'
elif (test "X`printf %s $as_echo`" = "X$as_echo") 2>/dev/null; then
  as_echo='printf %s\n'
  as_echo_n='printf %s'
else
  if test "X`(/usr/ucb/echo -n -n $as_echo) 2>/dev/null`" = "X-n $as_echo"; then
    as_echo_body='eval /usr/ucb/echo -n "$1$as_nl"'
    as_echo_n='/usr/ucb/echo -n'
  else
    as_echo_body='eval expr "X$1" : "X\\(.*\\)"'
    as_echo_n_body='eval
      arg=$1;
      case $arg in #(
      *"$as_nl"*)
    expr "X$arg" : "X\\(.*\\)$as_nl";
    arg=`expr "X$arg" : ".*$as_nl\\(.*\\)"`;;
      esac;
      expr "X$arg" : "X\\(.*\\)" | tr -d "$as_nl"
    '
    export as_echo_n_body
    as_echo_n='sh -c $as_echo_n_body as_echo'
  fi
  export as_echo_body
  as_echo='sh -c $as_echo_body as_echo'
fi


>      Mark



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Verifying Toolchain Semantics
       [not found]                     ` <CAKFjmdwNTjJ7nU-rKEWkA+5whsfyrpqJ6RkhU+VRbUW6rqT03A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2014-10-07 19:24                       ` Philip Herron
       [not found]                         ` <CAEvRbeoEJPTtoDu0nDudJyfBoaT1vpuvHzL=t+TkJr_ZGkzYEQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2014-10-08 18:26                       ` Mark H Weaver
  1 sibling, 1 reply; 27+ messages in thread
From: Philip Herron @ 2014-10-07 19:24 UTC (permalink / raw)
  To: Ian Grant
  Cc: Richard Stallman, guile-devel, Mark H Weaver, Markus Kuhn,
	Mike Gerwitz, lightning

I think what we want is your posts really are not 100% related to
development of the respective development. Its flooding mailboxes.

If you want to contribute, open source is a meritocracy we don't
really hold meetings talking about ideas. You implement something and
it goes from there. Your probably a great guy but your posts over the
last few months irritate my a lot because its just all open ended
discussion stuff.


On 7 October 2014 18:56, Ian Grant <ian.a.n.grant-gM/Ye1E23mwN+BqQ9rBEUg@public.gmane.org> wrote:
> On Tue, Oct 7, 2014 at 1:28 PM, Mark H Weaver <mhw-StlzRsPvAncdnm+yROfE0A@public.gmane.org> wrote:
>> Ian, please stop posting to guile-devel.  You've made your points, and
>> I've even called attention to what I think is the best exposition of
>> your ideas.  At this point you're just repeating yourself and hurling
>> gratuitous insults.  Enough!
>
> Well, it's insulting when you speak to me like a child, and when you
> make childish suggestions about my motives for posting to this list.
> But I don't complain, and I certainly wouldn't attempt what William
> calls "silencing tactics".
> .
> Remember, we're making a movie about this, and the whole world is
> watching what we say to each other, and it is all a matter of public
> record, distributed across thousands of independent machines,
>
> I am responding constructively to questions asked me by a guile
> developer who is also an official representative of the FSF. Will the
> FSF prevent me from doing so on an FSF forum. And if so, will any
> guile developers respond to the mails I sent regarding guile? The one
> about throw-handlers, and the one about block-allocations of cons
> cells, the one about a liightning interface for guile. (People will
> want to know why you ignored that.) and the one about 50,000 lines of
> shell script with no explanation for things like this which are
> basically setting up for an exploit. Once this sets the environment
> variable, any programs knows it's on a back-level Solaris install, and
> can infer a catalogue of exploits. But why is this check necessary
> anyway? I know this not guile-specific, but it relates to my original
> suggestion which was to replace autoconf with an abstract prolog
> database for inferring system properties from formal descriptions, and
> which wouldn't be vulnerable to this sort of nonsense., This is
> something to which guile is ideally suited.
>
> In short, no I won't stop responding to people who make stupid
> comments on this list, either about me, or things I've written on this
> list.
>
> Speaking of which, what is the name and version of the program that
> your emacs uses for "pdf->png" conversion? Your report, blaming me for
> sending bad PDF, indicates a fairly fundamental misunderstanding of
> what a program meant to do when it reads a file that supposed to be in
> a defined format.
>
> And lastly, just be happy we're not discussing the FSFs 2013 financial
> filing, which you presumably haven't read, because it's a PDF file ...
>
> Ian
>
> as_nl='
> '
> export as_nl
> # Printing a long string crashes Solaris 7 /usr/bin/printf.
> as_echo='\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\'
> as_echo=$as_echo$as_echo$as_echo$as_echo$as_echo
> as_echo=$as_echo$as_echo$as_echo$as_echo$as_echo$as_echo
> # Prefer a ksh shell builtin over an external printf program on Solaris,
> # but without wasting forks for bash or zsh.
> if test -z "$BASH_VERSION$ZSH_VERSION" \
>     && (test "X`print -r -- $as_echo`" = "X$as_echo") 2>/dev/null; then
>   as_echo='print -r --'
>   as_echo_n='print -rn --'
> elif (test "X`printf %s $as_echo`" = "X$as_echo") 2>/dev/null; then
>   as_echo='printf %s\n'
>   as_echo_n='printf %s'
> else
>   if test "X`(/usr/ucb/echo -n -n $as_echo) 2>/dev/null`" = "X-n $as_echo"; then
>     as_echo_body='eval /usr/ucb/echo -n "$1$as_nl"'
>     as_echo_n='/usr/ucb/echo -n'
>   else
>     as_echo_body='eval expr "X$1" : "X\\(.*\\)"'
>     as_echo_n_body='eval
>       arg=$1;
>       case $arg in #(
>       *"$as_nl"*)
>     expr "X$arg" : "X\\(.*\\)$as_nl";
>     arg=`expr "X$arg" : ".*$as_nl\\(.*\\)"`;;
>       esac;
>       expr "X$arg" : "X\\(.*\\)" | tr -d "$as_nl"
>     '
>     export as_echo_n_body
>     as_echo_n='sh -c $as_echo_n_body as_echo'
>   fi
>   export as_echo_body
>   as_echo='sh -c $as_echo_body as_echo'
> fi
>
>
>>      Mark
>
> _______________________________________________
> Lightning mailing list
> Lightning-mXXj517/zsQ@public.gmane.org
> https://lists.gnu.org/mailman/listinfo/lightning

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Verifying Toolchain Semantics
       [not found]                         ` <CAEvRbeoEJPTtoDu0nDudJyfBoaT1vpuvHzL=t+TkJr_ZGkzYEQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2014-10-07 19:47                           ` Ian Grant
  0 siblings, 0 replies; 27+ messages in thread
From: Ian Grant @ 2014-10-07 19:47 UTC (permalink / raw)
  To: Philip Herron
  Cc: Richard Stallman, guile-devel, Mark H Weaver, Markus Kuhn,
	Mike Gerwitz, lightning

On Tue, Oct 7, 2014 at 3:24 PM, Philip Herron
<herron.philip-gM/Ye1E23mwN+BqQ9rBEUg@public.gmane.org> wrote:
> I think what we want is your posts really are not 100% related to
> development of the respective development. Its flooding mailboxes.

Yes, well, mailboxes get flooded ... Haven't you got a kill list, or a
filter that says "Ian Grant's crap."

The reason it gets flooded is that people keep writing unreasonable
things about me, or about what I've written, on this list, (and
sometimes don't even send them directly to me (I am not subscribed)).
I think I have the right to respond to these things, since they appear
on a public list. If you think otherwise, then you are welcome to tell
me why. If you don't want me to reply to the list, then send me a
_private_ mail, clearly giving the reasons why you think I don't have
this right. If you send it to the list, all the better, but consider
that my reply may also go the list, and thereby be the cause of more
traffic that _others_ may not be interested in reading.

> If you want to contribute, open source is a meritocracy we don't
> really hold meetings talking about ideas. You implement something and
> it goes from there. Your probably a great guy but your posts over the
> last few months irritate my a lot because its just all open ended
> discussion stuff.

You read it all then? A lot of it is "all open ended discussion", and
that is what irritates me too. I posted 8 pages explaining why we need
to have these discussions, so that we can steer separate projects in
mutually beneficial directions. We need to discuss these possible
directions in public, and in a cooperative atmosphere of intelligent
mutual exploration of possibilities, with contributions from all the
different projects.

It's no good just individually "hacking code" and sending in patches,
because the overall direction gets lost, and people don't benefit from
working together, they end up working alone, and competing for
developer attention against other developers on the same project!!

And it's clear to me that compared to how this project worked sixteen
or so years ago, it has completely lost its direction, and it has
hardly progressed at all.  And I daresay this will provoke screams of
outrage from dozens of people that didn't even know what guile was
sixteen years ago, .... but that's "hackers" for you!

Ian

> On 7 October 2014 18:56, Ian Grant <ian.a.n.grant-gM/Ye1E23mwN+BqQ9rBEUg@public.gmane.org> wrote:
>> On Tue, Oct 7, 2014 at 1:28 PM, Mark H Weaver <mhw-StlzRsPvAncdnm+yROfE0A@public.gmane.org> wrote:
>>> Ian, please stop posting to guile-devel.  You've made your points, and
>>> I've even called attention to what I think is the best exposition of
>>> your ideas.  At this point you're just repeating yourself and hurling
>>> gratuitous insults.  Enough!
>>
>> Well, it's insulting when you speak to me like a child, and when you
>> make childish suggestions about my motives for posting to this list.
>> But I don't complain, and I certainly wouldn't attempt what William
>> calls "silencing tactics".
>> .
>> Remember, we're making a movie about this, and the whole world is
>> watching what we say to each other, and it is all a matter of public
>> record, distributed across thousands of independent machines,
>>
>> I am responding constructively to questions asked me by a guile
>> developer who is also an official representative of the FSF. Will the
>> FSF prevent me from doing so on an FSF forum. And if so, will any
>> guile developers respond to the mails I sent regarding guile? The one
>> about throw-handlers, and the one about block-allocations of cons
>> cells, the one about a liightning interface for guile. (People will
>> want to know why you ignored that.) and the one about 50,000 lines of
>> shell script with no explanation for things like this which are
>> basically setting up for an exploit. Once this sets the environment
>> variable, any programs knows it's on a back-level Solaris install, and
>> can infer a catalogue of exploits. But why is this check necessary
>> anyway? I know this not guile-specific, but it relates to my original
>> suggestion which was to replace autoconf with an abstract prolog
>> database for inferring system properties from formal descriptions, and
>> which wouldn't be vulnerable to this sort of nonsense., This is
>> something to which guile is ideally suited.
>>
>> In short, no I won't stop responding to people who make stupid
>> comments on this list, either about me, or things I've written on this
>> list.
>>
>> Speaking of which, what is the name and version of the program that
>> your emacs uses for "pdf->png" conversion? Your report, blaming me for
>> sending bad PDF, indicates a fairly fundamental misunderstanding of
>> what a program meant to do when it reads a file that supposed to be in
>> a defined format.
>>
>> And lastly, just be happy we're not discussing the FSFs 2013 financial
>> filing, which you presumably haven't read, because it's a PDF file ...
>>
>> Ian
>>
>> as_nl='
>> '
>> export as_nl
>> # Printing a long string crashes Solaris 7 /usr/bin/printf.
>> as_echo='\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\'
>> as_echo=$as_echo$as_echo$as_echo$as_echo$as_echo
>> as_echo=$as_echo$as_echo$as_echo$as_echo$as_echo$as_echo
>> # Prefer a ksh shell builtin over an external printf program on Solaris,
>> # but without wasting forks for bash or zsh.
>> if test -z "$BASH_VERSION$ZSH_VERSION" \
>>     && (test "X`print -r -- $as_echo`" = "X$as_echo") 2>/dev/null; then
>>   as_echo='print -r --'
>>   as_echo_n='print -rn --'
>> elif (test "X`printf %s $as_echo`" = "X$as_echo") 2>/dev/null; then
>>   as_echo='printf %s\n'
>>   as_echo_n='printf %s'
>> else
>>   if test "X`(/usr/ucb/echo -n -n $as_echo) 2>/dev/null`" = "X-n $as_echo"; then
>>     as_echo_body='eval /usr/ucb/echo -n "$1$as_nl"'
>>     as_echo_n='/usr/ucb/echo -n'
>>   else
>>     as_echo_body='eval expr "X$1" : "X\\(.*\\)"'
>>     as_echo_n_body='eval
>>       arg=$1;
>>       case $arg in #(
>>       *"$as_nl"*)
>>     expr "X$arg" : "X\\(.*\\)$as_nl";
>>     arg=`expr "X$arg" : ".*$as_nl\\(.*\\)"`;;
>>       esac;
>>       expr "X$arg" : "X\\(.*\\)" | tr -d "$as_nl"
>>     '
>>     export as_echo_n_body
>>     as_echo_n='sh -c $as_echo_n_body as_echo'
>>   fi
>>   export as_echo_body
>>   as_echo='sh -c $as_echo_body as_echo'
>> fi
>>
>>
>>>      Mark
>>
>> _______________________________________________
>> Lightning mailing list
>> Lightning-mXXj517/zsQ@public.gmane.org
>> https://lists.gnu.org/mailman/listinfo/lightning

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Verifying Toolchain Semantics
  2014-10-07 17:18             ` Ian Grant
       [not found]               ` <CAKFjmdx+jzfapvrq6EEO8Skx2L2UZwi-DZ22xiq9t1438E7kOw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2014-10-08  3:55               ` Mike Gerwitz
  1 sibling, 0 replies; 27+ messages in thread
From: Mike Gerwitz @ 2014-10-08  3:55 UTC (permalink / raw)
  To: Ian Grant
  Cc: Mark H Weaver, Markus Kuhn, lightning, Richard Stallman,
	guile-devel

[-- Attachment #1: Type: text/plain, Size: 6047 bytes --]

To limit your attack area for a response (out of respect for the
recipients---feel free to continue this conversation with me in private) I'm
going to keep this relatively brief.

On Tue, Oct 07, 2014 at 01:18:31PM -0400, Ian Grant wrote:
> The problem is not understood by _you._ otherwise you wouldn't say it
> was an enormous investment of time.

Then do it. You already have a large time investment on these lists.

> Even cryptographic checksums are not unique. So if you know what checksum
> you are trying to defeat (MD5, say) you could make a new PDF with
> different text, but which had the same checksum. The fact that this wasn't
> obvious to you demonstrates something important

I'm not sure how you even remotely got that impression, especially
considering that collisions were at the core of one of my arguments.

> I have zero confidence in the security of my own system. And not
> because of a "Thompson virus" (It's not a virus, and it's no more
> associated with Thompson than it is with the NSA.) it's because it's
> mostly GNU software, which I know is so totally insecure you wouldn't
> need a compiler trap door to get into any and every system.

Perhaps you should spend less time complaining about it and submit patches.
Even if they implement your suggestions---which you stated isn't much of a
time investment anyway---you will have worked much further toward solving
the problem than you already have.

If the system is so fundamentally insecure, why has that not been
aggressively demonstrated by crackers, and why have you not demonstrated
numerous exploits?

> > So tell me: how does your yet-to-be-released checksum algorithm provide
> > any better assurances than this?
> 
> I think I've explained that.

No, you have explained that you think I don't know what I'm talking about,
without producing any actual content or substantiating your opinion.

> > I certainly hope that developers of systems that use public-key cryptography
> > understand the obvious, fundamental principle that you described in the
> > article that Mark linked.[...]
> 
> I sincerely doubt that developers of systems that use public-key
> cryptography know that. In fact, I doubt anyone will be able to
> provide me a reference to a commercially published source that clearly
> and explicitly says essentially what I said there. I doubt anyone
> could even give me a URL for a text that clearly and explicitly states
> that. And to show it is widely acknowledged you would need to point to
> dozens of instances.

The link itself[8] contains a reference (one of Schneier's books). In fact,
you didn't provide any actual details aside from a mention of "the
factorization problem"---just generalizations.

  "Computing discrete logarithms is believed to be difficult. No efficient
  general method for computing discrete logarithms on conventional computers
  is known, and several important algorithms in public-key cryptography base
  their security on the assumption that the discrete logarithm problem has
  no efficient solution."[0]

The discrete logarithm problem is also mentioned, directly or indirectly, by
[1], [2], [3], [4], and is broken by [5], but a mention of how to avoid the
problems caused by Shor's attack on elliptic curve cryptography is mentioned
in [6]. Shor's paper mentioning the issue is at [7].

I intentionally linked a large number of Wikipedia articles to show its
ubiquity in such a common public resource. It is *core* to the discussion of
the Diffie-Hellman probrem (and, consequently, ElGamal). Research into
the discovery of primes dates back to ancient times (an example being
Eratosthenes of Cyrene); it's an elementary topic that naturally segues
into, at the very least, limitations of modern algorithms and future
outlook.

I'm not sure why you think that this is some secret.

> No. You have not told me anything I didn't already know. But I hope
> you appreciate I have told you things you should have known, had you
> only thought a little about them.

You did tell me something: that you were too busy trying to poke holes in my
response that you didn't take the time to grok what I had written, and
dismissed it as anything but ammunition against me.

This is precisely why you are being pushed away or ignored. You have no
intent to *actually help*, or you would have.

> [...],you would have had a scheme program that generated PDF reader
> programs by now. And that would have given you a guile PDF display plugin
> for Firefox, for example.

I guess we're just not that sweet, huh?

But you sound like you are. Perhaps you could implement this for us as a
demonstration of the power of the methodology you tout.

> And maybe one-day you won't feel embarrassed to call yourself a
> programmer, or a software engineer. Here's hoping, anyway.

You seem to be assuming that you have somehow made myself---or anyone else,
for that matter---feel embarrassed. The GNU Project is composed of, like it
or not, individuals that are *proud* to be part of the prestigious (and
the original) free operating system. You have falsely assumed that a lack of
knowledge on a narrow subset of computer science is somehow indicative of
incompetence, and that you somehow are of superior competence *because* of
knowledge in those areas.


[0]: http://en.wikipedia.org/wiki/Discrete_logarithm_problem
[1]: http://en.wikipedia.org/wiki/Diffie%E2%80%93Hellman_key_exchange
[2]: http://en.wikipedia.org/wiki/Public_key_cryptography
[3]: http://en.wikipedia.org/wiki/Integer_factorization
[4]: http://en.wikipedia.org/wiki/RSA_(algorithm)
[5]: http://en.wikipedia.org/wiki/Shor%27s_algorithm
[6]: http://en.wikipedia.org/wiki/Elliptic_curve_cryptography
[7]: http://arxiv.org/abs/quant-ph/9508027
[8]: http://livelogic.blogspot.dk/2014/09/the-free-fair-and-forged.html

-- 
Mike Gerwitz
Free Software Hacker | GNU Maintainer
http://mikegerwitz.com
FSF Member #5804 | GPG Key ID: 0x8EE30EAB

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Verifying Toolchain Semantics
       [not found]                     ` <CAKFjmdwNTjJ7nU-rKEWkA+5whsfyrpqJ6RkhU+VRbUW6rqT03A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2014-10-07 19:24                       ` Philip Herron
@ 2014-10-08 18:26                       ` Mark H Weaver
  1 sibling, 0 replies; 27+ messages in thread
From: Mark H Weaver @ 2014-10-08 18:26 UTC (permalink / raw)
  To: Ian Grant
  Cc: Mike Gerwitz, Markus Kuhn, lightning, Richard Stallman,
	guile-devel

Ian Grant <ian.a.n.grant-gM/Ye1E23mwN+BqQ9rBEUg@public.gmane.org> writes:

> On Tue, Oct 7, 2014 at 1:28 PM, Mark H Weaver <mhw-StlzRsPvAncdnm+yROfE0A@public.gmane.org> wrote:
>> Ian, please stop posting to guile-devel.  You've made your points, and
>> I've even called attention to what I think is the best exposition of
>> your ideas.  At this point you're just repeating yourself and hurling
>> gratuitous insults.  Enough!
>
> Well, it's insulting when you speak to me like a child, and when you
> make childish suggestions about my motives for posting to this list.

I don't see how either of these claims applies to what I wrote above.
I do, however, feel as though you're treating all of us in an extremely
condescending manner.  You're also acting like a bully.

But more relevantly to the points you wish to discuss: you have not
adequately responded to the excellent points by William ML Leslie.
Specifically, he made a persuasive argument that semantic fixed-points
are still vulnerable to variants of the Thompson hack.  You have failed
to explain why this is not the case.

And what about the hardware?  Even setting aside blatant back doors like
Intel ME/AMT, the semantics of our hardware is unknown to us.  In case
you don't know, there's a huge amount of proprietary software running
inside of every mainstream processor, which implements the "machine
code" in terms of a lower level "microcode", and we don't know the
semantics of the microcode either.  The actual hardware designs are
generated by something very analogous to software, written in languages
like Verilog and VHDL.

Personally, I won't have much confidence in our machines until they are
based on free designs all the way down to the physical transistors, and
when we have widely available tools to verify that a physical machine
precisely matches the published designs.  If we must destroy a machine
to verify it, we can still acquire N identical machines and randomly
choose some subset of them to verify.

Now, suppose we accomplish this, and *every* part of the implementation
down to the physical layer was automatically generated from formal
specifications.  FWIW, I think this would be quite satisfactory, but I
still don't understand how you'd know anything at all about the
semantics of the resulting machine, by the standards you've set for us.

The problem you'd then run into is that we cannot be sure what the
physical laws actually are, as you've already pointed out.  But suppose
for the moment that Quantum Electrodynamics (QED) truly is an adequate
theory to model our machine, as I've been led to believe.  We still have
the problems that (1) we lack the means to solve the equations for any
system of non-trivial size, (2) we can't know whether the actual
physical configuration of our machine corresponds to what we've been
told about it to the level relevant to QED, and (3) even if we had
solutions to (1) and (2), we cannot produce a machine that will be
completely reliable, because quantum mechanics only allows us to compute
the *probabilities* of various outcomes.

In summary, given our current understanding of physical law, we have no
way (even in principle) to build a machine that reliably follows our
instructions.

And if you really want to blow your mind, consider that we have no
knowledge whatsoever of whether the "die rolls" in quantum mechanics are
truly random.  All we have is *absence of knowledge* of any discernable
pattern to these die rolls.

For all we know, our entire universe could be running as a simulation
inside of a machine in some other universe, and the beings who control
that simulation are rolling the dice for us.  They could then control
*anything* in our universe that they wish to, simply by selecting the
results of those die rolls in carefully selected situations of
importance.

Or perhaps it's just your own mind that is in a simulator, and all of us
are just puppets designed to perform experiments on you as part of some
evolutionary programming process, to create a better tool for some
larger machine.

The unfortunate fact is that none of us has truly reliable knowledge of
anything at all.  I think you are understandably uncomfortable with
this, and are trying mightily to establish a solid base on which to
stand.  I sympathize with this, believe me, but I think it's impossible,
even in principle.

Anyway, coming back down to the realm of realistic vulnerabilities that
we have any hope of addressing: the truth of the matter is that we'll
have to fix more than just our computers.  We'll have to fix
*ourselves*.  How long do you suppose it will be before it is feasible
for someone to introduce some nanobots into your body somehow -- the
possibilities are nearly endless; I've heard of research into using
mosquitoes, but they could also be introduced by food or water -- and
then using them to rewire your neural connections to change your mind in
potentially arbitrary ways?

> I am responding constructively to questions asked me by a guile
> developer who is also an official representative of the FSF. Will the
> FSF prevent me from doing so on an FSF forum.

I don't think anyone could reasonably claim that we haven't given you
ample freedom to post on our forums.  There are limits, however.

> And if so, will any guile developers respond to the mails I sent
> regarding guile?

I intend to do so, yes.  Please understand that we are very busy people,
and that we are volunteers.

You should also understand that when you post such a large volume of
highly redundant and often gratuitously insulting material, people are
likely to filter you out (mentally if not mechanically), and then
everything you write will be ignored, including posts that are relevant
to practical issues concerning Guile.

> Speaking of which, what is the name and version of the program that
> your emacs uses for "pdf->png" conversion? Your report, blaming me for
> sending bad PDF, indicates a fairly fundamental misunderstanding of
> what a program meant to do when it reads a file that supposed to be in
> a defined format.

Sorry, this was my mistake.  Your blog post included a URL that ended
with ".pdf", so I assumed it would actually be a PDF.  Turns out it was
actually a web page with more links to download the actual PDF.

So, what I had was HTML in a file with a .pdf extension.  The program
that crashed when fed this HTML file was GNU Ghostscript 9.06.0.  I've
since switched to another PDF renderer.

     Mark

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2014-10-08 18:26 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-10-02 14:54 Verifying Toolchain Semantics Ian Grant
2014-10-03  6:23 ` Mark H Weaver
2014-10-03  7:15   ` William ML Leslie
2014-10-03 12:56     ` Taylan Ulrich Bayirli/Kammer
2014-10-03 17:13       ` Mark H Weaver
2014-10-05 13:57       ` William ML Leslie
2014-10-03  8:45   ` Nala Ginrut
2014-10-05  1:35   ` Ian Grant
2014-10-05  5:39     ` Nala Ginrut
2014-10-05 14:14       ` Ian Grant
2014-10-05 15:15         ` Nala Ginrut
2014-10-05 16:24           ` Ian Grant
2014-10-06  8:25             ` Nala Ginrut
2014-10-05  6:58     ` Mike Gerwitz
2014-10-05 16:11       ` Ian Grant
2014-10-06  4:23         ` Mike Gerwitz
     [not found]           ` <20141006042323.GA31390-iW7gFb+/I3LZHJUXO5efmti2O/JbrIOy@public.gmane.org>
2014-10-07 17:18             ` Ian Grant
     [not found]               ` <CAKFjmdx+jzfapvrq6EEO8Skx2L2UZwi-DZ22xiq9t1438E7kOw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-10-07 17:28                 ` Mark H Weaver
2014-10-07 17:56                   ` Ian Grant
     [not found]                     ` <CAKFjmdwNTjJ7nU-rKEWkA+5whsfyrpqJ6RkhU+VRbUW6rqT03A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-10-07 19:24                       ` Philip Herron
     [not found]                         ` <CAEvRbeoEJPTtoDu0nDudJyfBoaT1vpuvHzL=t+TkJr_ZGkzYEQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-10-07 19:47                           ` Ian Grant
2014-10-08 18:26                       ` Mark H Weaver
2014-10-08  3:55               ` Mike Gerwitz
  -- strict thread matches above, loose matches on Subject: below --
2014-10-05 17:42 Ian Grant
2014-10-05 18:19 ` Ian Grant
2014-10-06  0:30 Ian Grant
     [not found] ` <CAKFjmdzxAMvcry8N6B_atM_8vGyzA1Dfz9ygWxSgh3fD7EUAuQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-10-06  8:51   ` William ML Leslie

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).