unofficial mirror of help-guix@gnu.org 
 help / color / mirror / Atom feed
From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
To: Christoph Buck <dev@icepic.de>
Cc: Zack Weinberg <zack@owlfolio.org>,
	 Efraim Flashner <efraim@flashner.co.il>,
	 help-guix@gnu.org
Subject: Re: ABI mismatch on boot on arm32 system
Date: Wed, 04 Dec 2024 20:50:44 +0900	[thread overview]
Message-ID: <87y10vll9n.fsf@gmail.com> (raw)
In-Reply-To: <87jzdg4q1y.fsf@icepic.de> (Christoph Buck's message of "Wed, 06 Nov 2024 11:25:29 +0100")

Hi Christoph,

Sorry for my delayed reply.

Christoph Buck <dev@icepic.de> writes:

> Hi Guix!
>
> So i looked into the guile source code and, as expected, the `scm_hash`
> function (see hash.c in guile) uses `unsigned long` wich is 8 bytes on
> x64 and 4 bytes on arm32/i868. If `string-hash` is called with the size
> parameter `n`, the hash value is limited to size by calculating the
> modulo `n` of the hash value, see scm_ihash in hash.c:440, namely
>
>> (unsigned long) scm_raw_ihash (obj, 10) % n
>
> (The `10` can be ignored as far as i can tell). Since the hash values
> are different on different platforms the modulo is different as well.
>
> However, if one steps through the call stack of `string-hash` you can
> see that the actual hash value is calculated by the
> `JENKINS_LOOKUP3_HASHWORD2` macro, which contains are rather
> interesting comment and a possible workaround for the abi problem,
> namely
>
> /* Scheme can access symbol-hash, which exposes this value.  For    \
>    cross-compilation reasons, we ensure that the high 32 bits of    \
>    the hash on a 64-bit system are equal to the hash on a 32-bit    \
>    system.  The low 32 bits just add more entropy.  */              \
> if (sizeof (ret) == 8)                                              \
>     ret = (((unsigned long) c) << 32) | b;                          \
> else                                                                \
>     ret = c;                                                        \
>
>
> in hash.c:82.
>
> Meaning, if executed on a x64 platform, the higher 32bit of the
> resulting 64bit hash result are equal to the hash value on a 32bit
> platform. A simple test case in c++ looks like this:
>
> int main(int args, char** argv)
> {
>     scm_init_guile();
>     auto strToHash = scm_from_locale_string ("((device) (mount-point))");
>     auto maxULong = scm_from_ulong(ULONG_MAX);
>     auto hashResult = scm_hash(strToHash,maxULong);
>     auto hashResultUL = scm_to_ulong(hashResult);
>     std::cout << "Max ULONG_MAX: " << ULONG_MAX <<std::endl;
>     std::cout << "Original hashResult ulong: " << hashResultUL << std::endl;
>
>     if(sizeof(hashResultUL) == 8)
>     {
>         std::cout << "Corrected for 32bit: " << (hashResultUL >> 32) << std::endl;
>     }
> }
>
>
> which results on x64 in
>
>> Max ULONG_MAX: 18446744073709551615
>> Original hashResult ulong: 10454028974864831
>> Corrected for 32bit: 2434018
>
> and on arm32 to
>
>> Max ULONG_MAX: 4294967295
>> Original hashResult ulong: 2434018
>
> This suggest the following workaround. Always limit the hash size to
> 32bit even if executed on a 64bit platform (or to be more specific a
> platform where ulong is 8bytes big). Do this by right shift the hash
> value 32bits and don't rely on the size parameter of the `string-hash`
> function.
>
> In code it could look something like this
>
> (define (compute-abi-cookie field-specs)
>     ;; Compute an "ABI cookie" for the given FIELD-SPECS.  We use
>     ;; 'string-hash' because that's a better hash function that 'hash' on a
>     ;; list of symbols.
>     (let ((hash
>            (syntax-case field-specs ()
>              (((field get properties ...) ...)
>               (let ((hash-value (string-hash (object->string
>                                               (syntax->datum #'((field properties ...) ...))))))
>                 (if (= (native-word-size) 8)
>                     (ash hash-value -32)
>                     hash-value)))))
>           (fd (syntax-case field-specs ()
>                 (((field get properties ...) ...)
>                  (object->string
>                   (syntax->datum #'((field properties ...) ...)))))))
>       
>       (format #t "Compute-abi-cookie: ~a~%" hash)
>       hash))
>
>
> where `native-word-size` is define by 
>
> (define (native-word-size)
>   ((@ (system foreign) sizeof) '*))

This is a thorough investigation, and the above fix/workaround that can
be applied to Guix is the cherry on the cake!  Thank you for producing
it.

> (taken from `cross-compilation.test`). There might be a cleaner way to
> formulate this, but you get the point.
>
> This seems to work for all combinations on my machine. I tested
> x64 -> arm, x64 -> i868, i868 -> x64...
>
> I can only think of two drawbacks.
>
> 1) Lost entropy on 64 bit machines
> 2) Abi break because on new compilation the hash values on 64bit
>    platforms will change.
>
> 1) is imho irrelevant, because it is not cryptophically important. For
> 2) i am not sure how important this is.
>
> Any thoughts on this?

I don't think 1 or 2 are too important; about 2 for example, we break
the ABI every time we add or remove a new record field.

> Might this be something worth fixing and sending a patch in?

Totally, if you haven't done so already.

-- 
Thanks,
Maxim


  parent reply	other threads:[~2024-12-04 11:51 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-16 10:11 ABI mismatch on boot on arm32 system Christoph Buck
2024-10-16 20:05 ` Richard Sent
2024-10-20 15:15   ` Christoph Buck
2024-10-18 20:58 ` Denis 'GNUtoo' Carikli
2024-10-20 15:23 ` Christoph Buck
2024-10-20 15:39   ` Zack Weinberg
2024-10-20 17:24     ` Christoph Buck
2024-10-21  9:55       ` Christoph Buck
2024-10-29 17:11         ` Christoph Buck
2024-10-30  7:09           ` Efraim Flashner
2024-10-30 13:24             ` Christoph Buck
2024-11-06 10:25               ` Christoph Buck
2024-11-11  7:47                 ` Christoph Buck
2024-12-04 11:50                 ` Maxim Cournoyer [this message]
2024-12-07 13:44                   ` Christoph Buck
2024-12-09  0:44                     ` Maxim Cournoyer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://guix.gnu.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87y10vll9n.fsf@gmail.com \
    --to=maxim.cournoyer@gmail.com \
    --cc=dev@icepic.de \
    --cc=efraim@flashner.co.il \
    --cc=help-guix@gnu.org \
    --cc=zack@owlfolio.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).