From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
To: Christoph Buck <dev@icepic.de>
Cc: Zack Weinberg <zack@owlfolio.org>,
Efraim Flashner <efraim@flashner.co.il>,
help-guix@gnu.org
Subject: Re: ABI mismatch on boot on arm32 system
Date: Wed, 04 Dec 2024 20:50:44 +0900 [thread overview]
Message-ID: <87y10vll9n.fsf@gmail.com> (raw)
In-Reply-To: <87jzdg4q1y.fsf@icepic.de> (Christoph Buck's message of "Wed, 06 Nov 2024 11:25:29 +0100")
Hi Christoph,
Sorry for my delayed reply.
Christoph Buck <dev@icepic.de> writes:
> Hi Guix!
>
> So i looked into the guile source code and, as expected, the `scm_hash`
> function (see hash.c in guile) uses `unsigned long` wich is 8 bytes on
> x64 and 4 bytes on arm32/i868. If `string-hash` is called with the size
> parameter `n`, the hash value is limited to size by calculating the
> modulo `n` of the hash value, see scm_ihash in hash.c:440, namely
>
>> (unsigned long) scm_raw_ihash (obj, 10) % n
>
> (The `10` can be ignored as far as i can tell). Since the hash values
> are different on different platforms the modulo is different as well.
>
> However, if one steps through the call stack of `string-hash` you can
> see that the actual hash value is calculated by the
> `JENKINS_LOOKUP3_HASHWORD2` macro, which contains are rather
> interesting comment and a possible workaround for the abi problem,
> namely
>
> /* Scheme can access symbol-hash, which exposes this value. For \
> cross-compilation reasons, we ensure that the high 32 bits of \
> the hash on a 64-bit system are equal to the hash on a 32-bit \
> system. The low 32 bits just add more entropy. */ \
> if (sizeof (ret) == 8) \
> ret = (((unsigned long) c) << 32) | b; \
> else \
> ret = c; \
>
>
> in hash.c:82.
>
> Meaning, if executed on a x64 platform, the higher 32bit of the
> resulting 64bit hash result are equal to the hash value on a 32bit
> platform. A simple test case in c++ looks like this:
>
> int main(int args, char** argv)
> {
> scm_init_guile();
> auto strToHash = scm_from_locale_string ("((device) (mount-point))");
> auto maxULong = scm_from_ulong(ULONG_MAX);
> auto hashResult = scm_hash(strToHash,maxULong);
> auto hashResultUL = scm_to_ulong(hashResult);
> std::cout << "Max ULONG_MAX: " << ULONG_MAX <<std::endl;
> std::cout << "Original hashResult ulong: " << hashResultUL << std::endl;
>
> if(sizeof(hashResultUL) == 8)
> {
> std::cout << "Corrected for 32bit: " << (hashResultUL >> 32) << std::endl;
> }
> }
>
>
> which results on x64 in
>
>> Max ULONG_MAX: 18446744073709551615
>> Original hashResult ulong: 10454028974864831
>> Corrected for 32bit: 2434018
>
> and on arm32 to
>
>> Max ULONG_MAX: 4294967295
>> Original hashResult ulong: 2434018
>
> This suggest the following workaround. Always limit the hash size to
> 32bit even if executed on a 64bit platform (or to be more specific a
> platform where ulong is 8bytes big). Do this by right shift the hash
> value 32bits and don't rely on the size parameter of the `string-hash`
> function.
>
> In code it could look something like this
>
> (define (compute-abi-cookie field-specs)
> ;; Compute an "ABI cookie" for the given FIELD-SPECS. We use
> ;; 'string-hash' because that's a better hash function that 'hash' on a
> ;; list of symbols.
> (let ((hash
> (syntax-case field-specs ()
> (((field get properties ...) ...)
> (let ((hash-value (string-hash (object->string
> (syntax->datum #'((field properties ...) ...))))))
> (if (= (native-word-size) 8)
> (ash hash-value -32)
> hash-value)))))
> (fd (syntax-case field-specs ()
> (((field get properties ...) ...)
> (object->string
> (syntax->datum #'((field properties ...) ...)))))))
>
> (format #t "Compute-abi-cookie: ~a~%" hash)
> hash))
>
>
> where `native-word-size` is define by
>
> (define (native-word-size)
> ((@ (system foreign) sizeof) '*))
This is a thorough investigation, and the above fix/workaround that can
be applied to Guix is the cherry on the cake! Thank you for producing
it.
> (taken from `cross-compilation.test`). There might be a cleaner way to
> formulate this, but you get the point.
>
> This seems to work for all combinations on my machine. I tested
> x64 -> arm, x64 -> i868, i868 -> x64...
>
> I can only think of two drawbacks.
>
> 1) Lost entropy on 64 bit machines
> 2) Abi break because on new compilation the hash values on 64bit
> platforms will change.
>
> 1) is imho irrelevant, because it is not cryptophically important. For
> 2) i am not sure how important this is.
>
> Any thoughts on this?
I don't think 1 or 2 are too important; about 2 for example, we break
the ABI every time we add or remove a new record field.
> Might this be something worth fixing and sending a patch in?
Totally, if you haven't done so already.
--
Thanks,
Maxim
next prev parent reply other threads:[~2024-12-04 11:51 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-16 10:11 ABI mismatch on boot on arm32 system Christoph Buck
2024-10-16 20:05 ` Richard Sent
2024-10-20 15:15 ` Christoph Buck
2024-10-18 20:58 ` Denis 'GNUtoo' Carikli
2024-10-20 15:23 ` Christoph Buck
2024-10-20 15:39 ` Zack Weinberg
2024-10-20 17:24 ` Christoph Buck
2024-10-21 9:55 ` Christoph Buck
2024-10-29 17:11 ` Christoph Buck
2024-10-30 7:09 ` Efraim Flashner
2024-10-30 13:24 ` Christoph Buck
2024-11-06 10:25 ` Christoph Buck
2024-11-11 7:47 ` Christoph Buck
2024-12-04 11:50 ` Maxim Cournoyer [this message]
2024-12-07 13:44 ` Christoph Buck
2024-12-09 0:44 ` Maxim Cournoyer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://guix.gnu.org/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87y10vll9n.fsf@gmail.com \
--to=maxim.cournoyer@gmail.com \
--cc=dev@icepic.de \
--cc=efraim@flashner.co.il \
--cc=help-guix@gnu.org \
--cc=zack@owlfolio.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).