Hi Ludovic, Ludovic Courtès writes: > Please add a “Fixes” line in the commit log. Done, thanks. > I’m not reviewing the code in depth and I trust your judgment. > > The risks of bugs I can think of are: missed ASCII references (a > regression, whereby some ASCII references would not get rewritten), This is indeed a risk whenever we modify the grafting code, which is written for efficiency, not clarity. I've tried to be careful, and have checked that my newly grafted system and user profiles do not retain references to ungrafted code, modulo the following pre-existing issues: (ibus-daemon launches ungrafted subprocesses) (Chunked store references in .zo files in Racket 8) > and unrelated UTF-{16,32}-base32-looking references getting rewritten. > > I guess the latter is very unlikely because only sequences found in the > replacement table may be rewritten, right? Yes, that's correct. > The former should be caught by ‘tests/grafts.scm’ but we could also > check the closure of real-world systems, for instance, to make sure it > doesn’t refer to ungrafted things. As I mention above, I've already done so for my own GNOME-based Guix system. > Do you know how this affects performance? I have not yet measured it, but subjectively, I do not notice any obvious difference in speed. I expect that the main performance impact is that large blocks of NULs will be processed more slowly by the current draft version of the new grafting code. That's because NULs can now be part of a nix hash, and therefore the new grafting code can only advance 1 byte position when seeing a NUL, whereas previously it would skip ahead 33 positions in that case. If desired, the handling of NULs could be made more efficient, at the cost of a bit more complexity. When seeing a NUL, we could check the adjacent bytes to see if this is part of a nix-base32 character in UTF-16 or UTF-32 encoding. If not, we could skip ahead. > Perhaps add short docstrings for clarity. Done. >> +(for-each >> + (lambda (char-size1) >> + (for-each >> + (lambda (char-size2) >> + (for-each >> + (lambda (gap) >> + (for-each >> + (lambda (offset) >> + (test-equal (format #f "replace-store-references, char-sizes ~a ~a, gap ~s, offset ~a" >> + char-size1 char-size2 gap offset) [...] >> + ;; offsets to test >> + (map (lambda (i) (- buffer-size (* 40 char-size1) i)) >> + (iota 30)))) >> + ;; gaps >> + '("" "-" " " "a"))) >> + ;; char-size2 values to test >> + '(1 2))) >> + ;; char-size1 values to test >> + '(1 2 4)) > > For clarity, perhaps you can define a top-level procedure for the test > and call it from ‘for-each’. Good idea. I'd like to optimize the tests a bit before pushing this. They take a fairly long time to run, and lead to a huge 1.5G grafts.log file. It might be sufficient to avoid 'test-equal' here. > Modulo these very minor issues, it looks like it’s ready to go! Sounds good. Thanks for the review! Below, I've attached my current revision of this draft patch, which incorporates your suggestions regarding the main code. What remains is to improve the tests. Regards, Mark