25 sep. 2020 kl. 01.54 skrev Lars Ingebrigtsen : > I went ahead and checked in a new C-level function string-search, which > should be an efficient way to search for strings in strings (using > memmem, which Emacs has via Gnulib?), and this fixed these corner cases. Thank you! Here are some proposed tweaks (diff attached): 1. Check the range of the START-POS argument so that we don't crash. The permitted range is [0..N] where N is (length HAYSTACK), thus we permit a start right after the last character but no further. We could also return nil in these cases but I think an error is more useful. 2. Make the docs more precise about various things. 3. Slight simplification of the implementation logic to avoid testing the same conditions multiple times. 4. More tests, especially for edge cases. Can't have too many! One test still fails: (string-search "ΓΈ" "\303\270") which should return nil but currently matches. I think it's wrong to convert the needle to unibyte (using Fstring_as_unibyte) in this case, but I haven't decided what the best solution would be. We should also consider the optimisations: - If SCHARS(needle)>SCHARS(haystack) then no match is possible. - If either needle or haystack is all-ASCII (all bytes in 0..127), then we can use memmem without conversion.