Hi, > 1. I tried to fine-tune the SQL a bit: > - Open/close the database only once for the whole indexing. > - Use "insert" instead of "insert or replace". > - Use numeric ID as key instead of path. > > Result: Still around 15-20 minutes to build. Switching to numeric > indices shrank the database by half. sqlite insert statements can be very fast. sqlite.org claims 50000 or more insert statements per second. But in order to achieve that speed all insert statements have to be grouped together in a single transaction. See https://www.sqlite.org/faq.html#q19 > A string-contains filter takes less than 1 second. Guile's string-contains function uses a naive O(nk) implementation, where 'n' is the length of string s1 and 'k' is the length of string s2. If it was implemented using the Knuth-Morris-Pratt algorithm, it could cost only O(n+k). So, there is some scope for improvement here. In fact, a comment on line 2007 of libguile/srfi-13.c in the guile source tree makes this very point. > I need to measure the time SQL takes for a regexp match. sqlite, by default, does not come with regexp support. You might have to load some external library. See https://www.sqlite.org/lang_expr.html#the_like_glob_regexp_and_match_operators --8<---------------cut here---------------start------------->8--- The REGEXP operator is a special syntax for the regexp() user function. No regexp() user function is defined by default and so use of the REGEXP operator will normally result in an error message. If an application-defined SQL function named "regexp" is added at run-time, then the "X REGEXP Y" operator will be implemented as a call to "regexp(Y,X)". --8<---------------cut here---------------end--------------->8--- Regards, Arun