Hi, Ludovic Courtès writes: > Hi Timothy, > > Timothy Sample skribis: > >> A quick reading of RFC 3986 suggests that the host part of a URI can be >> an IP address (version 4 or 6) or a registered name. It gives the >> following rules for registered names: >> >> reg-name = *( unreserved / pct-encoded / sub-delims ) >> unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~" >> pct-encoded = "%" HEXDIG HEXDIG >> sub-delims = "!" / "$" / "&" / "'" / "(" / ")" >> / "*" / "+" / "," / ";" / "=" >> >> Here, “ALPHA”, “DIGIT”, and “HEXDIG” are specified in RFC 2234, and are >> just the ASCII ranges you might expect (except for that “HEXDIG” only >> allows uppercase letters). > > Do you think you could turn that into a patch for Guile? I’d happily > apply it. :-) > > It looks like both [[:alnum:]] & co. and ranges would be > locale-dependent, so my understanding is that we’ll have to list all the > characters explicitly, right? Here’s a patch for Guile that uses explicit lists of characters in the ‘(web uri)’ module instead of character ranges. It includes two tests that are pretty verbose, but seem to do the trick. I have a bit more background on the problem, mostly coming from a Glibc bug report: . It turns out that it is well-known upstream, and avoiding character ranges is the recommended approach for know. Some other GNU tools have adopted what is being called the “Rational Range Interpretation” . AIUI, this means they use the underlying encoding numbers for ranges (I checked the source, but I’m only mostly sure I read it right). It looks like the Glibc folks are unsure how to proceed on this (but are maybe slightly leaning towards the “rational” approach). It’s all a pretty big mess, really. I was hoping there would be some obvious thing that would fix the problem more generally. Short of pulling in the Gnulib regex code or writing something in Scheme, it looks like Guile is stuck where it is now. I’m unsure if the changes are considered “trivial” from a copyright perspective. It’s pretty close, but I think programmers tend to underestimate here. I’ve started the FSF copyright assignment process either way, since is likely not my last Guile patch. :) -- Tim