Thien-Thi Nguyen wrote: > YAMAMOTO Mitsuharu writes: > > >> This change breaks the following case: >> >> (concat >> "file://localhost" >> (mapconcat 'url-hexify-string >> (split-string >> (encode-coding-string "/SOME/NONASCII/FILE/NAME" >> (or file-name-coding-system >> default-file-name-coding-system)) >> "/") >> "/")) >> >> Maybe suppress encoding with UTF-8 for unibyte strings? >> > > if the result of this expression is to be used as a URI, then that means > the change exposes improper use of `url-hexify-string'; according to the > RFC (as i understand it) URIs require utf-8. > There is a recent RFC that mandates utf-8 encoding for URIs, but previous RFCs either said nothing, or specified Latin-1, so there are many implementations that do not use utf-8. We need some way to interoperate with such implementations. > if we want `url-hexify-string' to handle "URI-like" transformations > (i.e., not strictly produce URI-conformant results), we can add an > optional arg MAKE-UNIBYTE that specifies a function to do the conversion > to unibyte. in most cases, i guess that would be `string-as-unibyte', > but i don't know for sure. > Alternatively, we could add an optional arg ENCODING, for specifying an encoding other than utf-8. That might be a cleaner interface than requiring the user to make the string unibyte before passing it to url-hexify-string.