* eww doesn't decode %AA%BB%CC URL names @ 2015-08-18 14:26 Eli Zaretskii 2015-12-24 17:40 ` Lars Ingebrigtsen 0 siblings, 1 reply; 19+ messages in thread From: Eli Zaretskii @ 2015-08-18 14:26 UTC (permalink / raw) To: emacs-devel When I visit a URL in eww and press 'd' on a link like this: https://ru.wikipedia.org/wiki/%D0%A1%D0%B5%D1%80%D0%B4%D1%86%D0%B5 the file Emacs creates a file whose name is made of those hex-encoded characters as you see them in this mail. Shouldn't we decode them? Firefox does. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: eww doesn't decode %AA%BB%CC URL names 2015-08-18 14:26 eww doesn't decode %AA%BB%CC URL names Eli Zaretskii @ 2015-12-24 17:40 ` Lars Ingebrigtsen 2015-12-24 18:07 ` Yuri Khan 0 siblings, 1 reply; 19+ messages in thread From: Lars Ingebrigtsen @ 2015-12-24 17:40 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel Eli Zaretskii <eliz@gnu.org> writes: > When I visit a URL in eww and press 'd' on a link like this: > > https://ru.wikipedia.org/wiki/%D0%A1%D0%B5%D1%80%D0%B4%D1%86%D0%B5 > > the file Emacs creates a file whose name is made of those hex-encoded > characters as you see them in this mail. Shouldn't we decode them? > Firefox does. We should. Let's see... (url-unhex-string "%D0%A1%D0%B5%D1%80%D0%B4%D1%86%D0%B5") => "\320\241\320\265\321\200\320\264\321\206\320\265" Uhm... (decode-coding-string (url-unhex-string "%D0%A1%D0%B5%D1%80%D0%B4%D1%86%D0%B5") 'utf-8) => "Сердце" Right. What charset do we choose? I guess using the charset of the document we're in doesn't make much sense (because it's linking to something off-site which may be in a different charset)... Perhaps just run a `detect-coding-string' on it? Or! We've just downloaded the file, after all, and the charset of the file itself may tell us what the charset of the name is... On the other hand, probably not. (For instance, a PDF with a Cyrillic name would probably still just be reported by the web server as being binary.) `detect-coding-string' it is, I guess, unless anybody has a better idea? -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: eww doesn't decode %AA%BB%CC URL names 2015-12-24 17:40 ` Lars Ingebrigtsen @ 2015-12-24 18:07 ` Yuri Khan 2015-12-24 19:03 ` Eli Zaretskii 0 siblings, 1 reply; 19+ messages in thread From: Yuri Khan @ 2015-12-24 18:07 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: Eli Zaretskii, Emacs developers On Thu, Dec 24, 2015 at 11:40 PM, Lars Ingebrigtsen <larsi@gnus.org> wrote: > (decode-coding-string (url-unhex-string > "%D0%A1%D0%B5%D1%80%D0%B4%D1%86%D0%B5") > 'utf-8) > => "Сердце" > > Right. What charset do we choose? I guess using the charset of the > document we're in doesn't make much sense (because it's linking to > something off-site which may be in a different charset)... By RFC 3986, percent-encoded URLs SHOULD use UTF-8 encoding. If the URL does not decode into a valid UTF-8 string, it is ok to fall back to a heuristic, though. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: eww doesn't decode %AA%BB%CC URL names 2015-12-24 18:07 ` Yuri Khan @ 2015-12-24 19:03 ` Eli Zaretskii 2015-12-24 19:18 ` Lars Ingebrigtsen 0 siblings, 1 reply; 19+ messages in thread From: Eli Zaretskii @ 2015-12-24 19:03 UTC (permalink / raw) To: Yuri Khan; +Cc: larsi, emacs-devel > From: Yuri Khan <yuri.v.khan@gmail.com> > Date: Fri, 25 Dec 2015 00:07:40 +0600 > Cc: Eli Zaretskii <eliz@gnu.org>, Emacs developers <emacs-devel@gnu.org> > > On Thu, Dec 24, 2015 at 11:40 PM, Lars Ingebrigtsen <larsi@gnus.org> wrote: > > (decode-coding-string (url-unhex-string > > "%D0%A1%D0%B5%D1%80%D0%B4%D1%86%D0%B5") > > 'utf-8) > > => "Сердце" > > > > Right. What charset do we choose? I guess using the charset of the > > document we're in doesn't make much sense (because it's linking to > > something off-site which may be in a different charset)... > > By RFC 3986, percent-encoded URLs SHOULD use UTF-8 encoding. If the > URL does not decode into a valid UTF-8 string, it is ok to fall back > to a heuristic, though. Yes, I think this is a good policy, thanks. Bonus points for implementing the command in a way that it will be able to accept user choice of the encoding via "C-x RET c", like file operations do. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: eww doesn't decode %AA%BB%CC URL names 2015-12-24 19:03 ` Eli Zaretskii @ 2015-12-24 19:18 ` Lars Ingebrigtsen 2015-12-24 19:34 ` Eli Zaretskii 0 siblings, 1 reply; 19+ messages in thread From: Lars Ingebrigtsen @ 2015-12-24 19:18 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel, Yuri Khan Eli Zaretskii <eliz@gnu.org> writes: >> From: Yuri Khan <yuri.v.khan@gmail.com> >> Date: Fri, 25 Dec 2015 00:07:40 +0600 >> Cc: Eli Zaretskii <eliz@gnu.org>, Emacs developers <emacs-devel@gnu.org> >> >> On Thu, Dec 24, 2015 at 11:40 PM, Lars Ingebrigtsen <larsi@gnus.org> wrote: >> > (decode-coding-string (url-unhex-string >> > "%D0%A1%D0%B5%D1%80%D0%B4%D1%86%D0%B5") >> > 'utf-8) >> > => "Сердце" >> > >> > Right. What charset do we choose? I guess using the charset of the >> > document we're in doesn't make much sense (because it's linking to >> > something off-site which may be in a different charset)... >> >> By RFC 3986, percent-encoded URLs SHOULD use UTF-8 encoding. If the >> URL does not decode into a valid UTF-8 string, it is ok to fall back >> to a heuristic, though. That's basically just (car (decode-coding-string ...)), though, since it'll return utf-8 first if that's a possible charset, won't it? > Yes, I think this is a good policy, thanks. Bonus points for > implementing the command in a way that it will be able to accept user > choice of the encoding via "C-x RET c", like file operations do. Let's see... that function basically just binds `coding-system-for-{read,write}' and then calls the command interactively? Do the commands just look at those variables, and if they're bound, then they use that coding system instead? -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: eww doesn't decode %AA%BB%CC URL names 2015-12-24 19:18 ` Lars Ingebrigtsen @ 2015-12-24 19:34 ` Eli Zaretskii 2015-12-24 19:55 ` Lars Ingebrigtsen 0 siblings, 1 reply; 19+ messages in thread From: Eli Zaretskii @ 2015-12-24 19:34 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: emacs-devel, yuri.v.khan > From: Lars Ingebrigtsen <larsi@gnus.org> > Cc: Yuri Khan <yuri.v.khan@gmail.com>, emacs-devel@gnu.org > Date: Thu, 24 Dec 2015 20:18:47 +0100 > > Eli Zaretskii <eliz@gnu.org> writes: > > >> From: Yuri Khan <yuri.v.khan@gmail.com> > >> Date: Fri, 25 Dec 2015 00:07:40 +0600 > >> Cc: Eli Zaretskii <eliz@gnu.org>, Emacs developers <emacs-devel@gnu.org> > >> > >> On Thu, Dec 24, 2015 at 11:40 PM, Lars Ingebrigtsen <larsi@gnus.org> wrote: > >> > (decode-coding-string (url-unhex-string > >> > "%D0%A1%D0%B5%D1%80%D0%B4%D1%86%D0%B5") > >> > 'utf-8) > >> > => "Сердце" > >> > > >> > Right. What charset do we choose? I guess using the charset of the > >> > document we're in doesn't make much sense (because it's linking to > >> > something off-site which may be in a different charset)... > >> > >> By RFC 3986, percent-encoded URLs SHOULD use UTF-8 encoding. If the > >> URL does not decode into a valid UTF-8 string, it is ok to fall back > >> to a heuristic, though. > > That's basically just (car (decode-coding-string ...)) I believe you meant detect-coding-string. > though, since it'll return utf-8 first if that's a possible charset, > won't it? You cannot rely on it returning UTF-8, that depends on coding priorities (that are subject to customizations) and other things. I think you should use UTF-8 literally as the first choice. > > Yes, I think this is a good policy, thanks. Bonus points for > > implementing the command in a way that it will be able to accept user > > choice of the encoding via "C-x RET c", like file operations do. > > Let's see... that function basically just binds > `coding-system-for-{read,write}' and then calls the command > interactively? Yes. > Do the commands just look at those variables, and if they're bound, > then they use that coding system instead? Yes, they use these in preference to everything else, something like this: (let ((coding (or coding-system-for-read document-encoding locale-coding-system ...))) (decode-coding-string ... coding)) ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: eww doesn't decode %AA%BB%CC URL names 2015-12-24 19:34 ` Eli Zaretskii @ 2015-12-24 19:55 ` Lars Ingebrigtsen 2015-12-24 20:40 ` Eli Zaretskii 2015-12-24 20:43 ` Lars Ingebrigtsen 0 siblings, 2 replies; 19+ messages in thread From: Lars Ingebrigtsen @ 2015-12-24 19:55 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel, yuri.v.khan Eli Zaretskii <eliz@gnu.org> writes: >> That's basically just (car (decode-coding-string ...)) > > I believe you meant detect-coding-string. Yup. :-) >> though, since it'll return utf-8 first if that's a possible charset, >> won't it? > > You cannot rely on it returning UTF-8, that depends on coding > priorities (that are subject to customizations) and other things. > > I think you should use UTF-8 literally as the first choice. Right. How do I check whether the bytes are a valid utf-8 sequence, though? I thought I remembered something called `valid-something-something-p', but I can't find it now... > Yes, they use these in preference to everything else, something like > this: > > (let ((coding (or coding-system-for-read > document-encoding > locale-coding-system > ...))) > (decode-coding-string ... coding)) Okidoke. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: eww doesn't decode %AA%BB%CC URL names 2015-12-24 19:55 ` Lars Ingebrigtsen @ 2015-12-24 20:40 ` Eli Zaretskii 2015-12-24 20:49 ` Lars Ingebrigtsen 2015-12-24 20:43 ` Lars Ingebrigtsen 1 sibling, 1 reply; 19+ messages in thread From: Eli Zaretskii @ 2015-12-24 20:40 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: emacs-devel, yuri.v.khan > From: Lars Ingebrigtsen <larsi@gnus.org> > Cc: yuri.v.khan@gmail.com, emacs-devel@gnu.org > Date: Thu, 24 Dec 2015 20:55:13 +0100 > > > I think you should use UTF-8 literally as the first choice. > > Right. How do I check whether the bytes are a valid utf-8 sequence, > though? I thought I remembered something called > `valid-something-something-p', but I can't find it now... I think you can run find-charset-string on the decoded string, and if the result is just (unicode), you can be sure. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: eww doesn't decode %AA%BB%CC URL names 2015-12-24 20:40 ` Eli Zaretskii @ 2015-12-24 20:49 ` Lars Ingebrigtsen 0 siblings, 0 replies; 19+ messages in thread From: Lars Ingebrigtsen @ 2015-12-24 20:49 UTC (permalink / raw) To: Eli Zaretskii; +Cc: yuri.v.khan, emacs-devel Eli Zaretskii <eliz@gnu.org> writes: > I think you can run find-charset-string on the decoded string, and if > the result is just (unicode), you can be sure. Yeah, that should do the trick. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: eww doesn't decode %AA%BB%CC URL names 2015-12-24 19:55 ` Lars Ingebrigtsen 2015-12-24 20:40 ` Eli Zaretskii @ 2015-12-24 20:43 ` Lars Ingebrigtsen 2015-12-24 21:00 ` Eli Zaretskii 2015-12-24 21:04 ` Lars Ingebrigtsen 1 sibling, 2 replies; 19+ messages in thread From: Lars Ingebrigtsen @ 2015-12-24 20:43 UTC (permalink / raw) To: emacs-devel Hm! I have an unexpected compliation here. If I eval the following: (write-region (point) (point-max) "/home/larsi/Downloads/Сердце") Then I get a file name that consists of five spaces. That seems awfully weird. I may have configured something somewhere that says that Emacs should create file names in latin-1... Hm... (set-language-environment "Latin-1") Which I would guess isn't uncommon. Making an all-blank file name here is somewhat unacceptable, I think. So how should this be handled? -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: eww doesn't decode %AA%BB%CC URL names 2015-12-24 20:43 ` Lars Ingebrigtsen @ 2015-12-24 21:00 ` Eli Zaretskii 2015-12-24 21:04 ` Lars Ingebrigtsen 1 sibling, 0 replies; 19+ messages in thread From: Eli Zaretskii @ 2015-12-24 21:00 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: emacs-devel > From: Lars Ingebrigtsen <larsi@gnus.org> > Date: Thu, 24 Dec 2015 21:43:17 +0100 > > If I eval the following: > > (write-region (point) (point-max) "/home/larsi/Downloads/Сердце") > > Then I get a file name that consists of five spaces. That seems awfully > weird. I may have configured something somewhere that says that Emacs > should create file names in latin-1... Hm... > > (set-language-environment "Latin-1") > > Which I would guess isn't uncommon. I hope not. Those who do that completely screw up their file-name encoding stuff. > Making an all-blank file name here is somewhat unacceptable, I > think. So how should this be handled? Not sure which problem are you trying to solve. But my crystal ball says you need to (let ((file-name-coding-system default-file)) (write-region (point) (point-max) "/home/larsi/Downloads/Сердце")) because most GNU/Linux systems use UTF-8 codeset by default. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: eww doesn't decode %AA%BB%CC URL names 2015-12-24 20:43 ` Lars Ingebrigtsen 2015-12-24 21:00 ` Eli Zaretskii @ 2015-12-24 21:04 ` Lars Ingebrigtsen 2015-12-24 21:11 ` Eli Zaretskii 1 sibling, 1 reply; 19+ messages in thread From: Lars Ingebrigtsen @ 2015-12-24 21:04 UTC (permalink / raw) To: emacs-devel After spelunking down into `set-language-environment', it seems like it's the setting of `default-file-name-coding-system' that's the problem here: (encode-coding-string (decode-coding-string (url-unhex-string "%D0%A1%D0%B5%D1%80%D0%B4%D1%86%D0%B5") 'utf-8) default-file-name-coding-system) => " " So I guess the file name should remain those percentages if it can't be encoded using that... but how do I check that, then? :-) Charsets are hard! -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: eww doesn't decode %AA%BB%CC URL names 2015-12-24 21:04 ` Lars Ingebrigtsen @ 2015-12-24 21:11 ` Eli Zaretskii 2015-12-24 21:16 ` Eli Zaretskii 2015-12-24 21:17 ` Lars Ingebrigtsen 0 siblings, 2 replies; 19+ messages in thread From: Eli Zaretskii @ 2015-12-24 21:11 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: emacs-devel > From: Lars Ingebrigtsen <larsi@gnus.org> > Date: Thu, 24 Dec 2015 22:04:08 +0100 > > After spelunking down into `set-language-environment', it seems like > it's the setting of `default-file-name-coding-system' that's the problem > here: > > (encode-coding-string > (decode-coding-string > (url-unhex-string "%D0%A1%D0%B5%D1%80%D0%B4%D1%86%D0%B5") > 'utf-8) > default-file-name-coding-system) > => " " > > So I guess the file name should remain those percentages if it can't be > encoded using that... but how do I check that, then? :-) If you want to check that STRING can be encoded in CODING, do this: (member CODING (find-coding-systems-string STRING)) and see if the result is non-nil. For file names, you should do this test with file-name-coding-system, if that's non-nil, else with default-file-name-coding-system. > Charsets are hard! Subtle. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: eww doesn't decode %AA%BB%CC URL names 2015-12-24 21:11 ` Eli Zaretskii @ 2015-12-24 21:16 ` Eli Zaretskii 2015-12-24 21:17 ` Lars Ingebrigtsen 1 sibling, 0 replies; 19+ messages in thread From: Eli Zaretskii @ 2015-12-24 21:16 UTC (permalink / raw) To: larsi; +Cc: emacs-devel > Date: Thu, 24 Dec 2015 23:11:20 +0200 > From: Eli Zaretskii <eliz@gnu.org> > Cc: emacs-devel@gnu.org > > (member CODING (find-coding-systems-string STRING)) ^^^^^^ Sorry, memq, of course. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: eww doesn't decode %AA%BB%CC URL names 2015-12-24 21:11 ` Eli Zaretskii 2015-12-24 21:16 ` Eli Zaretskii @ 2015-12-24 21:17 ` Lars Ingebrigtsen 2015-12-24 21:28 ` Lars Ingebrigtsen 2015-12-25 7:17 ` Eli Zaretskii 1 sibling, 2 replies; 19+ messages in thread From: Lars Ingebrigtsen @ 2015-12-24 21:17 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel Eli Zaretskii <eliz@gnu.org> writes: > If you want to check that STRING can be encoded in CODING, do this: > > (member CODING (find-coding-systems-string STRING)) > > and see if the result is non-nil. Hm: (find-coding-systems-string "a") => (undecided) (find-coding-systems-string "Сердце") => (chinese-iso-8bit japanese-shift-jis iso-2022-jp utf-8 korean-iso-8bit euc-jis-2004 japanese-iso-8bit iso-2022-jp-2004 cp855 windows-1251 koi8-t koi8-u cp866 koi8-u cyrillic-koi8 cyrillic-iso-8bit chinese-gb18030 chinese-gbk chinese-big5-hkscs chinese-hz utf-7 iso-2022-kr iso-2022-jp-2 iso-2022-cn-ext iso-2022-cn utf-16 utf-16be-with-signature utf-16le-with-signature utf-16be utf-16le compound-text-with-extensions compound-text iso-2022-7bit utf-8-auto utf-8-with-signature emacs-mule raw-text iso-2022-8bit-ss2 iso-2022-7bit-lock eucjp-ms korean-cp949 japanese-shift-jis-2004 japanese-iso-7bit-1978-irv japanese-cp932 pt154 mik cp1125 cyrillic-alternativnyj utf-7-imap utf-8-emacs prefer-utf-8 no-conversion ctext-no-compositions iso-2022-7bit-lock-ss2 iso-2022-7bit-ss2) Wowza. Ok, I think I should now be able to create the function in question. Thanks for all the help. :-) -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: eww doesn't decode %AA%BB%CC URL names 2015-12-24 21:17 ` Lars Ingebrigtsen @ 2015-12-24 21:28 ` Lars Ingebrigtsen 2015-12-25 7:24 ` Eli Zaretskii 2015-12-25 7:17 ` Eli Zaretskii 1 sibling, 1 reply; 19+ messages in thread From: Lars Ingebrigtsen @ 2015-12-24 21:28 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel Lars Ingebrigtsen <larsi@gnus.org> writes: > (find-coding-systems-string "Сердце") > => (chinese-iso-8bit japanese-shift-jis iso-2022-jp utf-8 korean-iso-8bit euc-jis-2004 japanese-iso-8bit iso-2022-jp-2004 cp855 windows-1251 koi8-t koi8-u cp866 koi8-u cyrillic-koi8 cyrillic-iso-8bit chinese-gb18030 chinese-gbk chinese-big5-hkscs chinese-hz utf-7 iso-2022-kr iso-2022-jp-2 iso-2022-cn-ext iso-2022-cn utf-16 utf-16be-with-signature utf-16le-with-signature utf-16be utf-16le compound-text-with-extensions compound-text iso-2022-7bit utf-8-auto utf-8-with-signature emacs-mule raw-text iso-2022-8bit-ss2 iso-2022-7bit-lock eucjp-ms korean-cp949 japanese-shift-jis-2004 japanese-iso-7bit-1978-irv japanese-cp932 pt154 mik cp1125 cyrillic-alternativnyj utf-7-imap utf-8-emacs prefer-utf-8 no-conversion ctext-no-compositions iso-2022-7bit-lock-ss2 iso-2022-7bit-ss2) Darn! If I start emacs -Q, I get default-file-name-coding-system => utf-8-unix And that isn't on that monstrous list up there... Is that a bug in `find-coding-systems-string'? -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: eww doesn't decode %AA%BB%CC URL names 2015-12-24 21:28 ` Lars Ingebrigtsen @ 2015-12-25 7:24 ` Eli Zaretskii 2015-12-25 7:32 ` Lars Ingebrigtsen 0 siblings, 1 reply; 19+ messages in thread From: Eli Zaretskii @ 2015-12-25 7:24 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: emacs-devel > From: Lars Ingebrigtsen <larsi@gnus.org> > Cc: emacs-devel@gnu.org > Date: Thu, 24 Dec 2015 22:28:48 +0100 > > Darn! If I start emacs -Q, I get > > default-file-name-coding-system > => utf-8-unix > > And that isn't on that monstrous list up there... Is that a bug in > `find-coding-systems-string'? No, it's another "issue" when dealing with coding systems. To avoid this, use (coding-system-base default-file-name-coding-system) instead of just default-file-name-coding-system, and the same with file-name-coding-system. (The "-unix" suffix controls conversion of end-of-line, which is not relevant for encoding the characters in the file name.) ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: eww doesn't decode %AA%BB%CC URL names 2015-12-25 7:24 ` Eli Zaretskii @ 2015-12-25 7:32 ` Lars Ingebrigtsen 0 siblings, 0 replies; 19+ messages in thread From: Lars Ingebrigtsen @ 2015-12-25 7:32 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel Eli Zaretskii <eliz@gnu.org> writes: > No, it's another "issue" when dealing with coding systems. To avoid > this, use > > (coding-system-base default-file-name-coding-system) > > instead of just default-file-name-coding-system, and the same with > file-name-coding-system. That did the trick. emacs -Q now saves that Russian-looking file name using utf-8 into the Download directory. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: eww doesn't decode %AA%BB%CC URL names 2015-12-24 21:17 ` Lars Ingebrigtsen 2015-12-24 21:28 ` Lars Ingebrigtsen @ 2015-12-25 7:17 ` Eli Zaretskii 1 sibling, 0 replies; 19+ messages in thread From: Eli Zaretskii @ 2015-12-25 7:17 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: emacs-devel > From: Lars Ingebrigtsen <larsi@gnus.org> > Cc: emacs-devel@gnu.org > Date: Thu, 24 Dec 2015 22:17:10 +0100 > > Eli Zaretskii <eliz@gnu.org> writes: > > > If you want to check that STRING can be encoded in CODING, do this: > > > > (member CODING (find-coding-systems-string STRING)) > > > > and see if the result is non-nil. > > Hm: > > (find-coding-systems-string "a") > => (undecided) This is normal for pure ASCII. If the return value is just that, then CODING, any CODING, can do the job. > (find-coding-systems-string "Сердце") > => (chinese-iso-8bit japanese-shift-jis iso-2022-jp utf-8 korean-iso-8bit euc-jis-2004 japanese-iso-8bit iso-2022-jp-2004 cp855 windows-1251 koi8-t koi8-u cp866 koi8-u cyrillic-koi8 cyrillic-iso-8bit chinese-gb18030 chinese-gbk chinese-big5-hkscs chinese-hz utf-7 iso-2022-kr iso-2022-jp-2 iso-2022-cn-ext iso-2022-cn utf-16 utf-16be-with-signature utf-16le-with-signature utf-16be utf-16le compound-text-with-extensions compound-text iso-2022-7bit utf-8-auto utf-8-with-signature emacs-mule raw-text iso-2022-8bit-ss2 iso-2022-7bit-lock eucjp-ms korean-cp949 japanese-shift-jis-2004 japanese-iso-7bit-1978-irv japanese-cp932 pt154 mik cp1125 cyrillic-alternativnyj utf-7-imap utf-8-emacs prefer-utf-8 no-conversion ctext-no-compositions iso-2022-7bit-lock-ss2 iso-2022-7bit-ss2) > > Wowza. Yeah. > Ok, I think I should now be able to create the function in question. > Thanks for all the help. :-) You are welcome. ^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2015-12-25 7:32 UTC | newest] Thread overview: 19+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-08-18 14:26 eww doesn't decode %AA%BB%CC URL names Eli Zaretskii 2015-12-24 17:40 ` Lars Ingebrigtsen 2015-12-24 18:07 ` Yuri Khan 2015-12-24 19:03 ` Eli Zaretskii 2015-12-24 19:18 ` Lars Ingebrigtsen 2015-12-24 19:34 ` Eli Zaretskii 2015-12-24 19:55 ` Lars Ingebrigtsen 2015-12-24 20:40 ` Eli Zaretskii 2015-12-24 20:49 ` Lars Ingebrigtsen 2015-12-24 20:43 ` Lars Ingebrigtsen 2015-12-24 21:00 ` Eli Zaretskii 2015-12-24 21:04 ` Lars Ingebrigtsen 2015-12-24 21:11 ` Eli Zaretskii 2015-12-24 21:16 ` Eli Zaretskii 2015-12-24 21:17 ` Lars Ingebrigtsen 2015-12-24 21:28 ` Lars Ingebrigtsen 2015-12-25 7:24 ` Eli Zaretskii 2015-12-25 7:32 ` Lars Ingebrigtsen 2015-12-25 7:17 ` Eli Zaretskii
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/emacs.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).