* bug#23750: 25.0.95; bug in url-retrieve or json.el @ 2016-06-12 2:22 Leo Liu 2016-06-13 15:02 ` Dmitry Gutov 0 siblings, 1 reply; 125+ messages in thread From: Leo Liu @ 2016-06-12 2:22 UTC (permalink / raw) To: 23750 I have been trying to debug an issue in TernJs¹ on and off for a few months now and it seems the cause is some nasty bug in Emacs 25. Could someone follow the steps detailed in https://github.com/ternjs/tern/issues/719 to reproduce the issue? I have verified that the bug is not in Tern but Emacs i.e. under some circumstances emacs's URL package strips some chars in the request body which, in this case, leads to unbalanced parentheses in the JSON doc. Leo Footnotes: ¹ https://github.com/ternjs/tern/issues/719 ^ permalink raw reply [flat|nested] 125+ messages in thread
* bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-06-12 2:22 bug#23750: 25.0.95; bug in url-retrieve or json.el Leo Liu @ 2016-06-13 15:02 ` Dmitry Gutov 2016-06-13 17:55 ` Stefan Monnier 0 siblings, 1 reply; 125+ messages in thread From: Dmitry Gutov @ 2016-06-13 15:02 UTC (permalink / raw) To: Leo Liu, 23750; +Cc: Stefan Monnier [-- Attachment #1: Type: text/plain, Size: 531 bytes --] On 06/12/2016 05:22 AM, Leo Liu wrote: > ¹ https://github.com/ternjs/tern/issues/719 Investigation shows that the problem occurs when url-http-data is multibyte and (length url-http-data) differs from (length (string-as-unibyte url-http-data)), because we send a wrong value in Content-length. Changing url-http-create-request like this will make the problem more obvious for anyone else that hits it, patch attached. Stefan, did you have a particular situation in mind where this might be bad, when you wrote the FIXME? [-- Attachment #2: url-http-unibyte.diff --] [-- Type: text/x-patch, Size: 1077 bytes --] diff --git a/lisp/url/url-http.el b/lisp/url/url-http.el index 5832e92..f7ec640 100644 --- a/lisp/url/url-http.el +++ b/lisp/url/url-http.el @@ -278,14 +278,10 @@ url-http-create-request ;; We used to concat directly, but if one of the strings happens ;; to being multibyte (even if it only contains pure ASCII) then ;; every string gets converted with `string-MAKE-multibyte' which - ;; turns the 127-255 codes into things like latin-1 accented chars - ;; (it would work right if it used `string-TO-multibyte' instead). + ;; turns the 127-255 codes into things like latin-1 accented chars. ;; So to avoid the problem we force every string to be unibyte. (mapconcat - ;; FIXME: Instead of `string-AS-unibyte' we'd want - ;; `string-to-unibyte', so as to properly signal an error if one - ;; of the strings contains a multibyte char. - 'string-as-unibyte + 'string-to-unibyte (delq nil (list ;; The request ^ permalink raw reply related [flat|nested] 125+ messages in thread
* bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-06-13 15:02 ` Dmitry Gutov @ 2016-06-13 17:55 ` Stefan Monnier 2016-06-13 19:26 ` Dmitry Gutov 0 siblings, 1 reply; 125+ messages in thread From: Stefan Monnier @ 2016-06-13 17:55 UTC (permalink / raw) To: Dmitry Gutov; +Cc: 23750, Leo Liu >> ¹ https://github.com/ternjs/tern/issues/719 > Investigation shows that the problem occurs when url-http-data is multibyte > and (length url-http-data) differs from (length (string-as-unibyte > url-http-data)), because we send a wrong value in Content-length. > Changing url-http-create-request like this will make the problem more > obvious for anyone else that hits it, patch attached. > Stefan, did you have a particular situation in mind where this might be bad, > when you wrote the FIXME? No, nothing in particular. Just that `string-as-unibyte` is generally synonymous with "the author is confused about how coding systems work", aka "trouble". Stefan ^ permalink raw reply [flat|nested] 125+ messages in thread
* bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-06-13 17:55 ` Stefan Monnier @ 2016-06-13 19:26 ` Dmitry Gutov 2016-06-14 0:30 ` Stefan Monnier 0 siblings, 1 reply; 125+ messages in thread From: Dmitry Gutov @ 2016-06-13 19:26 UTC (permalink / raw) To: Stefan Monnier; +Cc: 23750, Leo Liu On 06/13/2016 08:55 PM, Stefan Monnier wrote: > No, nothing in particular. Just that `string-as-unibyte` is generally > synonymous with "the author is confused about how coding systems work", > aka "trouble". You were also the author in this case. The same commit added both the use of string-as-unibyte and the FIXME comment. ^ permalink raw reply [flat|nested] 125+ messages in thread
* bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-06-13 19:26 ` Dmitry Gutov @ 2016-06-14 0:30 ` Stefan Monnier 2016-06-19 18:14 ` Dmitry Gutov 0 siblings, 1 reply; 125+ messages in thread From: Stefan Monnier @ 2016-06-14 0:30 UTC (permalink / raw) To: Dmitry Gutov; +Cc: 23750, Leo Liu >> No, nothing in particular. Just that `string-as-unibyte` is generally >> synonymous with "the author is confused about how coding systems work", >> aka "trouble". > You were also the author in this case. The same commit added both the use of > string-as-unibyte and the FIXME comment. Can't remember why I did so. My best guess is that I tried to mimick some earlier behavior. Stefan ^ permalink raw reply [flat|nested] 125+ messages in thread
* bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-06-14 0:30 ` Stefan Monnier @ 2016-06-19 18:14 ` Dmitry Gutov 2016-06-19 18:25 ` Eli Zaretskii 0 siblings, 1 reply; 125+ messages in thread From: Dmitry Gutov @ 2016-06-19 18:14 UTC (permalink / raw) To: Stefan Monnier; +Cc: 23750, Leo Liu On 06/14/2016 03:30 AM, Stefan Monnier wrote: > Can't remember why I did so. My best guess is that I tried to mimick > some earlier behavior. OK, thanks anyway. I've pushed the patch to master as 2ede29575fa22eb7c265117d7511cff9fe02c606. Eli, could we have it emacs-25 as well? It's not critical, but it should make the life of our users easier to flagging problems with the usage of url-http earlier, in a more appropriate place, with an error, rather than leaving that up to them to deduce why their HTTP server truncates the request body. While the truncation bug itself is quite old, it's been exacerbated in Emacs 25 by my own цщкл to make json.el faster: one side-effect is that it doesn't \u-quote multibyte characters anymore, or at least not all of them. FWIW, I've been running with it applied to emacs-25 for the past week with no problems. ^ permalink raw reply [flat|nested] 125+ messages in thread
* bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-06-19 18:14 ` Dmitry Gutov @ 2016-06-19 18:25 ` Eli Zaretskii 2016-06-19 18:30 ` John Wiegley 2016-06-19 18:36 ` Dmitry Gutov 0 siblings, 2 replies; 125+ messages in thread From: Eli Zaretskii @ 2016-06-19 18:25 UTC (permalink / raw) To: Dmitry Gutov; +Cc: 23750, monnier, sdl.web > Cc: 23750@debbugs.gnu.org, Leo Liu <sdl.web@gmail.com>, > Eli Zaretskii <eliz@gnu.org> > From: Dmitry Gutov <dgutov@yandex.ru> > Date: Sun, 19 Jun 2016 21:14:55 +0300 > > On 06/14/2016 03:30 AM, Stefan Monnier wrote: > > > Can't remember why I did so. My best guess is that I tried to mimick > > some earlier behavior. > > OK, thanks anyway. I've pushed the patch to master as > 2ede29575fa22eb7c265117d7511cff9fe02c606. > > Eli, could we have it emacs-25 as well? It's not critical, but it should > make the life of our users easier to flagging problems with the usage of > url-http earlier, in a more appropriate place, with an error, rather > than leaving that up to them to deduce why their HTTP server truncates > the request body. I'd need a very detailed description of the bug, and why this particular solution was used. IME, neither string-to-unibyte not string-as-unibyte should ever be used in applications, their use is more often than not a sign of some basic misunderstanding of text encoding. For starters, how come 8-bit bytes wind up in that function, and what do they stand for? ^ permalink raw reply [flat|nested] 125+ messages in thread
* bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-06-19 18:25 ` Eli Zaretskii @ 2016-06-19 18:30 ` John Wiegley 2016-06-19 18:45 ` Dmitry Gutov 2016-06-19 18:36 ` Dmitry Gutov 1 sibling, 1 reply; 125+ messages in thread From: John Wiegley @ 2016-06-19 18:30 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 23750, Dmitry Gutov, sdl.web, monnier [-- Attachment #1: Type: text/plain, Size: 934 bytes --] >>>>> Eli Zaretskii <eliz@gnu.org> writes: >> Eli, could we have it emacs-25 as well? It's not critical, but it should >> make the life of our users easier to flagging problems with the usage of >> url-http earlier, in a more appropriate place, with an error, rather than >> leaving that up to them to deduce why their HTTP server truncates the >> request body. Bear in mind that 25.2 can be released as soon after as we want it to. If anything is "optional" at this point in time, it should be deferred. We shouldn't try to race anything into the release, just because we think users will then have to live with some minor inferior behavior for a long time after. The description above certainly does not sound like something that needs to be happen for 25.1. -- John Wiegley GPG fingerprint = 4710 CF98 AF9B 327B B80F http://newartisans.com 60E1 46C4 BD1A 7AC1 4BA2 [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 629 bytes --] ^ permalink raw reply [flat|nested] 125+ messages in thread
* bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-06-19 18:30 ` John Wiegley @ 2016-06-19 18:45 ` Dmitry Gutov 2016-06-19 19:56 ` John Wiegley 0 siblings, 1 reply; 125+ messages in thread From: Dmitry Gutov @ 2016-06-19 18:45 UTC (permalink / raw) To: John Wiegley, Eli Zaretskii; +Cc: 23750, sdl.web, monnier On 06/19/2016 09:30 PM, John Wiegley wrote: > Bear in mind that 25.2 can be released as soon after as we want it to. If > anything is "optional" at this point in time, it should be deferred. Let's apply the few outstanding patches and release 25.2 the next day, then? Traditionally, releases are separated by at least several months, even ones with no big changes. > We shouldn't try to race anything into the release, just because we think > users will then have to live with some minor inferior behavior for a long time > after. The description above certainly does not sound like something that > needs to be happen for 25.1. Just to be clear: the patch doesn't change the behavior of any working code. It just catches a particular kind of bug earlier than it would manifest through a cryptic behavior. Behavior which is non-trivial to debug, and thus adds to the already non-trivial effort required of a person writing an advanced language support code (using an external daemon talking over HTTP is fairly common for this these days). ^ permalink raw reply [flat|nested] 125+ messages in thread
* bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-06-19 18:45 ` Dmitry Gutov @ 2016-06-19 19:56 ` John Wiegley 2016-06-19 20:05 ` Dmitry Gutov ` (2 more replies) 0 siblings, 3 replies; 125+ messages in thread From: John Wiegley @ 2016-06-19 19:56 UTC (permalink / raw) To: Dmitry Gutov; +Cc: 23750, sdl.web, monnier [-- Attachment #1: Type: text/plain, Size: 955 bytes --] >>>>> Dmitry Gutov <dgutov@yandex.ru> writes: > Just to be clear: the patch doesn't change the behavior of any working code. > It just catches a particular kind of bug earlier than it would manifest > through a cryptic behavior. > > Behavior which is non-trivial to debug, and thus adds to the already > non-trivial effort required of a person writing an advanced language support > code (using an external daemon talking over HTTP is fairly common for this > these days). I get that. But right now, if it doesn't *have* to happen, it should wait. We're thinking about cutting the release candidate in just a few days, pending one issue that Eli is looking into. Any change -- and I mean _any_ change -- has the potential to introduce unforeseen effects that could delay us further. -- John Wiegley GPG fingerprint = 4710 CF98 AF9B 327B B80F http://newartisans.com 60E1 46C4 BD1A 7AC1 4BA2 [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 629 bytes --] ^ permalink raw reply [flat|nested] 125+ messages in thread
* bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-06-19 19:56 ` John Wiegley @ 2016-06-19 20:05 ` Dmitry Gutov 2016-06-19 21:07 ` John Wiegley 2016-06-20 1:26 ` Glenn Morris 2016-06-20 2:58 ` Dmitry Gutov 2 siblings, 1 reply; 125+ messages in thread From: Dmitry Gutov @ 2016-06-19 20:05 UTC (permalink / raw) To: John Wiegley; +Cc: 23750, sdl.web, monnier On 06/19/2016 10:56 PM, John Wiegley wrote: > We're thinking about cutting the release candidate in just a few days, pending > one issue that Eli is looking into. Any change -- and I mean _any_ change -- > has the potential to introduce unforeseen effects that could delay us further. By how much? Even if that change causes problems (which is unlikely), we'd only have to revert it, and, unless other issues have come in the meantime, we could build and release Emacs 25.1 right then, more or less. It's not like a regression there has a significant potential to obscure other problems. We've tested the current state of the URL package pretty well by now anyway. ^ permalink raw reply [flat|nested] 125+ messages in thread
* bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-06-19 20:05 ` Dmitry Gutov @ 2016-06-19 21:07 ` John Wiegley 2016-06-20 1:28 ` Glenn Morris 0 siblings, 1 reply; 125+ messages in thread From: John Wiegley @ 2016-06-19 21:07 UTC (permalink / raw) To: Dmitry Gutov; +Cc: 23750, sdl.web, monnier [-- Attachment #1: Type: text/plain, Size: 850 bytes --] >>>>> Dmitry Gutov <dgutov@yandex.ru> writes: > By how much? > > Even if that change causes problems (which is unlikely), we'd only have to > revert it, and, unless other issues have come in the meantime, we could > build and release Emacs 25.1 right then, more or less. A day comes when a line has to be drawn in the sand, otherwise we could nickel and dime ourselves into the next century. That line is drawn; the time for 25.1 is at hand. Let's start thinking about 25.2 as we think about these types of improvements, and how we might accelerate its release so it happens in 1-2 months time. There can be many 25.x's, without disrupting the feature work happening on master. -- John Wiegley GPG fingerprint = 4710 CF98 AF9B 327B B80F http://newartisans.com 60E1 46C4 BD1A 7AC1 4BA2 [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 629 bytes --] ^ permalink raw reply [flat|nested] 125+ messages in thread
* bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-06-19 21:07 ` John Wiegley @ 2016-06-20 1:28 ` Glenn Morris 2016-06-20 4:22 ` John Wiegley 0 siblings, 1 reply; 125+ messages in thread From: Glenn Morris @ 2016-06-20 1:28 UTC (permalink / raw) To: John Wiegley; +Cc: 23750, Dmitry Gutov, sdl.web, monnier John Wiegley wrote: > There can be many 25.x's, without disrupting the feature work > happening on master. Then why is master STILL advertising itself as the forerunner to 25.2? Why are we closing a bunch of bugs as "fixed in 25.2" if they won't be fixed till 26.1? ^ permalink raw reply [flat|nested] 125+ messages in thread
* bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-06-20 1:28 ` Glenn Morris @ 2016-06-20 4:22 ` John Wiegley 2016-06-20 12:39 ` Lars Ingebrigtsen ` (2 more replies) 0 siblings, 3 replies; 125+ messages in thread From: John Wiegley @ 2016-06-20 4:22 UTC (permalink / raw) To: Glenn Morris; +Cc: 23750, Dmitry Gutov, sdl.web, monnier >>>>> Glenn Morris <rgm@gnu.org> writes: > Then why is master STILL advertising itself as the forerunner to 25.2? Why > are we closing a bunch of bugs as "fixed in 25.2" if they won't be fixed > till 26.1? I guess to avoid having the reported version number in bug reports keep jumping around? Master is really working toward 26.1 at this point. Once we start working on 25.2, we should cherry-pick over all the fixes for bugs are marked "fixed in 25.2". Otherwise, they should be marked "fixed in 26.1". -- John Wiegley GPG fingerprint = 4710 CF98 AF9B 327B B80F http://newartisans.com 60E1 46C4 BD1A 7AC1 4BA2 ^ permalink raw reply [flat|nested] 125+ messages in thread
* bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-06-20 4:22 ` John Wiegley @ 2016-06-20 12:39 ` Lars Ingebrigtsen 2016-07-01 20:49 ` John Wiegley 2016-06-20 14:42 ` Eli Zaretskii 2016-06-23 17:14 ` Glenn Morris 2 siblings, 1 reply; 125+ messages in thread From: Lars Ingebrigtsen @ 2016-06-20 12:39 UTC (permalink / raw) To: John Wiegley; +Cc: 23750, Dmitry Gutov, sdl.web, monnier John Wiegley <jwiegley@gmail.com> writes: >>>>>> Glenn Morris <rgm@gnu.org> writes: > >> Then why is master STILL advertising itself as the forerunner to 25.2? Why >> are we closing a bunch of bugs as "fixed in 25.2" if they won't be fixed >> till 26.1? > > I guess to avoid having the reported version number in bug reports keep > jumping around? Master is really working toward 26.1 at this point. > > Once we start working on 25.2, we should cherry-pick over all the fixes for > bugs are marked "fixed in 25.2". Otherwise, they should be marked "fixed in > 26.1". Most bugs fixed in master are marked "fixed in 25.2" (since that is what master is announcing itself as being the forerunner to), so that doesn't make much sense, I'm afraid. Which is what Glenn is telling us, once again. I really don't understand why master hasn't been changed to say that it's the forerunner to 26.1. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 125+ messages in thread
* bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-06-20 12:39 ` Lars Ingebrigtsen @ 2016-07-01 20:49 ` John Wiegley 0 siblings, 0 replies; 125+ messages in thread From: John Wiegley @ 2016-07-01 20:49 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: 23750, Dmitry Gutov, sdl.web, monnier [-- Attachment #1: Type: text/plain, Size: 2034 bytes --] >>>>> Lars Ingebrigtsen <larsi@gnus.org> writes: > Most bugs fixed in master are marked "fixed in 25.2" (since that is what > master is announcing itself as being the forerunner to), so that doesn't > make much sense, I'm afraid. > > Which is what Glenn is telling us, once again. I really don't understand why > master hasn't been changed to say that it's the forerunner to 26.1. The last time we had our long discussion about what the various branches mean, the conclusion was that emacs-25 is for the next release, and master is for all other work. Most people did NOT want master to be toward the next release (25.2), as that leaves nowhere for changes meant for 26 only. However, this also leaves nowhere for fixes to go that are only for 25.2. But since no additional branches were desired, the compromise was that both types of changes will go into master, and we will be backport certain changes into emacs-25 toward 25.2 after the release. Marking a bug as "fixed in 25.2" seems wrong to me, because it implies a guarantee that the fix will get cherry picked into emacs-25 after 25.1 is released, although I highly doubt this will happen for every such fix. There is just too much work to be done. What we should do is mark every commit intended for 25.2 in a way that lets us find them all automatically after the release, with a link to the bugs they fix so that we can safely state "fixed in 25.2". Since this hasn't happened, I imagine it will be a very manual process, and will be missing several of those fixes. This is why I personally argued for 3 branches, but it's not what the people doing the real work wanted, so this is what we have. After 25.1, we'll just have to see what happens to emacs-25 and to the bug-tracker. I imagine several of the "fixed in 25.2" bugs will need to be adjusted to "fixed in 26.1". -- John Wiegley GPG fingerprint = 4710 CF98 AF9B 327B B80F http://newartisans.com 60E1 46C4 BD1A 7AC1 4BA2 [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 629 bytes --] ^ permalink raw reply [flat|nested] 125+ messages in thread
* bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-06-20 4:22 ` John Wiegley 2016-06-20 12:39 ` Lars Ingebrigtsen @ 2016-06-20 14:42 ` Eli Zaretskii 2016-06-23 17:14 ` Glenn Morris 2 siblings, 0 replies; 125+ messages in thread From: Eli Zaretskii @ 2016-06-20 14:42 UTC (permalink / raw) To: John Wiegley; +Cc: 23750, dgutov, sdl.web, monnier > From: John Wiegley <jwiegley@gmail.com> > Date: Sun, 19 Jun 2016 21:22:25 -0700 > Cc: 23750@debbugs.gnu.org, Dmitry Gutov <dgutov@yandex.ru>, sdl.web@gmail.com, > monnier@IRO.UMontreal.CA > > Once we start working on 25.2, we should cherry-pick over all the fixes for > bugs are marked "fixed in 25.2". I don't think this is practical. The only practical way is to cut a new release branch off master. ^ permalink raw reply [flat|nested] 125+ messages in thread
* bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-06-20 4:22 ` John Wiegley 2016-06-20 12:39 ` Lars Ingebrigtsen 2016-06-20 14:42 ` Eli Zaretskii @ 2016-06-23 17:14 ` Glenn Morris 2 siblings, 0 replies; 125+ messages in thread From: Glenn Morris @ 2016-06-23 17:14 UTC (permalink / raw) To: John Wiegley; +Cc: 23750, Dmitry Gutov, sdl.web, monnier John Wiegley wrote: >> Then why is master STILL advertising itself as the forerunner to 25.2? Why >> are we closing a bunch of bugs as "fixed in 25.2" if they won't be fixed >> till 26.1? > > I guess to avoid having the reported version number in bug reports keep > jumping around? Master is really working toward 26.1 at this point. This doesn't make any sense to me. (And why are you guessing? Isn't there a plan?) > Once we start working on 25.2, we should cherry-pick over all the fixes for > bugs are marked "fixed in 25.2". Otherwise, they should be marked "fixed in > 26.1". I don't think that will work well, but good luck with it. ^ permalink raw reply [flat|nested] 125+ messages in thread
* bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-06-19 19:56 ` John Wiegley 2016-06-19 20:05 ` Dmitry Gutov @ 2016-06-20 1:26 ` Glenn Morris 2016-06-20 2:58 ` Dmitry Gutov 2 siblings, 0 replies; 125+ messages in thread From: Glenn Morris @ 2016-06-20 1:26 UTC (permalink / raw) To: John Wiegley; +Cc: 23750, Dmitry Gutov, sdl.web, monnier John Wiegley wrote: > We're thinking about cutting the release candidate in just a few days Please see admin/release-process for some tasks that should happen before that. ^ permalink raw reply [flat|nested] 125+ messages in thread
* bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-06-19 19:56 ` John Wiegley 2016-06-19 20:05 ` Dmitry Gutov 2016-06-20 1:26 ` Glenn Morris @ 2016-06-20 2:58 ` Dmitry Gutov 2 siblings, 0 replies; 125+ messages in thread From: Dmitry Gutov @ 2016-06-20 2:58 UTC (permalink / raw) To: John Wiegley; +Cc: 23750, sdl.web, monnier On 06/19/2016 10:56 PM, John Wiegley wrote: > We're thinking about cutting the release candidate in just a few days, pending > one issue that Eli is looking into. Do you mean bug#23779? I wouldn't call it critical (judging by the number of years it went unreported), and it's not a regression, so it doesn't make a lot of sense to fix it without taking care of the bug that resulted in it being reported (bug#23769). ^ permalink raw reply [flat|nested] 125+ messages in thread
* bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-06-19 18:25 ` Eli Zaretskii 2016-06-19 18:30 ` John Wiegley @ 2016-06-19 18:36 ` Dmitry Gutov 2016-06-20 0:15 ` Leo Liu 2016-06-20 2:40 ` Eli Zaretskii 1 sibling, 2 replies; 125+ messages in thread From: Dmitry Gutov @ 2016-06-19 18:36 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 23750, monnier, sdl.web On 06/19/2016 09:25 PM, Eli Zaretskii wrote: > I'd need a very detailed description of the bug, and why this > particular solution was used. This particular bug came from this: "Content-length: " (number-to-string (length url-http-data)) Which gives wrong value when url-http-data is multibyte (it should be length in bytes). So then, the HTTP server on the other side saw the wrong body length and truncated the body when reading the request. Or something along these lines. > IME, neither string-to-unibyte not > string-as-unibyte should ever be used in applications, their use is > more often than not a sign of some basic misunderstanding of text > encoding. For starters, how come 8-bit bytes wind up in that > function, and what do they stand for? Some 8-byte encoding of the HTTP request body. Anyway, yes, the hope is that the programmer uses something like encode-coding-string to produce that value (and picks the encoding, and indicates it in the appropriate HTTP header). Then string-to-unibyte will simply be a no-op. But we need to catch the case when they don't, and this seems to be the easiest way to do this. ^ permalink raw reply [flat|nested] 125+ messages in thread
* bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-06-19 18:36 ` Dmitry Gutov @ 2016-06-20 0:15 ` Leo Liu 2016-06-20 14:39 ` Eli Zaretskii 2016-06-20 2:40 ` Eli Zaretskii 1 sibling, 1 reply; 125+ messages in thread From: Leo Liu @ 2016-06-20 0:15 UTC (permalink / raw) To: 23750 On 2016-06-19 21:36 +0300, Dmitry Gutov wrote: > This particular bug came from this: > > "Content-length: " (number-to-string (length url-http-data)) > > Which gives wrong value when url-http-data is multibyte (it should be > length in bytes). So then, the HTTP server on the other side saw the > wrong body length and truncated the body when reading the request. As Dmitry mentioned earlier json-encode in 25.1 produces multibyte strings and makes it easier to hit this bug when consuming JSON API's. There are three parties that are suspicious: 1) JSON API server 2) JSON.el 3) URL. It took me a while to realise it's URL's fault IOW the bug isn't easy to debug. This is somewhat related to changes brought in by 25.1. Leo ^ permalink raw reply [flat|nested] 125+ messages in thread
* bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-06-20 0:15 ` Leo Liu @ 2016-06-20 14:39 ` Eli Zaretskii 0 siblings, 0 replies; 125+ messages in thread From: Eli Zaretskii @ 2016-06-20 14:39 UTC (permalink / raw) To: Leo Liu; +Cc: 23750 > From: Leo Liu <sdl.web@gmail.com> > Date: Mon, 20 Jun 2016 08:15:26 +0800 > > > This particular bug came from this: > > > > "Content-length: " (number-to-string (length url-http-data)) > > > > Which gives wrong value when url-http-data is multibyte (it should be > > length in bytes). So then, the HTTP server on the other side saw the > > wrong body length and truncated the body when reading the request. > > As Dmitry mentioned earlier json-encode in 25.1 produces multibyte > strings and makes it easier to hit this bug when consuming JSON API's. > There are three parties that are suspicious: 1) JSON API server 2) > JSON.el 3) URL. It took me a while to realise it's URL's fault IOW the > bug isn't easy to debug. This is somewhat related to changes brought in > by 25.1. I understand that url-http expects unibyte strings. So my suggestion is to test that, and signal an error if the requirement is violated, with an error message text that could be understood by users and developers. Alternatively, we could encode multibyte strings in UTF-8, if we want to attempt to silently cope with such strings. In any case, using string-*-unibyte functions for that is not needed, and I'm quite sure their use in this case is a left-over from an era long gone. ^ permalink raw reply [flat|nested] 125+ messages in thread
* bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-06-19 18:36 ` Dmitry Gutov 2016-06-20 0:15 ` Leo Liu @ 2016-06-20 2:40 ` Eli Zaretskii 2016-06-20 2:51 ` Dmitry Gutov 1 sibling, 1 reply; 125+ messages in thread From: Eli Zaretskii @ 2016-06-20 2:40 UTC (permalink / raw) To: Dmitry Gutov; +Cc: 23750, monnier, sdl.web > Cc: 23750@debbugs.gnu.org, monnier@IRO.UMontreal.CA, sdl.web@gmail.com > From: Dmitry Gutov <dgutov@yandex.ru> > Date: Sun, 19 Jun 2016 21:36:25 +0300 > > This particular bug came from this: > > "Content-length: " (number-to-string (length url-http-data)) > > Which gives wrong value when url-http-data is multibyte (it should be > length in bytes). So then, the HTTP server on the other side saw the > wrong body length and truncated the body when reading the request. Or > something along these lines. So this is not a bug in Emacs, but a diagnostic facility to let bugs in applications be discovered? > > IME, neither string-to-unibyte not > > string-as-unibyte should ever be used in applications, their use is > > more often than not a sign of some basic misunderstanding of text > > encoding. For starters, how come 8-bit bytes wind up in that > > function, and what do they stand for? > > Some 8-byte encoding of the HTTP request body. > > Anyway, yes, the hope is that the programmer uses something like > encode-coding-string to produce that value (and picks the encoding, and > indicates it in the appropriate HTTP header). Then string-to-unibyte > will simply be a no-op. But we need to catch the case when they don't, > and this seems to be the easiest way to do this. If this is what you need, why not simply test the payload for being a unibyte string? There a function, multibyte-string-p, for that. ^ permalink raw reply [flat|nested] 125+ messages in thread
* bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-06-20 2:40 ` Eli Zaretskii @ 2016-06-20 2:51 ` Dmitry Gutov 2016-06-20 14:38 ` Eli Zaretskii 0 siblings, 1 reply; 125+ messages in thread From: Dmitry Gutov @ 2016-06-20 2:51 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 23750, monnier, sdl.web On 06/20/2016 05:40 AM, Eli Zaretskii wrote: > So this is not a bug in Emacs, but a diagnostic facility to let bugs > in applications be discovered? It's a bug. Accepting invalid input and behaving badly with it is definitely a bug. > If this is what you need, why not simply test the payload for being a > unibyte string? There a function, multibyte-string-p, for that. There are a lot of variables to test (see the comment above the mapconcat call). I'm fine either way, but my patch changes two characters, and yours will be longer. And you'll have to come up with the error message(s). ^ permalink raw reply [flat|nested] 125+ messages in thread
* bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-06-20 2:51 ` Dmitry Gutov @ 2016-06-20 14:38 ` Eli Zaretskii 2016-06-20 14:54 ` Dmitry Gutov 2016-06-20 17:16 ` Dmitry Gutov 0 siblings, 2 replies; 125+ messages in thread From: Eli Zaretskii @ 2016-06-20 14:38 UTC (permalink / raw) To: Dmitry Gutov; +Cc: 23750, monnier, sdl.web > Cc: 23750@debbugs.gnu.org, monnier@IRO.UMontreal.CA, sdl.web@gmail.com > From: Dmitry Gutov <dgutov@yandex.ru> > Date: Mon, 20 Jun 2016 05:51:06 +0300 This all sounds like my response is not welcome, but in that case why did you ask the question? Anyway: > So this is not a bug in Emacs, but a diagnostic facility to let bugs > in applications be discovered? > > It's a bug. Accepting invalid input and behaving badly with it is definitely a bug. No, the bug is where the invalid input is generated in the first place. Each API has its contract; if you violate the contract, you invoke undefined behavior. > If this is what you need, why not simply test the payload for being a > unibyte string? There a function, multibyte-string-p, for that. > > There are a lot of variables to test (see the comment above the mapconcat call). Looks like mapc will be able to deal with that. Or just use concat, and test the result with multibyte-string-p before sending. Or encode it with UTF-8, if it is not unibyte already. Btw, I don't think the comment which explains why we started using mapconcat is accurate these days. It was written before the move to Unicode in Emacs 23, but we stopped converting raw bytes into Latin-1 characters in Emacs 23 and later. So maybe we should just go back to using concat (with erroring out, if the result is multibyte, and/or maybe with replacing 'length' with 'string-bytes'). Bottom line: like I said, there should be no reason to use string-*-unibyte in modern Emacs code on the url-http level or higher (maybe not at all). Its use is a sign of some basic misunderstanding, or a bug elsewhere, or remnant of old problems that no longer exist. So I think we should reconsider the solution on master as well. > I'm fine either way, but my patch changes two characters, and yours will be longer. I don't think the quality of a change should be judged by the number of characters in the patch. That is a very strange criterion, to say the least. It would mean, for example, that changes with comments are worse than changes without comments, or that saving newlines in C code (which makes the code less readable) is a virtue. > And you'll have to come up with the error message(s). Are you saying you like the error message from string-to-unibyte? Cannot convert 123th character to unibyte Doesn't really strike me as something that a user or an average developer will understand. I thought you wanted something more human-readable, like Invalid multibyte text in HTTP request %s ^ permalink raw reply [flat|nested] 125+ messages in thread
* bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-06-20 14:38 ` Eli Zaretskii @ 2016-06-20 14:54 ` Dmitry Gutov 2016-06-20 15:03 ` Eli Zaretskii 2016-06-20 17:16 ` Dmitry Gutov 1 sibling, 1 reply; 125+ messages in thread From: Dmitry Gutov @ 2016-06-20 14:54 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 23750, monnier, sdl.web On 06/20/2016 05:38 PM, Eli Zaretskii wrote: > This all sounds like my response is not welcome, but in that case why > did you ask the question? I was kind of hoping for "yes, let's get it into 25.1!"? :) > No, the bug is where the invalid input is generated in the first > place. Each API has its contract; if you violate the contract, you > invoke undefined behavior. It's a bug in the API, or bad API, if you will. It needs stricter contract, and the submitted patch added it. Or to look at it another way, the current contract allows url-http-data to be multibyte, because the requirement to the contrary is not documented anywhere that I can see. The variable is simply undocumented. >> If this is what you need, why not simply test the payload for being a >> unibyte string? There a function, multibyte-string-p, for that. >> >> There are a lot of variables to test (see the comment above the mapconcat call). > > Looks like mapc will be able to deal with that. Or just use concat, > and test the result with multibyte-string-p before sending. Or encode > it with UTF-8, if it is not unibyte already. I don't know if we want to be that permissive that we'll encode to UTF-8 silently. > Btw, I don't think the comment which explains why we started using > mapconcat is accurate these days. It was written before the move to > Unicode in Emacs 23, but we stopped converting raw bytes into Latin-1 > characters in Emacs 23 and later. So maybe we should just go back to > using concat (with erroring out, if the result is multibyte, and/or > maybe with replacing 'length' with 'string-bytes'). Better error out: the payload's encoding is something only the caller should be concerned with. Unless we're fine with the users assuming that Emacs's internal encoding is close enough to UTF-8. > Bottom line: like I said, there should be no reason to use > string-*-unibyte in modern Emacs code on the url-http level or higher > (maybe not at all). Its use is a sign of some basic misunderstanding, > or a bug elsewhere, or remnant of old problems that no longer exist. > So I think we should reconsider the solution on master as well. I don't mind. Would you advocate for having this fix on emacs-25 if I implement it the way you described? >> And you'll have to come up with the error message(s). > > Are you saying you like the error message from string-to-unibyte? > > Cannot convert 123th character to unibyte It's an order of magnitude better than what was before (no error and silent corruption), but yes, there is space for improvement. ^ permalink raw reply [flat|nested] 125+ messages in thread
* bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-06-20 14:54 ` Dmitry Gutov @ 2016-06-20 15:03 ` Eli Zaretskii 0 siblings, 0 replies; 125+ messages in thread From: Eli Zaretskii @ 2016-06-20 15:03 UTC (permalink / raw) To: Dmitry Gutov; +Cc: 23750, monnier, sdl.web > Cc: 23750@debbugs.gnu.org, monnier@IRO.UMontreal.CA, sdl.web@gmail.com > From: Dmitry Gutov <dgutov@yandex.ru> > Date: Mon, 20 Jun 2016 17:54:23 +0300 > > On 06/20/2016 05:38 PM, Eli Zaretskii wrote: > > > This all sounds like my response is not welcome, but in that case why > > did you ask the question? > > I was kind of hoping for "yes, let's get it into 25.1!"? :) I'm not that kind of guy, as you know ;-) > > Bottom line: like I said, there should be no reason to use > > string-*-unibyte in modern Emacs code on the url-http level or higher > > (maybe not at all). Its use is a sign of some basic misunderstanding, > > or a bug elsewhere, or remnant of old problems that no longer exist. > > So I think we should reconsider the solution on master as well. > > I don't mind. Would you advocate for having this fix on emacs-25 if I > implement it the way you described? A single test and an error message is safe enough to go to emacs-25, yes. ^ permalink raw reply [flat|nested] 125+ messages in thread
* bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-06-20 14:38 ` Eli Zaretskii 2016-06-20 14:54 ` Dmitry Gutov @ 2016-06-20 17:16 ` Dmitry Gutov 2016-06-20 20:17 ` Eli Zaretskii 1 sibling, 1 reply; 125+ messages in thread From: Dmitry Gutov @ 2016-06-20 17:16 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 23750, monnier, sdl.web On 06/20/2016 05:38 PM, Eli Zaretskii wrote: > Or just use concat, > and test the result with multibyte-string-p before sending. Actually, here's a reason why we might prefer not to replace string-as/to-unibyte with multibyte-string-p: string-to-unibyte works fine if the string's contents only contain ASCII/8-bit characters, even if the string itself is multibyte. But multibyte-string-p returns nil for such strings anyway. So doing like you suggest might make some (arguably not well-written) programs fail, which otherwise could function fine, provided they only operate on ASCII strings. And having a multibyte string with ASCII-only contents is fairly common when the string is produced with buffer-substring from a source code buffer. While it might be good to discourage this kind of programming practice (that doesn't handle non-ASCII text properly), it seems like this would be better for master rather that the impending release. WDYT? ^ permalink raw reply [flat|nested] 125+ messages in thread
* bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-06-20 17:16 ` Dmitry Gutov @ 2016-06-20 20:17 ` Eli Zaretskii 2016-06-20 20:27 ` Dmitry Gutov 0 siblings, 1 reply; 125+ messages in thread From: Eli Zaretskii @ 2016-06-20 20:17 UTC (permalink / raw) To: Dmitry Gutov; +Cc: 23750, monnier, sdl.web > Cc: 23750@debbugs.gnu.org, monnier@IRO.UMontreal.CA, sdl.web@gmail.com > From: Dmitry Gutov <dgutov@yandex.ru> > Date: Mon, 20 Jun 2016 20:16:37 +0300 > > On 06/20/2016 05:38 PM, Eli Zaretskii wrote: > > > Or just use concat, > > and test the result with multibyte-string-p before sending. > > Actually, here's a reason why we might prefer not to replace > string-as/to-unibyte with multibyte-string-p: string-to-unibyte works > fine if the string's contents only contain ASCII/8-bit characters, even > if the string itself is multibyte. But multibyte-string-p returns nil > for such strings anyway. We can replace the call to multibyte-string-p with a comparison of what 'length' and 'string-bytes' return. That should overcome this issue. ^ permalink raw reply [flat|nested] 125+ messages in thread
* bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-06-20 20:17 ` Eli Zaretskii @ 2016-06-20 20:27 ` Dmitry Gutov 2016-06-21 2:30 ` Eli Zaretskii 0 siblings, 1 reply; 125+ messages in thread From: Dmitry Gutov @ 2016-06-20 20:27 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 23750, monnier, sdl.web On 06/20/2016 11:17 PM, Eli Zaretskii wrote: > We can replace the call to multibyte-string-p with a comparison of > what 'length' and 'string-bytes' return. That should overcome this > issue. Why not just call string-to-unibyte? To you expect different results? ^ permalink raw reply [flat|nested] 125+ messages in thread
* bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-06-20 20:27 ` Dmitry Gutov @ 2016-06-21 2:30 ` Eli Zaretskii 2016-06-21 13:51 ` Dmitry Gutov 0 siblings, 1 reply; 125+ messages in thread From: Eli Zaretskii @ 2016-06-21 2:30 UTC (permalink / raw) To: Dmitry Gutov; +Cc: 23750, monnier, sdl.web > Cc: 23750@debbugs.gnu.org, monnier@IRO.UMontreal.CA, sdl.web@gmail.com > From: Dmitry Gutov <dgutov@yandex.ru> > Date: Mon, 20 Jun 2016 23:27:01 +0300 > > On 06/20/2016 11:17 PM, Eli Zaretskii wrote: > > > We can replace the call to multibyte-string-p with a comparison of > > what 'length' and 'string-bytes' return. That should overcome this > > issue. > > Why not just call string-to-unibyte? Because (a) I don't want to see that function in our sources, ever, and (b) you don't have any control on the error message it produces, which is not appropriate for application-level checks. ^ permalink raw reply [flat|nested] 125+ messages in thread
* bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-06-21 2:30 ` Eli Zaretskii @ 2016-06-21 13:51 ` Dmitry Gutov 2016-06-21 15:18 ` Eli Zaretskii 0 siblings, 1 reply; 125+ messages in thread From: Dmitry Gutov @ 2016-06-21 13:51 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 23750, monnier, sdl.web [-- Attachment #1: Type: text/plain, Size: 368 bytes --] On 06/21/2016 05:30 AM, Eli Zaretskii wrote: > Because (a) I don't want to see that function in our sources, ever, > and (b) you don't have any control on the error message it produces, > which is not appropriate for application-level checks. Please take a look at the attachment. OK to install? I recall John saying we shouldn't push any more changes to emacs-25. [-- Attachment #2: url-http-multibyte.diff --] [-- Type: text/x-patch, Size: 1588 bytes --] diff --git a/lisp/url/url-http.el b/lisp/url/url-http.el index 5832e92..7156e6f 100644 --- a/lisp/url/url-http.el +++ b/lisp/url/url-http.el @@ -275,19 +275,7 @@ url-http-create-request ;; allows us to elide null lines directly, at the cost of making ;; the layout less clear. (setq request - ;; We used to concat directly, but if one of the strings happens - ;; to being multibyte (even if it only contains pure ASCII) then - ;; every string gets converted with `string-MAKE-multibyte' which - ;; turns the 127-255 codes into things like latin-1 accented chars - ;; (it would work right if it used `string-TO-multibyte' instead). - ;; So to avoid the problem we force every string to be unibyte. - (mapconcat - ;; FIXME: Instead of `string-AS-unibyte' we'd want - ;; `string-to-unibyte', so as to properly signal an error if one - ;; of the strings contains a multibyte char. - 'string-as-unibyte - (delq nil - (list + (concat ;; The request (or url-http-method "GET") " " (if using-proxy (url-recreate-url url-http-target-url) real-fname) @@ -365,7 +353,10 @@ url-http-create-request "\r\n" ;; Any data url-http-data)) - "")) + ;; Bug#23750 + (unless (= (string-bytes request) + (length request)) + (error "Multibyte text in HTTP request: %s" request)) (url-http-debug "Request is: \n%s" request) request)) ^ permalink raw reply related [flat|nested] 125+ messages in thread
* bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-06-21 13:51 ` Dmitry Gutov @ 2016-06-21 15:18 ` Eli Zaretskii 2016-06-22 1:08 ` John Wiegley 0 siblings, 1 reply; 125+ messages in thread From: Eli Zaretskii @ 2016-06-21 15:18 UTC (permalink / raw) To: Dmitry Gutov, John Wiegley; +Cc: 23750, monnier, sdl.web > Cc: 23750@debbugs.gnu.org, monnier@IRO.UMontreal.CA, sdl.web@gmail.com > From: Dmitry Gutov <dgutov@yandex.ru> > Date: Tue, 21 Jun 2016 16:51:59 +0300 > > > Because (a) I don't want to see that function in our sources, ever, > > and (b) you don't have any control on the error message it produces, > > which is not appropriate for application-level checks. > > Please take a look at the attachment. OK to install? Yes, but let's wait for John. > I recall John saying we shouldn't push any more changes to emacs-25. He did? John, this change is IMO safe for emacs-25. Is it OK to push there? Thanks. ^ permalink raw reply [flat|nested] 125+ messages in thread
* bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-06-21 15:18 ` Eli Zaretskii @ 2016-06-22 1:08 ` John Wiegley 2016-06-22 2:36 ` Eli Zaretskii 2016-06-22 18:21 ` Dmitry Gutov 0 siblings, 2 replies; 125+ messages in thread From: John Wiegley @ 2016-06-22 1:08 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 23750, Dmitry Gutov, sdl.web, monnier [-- Attachment #1: Type: text/plain, Size: 335 bytes --] >>>>> Eli Zaretskii <eliz@gnu.org> writes: > He did? John, this change is IMO safe for emacs-25. Is it OK to push there? If you think it's safe, Eli, then I'm good with it. -- John Wiegley GPG fingerprint = 4710 CF98 AF9B 327B B80F http://newartisans.com 60E1 46C4 BD1A 7AC1 4BA2 [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 629 bytes --] ^ permalink raw reply [flat|nested] 125+ messages in thread
* bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-06-22 1:08 ` John Wiegley @ 2016-06-22 2:36 ` Eli Zaretskii 2016-06-22 18:21 ` Dmitry Gutov 1 sibling, 0 replies; 125+ messages in thread From: Eli Zaretskii @ 2016-06-22 2:36 UTC (permalink / raw) To: John Wiegley; +Cc: 23750, dgutov, sdl.web, monnier > From: John Wiegley <jwiegley@gmail.com> > Cc: Dmitry Gutov <dgutov@yandex.ru>, 23750@debbugs.gnu.org, monnier@IRO.UMontreal.CA, sdl.web@gmail.com > Date: Tue, 21 Jun 2016 18:08:44 -0700 > > >>>>> Eli Zaretskii <eliz@gnu.org> writes: > > > He did? John, this change is IMO safe for emacs-25. Is it OK to push there? > > If you think it's safe, Eli, then I'm good with it. OK, thanks. ^ permalink raw reply [flat|nested] 125+ messages in thread
* bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-06-22 1:08 ` John Wiegley 2016-06-22 2:36 ` Eli Zaretskii @ 2016-06-22 18:21 ` Dmitry Gutov 1 sibling, 0 replies; 125+ messages in thread From: Dmitry Gutov @ 2016-06-22 18:21 UTC (permalink / raw) To: John Wiegley, Eli Zaretskii; +Cc: 23750-done, sdl.web, monnier On 06/22/2016 04:08 AM, John Wiegley wrote: > If you think it's safe, Eli, then I'm good with it. Thanks! Pushed, and closing. ^ permalink raw reply [flat|nested] 125+ messages in thread
* bug#23750: 25.0.95; bug in url-retrieve or json.el @ 2016-11-29 8:22 Kentaro NAKAZAWA 2016-11-29 9:54 ` Andreas Schwab 0 siblings, 1 reply; 125+ messages in thread From: Kentaro NAKAZAWA @ 2016-11-29 8:22 UTC (permalink / raw) To: dgutov, emacs-devel Why can not I use multibyte text for http requests? The following correct http request will fail. (require 'json) (let* ((content "ほげ <- VALID utf-8 Japanese multibyte text") (url "https://api.github.com/gists") (url-request-method "POST") (url-request-data (json-encode `(("description" . "test") ("public" . false) ("files" . (("test.txt" . (("content" . ,content))))))))) (with-current-buffer (url-retrieve-synchronously url) (buffer-string))) => url-http-create-request: Multibyte text in HTTP request: POST /gists HTTP/1.1 Please apply the following patch. --- url-http.el.orig 2016-09-15 17:16:04.000000000 +0900 +++ url-http.el 2016-11-29 17:10:57.018703500 +0900 @@ -351,16 +351,12 @@ (if url-http-data (concat "Content-length: " (number-to-string - (length url-http-data)) + (string-bytes url-http-data)) "\r\n")) ;; End request "\r\n" ;; Any data url-http-data)) - ;; Bug#23750 - (unless (= (string-bytes request) - (length request)) - (error "Multibyte text in HTTP request: %s" request)) (url-http-debug "Request is: \n%s" request) request)) ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-11-29 8:22 Kentaro NAKAZAWA @ 2016-11-29 9:54 ` Andreas Schwab 2016-11-29 10:06 ` Kentaro NAKAZAWA 0 siblings, 1 reply; 125+ messages in thread From: Andreas Schwab @ 2016-11-29 9:54 UTC (permalink / raw) To: Kentaro NAKAZAWA; +Cc: emacs-devel, dgutov On Nov 29 2016, Kentaro NAKAZAWA <kentaro.nakazawa@nifty.com> wrote: > Why can not I use multibyte text for http requests? You need to encode it. Andreas. -- Andreas Schwab, SUSE Labs, schwab@suse.de GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE 1748 E4D4 88E3 0EEA B9D7 "And now for something completely different." ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-11-29 9:54 ` Andreas Schwab @ 2016-11-29 10:06 ` Kentaro NAKAZAWA 2016-11-29 10:08 ` Dmitry Gutov 0 siblings, 1 reply; 125+ messages in thread From: Kentaro NAKAZAWA @ 2016-11-29 10:06 UTC (permalink / raw) To: Andreas Schwab; +Cc: emacs-devel, dgutov On 2016/11/29 18:54, Andreas Schwab wrote: > You need to encode it. The text is encoded with utf-8. The correct utf-8 text also contains multibyte text. (Multibyte text is (/= (string-bytes text) (length text)) => t) How can I correctly POST multibyte text? ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-11-29 10:06 ` Kentaro NAKAZAWA @ 2016-11-29 10:08 ` Dmitry Gutov 2016-11-29 10:23 ` Kentaro NAKAZAWA 0 siblings, 1 reply; 125+ messages in thread From: Dmitry Gutov @ 2016-11-29 10:08 UTC (permalink / raw) To: Kentaro NAKAZAWA, Andreas Schwab; +Cc: emacs-devel On 29.11.2016 12:06, Kentaro NAKAZAWA wrote: > The text is encoded with utf-8. > The correct utf-8 text also contains multibyte text. > (Multibyte text is (/= (string-bytes text) (length text)) => t) > > How can I correctly POST multibyte text? You encode it to a unibyte string using encode-coding-string. ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-11-29 10:08 ` Dmitry Gutov @ 2016-11-29 10:23 ` Kentaro NAKAZAWA 2016-11-29 10:34 ` Lars Ingebrigtsen 0 siblings, 1 reply; 125+ messages in thread From: Kentaro NAKAZAWA @ 2016-11-29 10:23 UTC (permalink / raw) To: Dmitry Gutov, Andreas Schwab; +Cc: emacs-devel On 2016/11/29 19:08, Dmitry Gutov wrote: > You encode it to a unibyte string using encode-coding-string. (let* ((content (encode-coding-string "ほげ <- VALID utf-8 Japanese multibyte text" 'us-ascii)) => The following text was POSTed. ?? <- VALID utf-8 Japanese multibyte text ^^Two question marks (let* ((content (encode-coding-string "ほげ <- VALID utf-8 Japanese multibyte text" 'raw-text)) => url-http-create-request: Multibyte text in HTTP request: POST /gists HTTP/1.1 I tried various things but I do not know how to do it ... ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-11-29 10:23 ` Kentaro NAKAZAWA @ 2016-11-29 10:34 ` Lars Ingebrigtsen 2016-11-29 10:38 ` Kentaro NAKAZAWA 0 siblings, 1 reply; 125+ messages in thread From: Lars Ingebrigtsen @ 2016-11-29 10:34 UTC (permalink / raw) To: Kentaro NAKAZAWA; +Cc: emacs-devel Kentaro NAKAZAWA <kentaro.nakazawa@nifty.com> writes: > (let* ((content (encode-coding-string > "ほげ <- VALID utf-8 Japanese multibyte text" > 'us-ascii)) Use (encode-coding-string "ほげ <- VALID utf-8 Japanese multibyte text" 'utf-8) -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-11-29 10:34 ` Lars Ingebrigtsen @ 2016-11-29 10:38 ` Kentaro NAKAZAWA 2016-11-29 10:42 ` Lars Ingebrigtsen 2016-11-29 10:50 ` Dmitry Gutov 0 siblings, 2 replies; 125+ messages in thread From: Kentaro NAKAZAWA @ 2016-11-29 10:38 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: emacs-devel On 2016/11/29 19:34, Lars Ingebrigtsen wrote: > (encode-coding-string "ほげ <- VALID utf-8 Japanese multibyte text" 'utf-8) => url-http-create-request: Multibyte text in HTTP request: POST /gists HTTP/1.1 It is the same result. ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-11-29 10:38 ` Kentaro NAKAZAWA @ 2016-11-29 10:42 ` Lars Ingebrigtsen 2016-11-29 10:48 ` Kentaro NAKAZAWA 2016-11-29 10:49 ` Dmitry Gutov 2016-11-29 10:50 ` Dmitry Gutov 1 sibling, 2 replies; 125+ messages in thread From: Lars Ingebrigtsen @ 2016-11-29 10:42 UTC (permalink / raw) To: Kentaro NAKAZAWA; +Cc: emacs-devel Kentaro NAKAZAWA <kentaro.nakazawa@nifty.com> writes: > On 2016/11/29 19:34, Lars Ingebrigtsen wrote: > >> (encode-coding-string "ほげ <- VALID utf-8 Japanese multibyte text" 'utf-8) > > => url-http-create-request: Multibyte text in HTTP request: POST /gists > HTTP/1.1 > > It is the same result. Uhm... how about (string-as-unibyte (encode-coding-string "ほげ <- VALID utf-8 Japanese multibyte text" 'utf-8)) -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-11-29 10:42 ` Lars Ingebrigtsen @ 2016-11-29 10:48 ` Kentaro NAKAZAWA 2016-11-29 10:49 ` Dmitry Gutov 1 sibling, 0 replies; 125+ messages in thread From: Kentaro NAKAZAWA @ 2016-11-29 10:48 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: emacs-devel On 2016/11/29 19:42, Lars Ingebrigtsen wrote: > (string-as-unibyte > (encode-coding-string "ほげ <- VALID utf-8 Japanese multibyte text" 'utf-8)) => url-http-create-request: Multibyte text in HTTP request: POST /gists HTTP/1.1 This is also the same result... ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-11-29 10:42 ` Lars Ingebrigtsen 2016-11-29 10:48 ` Kentaro NAKAZAWA @ 2016-11-29 10:49 ` Dmitry Gutov 1 sibling, 0 replies; 125+ messages in thread From: Dmitry Gutov @ 2016-11-29 10:49 UTC (permalink / raw) To: Lars Ingebrigtsen, Kentaro NAKAZAWA; +Cc: emacs-devel On 29.11.2016 12:42, Lars Ingebrigtsen wrote: > (string-as-unibyte > (encode-coding-string "ほげ <- VALID utf-8 Japanese multibyte text" 'utf-8)) That shouldn't be necessary. ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-11-29 10:38 ` Kentaro NAKAZAWA 2016-11-29 10:42 ` Lars Ingebrigtsen @ 2016-11-29 10:50 ` Dmitry Gutov 2016-11-29 10:55 ` Kentaro NAKAZAWA 1 sibling, 1 reply; 125+ messages in thread From: Dmitry Gutov @ 2016-11-29 10:50 UTC (permalink / raw) To: Kentaro NAKAZAWA, Lars Ingebrigtsen; +Cc: emacs-devel On 29.11.2016 12:38, Kentaro NAKAZAWA wrote: > On 2016/11/29 19:34, Lars Ingebrigtsen wrote: > >> (encode-coding-string "ほげ <- VALID utf-8 Japanese multibyte text" 'utf-8) > > => url-http-create-request: Multibyte text in HTTP request: POST /gists > HTTP/1.1 > > It is the same result. Do you have a full example to reproduce this? ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-11-29 10:50 ` Dmitry Gutov @ 2016-11-29 10:55 ` Kentaro NAKAZAWA 2016-11-29 10:59 ` Dmitry Gutov 0 siblings, 1 reply; 125+ messages in thread From: Kentaro NAKAZAWA @ 2016-11-29 10:55 UTC (permalink / raw) To: Dmitry Gutov, Lars Ingebrigtsen; +Cc: emacs-devel On 2016/11/29 19:50, Dmitry Gutov wrote: > Do you have a full example to reproduce this? (require 'json) (let* ((content "ほげ <- VALID utf-8 Japanese multibyte text") (url "https://api.github.com/gists") (url-request-method "POST") (url-request-data (json-encode `(("description" . "test") ("public" . false) ("files" . (("test.txt" . (("content" . ,content))))))))) (with-current-buffer (url-retrieve-synchronously url) (buffer-string))) Evaluate the above by *scratch* and post it to private anonymous gist. ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-11-29 10:55 ` Kentaro NAKAZAWA @ 2016-11-29 10:59 ` Dmitry Gutov 2016-11-29 11:03 ` Kentaro NAKAZAWA 0 siblings, 1 reply; 125+ messages in thread From: Dmitry Gutov @ 2016-11-29 10:59 UTC (permalink / raw) To: Kentaro NAKAZAWA, Lars Ingebrigtsen; +Cc: emacs-devel On 29.11.2016 12:55, Kentaro NAKAZAWA wrote: > On 2016/11/29 19:50, Dmitry Gutov wrote: > >> Do you have a full example to reproduce this? > > (require 'json) > (let* ((content "ほげ <- VALID utf-8 Japanese multibyte text") > (url "https://api.github.com/gists") > (url-request-method "POST") > (url-request-data > (json-encode > `(("description" . "test") > ("public" . false) > ("files" . (("test.txt" . (("content" . ,content))))))))) > (with-current-buffer (url-retrieve-synchronously url) > (buffer-string))) Where is the encode-coding-string call? ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-11-29 10:59 ` Dmitry Gutov @ 2016-11-29 11:03 ` Kentaro NAKAZAWA 2016-11-29 11:05 ` Dmitry Gutov 0 siblings, 1 reply; 125+ messages in thread From: Kentaro NAKAZAWA @ 2016-11-29 11:03 UTC (permalink / raw) To: Dmitry Gutov, Lars Ingebrigtsen; +Cc: emacs-devel On 2016/11/29 19:59, Dmitry Gutov wrote: > Where is the encode-coding-string call? Sorry, this is it. (let* ((content (encode-coding-string "ほげ <- VALID utf-8 Japanese multibyte text" 'utf-8)) (url "https://api.github.com/gists") (url-request-method "POST") (url-request-data (json-encode `(("description" . "test") ("public" . false) ("files" . (("test.txt" . (("content" . ,content))))))))) (with-current-buffer (url-retrieve-synchronously url) (buffer-string))) ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-11-29 11:03 ` Kentaro NAKAZAWA @ 2016-11-29 11:05 ` Dmitry Gutov 2016-11-29 11:12 ` Kentaro NAKAZAWA 2016-11-29 17:23 ` Eli Zaretskii 0 siblings, 2 replies; 125+ messages in thread From: Dmitry Gutov @ 2016-11-29 11:05 UTC (permalink / raw) To: Kentaro NAKAZAWA, Lars Ingebrigtsen; +Cc: emacs-devel On 29.11.2016 13:03, Kentaro NAKAZAWA wrote: > (let* ((content (encode-coding-string > "ほげ <- VALID utf-8 Japanese multibyte text" > 'utf-8)) > (url "https://api.github.com/gists") > (url-request-method "POST") > (url-request-data > (json-encode > `(("description" . "test") > ("public" . false) > ("files" . (("test.txt" . (("content" . ,content))))))))) > (with-current-buffer (url-retrieve-synchronously url) > (buffer-string))) json-encode returns a multibyte string. Try this: (let* ((content "ほげ <- VALID utf-8 Japanese multibyte text") (url "https://api.github.com/gists") (url-request-method "POST") (url-request-data (encode-coding-string (json-encode `(("description" . "test") ("public" . false) ("files" . (("test.txt" . (("content" . ,content))))))) 'utf-8))) (with-current-buffer (url-retrieve-synchronously url) (buffer-string))) ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-11-29 11:05 ` Dmitry Gutov @ 2016-11-29 11:12 ` Kentaro NAKAZAWA 2016-11-29 17:23 ` Eli Zaretskii 1 sibling, 0 replies; 125+ messages in thread From: Kentaro NAKAZAWA @ 2016-11-29 11:12 UTC (permalink / raw) To: Dmitry Gutov, Lars Ingebrigtsen; +Cc: emacs-devel On 2016/11/29 20:05, Dmitry Gutov wrote: > json-encode returns a multibyte string. Try this: It worked! Thank you for telling me the correct code! I confirmed the correct result below. (let* ((content "ほげ <- VALID utf-8 Japanese multibyte text") (url "https://api.github.com/gists") (url-request-method "POST") (url-request-data (encode-coding-string (json-encode `(("description" . "test") ("public" . false) ("files" . (("test.txt" . (("content" . ,content))))))) 'utf-8))) (with-current-buffer (url-retrieve-synchronously url) (when (url-http-parse-headers) (search-forward-regexp "\n\\s-*\n" nil t) (browse-url (cdr (assoc 'html_url (json-read))))))) ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-11-29 11:05 ` Dmitry Gutov 2016-11-29 11:12 ` Kentaro NAKAZAWA @ 2016-11-29 17:23 ` Eli Zaretskii 2016-11-29 23:09 ` Philipp Stephani 1 sibling, 1 reply; 125+ messages in thread From: Eli Zaretskii @ 2016-11-29 17:23 UTC (permalink / raw) To: Dmitry Gutov; +Cc: larsi, kentaro.nakazawa, emacs-devel > From: Dmitry Gutov <dgutov@yandex.ru> > Date: Tue, 29 Nov 2016 13:05:39 +0200 > Cc: emacs-devel@gnu.org > > On 29.11.2016 13:03, Kentaro NAKAZAWA wrote: > > > (let* ((content (encode-coding-string > > "ほげ <- VALID utf-8 Japanese multibyte text" > > 'utf-8)) > > (url "https://api.github.com/gists") > > (url-request-method "POST") > > (url-request-data > > (json-encode > > `(("description" . "test") > > ("public" . false) > > ("files" . (("test.txt" . (("content" . ,content))))))))) > > (with-current-buffer (url-retrieve-synchronously url) > > (buffer-string))) > > json-encode returns a multibyte string. Any idea why? Is it again that 'concat' misfeature, when one of the strings is pure-ASCII, but happens to be multibyte? Maybe we should do something about that. Thanks. ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-11-29 17:23 ` Eli Zaretskii @ 2016-11-29 23:09 ` Philipp Stephani 2016-11-29 23:18 ` Philipp Stephani ` (2 more replies) 0 siblings, 3 replies; 125+ messages in thread From: Philipp Stephani @ 2016-11-29 23:09 UTC (permalink / raw) To: Eli Zaretskii, Dmitry Gutov; +Cc: larsi, kentaro.nakazawa, emacs-devel [-- Attachment #1: Type: text/plain, Size: 1622 bytes --] Eli Zaretskii <eliz@gnu.org> schrieb am Di., 29. Nov. 2016 um 18:24 Uhr: > > From: Dmitry Gutov <dgutov@yandex.ru> > > Date: Tue, 29 Nov 2016 13:05:39 +0200 > > Cc: emacs-devel@gnu.org > > > > On 29.11.2016 13:03, Kentaro NAKAZAWA wrote: > > > > > (let* ((content (encode-coding-string > > > "ほげ <- VALID utf-8 Japanese multibyte text" > > > 'utf-8)) > > > (url "https://api.github.com/gists") > > > (url-request-method "POST") > > > (url-request-data > > > (json-encode > > > `(("description" . "test") > > > ("public" . false) > > > ("files" . (("test.txt" . (("content" . ,content))))))))) > > > (with-current-buffer (url-retrieve-synchronously url) > > > (buffer-string))) > > > > json-encode returns a multibyte string. > > Any idea why? Because (symbol-name 'false) returns a multibyte string. I guess the ultimate reason is that the reader always creates multibyte strings for symbol names. > Is it again that 'concat' misfeature, when one of the > strings is pure-ASCII, but happens to be multibyte? Why is it a misfeature? I'd expect a concatenation of multibyte and unibyte strings to either implicitly upgrade to as multibyte string (as in Python 2) or raise a signal (as in Python 3). That url-retrieve breaks in this case is unfortunate, but I guess we can't do much about it without breaking other stuff. Maybe the behavior regarding unibyte and multibyte strings (e.g. what kinds of strings the reader and `concat' generate) should simply be documented. [-- Attachment #2: Type: text/html, Size: 2984 bytes --] ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-11-29 23:09 ` Philipp Stephani @ 2016-11-29 23:18 ` Philipp Stephani 2016-11-30 15:11 ` Eli Zaretskii 2016-11-30 0:16 ` Dmitry Gutov 2016-11-30 15:06 ` Eli Zaretskii 2 siblings, 1 reply; 125+ messages in thread From: Philipp Stephani @ 2016-11-29 23:18 UTC (permalink / raw) To: Eli Zaretskii, Dmitry Gutov; +Cc: larsi, kentaro.nakazawa, emacs-devel [-- Attachment #1: Type: text/plain, Size: 327 bytes --] Philipp Stephani <p.stephani2@gmail.com> schrieb am Mi., 30. Nov. 2016 um 00:09 Uhr: > That url-retrieve breaks in this case is unfortunate, but I guess we can't > do much about it without breaking other stuff. > Ah, I guess the URL functions could simply call string-to-unibyte, that should do the right thing in all cases. [-- Attachment #2: Type: text/html, Size: 706 bytes --] ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-11-29 23:18 ` Philipp Stephani @ 2016-11-30 15:11 ` Eli Zaretskii 2016-11-30 15:20 ` Lars Ingebrigtsen 0 siblings, 1 reply; 125+ messages in thread From: Eli Zaretskii @ 2016-11-30 15:11 UTC (permalink / raw) To: Philipp Stephani; +Cc: larsi, emacs-devel, kentaro.nakazawa, dgutov > From: Philipp Stephani <p.stephani2@gmail.com> > Date: Tue, 29 Nov 2016 23:18:21 +0000 > Cc: larsi@gnus.org, kentaro.nakazawa@nifty.com, emacs-devel@gnu.org > > Ah, I guess the URL functions could simply call string-to-unibyte, that should do the right thing in all cases. That would bring back the problem which caused us to introduce the test which triggered this bug report. string-to-unibyte can produce results that might surprise naïve users, and it also can signal an error whose text is not fit for showing it to users. We are trying to avoid using that function, for these very reasons. ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-11-30 15:11 ` Eli Zaretskii @ 2016-11-30 15:20 ` Lars Ingebrigtsen 2016-11-30 15:43 ` Eli Zaretskii 0 siblings, 1 reply; 125+ messages in thread From: Lars Ingebrigtsen @ 2016-11-30 15:20 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Philipp Stephani, emacs-devel, kentaro.nakazawa, dgutov Eli Zaretskii <eliz@gnu.org> writes: > We are trying to avoid using that function, for these very reasons. Indeed. The entire url-retrieve interface is more than a little broken in many small ways. In the next-generation URL library interface (the `with-url' thing discussed intermittently the past few years) I think it would make sense to supply the caller with a method to say what charset you want stuff like this to be encoded with. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-11-30 15:20 ` Lars Ingebrigtsen @ 2016-11-30 15:43 ` Eli Zaretskii 2016-11-30 15:46 ` Lars Ingebrigtsen 0 siblings, 1 reply; 125+ messages in thread From: Eli Zaretskii @ 2016-11-30 15:43 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: p.stephani2, emacs-devel, kentaro.nakazawa, dgutov > From: Lars Ingebrigtsen <larsi@gnus.org> > Cc: Philipp Stephani <p.stephani2@gmail.com>, dgutov@yandex.ru, kentaro.nakazawa@nifty.com, emacs-devel@gnu.org > Date: Wed, 30 Nov 2016 16:20:20 +0100 > > In the next-generation URL library interface (the `with-url' thing > discussed intermittently the past few years) I think it would make sense > to supply the caller with a method to say what charset you want stuff > like this to be encoded with. Would they ever want anything except utf-8? ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-11-30 15:43 ` Eli Zaretskii @ 2016-11-30 15:46 ` Lars Ingebrigtsen 0 siblings, 0 replies; 125+ messages in thread From: Lars Ingebrigtsen @ 2016-11-30 15:46 UTC (permalink / raw) To: Eli Zaretskii; +Cc: p.stephani2, emacs-devel, kentaro.nakazawa, dgutov Eli Zaretskii <eliz@gnu.org> writes: > Would they ever want anything except utf-8? Standard HTTP values should be URL-encoded (or similar) anyway, so non-URL-encoded values are for pretty non-standard use. So I would expect people to create interfaces in whatever charset they happen to think of. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-11-29 23:09 ` Philipp Stephani 2016-11-29 23:18 ` Philipp Stephani @ 2016-11-30 0:16 ` Dmitry Gutov 2016-11-30 15:13 ` Eli Zaretskii 2016-11-30 15:06 ` Eli Zaretskii 2 siblings, 1 reply; 125+ messages in thread From: Dmitry Gutov @ 2016-11-30 0:16 UTC (permalink / raw) To: Philipp Stephani, Eli Zaretskii; +Cc: larsi, kentaro.nakazawa, emacs-devel On 30.11.2016 01:09, Philipp Stephani wrote: > Because (symbol-name 'false) returns a multibyte string. I guess the ultimate reason is that the reader always creates multibyte strings for symbol names. Yes. For the same reason, (json-encode-alist '((a . "abc"))) also returns a multibyte string. And we're likely to see symbols as keys a lot. ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-11-30 0:16 ` Dmitry Gutov @ 2016-11-30 15:13 ` Eli Zaretskii 2016-11-30 15:17 ` Dmitry Gutov 0 siblings, 1 reply; 125+ messages in thread From: Eli Zaretskii @ 2016-11-30 15:13 UTC (permalink / raw) To: Dmitry Gutov; +Cc: p.stephani2, emacs-devel, larsi, kentaro.nakazawa > Cc: larsi@gnus.org, kentaro.nakazawa@nifty.com, emacs-devel@gnu.org > From: Dmitry Gutov <dgutov@yandex.ru> > Date: Wed, 30 Nov 2016 02:16:36 +0200 > > On 30.11.2016 01:09, Philipp Stephani wrote: > > > Because (symbol-name 'false) returns a multibyte string. I guess the ultimate reason is that the reader always creates multibyte strings for symbol names. > > Yes. For the same reason, > > (json-encode-alist '((a . "abc"))) > > also returns a multibyte string. And we're likely to see symbols as keys > a lot. Can we do something about that in json-encode-* functions? ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-11-30 15:13 ` Eli Zaretskii @ 2016-11-30 15:17 ` Dmitry Gutov 2016-11-30 15:32 ` Stefan Monnier 2016-11-30 15:42 ` Eli Zaretskii 0 siblings, 2 replies; 125+ messages in thread From: Dmitry Gutov @ 2016-11-30 15:17 UTC (permalink / raw) To: Eli Zaretskii; +Cc: p.stephani2, emacs-devel, larsi, kentaro.nakazawa On 30.11.2016 17:13, Eli Zaretskii wrote: > Can we do something about that in json-encode-* functions? json-encode uses the previously mentioned symbol-name, which returns multibyte values. What would we do about that? ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-11-30 15:17 ` Dmitry Gutov @ 2016-11-30 15:32 ` Stefan Monnier 2016-11-30 15:42 ` Eli Zaretskii 1 sibling, 0 replies; 125+ messages in thread From: Stefan Monnier @ 2016-11-30 15:32 UTC (permalink / raw) To: emacs-devel >> Can we do something about that in json-encode-* functions? > json-encode uses the previously mentioned symbol-name, which returns > multibyte values. What would we do about that? We need to encode the symbol name since it's a plain string which can contain non-ASCII chars. Stefan ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-11-30 15:17 ` Dmitry Gutov 2016-11-30 15:32 ` Stefan Monnier @ 2016-11-30 15:42 ` Eli Zaretskii 2016-11-30 15:45 ` Dmitry Gutov 1 sibling, 1 reply; 125+ messages in thread From: Eli Zaretskii @ 2016-11-30 15:42 UTC (permalink / raw) To: Dmitry Gutov; +Cc: p.stephani2, emacs-devel, larsi, kentaro.nakazawa > Cc: p.stephani2@gmail.com, larsi@gnus.org, kentaro.nakazawa@nifty.com, > emacs-devel@gnu.org > From: Dmitry Gutov <dgutov@yandex.ru> > Date: Wed, 30 Nov 2016 17:17:18 +0200 > > On 30.11.2016 17:13, Eli Zaretskii wrote: > > > Can we do something about that in json-encode-* functions? > > json-encode uses the previously mentioned symbol-name, which returns > multibyte values. What would we do about that? Check that the value returned by symbol-name is pure-ASCII, and if so, make it unibyte? ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-11-30 15:42 ` Eli Zaretskii @ 2016-11-30 15:45 ` Dmitry Gutov 2016-11-30 15:48 ` Lars Ingebrigtsen 2016-11-30 16:23 ` Eli Zaretskii 0 siblings, 2 replies; 125+ messages in thread From: Dmitry Gutov @ 2016-11-30 15:45 UTC (permalink / raw) To: Eli Zaretskii; +Cc: p.stephani2, emacs-devel, larsi, kentaro.nakazawa On 30.11.2016 17:42, Eli Zaretskii wrote: >> json-encode uses the previously mentioned symbol-name, which returns >> multibyte values. What would we do about that? > > Check that the value returned by symbol-name is pure-ASCII, and if so, > make it unibyte? In json-encode? Should it really deal with that concern explicitly? I could understand an idea along the lines of "use a different algorithm", but calling encode-coding-string inside json-encode sounds odd. ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-11-30 15:45 ` Dmitry Gutov @ 2016-11-30 15:48 ` Lars Ingebrigtsen 2016-11-30 16:25 ` Eli Zaretskii 2016-12-28 18:22 ` Philipp Stephani 2016-11-30 16:23 ` Eli Zaretskii 1 sibling, 2 replies; 125+ messages in thread From: Lars Ingebrigtsen @ 2016-11-30 15:48 UTC (permalink / raw) To: Dmitry Gutov; +Cc: Eli Zaretskii, emacs-devel, p.stephani2, kentaro.nakazawa Dmitry Gutov <dgutov@yandex.ru> writes: > In json-encode? Should it really deal with that concern explicitly? > > I could understand an idea along the lines of "use a different > algorithm", but calling encode-coding-string inside json-encode sounds > odd. Yes, this is not a json.el problem at all. It does the correct thing, and shouldn't be changed. It's just url.el being lacking in features, as usual. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-11-30 15:48 ` Lars Ingebrigtsen @ 2016-11-30 16:25 ` Eli Zaretskii 2016-11-30 16:27 ` Lars Ingebrigtsen 2016-11-30 18:23 ` Philipp Stephani 2016-12-28 18:22 ` Philipp Stephani 1 sibling, 2 replies; 125+ messages in thread From: Eli Zaretskii @ 2016-11-30 16:25 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: p.stephani2, emacs-devel, kentaro.nakazawa, dgutov > From: Lars Ingebrigtsen <larsi@gnus.org> > Cc: Eli Zaretskii <eliz@gnu.org>, p.stephani2@gmail.com, kentaro.nakazawa@nifty.com, emacs-devel@gnu.org > Date: Wed, 30 Nov 2016 16:48:09 +0100 > > Yes, this is not a json.el problem at all. It does the correct thing, > and shouldn't be changed. ??? Why should any code care whether a pure-ASCII string is marked as unibyte or as multibyte? Both are "correct". ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-11-30 16:25 ` Eli Zaretskii @ 2016-11-30 16:27 ` Lars Ingebrigtsen 2016-11-30 16:42 ` Eli Zaretskii 2016-11-30 18:23 ` Philipp Stephani 1 sibling, 1 reply; 125+ messages in thread From: Lars Ingebrigtsen @ 2016-11-30 16:27 UTC (permalink / raw) To: Eli Zaretskii; +Cc: p.stephani2, emacs-devel, kentaro.nakazawa, dgutov Eli Zaretskii <eliz@gnu.org> writes: >> Yes, this is not a json.el problem at all. It does the correct thing, >> and shouldn't be changed. > > ??? Why should any code care whether a pure-ASCII string is marked as > unibyte or as multibyte? Both are "correct". That's right -- why should any code care? Yet url.el does. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-11-30 16:27 ` Lars Ingebrigtsen @ 2016-11-30 16:42 ` Eli Zaretskii 2016-11-30 18:25 ` Philipp Stephani 0 siblings, 1 reply; 125+ messages in thread From: Eli Zaretskii @ 2016-11-30 16:42 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: p.stephani2, emacs-devel, kentaro.nakazawa, dgutov > From: Lars Ingebrigtsen <larsi@gnus.org> > Cc: dgutov@yandex.ru, p.stephani2@gmail.com, kentaro.nakazawa@nifty.com, emacs-devel@gnu.org > Date: Wed, 30 Nov 2016 17:27:05 +0100 > > Eli Zaretskii <eliz@gnu.org> writes: > > >> Yes, this is not a json.el problem at all. It does the correct thing, > >> and shouldn't be changed. > > > > ??? Why should any code care whether a pure-ASCII string is marked as > > unibyte or as multibyte? Both are "correct". > > That's right -- why should any code care? Yet url.el does. No, it doesn't, not if the string is plain ASCII. ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-11-30 16:42 ` Eli Zaretskii @ 2016-11-30 18:25 ` Philipp Stephani 2016-11-30 18:48 ` Eli Zaretskii 0 siblings, 1 reply; 125+ messages in thread From: Philipp Stephani @ 2016-11-30 18:25 UTC (permalink / raw) To: Eli Zaretskii, Lars Ingebrigtsen; +Cc: dgutov, kentaro.nakazawa, emacs-devel [-- Attachment #1: Type: text/plain, Size: 934 bytes --] Eli Zaretskii <eliz@gnu.org> schrieb am Mi., 30. Nov. 2016 um 17:42 Uhr: > > From: Lars Ingebrigtsen <larsi@gnus.org> > > Cc: dgutov@yandex.ru, p.stephani2@gmail.com, > kentaro.nakazawa@nifty.com, emacs-devel@gnu.org > > Date: Wed, 30 Nov 2016 17:27:05 +0100 > > > > Eli Zaretskii <eliz@gnu.org> writes: > > > > >> Yes, this is not a json.el problem at all. It does the correct thing, > > >> and shouldn't be changed. > > > > > > ??? Why should any code care whether a pure-ASCII string is marked as > > > unibyte or as multibyte? Both are "correct". > > > > That's right -- why should any code care? Yet url.el does. > > No, it doesn't, not if the string is plain ASCII. > > But in that case it isn't, it's morally a byte array. What Emacs lacks is good support for byte arrays. For HTTP, process-send-string shouldn't need to deal with encoding or EOL conversion, it should just accept a byte array and send that, unmodified. [-- Attachment #2: Type: text/html, Size: 2099 bytes --] ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-11-30 18:25 ` Philipp Stephani @ 2016-11-30 18:48 ` Eli Zaretskii 2016-12-28 18:18 ` Philipp Stephani 0 siblings, 1 reply; 125+ messages in thread From: Eli Zaretskii @ 2016-11-30 18:48 UTC (permalink / raw) To: Philipp Stephani; +Cc: larsi, dgutov, kentaro.nakazawa, emacs-devel > From: Philipp Stephani <p.stephani2@gmail.com> > Date: Wed, 30 Nov 2016 18:25:09 +0000 > Cc: emacs-devel@gnu.org, kentaro.nakazawa@nifty.com, dgutov@yandex.ru > > > That's right -- why should any code care? Yet url.el does. > > No, it doesn't, not if the string is plain ASCII. > > But in that case it isn't, it's morally a byte array. Yes, because the internal representation of characters in Emacs is a superset of UTF-8. > What Emacs lacks is good support for byte arrays. Unibyte strings are byte arrays. What do you think we lack in that regard? > For HTTP, process-send-string shouldn't need to deal > with encoding or EOL conversion, it should just accept a byte array and send that, unmodified. I disagree. Handling unibyte strings is a nuisance, so Emacs allows most applications be oblivious about them, and just handle human-readable text. ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-11-30 18:48 ` Eli Zaretskii @ 2016-12-28 18:18 ` Philipp Stephani 2016-12-28 18:34 ` Eli Zaretskii 0 siblings, 1 reply; 125+ messages in thread From: Philipp Stephani @ 2016-12-28 18:18 UTC (permalink / raw) To: Eli Zaretskii; +Cc: larsi, dgutov, kentaro.nakazawa, emacs-devel [-- Attachment #1: Type: text/plain, Size: 1685 bytes --] Eli Zaretskii <eliz@gnu.org> schrieb am Mi., 30. Nov. 2016 um 19:48 Uhr: > > From: Philipp Stephani <p.stephani2@gmail.com> > > Date: Wed, 30 Nov 2016 18:25:09 +0000 > > Cc: emacs-devel@gnu.org, kentaro.nakazawa@nifty.com, dgutov@yandex.ru > > > > > That's right -- why should any code care? Yet url.el does. > > > > No, it doesn't, not if the string is plain ASCII. > > > > But in that case it isn't, it's morally a byte array. > > Yes, because the internal representation of characters in Emacs is a > superset of UTF-8. > That has nothing to do with characters. A byte array is conceptually different from a character string. > > > What Emacs lacks is good support for byte arrays. > > Unibyte strings are byte arrays. What do you think we lack in that regard? > If unibyte strings should be used for byte arrays, then the URL functions should indeed signal an error whenever url-request-data is a multibyte string, as HTTP requests are conceptually byte arrays, not character strings. > > > For HTTP, process-send-string shouldn't need to deal > > with encoding or EOL conversion, it should just accept a byte array and > send that, unmodified. > > I disagree. Handling unibyte strings is a nuisance, so Emacs allows > most applications be oblivious about them, and just handle > human-readable text. > That is the wrong approach (byte arrays and character strings are fundamentally different types, and mixing them together only causes pain), and it cannot work when implementing network protocols. HTTP requests are *not* human-readable text, they are byte arrays. Attempting to handle Unicode strings can't work because we wouldn't know the number of encoded bytes. [-- Attachment #2: Type: text/html, Size: 3100 bytes --] ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-12-28 18:18 ` Philipp Stephani @ 2016-12-28 18:34 ` Eli Zaretskii 2016-12-28 18:45 ` Philipp Stephani 0 siblings, 1 reply; 125+ messages in thread From: Eli Zaretskii @ 2016-12-28 18:34 UTC (permalink / raw) To: Philipp Stephani; +Cc: larsi, dgutov, kentaro.nakazawa, emacs-devel > From: Philipp Stephani <p.stephani2@gmail.com> > Date: Wed, 28 Dec 2016 18:18:25 +0000 > Cc: larsi@gnus.org, emacs-devel@gnu.org, kentaro.nakazawa@nifty.com, > dgutov@yandex.ru > > > > That's right -- why should any code care? Yet url.el does. > > > > No, it doesn't, not if the string is plain ASCII. > > > > But in that case it isn't, it's morally a byte array. > > Yes, because the internal representation of characters in Emacs is a > superset of UTF-8. > > That has nothing to do with characters. A byte array is conceptually different from a character string. In Emacs, they are both implemented using very similar objects. > > What Emacs lacks is good support for byte arrays. > > Unibyte strings are byte arrays. What do you think we lack in that regard? > > If unibyte strings should be used for byte arrays, then the URL functions should indeed signal an error > whenever url-request-data is a multibyte string, as HTTP requests are conceptually byte arrays, not character > strings. Which is what we do now. > > For HTTP, process-send-string shouldn't need to deal > > with encoding or EOL conversion, it should just accept a byte array and send that, unmodified. > > I disagree. Handling unibyte strings is a nuisance, so Emacs allows > most applications be oblivious about them, and just handle > human-readable text. > > That is the wrong approach (byte arrays and character strings are fundamentally different types, and mixing > them together only causes pain), and it cannot work when implementing network protocols. HTTP requests > are *not* human-readable text, they are byte arrays. Attempting to handle Unicode strings can't work because > we wouldn't know the number of encoded bytes. You are arguing against a long and quite painful history of non-ASCII strings in Emacs. What we have now is based on a lot of experience and at least two very large refactoring jobs. Going back would be a very bad idea indeed, as we've been there already, and users didn't like that. Some of us are old enough to remember the notorious \201 bytes creeping into text files and mail messages, due to that. Never again. Our experience is that we should keep use of unibyte strings in Lisp application code to the absolute minimum, ideally zero. Once we arrived at that conclusion, we've been living happily ever after. This minor issue we are discussing here is certainly not worth repeating past mistakes for which we paid plenty in sweat and blood. ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-12-28 18:34 ` Eli Zaretskii @ 2016-12-28 18:45 ` Philipp Stephani 2016-12-28 18:55 ` Eli Zaretskii 2016-12-28 19:03 ` Andreas Schwab 0 siblings, 2 replies; 125+ messages in thread From: Philipp Stephani @ 2016-12-28 18:45 UTC (permalink / raw) To: Eli Zaretskii; +Cc: larsi, dgutov, kentaro.nakazawa, emacs-devel [-- Attachment #1: Type: text/plain, Size: 3366 bytes --] Eli Zaretskii <eliz@gnu.org> schrieb am Mi., 28. Dez. 2016 um 19:35 Uhr: > > From: Philipp Stephani <p.stephani2@gmail.com> > > Date: Wed, 28 Dec 2016 18:18:25 +0000 > > Cc: larsi@gnus.org, emacs-devel@gnu.org, kentaro.nakazawa@nifty.com, > > dgutov@yandex.ru > > > > > > That's right -- why should any code care? Yet url.el does. > > > > > > No, it doesn't, not if the string is plain ASCII. > > > > > > But in that case it isn't, it's morally a byte array. > > > > Yes, because the internal representation of characters in Emacs is a > > superset of UTF-8. > > > > That has nothing to do with characters. A byte array is conceptually > different from a character string. > > In Emacs, they are both implemented using very similar objects. > Yes, that's why I said "conceptually different". The concepts may be the different, but the implementation might still be the same. > > > > What Emacs lacks is good support for byte arrays. > > > > Unibyte strings are byte arrays. What do you think we lack in that > regard? > > > > If unibyte strings should be used for byte arrays, then the URL > functions should indeed signal an error > > whenever url-request-data is a multibyte string, as HTTP requests are > conceptually byte arrays, not character > > strings. > > Which is what we do now. > There is no such check for url-request-data. There's an overall check for the complete request, but that also doesn't check for unibyte-ness. > > > > For HTTP, process-send-string shouldn't need to deal > > > with encoding or EOL conversion, it should just accept a byte array > and send that, unmodified. > > > > I disagree. Handling unibyte strings is a nuisance, so Emacs allows > > most applications be oblivious about them, and just handle > > human-readable text. > > > > That is the wrong approach (byte arrays and character strings are > fundamentally different types, and mixing > > them together only causes pain), and it cannot work when implementing > network protocols. HTTP requests > > are *not* human-readable text, they are byte arrays. Attempting to > handle Unicode strings can't work because > > we wouldn't know the number of encoded bytes. > > You are arguing against a long and quite painful history of non-ASCII > strings in Emacs. What we have now is based on a lot of experience > and at least two very large refactoring jobs. Going back would be a > very bad idea indeed, as we've been there already, and users didn't > like that. Some of us are old enough to remember the notorious \201 > bytes creeping into text files and mail messages, due to that. Never > again. > I'm not suggesting going back, too much would be broken. > > Our experience is that we should keep use of unibyte strings in Lisp > application code to the absolute minimum, ideally zero. Once we > arrived at that conclusion, we've been living happily ever after. > This minor issue we are discussing here is certainly not worth > repeating past mistakes for which we paid plenty in sweat and blood. > If you want unibyte strings to represent octet streams, then unibyte strings must be usable in application code, because octet streams are a concept that exists in reality, and applications must be able to support them in some way. If you don't want unibyte strings, then you need to provide some different way to represent octet streams. [-- Attachment #2: Type: text/html, Size: 5777 bytes --] ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-12-28 18:45 ` Philipp Stephani @ 2016-12-28 18:55 ` Eli Zaretskii 2016-12-28 19:03 ` Andreas Schwab 1 sibling, 0 replies; 125+ messages in thread From: Eli Zaretskii @ 2016-12-28 18:55 UTC (permalink / raw) To: Philipp Stephani; +Cc: larsi, dgutov, kentaro.nakazawa, emacs-devel > From: Philipp Stephani <p.stephani2@gmail.com> > Date: Wed, 28 Dec 2016 18:45:43 +0000 > Cc: larsi@gnus.org, emacs-devel@gnu.org, kentaro.nakazawa@nifty.com, > dgutov@yandex.ru > > > That has nothing to do with characters. A byte array is conceptually different from a character string. > > In Emacs, they are both implemented using very similar objects. > > Yes, that's why I said "conceptually different". The concepts may be the different, but the implementation > might still be the same. If the implementation is the same, then concepts are not very different to begin with, and the abstraction will sooner or later leak into applications. > Our experience is that we should keep use of unibyte strings in Lisp > application code to the absolute minimum, ideally zero. Once we > arrived at that conclusion, we've been living happily ever after. > This minor issue we are discussing here is certainly not worth > repeating past mistakes for which we paid plenty in sweat and blood. > > If you want unibyte strings to represent octet streams, then unibyte strings must be usable in application > code They are usable, but using them requires knowledge and proficiency that's unusual with many Lisp developers, and it also has some unpleasant pitfalls. > because octet streams are a concept that exists in reality, and applications must be able to support > them in some way. If you don't want unibyte strings, then you need to provide some different way to represent > octet streams. We use unibyte strings where we must, and otherwise prefer multibyte ones. In most cases the unibyte strings exist in Emacs internals, so that Lisp applications will not have to deal with them. This case is one of the few exceptions. If you are still unconvinced and think that we need some separate representation for byte arrays, consider this: when Emacs starts, it takes some time until it bootstraps itself enough to learn how to decode non-ASCII strings, such as file names. Until then, all file names are unibyte strings, and Emacs still must handle them correctly, because otherwise it would be impossible to build or start it in a directory that includes non-ASCII characters. This and other similar subtleties are the reason why using anything but a string for raw byte arrays is not a good idea, IMO and IME. ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-12-28 18:45 ` Philipp Stephani 2016-12-28 18:55 ` Eli Zaretskii @ 2016-12-28 19:03 ` Andreas Schwab 1 sibling, 0 replies; 125+ messages in thread From: Andreas Schwab @ 2016-12-28 19:03 UTC (permalink / raw) To: Philipp Stephani Cc: Eli Zaretskii, emacs-devel, kentaro.nakazawa, larsi, dgutov On Dez 28 2016, Philipp Stephani <p.stephani2@gmail.com> wrote: > If you want unibyte strings to represent octet streams, then unibyte > strings must be usable in application code, because octet streams are a > concept that exists in reality, and applications must be able to support > them in some way. If you don't want unibyte strings, then you need to > provide some different way to represent octet streams. Octet streams are basically encoded strings, and we use unibyte strings for encoded strings. That's the only place where unibyte strings should be used in Emacs. Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-11-30 16:25 ` Eli Zaretskii 2016-11-30 16:27 ` Lars Ingebrigtsen @ 2016-11-30 18:23 ` Philipp Stephani 2016-11-30 18:44 ` Eli Zaretskii 1 sibling, 1 reply; 125+ messages in thread From: Philipp Stephani @ 2016-11-30 18:23 UTC (permalink / raw) To: Eli Zaretskii, Lars Ingebrigtsen; +Cc: emacs-devel, kentaro.nakazawa, dgutov [-- Attachment #1: Type: text/plain, Size: 875 bytes --] Eli Zaretskii <eliz@gnu.org> schrieb am Mi., 30. Nov. 2016 um 17:25 Uhr: > > From: Lars Ingebrigtsen <larsi@gnus.org> > > Cc: Eli Zaretskii <eliz@gnu.org>, p.stephani2@gmail.com, > kentaro.nakazawa@nifty.com, emacs-devel@gnu.org > > Date: Wed, 30 Nov 2016 16:48:09 +0100 > > > > Yes, this is not a json.el problem at all. It does the correct thing, > > and shouldn't be changed. > > ??? Why should any code care whether a pure-ASCII string is marked as > unibyte or as multibyte? Both are "correct". > I guess the problem is that process-send-string cares. If it didn't, we wouldn't have the problem. For URL, we'd need functions like (byte-array-length s) = (length (string-to-unibyte s)) (process-send-bytes s) = (process-send-string (string-to-unibyte s)) (conceptually; process-send-string also does EOL conversion, which should never be done for HTTP bodies.) [-- Attachment #2: Type: text/html, Size: 1802 bytes --] ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-11-30 18:23 ` Philipp Stephani @ 2016-11-30 18:44 ` Eli Zaretskii 2016-12-28 18:09 ` Philipp Stephani 0 siblings, 1 reply; 125+ messages in thread From: Eli Zaretskii @ 2016-11-30 18:44 UTC (permalink / raw) To: Philipp Stephani; +Cc: larsi, emacs-devel, kentaro.nakazawa, dgutov > From: Philipp Stephani <p.stephani2@gmail.com> > Date: Wed, 30 Nov 2016 18:23:14 +0000 > Cc: dgutov@yandex.ru, kentaro.nakazawa@nifty.com, emacs-devel@gnu.org > > > Yes, this is not a json.el problem at all. It does the correct thing, > > and shouldn't be changed. > > ??? Why should any code care whether a pure-ASCII string is marked as > unibyte or as multibyte? Both are "correct". > > I guess the problem is that process-send-string cares. If it didn't, we wouldn't have the problem. I don't think I follow. The error we are talking about is signaled from url-http-create-request, not from process-send-string. > For URL, we'd need functions like > (byte-array-length s) = (length (string-to-unibyte s)) Why do you need this? string-to-unibyte is well-defined only for unibyte or ASCII strings (if we forget the raw bytes for a moment), so length will do. > (process-send-bytes s) = (process-send-string (string-to-unibyte s)) Why is this needed? process-send-string already encodes its argument, which produces a unibyte string. > (conceptually; process-send-string also does EOL conversion, which should never be done for HTTP > bodies.) I don't understand why. There are protocols that require CR-LF, no? ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-11-30 18:44 ` Eli Zaretskii @ 2016-12-28 18:09 ` Philipp Stephani 2016-12-28 18:27 ` Eli Zaretskii 0 siblings, 1 reply; 125+ messages in thread From: Philipp Stephani @ 2016-12-28 18:09 UTC (permalink / raw) To: Eli Zaretskii; +Cc: larsi, emacs-devel, kentaro.nakazawa, dgutov [-- Attachment #1: Type: text/plain, Size: 2051 bytes --] Eli Zaretskii <eliz@gnu.org> schrieb am Mi., 30. Nov. 2016 um 19:45 Uhr: > > From: Philipp Stephani <p.stephani2@gmail.com> > > Date: Wed, 30 Nov 2016 18:23:14 +0000 > > Cc: dgutov@yandex.ru, kentaro.nakazawa@nifty.com, emacs-devel@gnu.org > > > > > Yes, this is not a json.el problem at all. It does the correct thing, > > > and shouldn't be changed. > > > > ??? Why should any code care whether a pure-ASCII string is marked as > > unibyte or as multibyte? Both are "correct". > > > > I guess the problem is that process-send-string cares. If it didn't, we > wouldn't have the problem. > > I don't think I follow. The error we are talking about is signaled > from url-http-create-request, not from process-send-string. > Yes, but url-http-create-request only cares about unibyte strings because the request it creates is passed to process-send-string, which special-cases unibyte strings. > > > For URL, we'd need functions like > > (byte-array-length s) = (length (string-to-unibyte s)) > > Why do you need this? string-to-unibyte is well-defined only for > unibyte or ASCII strings (if we forget the raw bytes for a moment), so > length will do. > We need it because we have to send the byte length in a header. We can't just use (length s) because it would silently give a wrong result. > > > (process-send-bytes s) = (process-send-string (string-to-unibyte s)) > > Why is this needed? process-send-string already encodes its argument, > which produces a unibyte string. > We can't give a multibyte string to process-send-string, because we have to pass the length in bytes in a header first. Therefore we have to encode any string before passing it to process-send-string. > > > (conceptually; process-send-string also does EOL conversion, which > should never be done for HTTP > > bodies.) > > I don't understand why. There are protocols that require CR-LF, no? > > Yes, but HTTP request/response bodies should just be byte arrays and no conversion whatsoever should happen. After all, the body could be a binary data format. [-- Attachment #2: Type: text/html, Size: 3840 bytes --] ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-12-28 18:09 ` Philipp Stephani @ 2016-12-28 18:27 ` Eli Zaretskii 2016-12-28 18:35 ` Philipp Stephani 0 siblings, 1 reply; 125+ messages in thread From: Eli Zaretskii @ 2016-12-28 18:27 UTC (permalink / raw) To: Philipp Stephani; +Cc: larsi, emacs-devel, kentaro.nakazawa, dgutov > From: Philipp Stephani <p.stephani2@gmail.com> > Date: Wed, 28 Dec 2016 18:09:52 +0000 > Cc: larsi@gnus.org, dgutov@yandex.ru, kentaro.nakazawa@nifty.com, > emacs-devel@gnu.org > > > [1:text/plain Show] > > > [2:text/html Hide Save:noname (9kB)] > > Eli Zaretskii <eliz@gnu.org> schrieb am Mi., 30. Nov. 2016 um 19:45 Uhr: > > > From: Philipp Stephani <p.stephani2@gmail.com> > > Date: Wed, 30 Nov 2016 18:23:14 +0000 > > Cc: dgutov@yandex.ru, kentaro.nakazawa@nifty.com, emacs-devel@gnu.org > > > > > Yes, this is not a json.el problem at all. It does the correct thing, > > > and shouldn't be changed. > > > > ??? Why should any code care whether a pure-ASCII string is marked as > > unibyte or as multibyte? Both are "correct". > > > > I guess the problem is that process-send-string cares. If it didn't, we wouldn't have the problem. > > I don't think I follow. The error we are talking about is signaled > from url-http-create-request, not from process-send-string. > > Yes, but url-http-create-request only cares about unibyte strings because the request it creates is passed to > process-send-string, which special-cases unibyte strings. How do you see that process-send-string special-cases unibyte strings? > > For URL, we'd need functions like > > (byte-array-length s) = (length (string-to-unibyte s)) > > Why do you need this? string-to-unibyte is well-defined only for > unibyte or ASCII strings (if we forget the raw bytes for a moment), so > length will do. > > We need it because we have to send the byte length in a header. We can't just use (length s) because it > would silently give a wrong result. We are miscommunicating. string-to-unibyte can only meaningfully be called on a pure-ASCII string, and for pure-ASCII strings 'length' will count bytes. So I see no need for 'byte-array-length' if its implementation is as you indicated. > > (process-send-bytes s) = (process-send-string (string-to-unibyte s)) > > Why is this needed? process-send-string already encodes its argument, > which produces a unibyte string. > > We can't give a multibyte string to process-send-string, because we have to pass the length in bytes in a > header first. Therefore we have to encode any string before passing it to process-send-string. Once you encoded the string, why do you need anything except calling process-send-string? ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-12-28 18:27 ` Eli Zaretskii @ 2016-12-28 18:35 ` Philipp Stephani 2016-12-28 18:45 ` Eli Zaretskii 0 siblings, 1 reply; 125+ messages in thread From: Philipp Stephani @ 2016-12-28 18:35 UTC (permalink / raw) To: Eli Zaretskii; +Cc: larsi, dgutov, kentaro.nakazawa, emacs-devel [-- Attachment #1: Type: text/plain, Size: 3307 bytes --] Eli Zaretskii <eliz@gnu.org> schrieb am Mi., 28. Dez. 2016 um 19:28 Uhr: > > From: Philipp Stephani <p.stephani2@gmail.com> > > Date: Wed, 28 Dec 2016 18:09:52 +0000 > > Cc: larsi@gnus.org, dgutov@yandex.ru, kentaro.nakazawa@nifty.com, > > emacs-devel@gnu.org > > > > > > [1:text/plain Show] > > > > > > [2:text/html Hide Save:noname (9kB)] > > > > Eli Zaretskii <eliz@gnu.org> schrieb am Mi., 30. Nov. 2016 um 19:45 Uhr: > > > > > From: Philipp Stephani <p.stephani2@gmail.com> > > > Date: Wed, 30 Nov 2016 18:23:14 +0000 > > > Cc: dgutov@yandex.ru, kentaro.nakazawa@nifty.com, emacs-devel@gnu.org > > > > > > > Yes, this is not a json.el problem at all. It does the correct > thing, > > > > and shouldn't be changed. > > > > > > ??? Why should any code care whether a pure-ASCII string is marked as > > > unibyte or as multibyte? Both are "correct". > > > > > > I guess the problem is that process-send-string cares. If it didn't, > we wouldn't have the problem. > > > > I don't think I follow. The error we are talking about is signaled > > from url-http-create-request, not from process-send-string. > > > > Yes, but url-http-create-request only cares about unibyte strings > because the request it creates is passed to > > process-send-string, which special-cases unibyte strings. > > How do you see that process-send-string special-cases unibyte strings? > The send_process function has two branches, one for unibyte, one for multibyte. > > > > For URL, we'd need functions like > > > (byte-array-length s) = (length (string-to-unibyte s)) > > > > Why do you need this? string-to-unibyte is well-defined only for > > unibyte or ASCII strings (if we forget the raw bytes for a moment), so > > length will do. > > > > We need it because we have to send the byte length in a header. We can't > just use (length s) because it > > would silently give a wrong result. > > We are miscommunicating. string-to-unibyte can only meaningfully be > called on a pure-ASCII string, and for pure-ASCII strings 'length' > will count bytes. So I see no need for 'byte-array-length' if its > implementation is as you indicated. > That depends on how you want to represent byte arrays/octet streams in Emacs. If you want to represent them using unibyte strings, then you indeed only need `length'. But some earlier messages sounded like you wanted to represent byte arrays either using unibyte strings or byte-only multibyte strings. In that case `string-to-unibyte' is necessary. > > > > (process-send-bytes s) = (process-send-string (string-to-unibyte s)) > > > > Why is this needed? process-send-string already encodes its argument, > > which produces a unibyte string. > > > > We can't give a multibyte string to process-send-string, because we have > to pass the length in bytes in a > > header first. Therefore we have to encode any string before passing it > to process-send-string. > > Once you encoded the string, why do you need anything except calling > process-send-string? > > The byte size should be added as a Content-length HTTP header. If url-request-data is a unibyte string, that's not a problem (except for the newline conversion behavior in send_string), you can just use `length'. But if it's a multibyte string, you need to encode first to find the byte length. [-- Attachment #2: Type: text/html, Size: 6155 bytes --] ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-12-28 18:35 ` Philipp Stephani @ 2016-12-28 18:45 ` Eli Zaretskii 0 siblings, 0 replies; 125+ messages in thread From: Eli Zaretskii @ 2016-12-28 18:45 UTC (permalink / raw) To: Philipp Stephani; +Cc: larsi, dgutov, kentaro.nakazawa, emacs-devel > From: Philipp Stephani <p.stephani2@gmail.com> > Date: Wed, 28 Dec 2016 18:35:58 +0000 > Cc: larsi@gnus.org, emacs-devel@gnu.org, kentaro.nakazawa@nifty.com, > dgutov@yandex.ru > > How do you see that process-send-string special-cases unibyte strings? > > The send_process function has two branches, one for unibyte, one for multibyte. That's not special-casing. That's polymorphism, if you like: Emacs silently does TRT for both. > We are miscommunicating. string-to-unibyte can only meaningfully be > called on a pure-ASCII string, and for pure-ASCII strings 'length' > will count bytes. So I see no need for 'byte-array-length' if its > implementation is as you indicated. > > That depends on how you want to represent byte arrays/octet streams in Emacs. If you want to represent > them using unibyte strings, then you indeed only need `length'. But some earlier messages sounded like you > wanted to represent byte arrays either using unibyte strings or byte-only multibyte strings. In that case > `string-to-unibyte' is necessary. No, it's not. Multibyte strings that include raw bytes are converted to single bytes when you encode them. > Once you encoded the string, why do you need anything except calling > process-send-string? > > The byte size should be added as a Content-length HTTP header. If url-request-data is a unibyte string, that's > not a problem (except for the newline conversion behavior in send_string), you can just use `length'. But if it's > a multibyte string, you need to encode first to find the byte length. I thought we've just agreed that multibyte strings there should not be allowed. ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-11-30 15:48 ` Lars Ingebrigtsen 2016-11-30 16:25 ` Eli Zaretskii @ 2016-12-28 18:22 ` Philipp Stephani 2016-12-28 18:57 ` Lars Ingebrigtsen 1 sibling, 1 reply; 125+ messages in thread From: Philipp Stephani @ 2016-12-28 18:22 UTC (permalink / raw) To: Lars Ingebrigtsen, Dmitry Gutov Cc: Eli Zaretskii, kentaro.nakazawa, emacs-devel [-- Attachment #1: Type: text/plain, Size: 857 bytes --] Lars Ingebrigtsen <larsi@gnus.org> schrieb am Mi., 30. Nov. 2016 um 16:48 Uhr: > Dmitry Gutov <dgutov@yandex.ru> writes: > > > In json-encode? Should it really deal with that concern explicitly? > > > > I could understand an idea along the lines of "use a different > > algorithm", but calling encode-coding-string inside json-encode sounds > > odd. > > Yes, this is not a json.el problem at all. It does the correct thing, > and shouldn't be changed. > Agreed. Neither symbol-function nor concat nor the JSON function do anything wrong here. > > It's just url.el being lacking in features, as usual. > > > I don't think url.el needs to grow features for encoding; after all, Emacs already has functions for that. I'd rather add an explicit check for unibyte-ness of url-request-data and document that url-request-data must be a unibyte string or nil. [-- Attachment #2: Type: text/html, Size: 1690 bytes --] ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-12-28 18:22 ` Philipp Stephani @ 2016-12-28 18:57 ` Lars Ingebrigtsen 2016-12-30 0:07 ` Richard Stallman 0 siblings, 1 reply; 125+ messages in thread From: Lars Ingebrigtsen @ 2016-12-28 18:57 UTC (permalink / raw) To: Philipp Stephani Cc: Eli Zaretskii, emacs-devel, kentaro.nakazawa, Dmitry Gutov Philipp Stephani <p.stephani2@gmail.com> writes: > I don't think url.el needs to grow features for encoding; after all, Emacs > already has functions for that. I'd rather add an explicit check for > unibyte-ness of url-request-data and document that url-request-data must be > a unibyte string or nil. Nah. If you want to do something here, just compute the correct length header (as previously discussed), and virtually all callers will be happy. I've started working on a `with-url' functionality that'll replace the current mess. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-12-28 18:57 ` Lars Ingebrigtsen @ 2016-12-30 0:07 ` Richard Stallman 2016-12-30 14:15 ` Lars Ingebrigtsen 0 siblings, 1 reply; 125+ messages in thread From: Richard Stallman @ 2016-12-30 0:07 UTC (permalink / raw) To: Lars Ingebrigtsen Cc: p.stephani2, dgutov, kentaro.nakazawa, eliz, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > I've started working on a `with-url' functionality that'll replace the > current mess. The name `with-url' suggests that Emacs has some sort of "current URL", and that this macro temporarily specifies some particular URL as current. That's not the case, is it? So the name `with-url' doesn't fit what it does. (What does it do?) We should change the name to something that fits what it does. -- Dr Richard Stallman President, Free Software Foundation (gnu.org, fsf.org) Internet Hall-of-Famer (internethalloffame.org) Skype: No way! See stallman.org/skype.html. ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-12-30 0:07 ` Richard Stallman @ 2016-12-30 14:15 ` Lars Ingebrigtsen 2016-12-30 16:59 ` Eli Zaretskii 2016-12-30 21:38 ` Richard Stallman 0 siblings, 2 replies; 125+ messages in thread From: Lars Ingebrigtsen @ 2016-12-30 14:15 UTC (permalink / raw) To: Richard Stallman; +Cc: p.stephani2, dgutov, kentaro.nakazawa, emacs-devel Richard Stallman <rms@gnu.org> writes: > The name `with-url' suggests that Emacs has some sort of "current URL", > and that this macro temporarily specifies some particular URL as current. > > That's not the case, is it? So the name `with-url' doesn't fit > what it does. (What does it do?) It's like `with-temp-buffer' and it's cousins: It generates a new buffer, executes the body in that buffer, and kills the buffer when the form finishes. The contents of the buffer come from the specified URL, of course. See the recent discussion of with-url on emacs-devel. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-12-30 14:15 ` Lars Ingebrigtsen @ 2016-12-30 16:59 ` Eli Zaretskii 2017-01-21 15:39 ` Lars Ingebrigtsen 2016-12-30 21:38 ` Richard Stallman 1 sibling, 1 reply; 125+ messages in thread From: Eli Zaretskii @ 2016-12-30 16:59 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: p.stephani2, emacs-devel, kentaro.nakazawa, rms, dgutov > From: Lars Ingebrigtsen <larsi@gnus.org> > Date: Fri, 30 Dec 2016 15:15:26 +0100 > Cc: p.stephani2@gmail.com, dgutov@yandex.ru, kentaro.nakazawa@nifty.com, > emacs-devel@gnu.org > > Richard Stallman <rms@gnu.org> writes: > > > The name `with-url' suggests that Emacs has some sort of "current URL", > > and that this macro temporarily specifies some particular URL as current. > > > > That's not the case, is it? So the name `with-url' doesn't fit > > what it does. (What does it do?) > > It's like `with-temp-buffer' and it's cousins: It generates a new > buffer, executes the body in that buffer, and kills the buffer when the > form finishes. How about 'with-fetched-url', then? ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-12-30 16:59 ` Eli Zaretskii @ 2017-01-21 15:39 ` Lars Ingebrigtsen 2017-01-21 15:56 ` Eli Zaretskii 0 siblings, 1 reply; 125+ messages in thread From: Lars Ingebrigtsen @ 2017-01-21 15:39 UTC (permalink / raw) To: Eli Zaretskii; +Cc: p.stephani2, emacs-devel, kentaro.nakazawa, rms, dgutov Eli Zaretskii <eliz@gnu.org> writes: >> It's like `with-temp-buffer' and it's cousins: It generates a new >> buffer, executes the body in that buffer, and kills the buffer when the >> form finishes. > > How about 'with-fetched-url', then? Hm... I'm not sure it gives us more clarity. It should really be `with-content-fetched-from-specified-url', but that's a bit long, right? So I think `with-url' is fine for anybody who's working with these things. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2017-01-21 15:39 ` Lars Ingebrigtsen @ 2017-01-21 15:56 ` Eli Zaretskii 2017-01-21 16:30 ` Lars Ingebrigtsen 0 siblings, 1 reply; 125+ messages in thread From: Eli Zaretskii @ 2017-01-21 15:56 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: p.stephani2, dgutov, kentaro.nakazawa, rms, emacs-devel > From: Lars Ingebrigtsen <larsi@gnus.org> > Date: Sat, 21 Jan 2017 16:39:12 +0100 > Cc: p.stephani2@gmail.com, emacs-devel@gnu.org, kentaro.nakazawa@nifty.com, > rms@gnu.org, dgutov@yandex.ru > > > How about 'with-fetched-url', then? > > Hm... I'm not sure it gives us more clarity. It should really be > `with-content-fetched-from-specified-url', but that's a bit long, right? > So I think `with-url' is fine for anybody who's working with these > things. Both Richard and myself came up with almost identical comments on with-url, so I hope you will reconsider. ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2017-01-21 15:56 ` Eli Zaretskii @ 2017-01-21 16:30 ` Lars Ingebrigtsen 2017-01-21 22:58 ` Stefan Monnier 0 siblings, 1 reply; 125+ messages in thread From: Lars Ingebrigtsen @ 2017-01-21 16:30 UTC (permalink / raw) To: Eli Zaretskii; +Cc: p.stephani2, dgutov, kentaro.nakazawa, rms, emacs-devel Eli Zaretskii <eliz@gnu.org> writes: > Both Richard and myself came up with almost identical comments on > with-url, so I hope you will reconsider. Perhaps we could have a vote. The contenders are `with-url', `with-fetched-url', `with-url-contents' and `with-contents-in-a-buffer-fetched-from-somewhere-specified-by-the-following-url'. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2017-01-21 16:30 ` Lars Ingebrigtsen @ 2017-01-21 22:58 ` Stefan Monnier 2017-01-24 20:04 ` Lars Ingebrigtsen 0 siblings, 1 reply; 125+ messages in thread From: Stefan Monnier @ 2017-01-21 22:58 UTC (permalink / raw) To: emacs-devel >>>>> "Lars" == Lars Ingebrigtsen <larsi@gnus.org> writes: > Eli Zaretskii <eliz@gnu.org> writes: >> Both Richard and myself came up with almost identical comments on >> with-url, so I hope you will reconsider. > Perhaps we could have a vote. The contenders are `with-url', > `with-fetched-url', `with-url-contents' and > `with-contents-in-a-buffer-fetched-from-somewhere-specified-by-the-following-url'. I vote against with-url and with-contents-in-a-buffer-fetched-from-somewhere-specified-by-the-following-url. The other two seem fine, Stefan ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2017-01-21 22:58 ` Stefan Monnier @ 2017-01-24 20:04 ` Lars Ingebrigtsen 2017-01-28 9:52 ` Elias Mårtenson 0 siblings, 1 reply; 125+ messages in thread From: Lars Ingebrigtsen @ 2017-01-24 20:04 UTC (permalink / raw) To: Stefan Monnier; +Cc: emacs-devel Stefan Monnier <monnier@iro.umontreal.ca> writes: >> Perhaps we could have a vote. The contenders are `with-url', >> `with-fetched-url', `with-url-contents' and >> `with-contents-in-a-buffer-fetched-from-somewhere-specified-by-the-following-url'. > > I vote against with-url and > with-contents-in-a-buffer-fetched-from-somewhere-specified-by-the-following-url. > The other two seem fine, OK, then we have 1 vote for `with-url', 1.5 votes for `with-fetched-url' and `with-url-contents' each, and zero for `with-contents-in-a-buffer-fetched-from-somewhere-specified-by-the-following-url'. The competition is heating up! -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2017-01-24 20:04 ` Lars Ingebrigtsen @ 2017-01-28 9:52 ` Elias Mårtenson 2017-01-28 14:16 ` Lars Ingebrigtsen 0 siblings, 1 reply; 125+ messages in thread From: Elias Mårtenson @ 2017-01-28 9:52 UTC (permalink / raw) To: Lars Magne Ingebrigtsen; +Cc: Stefan Monnier, emacs-devel [-- Attachment #1: Type: text/plain, Size: 937 bytes --] Who is allowed to vote? I consider with-url to be less than ideal and not very clear. with-url-contents is a lot better. Regards, Elias On 25 Jan 2017 4:06 AM, "Lars Ingebrigtsen" <larsi@gnus.org> wrote: Stefan Monnier <monnier@iro.umontreal.ca> writes: >> Perhaps we could have a vote. The contenders are `with-url', >> `with-fetched-url', `with-url-contents' and >> `with-contents-in-a-buffer-fetched-from-somewhere- specified-by-the-following-url'. > > I vote against with-url and > with-contents-in-a-buffer-fetched-from-somewhere- specified-by-the-following-url. > The other two seem fine, OK, then we have 1 vote for `with-url', 1.5 votes for `with-fetched-url' and `with-url-contents' each, and zero for `with-contents-in-a-buffer-fetched-from-somewhere- specified-by-the-following-url'. The competition is heating up! -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no [-- Attachment #2: Type: text/html, Size: 1695 bytes --] ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2017-01-28 9:52 ` Elias Mårtenson @ 2017-01-28 14:16 ` Lars Ingebrigtsen 0 siblings, 0 replies; 125+ messages in thread From: Lars Ingebrigtsen @ 2017-01-28 14:16 UTC (permalink / raw) To: Elias Mårtenson; +Cc: Stefan Monnier, emacs-devel Elias Mårtenson <lokedhs@gmail.com> writes: > Who is allowed to vote? I consider with-url to be less than ideal and not very > clear. with-url-contents is a lot better. OK, `with-url-contents' is now the clear leader here with 2.5 votes! -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-12-30 14:15 ` Lars Ingebrigtsen 2016-12-30 16:59 ` Eli Zaretskii @ 2016-12-30 21:38 ` Richard Stallman 1 sibling, 0 replies; 125+ messages in thread From: Richard Stallman @ 2016-12-30 21:38 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: p.stephani2, emacs-devel, kentaro.nakazawa, dgutov [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > > That's not the case, is it? So the name `with-url' doesn't fit > > what it does. (What does it do?) > It's like `with-temp-buffer' and it's cousins: It generates a new > buffer, executes the body in that buffer, and kills the buffer when the > form finishes. It sounds useful, but the name isn't clear. Let's call it `with-url-contents'; that fits what it does. -- Dr Richard Stallman President, Free Software Foundation (gnu.org, fsf.org) Internet Hall-of-Famer (internethalloffame.org) Skype: No way! See stallman.org/skype.html. ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-11-30 15:45 ` Dmitry Gutov 2016-11-30 15:48 ` Lars Ingebrigtsen @ 2016-11-30 16:23 ` Eli Zaretskii 2016-12-01 0:30 ` Dmitry Gutov 1 sibling, 1 reply; 125+ messages in thread From: Eli Zaretskii @ 2016-11-30 16:23 UTC (permalink / raw) To: Dmitry Gutov; +Cc: p.stephani2, emacs-devel, larsi, kentaro.nakazawa > Cc: p.stephani2@gmail.com, larsi@gnus.org, kentaro.nakazawa@nifty.com, > emacs-devel@gnu.org > From: Dmitry Gutov <dgutov@yandex.ru> > Date: Wed, 30 Nov 2016 17:45:25 +0200 > > On 30.11.2016 17:42, Eli Zaretskii wrote: > > >> json-encode uses the previously mentioned symbol-name, which returns > >> multibyte values. What would we do about that? > > > > Check that the value returned by symbol-name is pure-ASCII, and if so, > > make it unibyte? > > In json-encode? Should it really deal with that concern explicitly? Since both the original issue and this one are at least indirectly caused by jason.el, it might make sense. > I could understand an idea along the lines of "use a different > algorithm", but calling encode-coding-string inside json-encode sounds odd. I didn't mean encode-coding-string, I meant string-make-unibyte, which for a pure-ASCII string doesn't touch the contents. ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-11-30 16:23 ` Eli Zaretskii @ 2016-12-01 0:30 ` Dmitry Gutov 2016-12-01 17:17 ` Eli Zaretskii 2016-12-28 18:25 ` Philipp Stephani 0 siblings, 2 replies; 125+ messages in thread From: Dmitry Gutov @ 2016-12-01 0:30 UTC (permalink / raw) To: Eli Zaretskii; +Cc: p.stephani2, emacs-devel, larsi, kentaro.nakazawa On 30.11.2016 18:23, Eli Zaretskii wrote: > Since both the original issue and this one are at least indirectly > caused by jason.el, it might make sense. Triggered, more like. JSON is a frequently-used format, but there are others. And same problems will remain when e.g. plain text is used. > I didn't mean encode-coding-string, I meant string-make-unibyte, which > for a pure-ASCII string doesn't touch the contents. Either way, I don't think it's a great idea. Quite the opposite: by allowing the programmer to avoid calling `encode-coding-string' in more cases, we'll just make the problem in their code harder to find, until some user of that code really does need to transfer multibyte content. Further, now that Emacs 25 is out, and we are allowed to have more breaking changes in Emacs 26, I think we should change the check at the end of url-http-create-request to just use multibyte-string-p. Barring some unforeseen consequences, this will solidify the requirement that the caller need to deal with encoding explicitly in all cases, before passing the request body to the transport level. ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-12-01 0:30 ` Dmitry Gutov @ 2016-12-01 17:17 ` Eli Zaretskii 2016-12-02 13:18 ` Dmitry Gutov 2016-12-28 18:25 ` Philipp Stephani 1 sibling, 1 reply; 125+ messages in thread From: Eli Zaretskii @ 2016-12-01 17:17 UTC (permalink / raw) To: Dmitry Gutov; +Cc: p.stephani2, emacs-devel, larsi, kentaro.nakazawa > Cc: p.stephani2@gmail.com, larsi@gnus.org, kentaro.nakazawa@nifty.com, > emacs-devel@gnu.org > From: Dmitry Gutov <dgutov@yandex.ru> > Date: Thu, 1 Dec 2016 02:30:15 +0200 > > On 30.11.2016 18:23, Eli Zaretskii wrote: > > > Since both the original issue and this one are at least indirectly > > caused by jason.el, it might make sense. > > Triggered, more like. Nothing wrong with that. If some issue isn't a bug, but gets in the way of a broad class of applications, it is okay to silently DTRT for that class only, in some central place that serves the class. > Either way, I don't think it's a great idea. Quite the opposite: by > allowing the programmer to avoid calling `encode-coding-string' in more > cases, we'll just make the problem in their code harder to find, until > some user of that code really does need to transfer multibyte content. I don't think we will win any hearts by nagging application programmers when we could silently DTRT ourselves. > Further, now that Emacs 25 is out, and we are allowed to have more > breaking changes in Emacs 26, I think we should change the check at the > end of url-http-create-request to just use multibyte-string-p. > > Barring some unforeseen consequences, this will solidify the requirement > that the caller need to deal with encoding explicitly in all cases, > before passing the request body to the transport level. Can you show me a patch to that effect, or point me to where it was posted in the past? I'm afraid I no longer remember those details. Thanks. ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-12-01 17:17 ` Eli Zaretskii @ 2016-12-02 13:18 ` Dmitry Gutov 2016-12-02 14:24 ` Eli Zaretskii 2016-12-02 15:29 ` Lars Ingebrigtsen 0 siblings, 2 replies; 125+ messages in thread From: Dmitry Gutov @ 2016-12-02 13:18 UTC (permalink / raw) To: Eli Zaretskii; +Cc: p.stephani2, emacs-devel, larsi, kentaro.nakazawa On 01.12.2016 19:17, Eli Zaretskii wrote: > Nothing wrong with that. If some issue isn't a bug, but gets in the > way of a broad class of applications, I don't think it's useful to extract applications that use JSON+HTTP with ASCII-only payloads into a separate class. Most of the time (or at least very often) it depends on the user, what kind of payload gets sent (with multibyte characters or not). > it is okay to silently DTRT for > that class only, in some central place that serves the class. Those central places are coding.c and url/url-*.el. Not sure what can be done there, though. > I don't think we will win any hearts by nagging application > programmers when we could silently DTRT ourselves. We can win the hearts of some users, long term, by making the API such that it's harder to do the wrong thing. You yourself suggested multibyte-string-p originally, and I suggested the current more permissive approach more or less because that the new release was very close: https://debbugs.gnu.org/cgi/bugreport.cgi?bug=23750#83 > Can you show me a patch to that effect, or point me to where it was > posted in the past? I'm afraid I no longer remember those details. Something like this: diff --git a/lisp/url/url-http.el b/lisp/url/url-http.el index e0e080e..affd5c2 100644 --- a/lisp/url/url-http.el +++ b/lisp/url/url-http.el @@ -358,9 +358,8 @@ url-http-create-request ;; Any data url-http-data)) ;; Bug#23750 - (unless (= (string-bytes request) - (length request)) - (error "Multibyte text in HTTP request: %s" request)) + (when (mutibyte-string-p request) + (error "Multibyte text in HTTP request: %s, please translate any multibyte components to unibyte using `encode-coding-string'" request)) (url-http-debug "Request is: \n%s" request) request)) ^ permalink raw reply related [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-12-02 13:18 ` Dmitry Gutov @ 2016-12-02 14:24 ` Eli Zaretskii 2016-12-02 14:35 ` Dmitry Gutov 2016-12-02 14:53 ` Yuri Khan 2016-12-02 15:29 ` Lars Ingebrigtsen 1 sibling, 2 replies; 125+ messages in thread From: Eli Zaretskii @ 2016-12-02 14:24 UTC (permalink / raw) To: Dmitry Gutov; +Cc: p.stephani2, emacs-devel, larsi, kentaro.nakazawa > Cc: p.stephani2@gmail.com, larsi@gnus.org, kentaro.nakazawa@nifty.com, > emacs-devel@gnu.org > From: Dmitry Gutov <dgutov@yandex.ru> > Date: Fri, 2 Dec 2016 15:18:48 +0200 > > > it is okay to silently DTRT for > > that class only, in some central place that serves the class. > > Those central places are coding.c and url/url-*.el. That's not what I meant (and coding.c is definitely not the place), but let's leave this alone. > diff --git a/lisp/url/url-http.el b/lisp/url/url-http.el > index e0e080e..affd5c2 100644 > --- a/lisp/url/url-http.el > +++ b/lisp/url/url-http.el > @@ -358,9 +358,8 @@ url-http-create-request > ;; Any data > url-http-data)) > ;; Bug#23750 > - (unless (= (string-bytes request) > - (length request)) > - (error "Multibyte text in HTTP request: %s" request)) > + (when (mutibyte-string-p request) > + (error "Multibyte text in HTTP request: %s, please translate any > multibyte components to unibyte using `encode-coding-string'" request)) > (url-http-debug "Request is: \n%s" request) > request)) This will also reject pure-ASCII strings that just happen to be multibyte, although there will be no problem with such an HTTP request. Do we really want to disallow that use case? ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-12-02 14:24 ` Eli Zaretskii @ 2016-12-02 14:35 ` Dmitry Gutov 2016-12-02 15:20 ` Eli Zaretskii 2016-12-02 14:53 ` Yuri Khan 1 sibling, 1 reply; 125+ messages in thread From: Dmitry Gutov @ 2016-12-02 14:35 UTC (permalink / raw) To: Eli Zaretskii; +Cc: p.stephani2, emacs-devel, larsi, kentaro.nakazawa On 02.12.2016 16:24, Eli Zaretskii wrote: > This will also reject pure-ASCII strings that just happen to be > multibyte, although there will be no problem with such an HTTP > request. Do we really want to disallow that use case? That's the whole point of the patch. I think I've explained why in the previous message. ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-12-02 14:35 ` Dmitry Gutov @ 2016-12-02 15:20 ` Eli Zaretskii 0 siblings, 0 replies; 125+ messages in thread From: Eli Zaretskii @ 2016-12-02 15:20 UTC (permalink / raw) To: Dmitry Gutov; +Cc: p.stephani2, emacs-devel, larsi, kentaro.nakazawa > Cc: p.stephani2@gmail.com, larsi@gnus.org, kentaro.nakazawa@nifty.com, > emacs-devel@gnu.org > From: Dmitry Gutov <dgutov@yandex.ru> > Date: Fri, 2 Dec 2016 16:35:32 +0200 > > On 02.12.2016 16:24, Eli Zaretskii wrote: > > > This will also reject pure-ASCII strings that just happen to be > > multibyte, although there will be no problem with such an HTTP > > request. Do we really want to disallow that use case? > > That's the whole point of the patch. I think I've explained why in the > previous message. Fine, let's try. Thanks. ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-12-02 14:24 ` Eli Zaretskii 2016-12-02 14:35 ` Dmitry Gutov @ 2016-12-02 14:53 ` Yuri Khan 2016-12-02 15:45 ` Eli Zaretskii 2016-12-02 15:51 ` Lars Ingebrigtsen 1 sibling, 2 replies; 125+ messages in thread From: Yuri Khan @ 2016-12-02 14:53 UTC (permalink / raw) To: Eli Zaretskii Cc: Philipp Stephani, Emacs developers, kentaro.nakazawa, Lars Magne Ingebrigtsen, Dmitry Gutov On Fri, Dec 2, 2016 at 9:24 PM, Eli Zaretskii <eliz@gnu.org> wrote: >> + (when (mutibyte-string-p request) >> + (error "Multibyte text in HTTP request: %s, please translate any >> multibyte components to unibyte using `encode-coding-string'" request)) >> (url-http-debug "Request is: \n%s" request) >> request)) > > This will also reject pure-ASCII strings that just happen to be > multibyte, although there will be no problem with such an HTTP > request. Do we really want to disallow that use case? It is really unfortunate that we talk about ASCII strings, unibyte strings, multibyte strings, as if that was a meaningful classification. The real dichotomy is between text (aka strings) and MIME-type-tagged byte arrays. In order to send a string over HTTP, one must encode it to a byte array and tag it as "text/plain; charset=utf-8" or "text/html; charset=utf-8" or application/json (no charset parameter because json must always be encoded in one of utf-* for transmission). Conversely, a byte array received over HTTP can, MIME type allowing, decoded into a string. The fact that there exist strings for which encoding and decoding are identity transforms should be regarded only as an implementation detail. Attempts by libraries and frameworks to silently DTRT for this subset lead to applications neglecting to properly encode or tag strings, leading, in turn, to breakage in presence of multilingual text. ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-12-02 14:53 ` Yuri Khan @ 2016-12-02 15:45 ` Eli Zaretskii 2016-12-02 15:51 ` Lars Ingebrigtsen 1 sibling, 0 replies; 125+ messages in thread From: Eli Zaretskii @ 2016-12-02 15:45 UTC (permalink / raw) To: Yuri Khan; +Cc: p.stephani2, emacs-devel, kentaro.nakazawa, larsi, dgutov > From: Yuri Khan <yuri.v.khan@gmail.com> > Date: Fri, 2 Dec 2016 21:53:16 +0700 > Cc: Dmitry Gutov <dgutov@yandex.ru>, Philipp Stephani <p.stephani2@gmail.com>, > It is really unfortunate that we talk about ASCII strings, unibyte > strings, multibyte strings, as if that was a meaningful > classification. It is meaningful when you work on Emacs code. > The real dichotomy is between text (aka strings) and MIME-type-tagged > byte arrays. That might be so in the context of HTTP, but in general, byte arrays ("raw bytes" in Emacs parlance) are not limited to MIME types. Moreover, there are very frequent use cases where Emacs code needs to work with a byte array whose type is unknown, or even cannot be known at all, because it doesn't come with any meta-data of any kind. > In order to send a string over HTTP, one must encode it > to a byte array and tag it as "text/plain; charset=utf-8" or > "text/html; charset=utf-8" or application/json (no charset parameter > because json must always be encoded in one of utf-* for transmission). > Conversely, a byte array received over HTTP can, MIME type allowing, > decoded into a string. > > The fact that there exist strings for which encoding and decoding are > identity transforms should be regarded only as an implementation > detail. You are talking generalities here, whereas this discussion is about Emacs-specific internal issues. In Emacs, a plain-ASCII string is indistinguishable from a "byte array" whose bytes are all below 128. They have the same representation. To muddy the water even more, a plain-ASCII string can be "marked" as multibyte (again, internally), but it should be clear that such a "mark" has no meaning at all for ASCII text. From the Lisp application POV, whether a plain-ASCII string it receives or processes is marked as unibyte or multibyte is entirely random. So if some ASCII text is accepted by an Emacs API involved in sending HTTP requests, while an identical ASCII string is rejected, it could be a source of surprises and bug reports. That is the core of the issues discussed here. > Attempts by libraries and frameworks to silently DTRT for this > subset lead to applications neglecting to properly encode or tag > strings, leading, in turn, to breakage in presence of multilingual > text. Based on Emacs experience of dealing with multibyte text and its encoding/decoding, the conclusion was that it is better to silently DTRT where we can be sure we know how. Making a point of educating users by harsh measures such as signaling errors where Emacs could easily proceed, is generally not welcome. We will see if this case is any different. ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-12-02 14:53 ` Yuri Khan 2016-12-02 15:45 ` Eli Zaretskii @ 2016-12-02 15:51 ` Lars Ingebrigtsen 2016-12-02 15:58 ` Eli Zaretskii 1 sibling, 1 reply; 125+ messages in thread From: Lars Ingebrigtsen @ 2016-12-02 15:51 UTC (permalink / raw) To: Yuri Khan Cc: Eli Zaretskii, Dmitry Gutov, kentaro.nakazawa, Philipp Stephani, Emacs developers Yuri Khan <yuri.v.khan@gmail.com> writes: > The real dichotomy is between text (aka strings) and MIME-type-tagged > byte arrays. To nit-pick (this is emacs-devel, after all): "Byte array" isn't very meaningful, either. The standards talk about octet streams. :-) But you're right, of course: This function has a string-based interface, which is pretty meaningless, since no protocols (well, extremely few) deal with characters -- they only deal with octet streams. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-12-02 15:51 ` Lars Ingebrigtsen @ 2016-12-02 15:58 ` Eli Zaretskii 0 siblings, 0 replies; 125+ messages in thread From: Eli Zaretskii @ 2016-12-02 15:58 UTC (permalink / raw) To: Lars Ingebrigtsen Cc: p.stephani2, emacs-devel, dgutov, kentaro.nakazawa, yuri.v.khan > From: Lars Ingebrigtsen <larsi@gnus.org> > Cc: Eli Zaretskii <eliz@gnu.org>, Philipp Stephani <p.stephani2@gmail.com>, Emacs developers <emacs-devel@gnu.org>, kentaro.nakazawa@nifty.com, Dmitry Gutov <dgutov@yandex.ru> > Date: Fri, 02 Dec 2016 16:51:28 +0100 > > But you're right, of course: This function has a string-based interface, > which is pretty meaningless, since no protocols (well, extremely few) > deal with characters -- they only deal with octet streams. The Emacs implementation of an octet stream is a unibyte string. ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-12-02 13:18 ` Dmitry Gutov 2016-12-02 14:24 ` Eli Zaretskii @ 2016-12-02 15:29 ` Lars Ingebrigtsen 2016-12-02 15:32 ` Dmitry Gutov 1 sibling, 1 reply; 125+ messages in thread From: Lars Ingebrigtsen @ 2016-12-02 15:29 UTC (permalink / raw) To: Dmitry Gutov; +Cc: Eli Zaretskii, kentaro.nakazawa, p.stephani2, emacs-devel Dmitry Gutov <dgutov@yandex.ru> writes: > - (unless (= (string-bytes request) > - (length request)) > - (error "Multibyte text in HTTP request: %s" request)) > + (when (mutibyte-string-p request) > + (error "Multibyte text in HTTP request: %s, please translate This is going to break many current callers. Most people aren't doing anything as weird as trying to transmit non-ASCII text via any of these headers (it's a very uncommon thing to do), but are just passing in normal Emacs strings (containing nothing by ASCII, as is proper). These will all fail if you do this, for no real gain. Sorry to keep harping on about this, but the current url-* interface is inadequate. We should leave it be and move on to create a new, well-defined url-fetching interface. I hope to get time to do that during my next holiday, which should be in February. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-12-02 15:29 ` Lars Ingebrigtsen @ 2016-12-02 15:32 ` Dmitry Gutov 2016-12-02 15:48 ` Lars Ingebrigtsen 0 siblings, 1 reply; 125+ messages in thread From: Dmitry Gutov @ 2016-12-02 15:32 UTC (permalink / raw) To: Lars Ingebrigtsen Cc: Eli Zaretskii, kentaro.nakazawa, p.stephani2, emacs-devel On 02.12.2016 17:29, Lars Ingebrigtsen wrote: > This is going to break many current callers. Most people aren't doing > anything as weird as trying to transmit non-ASCII text via any of these > headers (it's a very uncommon thing to do), but are just passing in > normal Emacs strings (containing nothing by ASCII, as is proper). Do you have some examples? > These will all fail if you do this, for no real gain. That's debatable. > Sorry to keep harping on about this, but the current url-* interface is > inadequate. We should leave it be and move on to create a new, > well-defined url-fetching interface. I'm sure a well-defined interface will need to have a required "encoding" step, or an argument somewhere, at least. ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-12-02 15:32 ` Dmitry Gutov @ 2016-12-02 15:48 ` Lars Ingebrigtsen 2016-12-02 15:56 ` Dmitry Gutov 0 siblings, 1 reply; 125+ messages in thread From: Lars Ingebrigtsen @ 2016-12-02 15:48 UTC (permalink / raw) To: Dmitry Gutov; +Cc: p.stephani2, Eli Zaretskii, kentaro.nakazawa, emacs-devel Dmitry Gutov <dgutov@yandex.ru> writes: > On 02.12.2016 17:29, Lars Ingebrigtsen wrote: > >> This is going to break many current callers. Most people aren't doing >> anything as weird as trying to transmit non-ASCII text via any of these >> headers (it's a very uncommon thing to do), but are just passing in >> normal Emacs strings (containing nothing by ASCII, as is proper). > > Do you have some examples? (multibyte-string-p (symbol-name 'a)) => t > I'm sure a well-defined interface will need to have a required > "encoding" step, or an argument somewhere, at least. Yes, of course. The interface will allow the caller to specify the charset of the data. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-12-02 15:48 ` Lars Ingebrigtsen @ 2016-12-02 15:56 ` Dmitry Gutov 2016-12-02 16:02 ` Lars Ingebrigtsen 0 siblings, 1 reply; 125+ messages in thread From: Dmitry Gutov @ 2016-12-02 15:56 UTC (permalink / raw) To: Lars Ingebrigtsen Cc: p.stephani2, Eli Zaretskii, kentaro.nakazawa, emacs-devel On 02.12.2016 17:48, Lars Ingebrigtsen wrote: > Dmitry Gutov <dgutov@yandex.ru> writes: > >> On 02.12.2016 17:29, Lars Ingebrigtsen wrote: >> >>> This is going to break many current callers. Most people aren't doing >>> anything as weird as trying to transmit non-ASCII text via any of these >>> headers (it's a very uncommon thing to do), but are just passing in >>> normal Emacs strings (containing nothing by ASCII, as is proper). >> >> Do you have some examples? > > (multibyte-string-p (symbol-name 'a)) > => t Examples of things "most people" are doing "trying to transmit" "nothing but ASCII" using the URL package, please. >> I'm sure a well-defined interface will need to have a required >> "encoding" step, or an argument somewhere, at least. > > Yes, of course. The interface will allow the caller to specify the > charset of the data. And at least make it clear that the parameter with default to UTF-8, right? ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-12-02 15:56 ` Dmitry Gutov @ 2016-12-02 16:02 ` Lars Ingebrigtsen 2016-12-02 16:06 ` Dmitry Gutov 0 siblings, 1 reply; 125+ messages in thread From: Lars Ingebrigtsen @ 2016-12-02 16:02 UTC (permalink / raw) To: Dmitry Gutov; +Cc: p.stephani2, Eli Zaretskii, kentaro.nakazawa, emacs-devel Dmitry Gutov <dgutov@yandex.ru> writes: > Examples of things "most people" are doing "trying to transmit" > "nothing but ASCII" using the URL package, please. I'm not sure what you want an example of. That most people try to transmit nothing but ASCII? That they may end up with multibyte ASCII strings without having "meaning" to (because it should make no difference)? The first thing is trivially true, and the second I think is also pretty much self-evident: (multibyte-string-p (buffer-substring (point) (- (point) 10))) => t > And at least make it clear that the parameter with default to UTF-8, right? Of course. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-12-02 16:02 ` Lars Ingebrigtsen @ 2016-12-02 16:06 ` Dmitry Gutov 2016-12-02 16:31 ` Lars Ingebrigtsen 0 siblings, 1 reply; 125+ messages in thread From: Dmitry Gutov @ 2016-12-02 16:06 UTC (permalink / raw) To: Lars Ingebrigtsen Cc: p.stephani2, Eli Zaretskii, kentaro.nakazawa, emacs-devel On 02.12.2016 18:02, Lars Ingebrigtsen wrote: >> Examples of things "most people" are doing "trying to transmit" >> "nothing but ASCII" using the URL package, please. > > I'm not sure what you want an example of. That most people try to > transmit nothing but ASCII? Yes. > That they may end up with multibyte ASCII > strings without having "meaning" to (because it should make no > difference)? > The first thing is trivially true, and the second I think is also pretty > much self-evident: > > (multibyte-string-p (buffer-substring (point) (- (point) 10))) > => t It's absolutely not a given that most applications or libraries that people write with Elisp will end up sending ASCII-only text. Especially if those applications are then available publicly for other people to use. ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-12-02 16:06 ` Dmitry Gutov @ 2016-12-02 16:31 ` Lars Ingebrigtsen 2016-12-02 23:13 ` Dmitry Gutov 0 siblings, 1 reply; 125+ messages in thread From: Lars Ingebrigtsen @ 2016-12-02 16:31 UTC (permalink / raw) To: Dmitry Gutov; +Cc: p.stephani2, Eli Zaretskii, kentaro.nakazawa, emacs-devel Dmitry Gutov <dgutov@yandex.ru> writes: >> I'm not sure what you want an example of. That most people try to >> transmit nothing but ASCII? > > Yes. Normal web applications require that you URL-encode (or similar) any data you send to them. These encodings are ASCII only. Here's a typical example of how this is used: (let ((url-request-method "POST") (url-request-extra-headers (list (cons "Content-Type" (concat "multipart/form-data; boundary=" boundary)))) (url-request-data (mm-url-encode-multipart-form-data values boundary))) The output from mm-url-encode-multipart-form-data is ASCII, and is typically multibyte. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-12-02 16:31 ` Lars Ingebrigtsen @ 2016-12-02 23:13 ` Dmitry Gutov 2016-12-03 0:37 ` Lars Ingebrigtsen 0 siblings, 1 reply; 125+ messages in thread From: Dmitry Gutov @ 2016-12-02 23:13 UTC (permalink / raw) To: Lars Ingebrigtsen Cc: p.stephani2, Eli Zaretskii, kentaro.nakazawa, emacs-devel On 02.12.2016 18:31, Lars Ingebrigtsen wrote: > Normal web applications require that you URL-encode (or similar) any > data you send to them. These encodings are ASCII only. > > Here's a typical example of how this is used: > > (let ((url-request-method "POST") > (url-request-extra-headers > (list (cons "Content-Type" > (concat "multipart/form-data; boundary=" > boundary)))) > (url-request-data > (mm-url-encode-multipart-form-data values boundary))) Thanks! > The output from mm-url-encode-multipart-form-data is ASCII, and is > typically multibyte. If we make the proposed change, this function will violate the contract on url-request-data (if the described above is its main use case). Luckily, this function is part of Emacs, so we can fix it in the same patch. ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-12-02 23:13 ` Dmitry Gutov @ 2016-12-03 0:37 ` Lars Ingebrigtsen 2016-12-03 1:27 ` Dmitry Gutov 2016-12-03 8:12 ` Eli Zaretskii 0 siblings, 2 replies; 125+ messages in thread From: Lars Ingebrigtsen @ 2016-12-03 0:37 UTC (permalink / raw) To: Dmitry Gutov; +Cc: p.stephani2, emacs-devel, Eli Zaretskii, kentaro.nakazawa Dmitry Gutov <dgutov@yandex.ru> writes: > If we make the proposed change, this function will violate the > contract on url-request-data (if the described above is its main use > case). > > Luckily, this function is part of Emacs, so we can fix it in the same patch. I'm sorry, I'm not sure how to respond to this without making accusations of a bad faith response on your part. This is a function will an ill-defined interface, but virtually all callers here understand what the interface is ("don't put anything into the body that isn't ASCII"). Even if wonkily defined, this works for virtually all callers, in-tree or not. You're proposing a change that would make virtually all these usages of this (ill-defined) function fail. The real fix for this extremely obscure problem is 1) to remove the `error' call you introduced in Emacs 25.1, and 2) make the Content-Length header reflect the number of octets transferred instead of the number of bytes in the URL string. This would have moved the number of successful calls to `url-retrieve' from (I'm guesstimating) 99.9995% to 99.999995%, and people who wanted to send iso8859-1 text to web servers would still fail. But these people are pretty rare. Your proposal would move the number of successful calls to `url-retrieve' with a body to around 0%. At this point I'm not sure what else to say. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-12-03 0:37 ` Lars Ingebrigtsen @ 2016-12-03 1:27 ` Dmitry Gutov 2016-12-03 8:12 ` Eli Zaretskii 1 sibling, 0 replies; 125+ messages in thread From: Dmitry Gutov @ 2016-12-03 1:27 UTC (permalink / raw) To: Lars Ingebrigtsen Cc: p.stephani2, emacs-devel, Eli Zaretskii, kentaro.nakazawa On 03.12.2016 02:37, Lars Ingebrigtsen wrote: > I'm sorry, I'm not sure how to respond to this without making > accusations of a bad faith response on your part. All I'm trying to do here is to introduce a more meaningful, stronger typing. See Yuri's comment on why that can be important. I don't really know if the benefits really outweigh the inconvenience, but the only example you gave so far can be trivially solved from our side. That leaves clients that perform "url encoding" manually using their own code, but there might be none of them, for all I know. IME, JSON encoding is more popular than that, and those users are affected already. > This is a function will an ill-defined interface, but virtually all > callers here understand what the interface is ("don't put anything into > the body that isn't ASCII"). Even if wonkily defined, this works for > virtually all callers, in-tree or not. > You're proposing a change that would make virtually all these usages of > this (ill-defined) function fail. True. > The real fix for this extremely obscure problem is 1) to remove the > `error' call you introduced in Emacs 25.1, and 2) make the > Content-Length header reflect the number of octets transferred instead > of the number of bytes in the URL string. This would have moved the > number of successful calls to `url-retrieve' from (I'm guesstimating) > 99.9995% to 99.999995%, and people who wanted to send iso8859-1 text to > web servers would still fail. But these people are pretty rare. > > Your proposal would move the number of successful calls to > `url-retrieve' with a body to around 0%. Not true. All current users of json.el, at least, who have updated their code for Emacs 25, won't be affected. And I imagine they represent a significant fraction of `url-retrieve' users. ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-12-03 0:37 ` Lars Ingebrigtsen 2016-12-03 1:27 ` Dmitry Gutov @ 2016-12-03 8:12 ` Eli Zaretskii 2016-12-03 10:01 ` Lars Ingebrigtsen 1 sibling, 1 reply; 125+ messages in thread From: Eli Zaretskii @ 2016-12-03 8:12 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: p.stephani2, emacs-devel, kentaro.nakazawa, dgutov > From: Lars Ingebrigtsen <larsi@gnus.org> > Cc: p.stephani2@gmail.com, Eli Zaretskii <eliz@gnu.org>, kentaro.nakazawa@nifty.com, emacs-devel@gnu.org > Date: Sat, 03 Dec 2016 01:37:19 +0100 > > I'm sorry, I'm not sure how to respond to this without making > accusations of a bad faith response on your part. Please don't. There's no bad faith on anyone's side here. > make the Content-Length header reflect the number of octets > transferred instead of the number of bytes in the URL string. How do you propose to compute the number of transferred octets, given that the URL request payload is a string? ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-12-03 8:12 ` Eli Zaretskii @ 2016-12-03 10:01 ` Lars Ingebrigtsen 2016-12-03 16:00 ` Stefan Monnier 0 siblings, 1 reply; 125+ messages in thread From: Lars Ingebrigtsen @ 2016-12-03 10:01 UTC (permalink / raw) To: Eli Zaretskii; +Cc: p.stephani2, emacs-devel, kentaro.nakazawa, dgutov Eli Zaretskii <eliz@gnu.org> writes: > How do you propose to compute the number of transferred octets, given > that the URL request payload is a string? Just use `string-bytes' instead of `length'. This happens to work since almost all web services expect utf-8, and our strings happen to be utf-8, too. (The few callers that are sending a different charset already presumably know to encode their data, or their applications would be failing already.) Yes, it's yucky, but this is an ill-defined function. And we should emphasise backwards compatibility instead of breaking people's code, I think. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-12-03 10:01 ` Lars Ingebrigtsen @ 2016-12-03 16:00 ` Stefan Monnier 2016-12-03 20:01 ` Lars Ingebrigtsen 0 siblings, 1 reply; 125+ messages in thread From: Stefan Monnier @ 2016-12-03 16:00 UTC (permalink / raw) To: emacs-devel > Just use `string-bytes' instead of `length'. IIRC the problem with that is if the string is the result of concatenating a unibyte and a multibyte string, in which case the string may only contain bytes (and hence `length` gives the right result) yet `string-bytes` and `length` will return different results (because the ≥128 bytes are encoded as 2 bytes in the multibyte representation). Stefan ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-12-03 16:00 ` Stefan Monnier @ 2016-12-03 20:01 ` Lars Ingebrigtsen 2016-12-03 20:57 ` Andreas Schwab 0 siblings, 1 reply; 125+ messages in thread From: Lars Ingebrigtsen @ 2016-12-03 20:01 UTC (permalink / raw) To: Stefan Monnier; +Cc: emacs-devel Stefan Monnier <monnier@iro.umontreal.ca> writes: > IIRC the problem with that is if the string is the result of > concatenating a unibyte and a multibyte string, in which case the string > may only contain bytes (and hence `length` gives the right result) yet > `string-bytes` and `length` will return different results (because the > ≥128 bytes are encoded as 2 bytes in the multibyte representation). Hm... I see... I think... :-) Can `string-bytes' return a different number than (with-temp-buffer (set-buffer-multibyte nil) (insert string) (buffer-size)) ? In any case, this latter is what we want, because those are the octets that will be transmitted to the server. Unless there's another subtlety I'm not aware of, which seems likely. :-) -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-12-03 20:01 ` Lars Ingebrigtsen @ 2016-12-03 20:57 ` Andreas Schwab 0 siblings, 0 replies; 125+ messages in thread From: Andreas Schwab @ 2016-12-03 20:57 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: Stefan Monnier, emacs-devel On Dez 03 2016, Lars Ingebrigtsen <larsi@gnus.org> wrote: > Can `string-bytes' return a different number than > > (with-temp-buffer > (set-buffer-multibyte nil) > (insert string) > (buffer-size)) > > ? ELISP> (string-bytes "\200") 1 (#o1, #x1, ?\C-a) ELISP> (string-bytes (string-make-multibyte "\200")) 2 (#o2, #x2, ?\C-b) ELISP> (let ((string "\200")) (with-temp-buffer (set-buffer-multibyte nil) (insert string) (buffer-size))) 1 (#o1, #x1, ?\C-a) ELISP> (let ((string (string-make-multibyte "\200"))) (with-temp-buffer (set-buffer-multibyte nil) (insert string) (buffer-size))) 1 (#o1, #x1, ?\C-a) ELISP> Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-12-01 0:30 ` Dmitry Gutov 2016-12-01 17:17 ` Eli Zaretskii @ 2016-12-28 18:25 ` Philipp Stephani 1 sibling, 0 replies; 125+ messages in thread From: Philipp Stephani @ 2016-12-28 18:25 UTC (permalink / raw) To: Dmitry Gutov, Eli Zaretskii; +Cc: larsi, kentaro.nakazawa, emacs-devel [-- Attachment #1: Type: text/plain, Size: 406 bytes --] Dmitry Gutov <dgutov@yandex.ru> schrieb am Do., 1. Dez. 2016 um 01:30 Uhr: > > Further, now that Emacs 25 is out, and we are allowed to have more > breaking changes in Emacs 26, I think we should change the check at the > end of url-http-create-request to just use multibyte-string-p. > > I think that's a good idea. (The check should also be moved to the front and documented, but those are minor nits.) [-- Attachment #2: Type: text/html, Size: 785 bytes --] ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-11-29 23:09 ` Philipp Stephani 2016-11-29 23:18 ` Philipp Stephani 2016-11-30 0:16 ` Dmitry Gutov @ 2016-11-30 15:06 ` Eli Zaretskii 2016-11-30 15:31 ` Stefan Monnier 2 siblings, 1 reply; 125+ messages in thread From: Eli Zaretskii @ 2016-11-30 15:06 UTC (permalink / raw) To: Philipp Stephani; +Cc: larsi, emacs-devel, kentaro.nakazawa, dgutov > From: Philipp Stephani <p.stephani2@gmail.com> > Date: Tue, 29 Nov 2016 23:09:57 +0000 > Cc: larsi@gnus.org, kentaro.nakazawa@nifty.com, emacs-devel@gnu.org > > > json-encode returns a multibyte string. > > Any idea why? > > Because (symbol-name 'false) returns a multibyte string. I guess the ultimate reason is that the reader always > creates multibyte strings for symbol names. I'm not sure I understand how symbol-name comes into play here. Can you help me understand this? > Is it again that 'concat' misfeature, when one of the > strings is pure-ASCII, but happens to be multibyte? > > Why is it a misfeature? Because a pure-ASCII string doesn't need to be multibyte, it's only becomes that by accident. The net results is that this misfeature gets in the way when you want to produce a unibyte string by concatenating an encoded string and some ASCII text. > I'd expect a concatenation of multibyte and unibyte strings to either implicitly upgrade > to as multibyte string (as in Python 2) or raise a signal (as in Python 3). But when all the strings are either unibyte or pure-ASCII, we could produce a unibyte string without losing anything. ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el 2016-11-30 15:06 ` Eli Zaretskii @ 2016-11-30 15:31 ` Stefan Monnier 0 siblings, 0 replies; 125+ messages in thread From: Stefan Monnier @ 2016-11-30 15:31 UTC (permalink / raw) To: emacs-devel > But when all the strings are either unibyte or pure-ASCII, we could > produce a unibyte string without losing anything. Actually, technically, if we take a multibyte string which only contains pure-ASCII and convert it to unibyte, we lose information: with a multibyte string, we can compare the `size` and the `size_byte` fields, and if they're equal we know we have a pure-ASCII string, whereas with a unibyte string, we'd have to scan the whole string looking for a byte >= 128 to determine that it's pure-ASCII. So maybe the change should be that when concat has to combine a unibyte string and a multibyte string, it should first look to see if the multibyte string has `size == size_byte` and if so, generate a unibyte string. Stefan ^ permalink raw reply [flat|nested] 125+ messages in thread
end of thread, other threads:[~2017-01-28 14:16 UTC | newest] Thread overview: 125+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2016-06-12 2:22 bug#23750: 25.0.95; bug in url-retrieve or json.el Leo Liu 2016-06-13 15:02 ` Dmitry Gutov 2016-06-13 17:55 ` Stefan Monnier 2016-06-13 19:26 ` Dmitry Gutov 2016-06-14 0:30 ` Stefan Monnier 2016-06-19 18:14 ` Dmitry Gutov 2016-06-19 18:25 ` Eli Zaretskii 2016-06-19 18:30 ` John Wiegley 2016-06-19 18:45 ` Dmitry Gutov 2016-06-19 19:56 ` John Wiegley 2016-06-19 20:05 ` Dmitry Gutov 2016-06-19 21:07 ` John Wiegley 2016-06-20 1:28 ` Glenn Morris 2016-06-20 4:22 ` John Wiegley 2016-06-20 12:39 ` Lars Ingebrigtsen 2016-07-01 20:49 ` John Wiegley 2016-06-20 14:42 ` Eli Zaretskii 2016-06-23 17:14 ` Glenn Morris 2016-06-20 1:26 ` Glenn Morris 2016-06-20 2:58 ` Dmitry Gutov 2016-06-19 18:36 ` Dmitry Gutov 2016-06-20 0:15 ` Leo Liu 2016-06-20 14:39 ` Eli Zaretskii 2016-06-20 2:40 ` Eli Zaretskii 2016-06-20 2:51 ` Dmitry Gutov 2016-06-20 14:38 ` Eli Zaretskii 2016-06-20 14:54 ` Dmitry Gutov 2016-06-20 15:03 ` Eli Zaretskii 2016-06-20 17:16 ` Dmitry Gutov 2016-06-20 20:17 ` Eli Zaretskii 2016-06-20 20:27 ` Dmitry Gutov 2016-06-21 2:30 ` Eli Zaretskii 2016-06-21 13:51 ` Dmitry Gutov 2016-06-21 15:18 ` Eli Zaretskii 2016-06-22 1:08 ` John Wiegley 2016-06-22 2:36 ` Eli Zaretskii 2016-06-22 18:21 ` Dmitry Gutov -- strict thread matches above, loose matches on Subject: below -- 2016-11-29 8:22 Kentaro NAKAZAWA 2016-11-29 9:54 ` Andreas Schwab 2016-11-29 10:06 ` Kentaro NAKAZAWA 2016-11-29 10:08 ` Dmitry Gutov 2016-11-29 10:23 ` Kentaro NAKAZAWA 2016-11-29 10:34 ` Lars Ingebrigtsen 2016-11-29 10:38 ` Kentaro NAKAZAWA 2016-11-29 10:42 ` Lars Ingebrigtsen 2016-11-29 10:48 ` Kentaro NAKAZAWA 2016-11-29 10:49 ` Dmitry Gutov 2016-11-29 10:50 ` Dmitry Gutov 2016-11-29 10:55 ` Kentaro NAKAZAWA 2016-11-29 10:59 ` Dmitry Gutov 2016-11-29 11:03 ` Kentaro NAKAZAWA 2016-11-29 11:05 ` Dmitry Gutov 2016-11-29 11:12 ` Kentaro NAKAZAWA 2016-11-29 17:23 ` Eli Zaretskii 2016-11-29 23:09 ` Philipp Stephani 2016-11-29 23:18 ` Philipp Stephani 2016-11-30 15:11 ` Eli Zaretskii 2016-11-30 15:20 ` Lars Ingebrigtsen 2016-11-30 15:43 ` Eli Zaretskii 2016-11-30 15:46 ` Lars Ingebrigtsen 2016-11-30 0:16 ` Dmitry Gutov 2016-11-30 15:13 ` Eli Zaretskii 2016-11-30 15:17 ` Dmitry Gutov 2016-11-30 15:32 ` Stefan Monnier 2016-11-30 15:42 ` Eli Zaretskii 2016-11-30 15:45 ` Dmitry Gutov 2016-11-30 15:48 ` Lars Ingebrigtsen 2016-11-30 16:25 ` Eli Zaretskii 2016-11-30 16:27 ` Lars Ingebrigtsen 2016-11-30 16:42 ` Eli Zaretskii 2016-11-30 18:25 ` Philipp Stephani 2016-11-30 18:48 ` Eli Zaretskii 2016-12-28 18:18 ` Philipp Stephani 2016-12-28 18:34 ` Eli Zaretskii 2016-12-28 18:45 ` Philipp Stephani 2016-12-28 18:55 ` Eli Zaretskii 2016-12-28 19:03 ` Andreas Schwab 2016-11-30 18:23 ` Philipp Stephani 2016-11-30 18:44 ` Eli Zaretskii 2016-12-28 18:09 ` Philipp Stephani 2016-12-28 18:27 ` Eli Zaretskii 2016-12-28 18:35 ` Philipp Stephani 2016-12-28 18:45 ` Eli Zaretskii 2016-12-28 18:22 ` Philipp Stephani 2016-12-28 18:57 ` Lars Ingebrigtsen 2016-12-30 0:07 ` Richard Stallman 2016-12-30 14:15 ` Lars Ingebrigtsen 2016-12-30 16:59 ` Eli Zaretskii 2017-01-21 15:39 ` Lars Ingebrigtsen 2017-01-21 15:56 ` Eli Zaretskii 2017-01-21 16:30 ` Lars Ingebrigtsen 2017-01-21 22:58 ` Stefan Monnier 2017-01-24 20:04 ` Lars Ingebrigtsen 2017-01-28 9:52 ` Elias Mårtenson 2017-01-28 14:16 ` Lars Ingebrigtsen 2016-12-30 21:38 ` Richard Stallman 2016-11-30 16:23 ` Eli Zaretskii 2016-12-01 0:30 ` Dmitry Gutov 2016-12-01 17:17 ` Eli Zaretskii 2016-12-02 13:18 ` Dmitry Gutov 2016-12-02 14:24 ` Eli Zaretskii 2016-12-02 14:35 ` Dmitry Gutov 2016-12-02 15:20 ` Eli Zaretskii 2016-12-02 14:53 ` Yuri Khan 2016-12-02 15:45 ` Eli Zaretskii 2016-12-02 15:51 ` Lars Ingebrigtsen 2016-12-02 15:58 ` Eli Zaretskii 2016-12-02 15:29 ` Lars Ingebrigtsen 2016-12-02 15:32 ` Dmitry Gutov 2016-12-02 15:48 ` Lars Ingebrigtsen 2016-12-02 15:56 ` Dmitry Gutov 2016-12-02 16:02 ` Lars Ingebrigtsen 2016-12-02 16:06 ` Dmitry Gutov 2016-12-02 16:31 ` Lars Ingebrigtsen 2016-12-02 23:13 ` Dmitry Gutov 2016-12-03 0:37 ` Lars Ingebrigtsen 2016-12-03 1:27 ` Dmitry Gutov 2016-12-03 8:12 ` Eli Zaretskii 2016-12-03 10:01 ` Lars Ingebrigtsen 2016-12-03 16:00 ` Stefan Monnier 2016-12-03 20:01 ` Lars Ingebrigtsen 2016-12-03 20:57 ` Andreas Schwab 2016-12-28 18:25 ` Philipp Stephani 2016-11-30 15:06 ` Eli Zaretskii 2016-11-30 15:31 ` Stefan Monnier
Code repositories for project(s) associated with this external index https://git.savannah.gnu.org/cgit/emacs.git https://git.savannah.gnu.org/cgit/emacs/org-mode.git This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.