unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Re: emacs-25 b6b47af: Properly encode/decode base64Binary data in SOAP
       [not found] ` <E1aGuJQ-0004Wv-Lz@vcs.savannah.gnu.org>
@ 2016-03-10  1:03   ` Thomas Fitzsimmons
  2016-03-10  9:30     ` Andreas Schwab
  0 siblings, 1 reply; 58+ messages in thread
From: Thomas Fitzsimmons @ 2016-03-10  1:03 UTC (permalink / raw)
  To: emacs-devel; +Cc: Andreas Schwab

Hi Andreas,

Andreas Schwab <schwab@gnu.org> writes:

> branch: emacs-25
> commit b6b47af82f6c7d960388ec46dd8ab371c2e34de4
> Author: Andreas Schwab <schwab@linux-m68k.org>
> Commit: Andreas Schwab <schwab@linux-m68k.org>
>
>     Properly encode/decode base64Binary data in SOAP
>     
>     	* lisp/net/soap-client.el (soap-encode-xs-basic-type): Encode
>     	base64Binary value as utf-8.
>     	(soap-decode-xs-basic-type): Decode base64Binary value as utf-8.
> ---
>  lisp/net/soap-client.el |    4 ++--
>  1 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/lisp/net/soap-client.el b/lisp/net/soap-client.el
> index f8cdaa9..7402464 100644
> --- a/lisp/net/soap-client.el
> +++ b/lisp/net/soap-client.el
> @@ -538,7 +538,7 @@ This is a specialization of `soap-encode-value' for
>                 (base64Binary
>                  (unless (stringp value)
>                    (error "Not a string value for base64Binary"))
> -                (base64-encode-string value))
> +                (base64-encode-string (encode-coding-string value 'utf-8)))
>  
>                 (otherwise
>                  (error "Don't know how to encode %s for type %s"
> @@ -682,7 +682,7 @@ This is a specialization of `soap-decode-type' for
>                 decimal byte float double duration)
>           (string-to-number (car contents)))
>          (boolean (string= (downcase (car contents)) "true"))
> -        (base64Binary (base64-decode-string (car contents)))
> +        (base64Binary (decode-coding-string (base64-decode-string (car contents)) 'utf-8))
>          (anyType (soap-decode-any-type node))
>          (Array (soap-decode-array node))))))
>  

I'm trying to merge Emacs master to the emacs-soap-client repository
(and back) to release soap-client 3.1.0.  This patch is causing some
test suite failures.  (Unfortunately the test suite is private because
we're not sure if we can redistribute the WSDL files therein.)  Here is
an example fragment that fails with your patch:

<originator xsi:type="xsd:base64Binary">Wm9sdMOhbiBLcmFqY3NvdmljcyA8enVsdGhhbmtAZ21haWwuY29tPg==</originator>

The test suite expects this to decode as:

Zolt\303\241n Krajcsovics [...]

but gets:

Zoltán Krajcsovics [...]

base64Binary is supposed to be for binary data; I don't think it's safe
to assume the content is a UTF-8-encoded string for encoding or
decoding, in general.

Do you have an example SOAP message that your patch fixes, and can the
surrounding code do the UTF-8 encoding/decoding instead of soap-client
itself?

Thanks,
Thomas



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: emacs-25 b6b47af: Properly encode/decode base64Binary data in SOAP
  2016-03-10  1:03   ` emacs-25 b6b47af: Properly encode/decode base64Binary data in SOAP Thomas Fitzsimmons
@ 2016-03-10  9:30     ` Andreas Schwab
  2016-03-11  3:29       ` emacs-25 b6b47AF: " Thomas Fitzsimmons
  0 siblings, 1 reply; 58+ messages in thread
From: Andreas Schwab @ 2016-03-10  9:30 UTC (permalink / raw)
  To: Thomas Fitzsimmons; +Cc: emacs-devel

Thomas Fitzsimmons <fitzsim@fitzsim.org> writes:

> <originator xsi:type="xsd:base64Binary">Wm9sdMOhbiBLcmFqY3NvdmljcyA8enVsdGhhbmtAZ21haWwuY29tPg==</originator>
>
> The test suite expects this to decode as:
>
> Zolt\303\241n Krajcsovics [...]

Leaving it undecoded is definitely wrong.  Undecoded strings should
never be used inside Emacs, except for interaction with external
sources.

> Do you have an example SOAP message that your patch fixes, and can the
> surrounding code do the UTF-8 encoding/decoding instead of soap-client
> itself?

(debbugs-get-status 22285)

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: emacs-25 b6b47AF: Properly encode/decode base64Binary data in SOAP
  2016-03-10  9:30     ` Andreas Schwab
@ 2016-03-11  3:29       ` Thomas Fitzsimmons
  2016-03-11  8:35         ` Andreas Schwab
  0 siblings, 1 reply; 58+ messages in thread
From: Thomas Fitzsimmons @ 2016-03-11  3:29 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: emacs-devel

Andreas Schwab <schwab@linux-m68k.org> writes:

> Thomas Fitzsimmons <fitzsim@fitzsim.org> writes:
>
>> <originator xsi:type="xsd:base64Binary">Wm9sdMOhbiBLcmFqY3NvdmljcyA8enVsdGhhbmtAZ21haWwuY29tPg==</originator>
>>
>> The test suite expects this to decode as:
>>
>> Zolt\303\241n Krajcsovics [...]
>
> Leaving it undecoded is definitely wrong.  Undecoded strings should
> never be used inside Emacs, except for interaction with external
> sources.

Agreed, but I disagree that soap-client itself should be doing the
decoding.  SOAP messages are sent to and received from external sources.
In general, an elisp function receiving base64-encoded data from an
external source, via a soap-client call, will want to see it first as a
unibyte string of bytes, in case it's image data or some other
non-string data.  If the soap-client caller expects it to be a UTF-8
encoded string, the caller can attempt the string decoding.

>> Do you have an example SOAP message that your patch fixes, and can the
>> surrounding code do the UTF-8 encoding/decoding instead of soap-client
>> itself?
>
> (debbugs-get-status 22285)

OK, thanks.  The originator field there is base64 binary.  Without your
patch:

(multibyte-string-p (cdr (assoc 'originator (car (debbugs-get-status 22285)))))
=> nil
(length (cdr (assq 'originator (car (debbugs-get-status 22285)))))
=> 51

With your patch:

(multibyte-string-p (cdr (assoc 'originator (car (debbugs-get-status 22285)))))
=> t
(length (cdr (assq 'originator (car (debbugs-get-status 22285)))))
=> 50

Before your patch, what problem did you see?  Did some caller of
debbugs-get-status show raw characters somewhere, i.e., did it assume
originator was an already-decoded UTF-8 string?  If so, I think that
caller should be fixed (and soap-client should go back to its old
behavior of leaving the string undecoded).

To see what other callers do, I tried (debbugs-org-bugs 22285); it
decodes the originator field itself:

  [...]
  (originator (when (cdr (assq 'originator status))
                (decode-coding-string
                 (cdr (assq 'originator status)) 'utf-8)))
  [...]

which correctly shows the "LATIN SMALL LETTER E WITH ACUTE" in the
*Org Bugs* buffer with the old soap-client behavior.

Thomas



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: emacs-25 b6b47AF: Properly encode/decode base64Binary data in SOAP
  2016-03-11  3:29       ` emacs-25 b6b47AF: " Thomas Fitzsimmons
@ 2016-03-11  8:35         ` Andreas Schwab
  2016-03-11 13:49           ` Alex Harsanyi
  0 siblings, 1 reply; 58+ messages in thread
From: Andreas Schwab @ 2016-03-11  8:35 UTC (permalink / raw)
  To: Thomas Fitzsimmons; +Cc: emacs-devel

Thomas Fitzsimmons <fitzsim@fitzsim.org> writes:

> Before your patch, what problem did you see?

An undecoded string.

> To see what other callers do, I tried (debbugs-org-bugs 22285); it
> decodes the originator field itself:
>
>   [...]
>   (originator (when (cdr (assq 'originator status))
>                 (decode-coding-string
>                  (cdr (assq 'originator status)) 'utf-8)))

I expect that to be done at the transport level.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: emacs-25 b6b47AF: Properly encode/decode base64Binary data in SOAP
  2016-03-11  8:35         ` Andreas Schwab
@ 2016-03-11 13:49           ` Alex Harsanyi
  2016-03-11 14:09             ` Andreas Schwab
  0 siblings, 1 reply; 58+ messages in thread
From: Alex Harsanyi @ 2016-03-11 13:49 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: Thomas Fitzsimmons, emacs-devel

2016-03-11 16:35 GMT+08:00 Andreas Schwab <schwab@linux-m68k.org>:
> Thomas Fitzsimmons <fitzsim@fitzsim.org> writes:
>
>> Before your patch, what problem did you see?
>
> An undecoded string.
>
>> To see what other callers do, I tried (debbugs-org-bugs 22285); it
>> decodes the originator field itself:
>>
>>   [...]
>>   (originator (when (cdr (assq 'originator status))
>>                 (decode-coding-string
>>                  (cdr (assq 'originator status)) 'utf-8)))
>
> I expect that to be done at the transport level.

Unfortunately, there is not enough information to decode the string at
the transport level.  What if the base64 data contains a PNG image or
some other binary data?

Best Regards,
Alex.



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: emacs-25 b6b47AF: Properly encode/decode base64Binary data in SOAP
  2016-03-11 13:49           ` Alex Harsanyi
@ 2016-03-11 14:09             ` Andreas Schwab
  2016-03-11 16:48               ` Stefan Monnier
  0 siblings, 1 reply; 58+ messages in thread
From: Andreas Schwab @ 2016-03-11 14:09 UTC (permalink / raw)
  To: Alex Harsanyi; +Cc: Thomas Fitzsimmons, emacs-devel

Alex Harsanyi <alexharsanyi@gmail.com> writes:

> Unfortunately, there is not enough information to decode the string at
> the transport level.

Then provide an interface to decode it.  In Emacs you always see the
decoded contents unless you explicitly ask for not doing it.  Returning
half-decoded content is the wrong way to do it.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: emacs-25 b6b47AF: Properly encode/decode base64Binary data in SOAP
  2016-03-11 14:09             ` Andreas Schwab
@ 2016-03-11 16:48               ` Stefan Monnier
  2016-03-11 16:59                 ` Andreas Schwab
  0 siblings, 1 reply; 58+ messages in thread
From: Stefan Monnier @ 2016-03-11 16:48 UTC (permalink / raw)
  To: emacs-devel

> Then provide an interface to decode it.  In Emacs you always see the
> decoded contents unless you explicitly ask for not doing it.  Returning
> half-decoded content is the wrong way to do it.

AFAIK it's not half-decoded, it's just never decoded because the
transport format only deals with bytes.


        Stefan




^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: emacs-25 b6b47AF: Properly encode/decode base64Binary data in SOAP
  2016-03-11 16:48               ` Stefan Monnier
@ 2016-03-11 16:59                 ` Andreas Schwab
  2016-03-11 22:27                   ` Stefan Monnier
  0 siblings, 1 reply; 58+ messages in thread
From: Andreas Schwab @ 2016-03-11 16:59 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

Stefan Monnier <monnier@iro.umontreal.ca> writes:

>> Then provide an interface to decode it.  In Emacs you always see the
>> decoded contents unless you explicitly ask for not doing it.  Returning
>> half-decoded content is the wrong way to do it.
>
> AFAIK it's not half-decoded, it's just never decoded because the
> transport format only deals with bytes.

The interface does not return bytes, it returns a (half-)decoded lisp
form.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: emacs-25 b6b47AF: Properly encode/decode base64Binary data in SOAP
  2016-03-11 16:59                 ` Andreas Schwab
@ 2016-03-11 22:27                   ` Stefan Monnier
  2016-03-13  3:52                     ` Thomas Fitzsimmons
  0 siblings, 1 reply; 58+ messages in thread
From: Stefan Monnier @ 2016-03-11 22:27 UTC (permalink / raw)
  To: emacs-devel

> The interface does not return bytes, it returns a (half-)decoded lisp
> form.

My understanding is that it returns a structure whose leaves are
all byte-strings because.

In this sense, they're all "undecoded".  And the meaning of those
byte-strings is not specified by the generic format, so the generic
parser can't know if or how to decode them.


        Stefan




^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: emacs-25 b6b47AF: Properly encode/decode base64Binary data in SOAP
  2016-03-11 22:27                   ` Stefan Monnier
@ 2016-03-13  3:52                     ` Thomas Fitzsimmons
  2016-03-13 15:15                       ` Stefan Monnier
  2016-03-13 16:02                       ` Eli Zaretskii
  0 siblings, 2 replies; 58+ messages in thread
From: Thomas Fitzsimmons @ 2016-03-13  3:52 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 989 bytes --]

Stefan Monnier <monnier@iro.umontreal.ca> writes:

>> The interface does not return bytes, it returns a (half-)decoded lisp
>> form.
>
> My understanding is that it returns a structure whose leaves are
> all byte-strings because.

String values are returned as multibyte, e.g.:

(multibyte-string-p (cdr (assq 'severity (car (debbugs-get-status 22285)))))
=> t

because parsing happens in a temporary buffer where
enable-multibyte-characters is set, and the string value is returned
unchanged.

> In this sense, they're all "undecoded".  And the meaning of those
> byte-strings is not specified by the generic format, so the generic
> parser can't know if or how to decode them.

This is true for base64Binary values.

I'd like to change the base64Binary behavior back, but we can document
this nuance of the API.  Also, on the encoding side I think we should
force the caller to provide unibyte strings for base64Binary values.

Is the attached patch OK for master and emacs-25?

Thomas


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: emacs-soap-client-document-base64-handling.patch --]
[-- Type: text/x-patch, Size: 1824 bytes --]

diff --git a/lisp/net/soap-client.el b/lisp/net/soap-client.el
index 7402464..bee3a12 100644
--- a/lisp/net/soap-client.el
+++ b/lisp/net/soap-client.el
@@ -536,9 +536,10 @@ soap-encode-xs-basic-type
                 (number-to-string value))
 
                (base64Binary
-                (unless (stringp value)
-                  (error "Not a string value for base64Binary"))
-                (base64-encode-string (encode-coding-string value 'utf-8)))
+                (unless (and (stringp value)
+                             (not (multibyte-string-p value)))
+                  (error "Not a unibyte string value for base64Binary"))
+                (base64-encode-string value))
 
                (otherwise
                 (error "Don't know how to encode %s for type %s"
@@ -682,7 +683,7 @@ soap-decode-xs-basic-type
                decimal byte float double duration)
          (string-to-number (car contents)))
         (boolean (string= (downcase (car contents)) "true"))
-        (base64Binary (decode-coding-string (base64-decode-string (car contents)) 'utf-8))
+        (base64Binary (base64-decode-string (car contents)))
         (anyType (soap-decode-any-type node))
         (Array (soap-decode-array node))))))
 
@@ -3096,7 +3097,9 @@ soap-invoke
 NOTE: The SOAP service provider should document the available
 operations and their parameters for the service.  You can also
 use the `soap-inspect' function to browse the available
-operations in a WSDL document."
+operations in a WSDL document.  `soap-invoke' base64-decodes
+base64Binary return values into unibyte strings; these
+byte-strings require further interpretation by the caller."
   (apply #'soap-invoke-internal nil nil wsdl service operation-name parameters))
 
 (defun soap-invoke-async (callback cbargs wsdl service operation-name

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* Re: emacs-25 b6b47AF: Properly encode/decode base64Binary data in SOAP
  2016-03-13  3:52                     ` Thomas Fitzsimmons
@ 2016-03-13 15:15                       ` Stefan Monnier
  2016-03-13 18:09                         ` Thomas Fitzsimmons
  2016-03-13 16:02                       ` Eli Zaretskii
  1 sibling, 1 reply; 58+ messages in thread
From: Stefan Monnier @ 2016-03-13 15:15 UTC (permalink / raw)
  To: Thomas Fitzsimmons; +Cc: emacs-devel

>>> The interface does not return bytes, it returns a (half-)decoded lisp
>>> form.
>> My understanding is that it returns a structure whose leaves are
>> all byte-strings because.
> String values are returned as multibyte, e.g.:
> (multibyte-string-p (cdr (assq 'severity (car (debbugs-get-status 22285)))))
> => t
> because parsing happens in a temporary buffer where
> enable-multibyte-characters is set, and the string value is returned
> unchanged.

If these are undecoded, then I'd consider it a bug to return them as
multibyte, indeed.  The temp buffer should probably be made unibyte.

> this nuance of the API.  Also, on the encoding side I think we should
> force the caller to provide unibyte strings for base64Binary values.

I'd let base64-encode-string make that decision, but it's just my own
preference of bikeshed's color.

> Is the attached patch OK for master and emacs-25?

Looks OK to me for master.  I'll let others figure out if it's safe
enough for `emacs-25'.


        Stefan



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: emacs-25 b6b47AF: Properly encode/decode base64Binary data in SOAP
  2016-03-13  3:52                     ` Thomas Fitzsimmons
  2016-03-13 15:15                       ` Stefan Monnier
@ 2016-03-13 16:02                       ` Eli Zaretskii
  2016-03-13 17:57                         ` Thomas Fitzsimmons
  1 sibling, 1 reply; 58+ messages in thread
From: Eli Zaretskii @ 2016-03-13 16:02 UTC (permalink / raw)
  To: Thomas Fitzsimmons; +Cc: monnier, emacs-devel

> From: Thomas Fitzsimmons <fitzsim@fitzsim.org>
> Date: Sat, 12 Mar 2016 22:52:25 -0500
> Cc: emacs-devel@gnu.org
> 
> String values are returned as multibyte, e.g.:
> 
> (multibyte-string-p (cdr (assq 'severity (car (debbugs-get-status 22285)))))
> => t

They should be unibyte, I agree with Stefan.  If you return multibyte
strings, they should be decoded.

> Is the attached patch OK for master and emacs-25?

Doesn't it bring back the bug which caused Andreas to make the change
you want to undo?



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: emacs-25 b6b47AF: Properly encode/decode base64Binary data in SOAP
  2016-03-13 16:02                       ` Eli Zaretskii
@ 2016-03-13 17:57                         ` Thomas Fitzsimmons
  2016-03-13 18:30                           ` Eli Zaretskii
  0 siblings, 1 reply; 58+ messages in thread
From: Thomas Fitzsimmons @ 2016-03-13 17:57 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: monnier, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Thomas Fitzsimmons <fitzsim@fitzsim.org>
>> Date: Sat, 12 Mar 2016 22:52:25 -0500
>> Cc: emacs-devel@gnu.org
>> 
>> String values are returned as multibyte, e.g.:
>> 
>> (multibyte-string-p (cdr (assq 'severity (car (debbugs-get-status 22285)))))
>> => t
>> because parsing happens in a temporary buffer where
>> enable-multibyte-characters is set, and the string value is returned
>> unchanged.
>
> They should be unibyte, I agree with Stefan.  If you return multibyte
> strings, they should be decoded.

Sorry, I wasn't being precise when I said "unchanged".  The result from
the server goes through several steps before soap-client returns the
multibyte string result:

   (defun soap-parse-server-response ()
     "Error-check and parse the XML contents of the current buffer."
     (let ((mime-part (mm-dissect-buffer t t)))
       (unless mime-part
         (error "Failed to decode response from server"))
       (unless (equal (car (mm-handle-type mime-part)) "text/xml")
         (error "Server response is not an XML document"))
       (with-temp-buffer
         (mm-insert-part mime-part)
         (prog1
             (car (xml-parse-region (point-min) (point-max)))
           (kill-buffer)
           (mm-destroy-part mime-part)))))

mm-insert-part does:

   (string-to-multibyte (mm-get-part handle no-cache))

In cases where the caller is expecting an xsd:string, the idea is for
soap-client to return a native Emacs string, for the caller's
convenience.  I guess soap-client assumes that the mm and xml packages
will do the right thing to convert XML string values into Emacs's
internal format.

>> Is the attached patch OK for master and emacs-25?
>
> Doesn't it bring back the bug which caused Andreas to make the change
> you want to undo?

It brings back the behavior of soap-client returning base64-decoded
xsd:base64Binary values as unibyte strings.  The debate on this thread
is about whether that behavior is buggy or not.  But yes, I want to
revert Andreas's change on both master and emacs-25 branches, because I
don't consider the old behavior buggy.

Thomas



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: emacs-25 b6b47AF: Properly encode/decode base64Binary data in SOAP
  2016-03-13 15:15                       ` Stefan Monnier
@ 2016-03-13 18:09                         ` Thomas Fitzsimmons
  0 siblings, 0 replies; 58+ messages in thread
From: Thomas Fitzsimmons @ 2016-03-13 18:09 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

Stefan Monnier <monnier@iro.umontreal.ca> writes:

>>>> The interface does not return bytes, it returns a (half-)decoded lisp
>>>> form.
>>> My understanding is that it returns a structure whose leaves are
>>> all byte-strings because.
>> String values are returned as multibyte, e.g.:
>> (multibyte-string-p (cdr (assq 'severity (car (debbugs-get-status 22285)))))
>> => t
>> because parsing happens in a temporary buffer where
>> enable-multibyte-characters is set, and the string value is returned
>> unchanged.
>
> If these are undecoded, then I'd consider it a bug to return them as
> multibyte, indeed.  The temp buffer should probably be made unibyte.

See my response to Eli.

>> this nuance of the API.  Also, on the encoding side I think we should
>> force the caller to provide unibyte strings for base64Binary values.
>
> I'd let base64-encode-string make that decision, but it's just my own
> preference of bikeshed's color.

OK.

>> Is the attached patch OK for master and emacs-25?
>
> Looks OK to me for master.  I'll let others figure out if it's safe
> enough for `emacs-25'.

Andreas committed his change to base64Binary handling to both master and
emacs-25.  Whatever the decision regarding reverting it, those two
branches should have the same behavior.

Thomas



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: emacs-25 b6b47AF: Properly encode/decode base64Binary data in SOAP
  2016-03-13 17:57                         ` Thomas Fitzsimmons
@ 2016-03-13 18:30                           ` Eli Zaretskii
  2016-03-13 19:54                             ` Thomas Fitzsimmons
  0 siblings, 1 reply; 58+ messages in thread
From: Eli Zaretskii @ 2016-03-13 18:30 UTC (permalink / raw)
  To: Thomas Fitzsimmons; +Cc: monnier, emacs-devel

> From: Thomas Fitzsimmons <fitzsim@fitzsim.org>
> Cc: monnier@iro.umontreal.ca,  emacs-devel@gnu.org
> Date: Sun, 13 Mar 2016 13:57:32 -0400
> 
>    (defun soap-parse-server-response ()
>      "Error-check and parse the XML contents of the current buffer."
>      (let ((mime-part (mm-dissect-buffer t t)))
>        (unless mime-part
>          (error "Failed to decode response from server"))
>        (unless (equal (car (mm-handle-type mime-part)) "text/xml")
>          (error "Server response is not an XML document"))
>        (with-temp-buffer
>          (mm-insert-part mime-part)
>          (prog1
>              (car (xml-parse-region (point-min) (point-max)))
>            (kill-buffer)
>            (mm-destroy-part mime-part)))))
> 
> mm-insert-part does:
> 
>    (string-to-multibyte (mm-get-part handle no-cache))

Why does it do that?  string-to-multibyte is one of those functions
that should never be used.

> In cases where the caller is expecting an xsd:string, the idea is for
> soap-client to return a native Emacs string, for the caller's
> convenience.

But that's not what string-to-multibyte does.

> I guess soap-client assumes that the mm and xml packages will do the
> right thing to convert XML string values into Emacs's internal
> format.

I'm not sure we are not mis-communicating: conversion into internal
format is what decoding does.  Whereas you just said a few messages
upthread that you thought strings should be returned undecoded,
i.e. as binary streams of bytes.  What am I missing?

> >> Is the attached patch OK for master and emacs-25?
> >
> > Doesn't it bring back the bug which caused Andreas to make the change
> > you want to undo?
> 
> It brings back the behavior of soap-client returning base64-decoded
> xsd:base64Binary values as unibyte strings.

I'm confused: you've just demonstrated that it returns them as
multibyte strings with raw bytes in their multibyte encoding.

> The debate on this thread is about whether that behavior is buggy or
> not.  But yes, I want to revert Andreas's change on both master and
> emacs-25 branches, because I don't consider the old behavior buggy.

That'll bring the bug in the debbugs package back, I think.  Once
again, if you want to return undecoded strings, they should at the
very least be unibyte, not multibyte.  Apologies if I'm too confused
to talk intelligently about this.



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: emacs-25 b6b47AF: Properly encode/decode base64Binary data in SOAP
  2016-03-13 18:30                           ` Eli Zaretskii
@ 2016-03-13 19:54                             ` Thomas Fitzsimmons
  2016-03-13 20:19                               ` Eli Zaretskii
  0 siblings, 1 reply; 58+ messages in thread
From: Thomas Fitzsimmons @ 2016-03-13 19:54 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: monnier, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Thomas Fitzsimmons <fitzsim@fitzsim.org>
>> Cc: monnier@iro.umontreal.ca,  emacs-devel@gnu.org
>> Date: Sun, 13 Mar 2016 13:57:32 -0400
>> 
>>    (defun soap-parse-server-response ()
>>      "Error-check and parse the XML contents of the current buffer."
>>      (let ((mime-part (mm-dissect-buffer t t)))
>>        (unless mime-part
>>          (error "Failed to decode response from server"))
>>        (unless (equal (car (mm-handle-type mime-part)) "text/xml")
>>          (error "Server response is not an XML document"))
>>        (with-temp-buffer
>>          (mm-insert-part mime-part)
>>          (prog1
>>              (car (xml-parse-region (point-min) (point-max)))
>>            (kill-buffer)
>>            (mm-destroy-part mime-part)))))
>> 
>> mm-insert-part does:
>> 
>>    (string-to-multibyte (mm-get-part handle no-cache))
>
> Why does it do that?  string-to-multibyte is one of those functions
> that should never be used.

I don't know.  This is the first I've looked at the mm code.  I'll have
to do more investigation here, apparently.

>> In cases where the caller is expecting an xsd:string, the idea is for
>> soap-client to return a native Emacs string, for the caller's
>> convenience.
>
> But that's not what string-to-multibyte does.
>
>> I guess soap-client assumes that the mm and xml packages will do the
>> right thing to convert XML string values into Emacs's internal
>> format.
>
> I'm not sure we are not mis-communicating: conversion into internal
> format is what decoding does.  Whereas you just said a few messages
> upthread that you thought strings should be returned undecoded,
> i.e. as binary streams of bytes.  What am I missing?

The discussion expanded from being about xsd:base64Binary, to being
about all strings returned by soap-client (see below).  Upthread I was
saying only that xsd:base64Binary values should be returned undecoded.
I wasn't commenting on how other XSD string values (xsd:string, etc.)
should be returned.

>> >> Is the attached patch OK for master and emacs-25?
>> >
>> > Doesn't it bring back the bug which caused Andreas to make the change
>> > you want to undo?
>> 
>> It brings back the behavior of soap-client returning base64-decoded
>> xsd:base64Binary values as unibyte strings.
>
> I'm confused: you've just demonstrated that it returns them as
> multibyte strings with raw bytes in their multibyte encoding.
>
>> The debate on this thread is about whether that behavior is buggy or
>> not.  But yes, I want to revert Andreas's change on both master and
>> emacs-25 branches, because I don't consider the old behavior buggy.
>
> That'll bring the bug in the debbugs package back, I think.  Once
> again, if you want to return undecoded strings, they should at the
> very least be unibyte, not multibyte.  Apologies if I'm too confused
> to talk intelligently about this.

Apologies for helping lead to confusion; it's good to have you reviewing
soap-client's design.

The discussion expanded from being about how to handle xsd:base64Binary
values only (Andreas's patch), to about how soap-client handles all
strings (including xsd:string, etc.).  It could be that how soap-client
handles all strings is broken, since it appears to be relying on
string-to-multibyte which you're saying should never be used.  However,
soap-client's decoding has been good enough that no one has complained
about string handling in general up til now.  But I'll review the design
with Alex to see if we can avoid calling string-to-multibyte via mm.

Maybe I can give an example with XML fragments returned by the server,
to show how I think soap-client should handle xsd:base64Binary values.

The debbugs server will respond with:

<?xml version="1.0" encoding="UTF-8"?>
[...]
<severity xsi:type="xsd:string">normal</severity>
[...]
<originator xsi:type="xsd:base64Binary">Q2zDqW1lbnQgUGl0LS1DbGF1ZGVsIDxjbGVtZW50LnBpdGNsYXVkZWxAbGl2ZS5jb20+</originator>
[...]

soap-client will parse those results into a structure that it returns to
the caller:

([...]
 (severity . "<string1>")
 [...]
 (originator . "<string2>")
 [...])

I think <string2> should be unibyte, because xsd:base64Binary represents
binary data, not necessarily a string.  It was unibyte before Andreas's
patch.  His patch changed it to be multibyte, by assuming the binary
data is a UTF-8 string and decoding it into Emacs's internal format.

What <string1> should be (unibyte or multibyte) and how it should be
produced (decoded) is the broader discussion.  I don't know enough to
have an opinion on that yet, other than it seems to have been working to
treat it as multibyte up until now.  Again, I'll have to talk to Alex
about this.

Thomas



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: emacs-25 b6b47AF: Properly encode/decode base64Binary data in SOAP
  2016-03-13 19:54                             ` Thomas Fitzsimmons
@ 2016-03-13 20:19                               ` Eli Zaretskii
  2016-03-13 21:17                                 ` Thomas Fitzsimmons
  0 siblings, 1 reply; 58+ messages in thread
From: Eli Zaretskii @ 2016-03-13 20:19 UTC (permalink / raw)
  To: Thomas Fitzsimmons; +Cc: monnier, emacs-devel

> From: Thomas Fitzsimmons <fitzsim@fitzsim.org>
> Cc: monnier@iro.umontreal.ca,  emacs-devel@gnu.org
> Date: Sun, 13 Mar 2016 15:54:34 -0400
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> >> From: Thomas Fitzsimmons <fitzsim@fitzsim.org>
> >> Cc: monnier@iro.umontreal.ca,  emacs-devel@gnu.org
> >> Date: Sun, 13 Mar 2016 13:57:32 -0400
> >> 
> >>    (defun soap-parse-server-response ()
> >>      "Error-check and parse the XML contents of the current buffer."
> >>      (let ((mime-part (mm-dissect-buffer t t)))
> >>        (unless mime-part
> >>          (error "Failed to decode response from server"))
> >>        (unless (equal (car (mm-handle-type mime-part)) "text/xml")
> >>          (error "Server response is not an XML document"))
> >>        (with-temp-buffer
> >>          (mm-insert-part mime-part)
> >>          (prog1
> >>              (car (xml-parse-region (point-min) (point-max)))
> >>            (kill-buffer)
> >>            (mm-destroy-part mime-part)))))
> >> 
> >> mm-insert-part does:
> >> 
> >>    (string-to-multibyte (mm-get-part handle no-cache))
> >
> > Why does it do that?  string-to-multibyte is one of those functions
> > that should never be used.
> 
> I don't know.  This is the first I've looked at the mm code.  I'll have
> to do more investigation here, apparently.

IME, mm-decode has a lot of baggage from distant past, when it had to
deal with bugs in Emacs, with incompatibilities between Emacs and
XEmacs, and from its own misconceptions in the area of encoding and
decoding text.  Its code should be carefully reviewed for correctness.

> Upthread I was saying only that xsd:base64Binary values should be
> returned undecoded.

But it currently doesn't, AFAIU, since this:

  (cdr (assq 'severity (car (debbugs-get-status 22285))))

is returned as a multibyte string.  So I guess you are saying that
soap-client needs some changes to return xsd:base64Binary data as
unibyte strings?  Or is "severity" an xsd:string?

> I wasn't commenting on how other XSD string values (xsd:string, etc.)
> should be returned.

In general, if it's known to be a text string, and its encoding is
specified by the document, or can be deduced otherwise with a 100%
certainty, I'd recommend to return decoded strings.

> Maybe I can give an example with XML fragments returned by the server,
> to show how I think soap-client should handle xsd:base64Binary values.
> 
> The debbugs server will respond with:
> 
> <?xml version="1.0" encoding="UTF-8"?>
> [...]
> <severity xsi:type="xsd:string">normal</severity>
> [...]
> <originator xsi:type="xsd:base64Binary">Q2zDqW1lbnQgUGl0LS1DbGF1ZGVsIDxjbGVtZW50LnBpdGNsYXVkZWxAbGl2ZS5jb20+</originator>
> [...]
> 
> soap-client will parse those results into a structure that it returns to
> the caller:
> 
> ([...]
>  (severity . "<string1>")
>  [...]
>  (originator . "<string2>")
>  [...])
> 
> I think <string2> should be unibyte, because xsd:base64Binary represents
> binary data, not necessarily a string.

Btw, why is "originator" not a string? why xsd:base64Binary?  It's a
name of a human (or some other entity), so it's clearly text, no?

> What <string1> should be (unibyte or multibyte) and how it should be
> produced (decoded) is the broader discussion.  I don't know enough to
> have an opinion on that yet, other than it seems to have been working to
> treat it as multibyte up until now.  Again, I'll have to talk to Alex
> about this.

If you can reliably decode it, then multibyte and decoded is better.
I'd also say that if it's known that xsd:base64Binary is a string in
disguise, it should also be decoded.  IOW, whenever the data is a
string, it is better to decode it, I agree with Andreas here: unibyte
strings that represent text are a PITA without a good justification.

Thanks.



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: emacs-25 b6b47AF: Properly encode/decode base64Binary data in SOAP
  2016-03-13 20:19                               ` Eli Zaretskii
@ 2016-03-13 21:17                                 ` Thomas Fitzsimmons
  2016-03-14  3:30                                   ` Eli Zaretskii
  2016-03-14  8:02                                   ` Michael Albinus
  0 siblings, 2 replies; 58+ messages in thread
From: Thomas Fitzsimmons @ 2016-03-13 21:17 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: monnier, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Thomas Fitzsimmons <fitzsim@fitzsim.org>
>> Cc: monnier@iro.umontreal.ca,  emacs-devel@gnu.org
>> Date: Sun, 13 Mar 2016 15:54:34 -0400
>> 
>> Eli Zaretskii <eliz@gnu.org> writes:
>> 
>> >> From: Thomas Fitzsimmons <fitzsim@fitzsim.org>
>> >> Cc: monnier@iro.umontreal.ca,  emacs-devel@gnu.org
>> >> Date: Sun, 13 Mar 2016 13:57:32 -0400
>> >> 
>> >>    (defun soap-parse-server-response ()
>> >>      "Error-check and parse the XML contents of the current buffer."
>> >>      (let ((mime-part (mm-dissect-buffer t t)))
>> >>        (unless mime-part
>> >>          (error "Failed to decode response from server"))
>> >>        (unless (equal (car (mm-handle-type mime-part)) "text/xml")
>> >>          (error "Server response is not an XML document"))
>> >>        (with-temp-buffer
>> >>          (mm-insert-part mime-part)
>> >>          (prog1
>> >>              (car (xml-parse-region (point-min) (point-max)))
>> >>            (kill-buffer)
>> >>            (mm-destroy-part mime-part)))))
>> >> 
>> >> mm-insert-part does:
>> >> 
>> >>    (string-to-multibyte (mm-get-part handle no-cache))
>> >
>> > Why does it do that?  string-to-multibyte is one of those functions
>> > that should never be used.
>> 
>> I don't know.  This is the first I've looked at the mm code.  I'll have
>> to do more investigation here, apparently.
>
> IME, mm-decode has a lot of baggage from distant past, when it had to
> deal with bugs in Emacs, with incompatibilities between Emacs and
> XEmacs, and from its own misconceptions in the area of encoding and
> decoding text.  Its code should be carefully reviewed for correctness.

OK.

>> Upthread I was saying only that xsd:base64Binary values should be
>> returned undecoded.
>
> But it currently doesn't, AFAIU, since this:
>
>   (cdr (assq 'severity (car (debbugs-get-status 22285))))
>
> is returned as a multibyte string.

"severity" is an xsd:string.

> So I guess you are saying that soap-client needs some changes to
> return xsd:base64Binary data as unibyte strings?

Yes, because Andreas's patch changed the behavior.  But the "severity"
example doesn't demonstrate this.  This does, because originator is an
xsd:base64Binary:

(cdr (assq 'originator (car (debbugs-get-status 22285))))

Before Andreas's patch that was unibyte, now, with Andreas's patch, it's
multibyte.

> Or is "severity" an xsd:string?

Yes.

>> I wasn't commenting on how other XSD string values (xsd:string, etc.)
>> should be returned.
>
> In general, if it's known to be a text string, and its encoding is
> specified by the document, or can be deduced otherwise with a 100%
> certainty, I'd recommend to return decoded strings.

OK.  That's what soap-client currently tries to do.  But as you've
pointed out, we need to review what it does for correctness.

>> Maybe I can give an example with XML fragments returned by the server,
>> to show how I think soap-client should handle xsd:base64Binary values.
>> 
>> The debbugs server will respond with:
>> 
>> <?xml version="1.0" encoding="UTF-8"?>
>> [...]
>> <severity xsi:type="xsd:string">normal</severity>
>> [...]
>> <originator xsi:type="xsd:base64Binary">Q2zDqW1lbnQgUGl0LS1DbGF1ZGVsIDxjbGVtZW50LnBpdGNsYXVkZWxAbGl2ZS5jb20+</originator>
>> [...]
>> 
>> soap-client will parse those results into a structure that it returns to
>> the caller:
>> 
>> ([...]
>>  (severity . "<string1>")
>>  [...]
>>  (originator . "<string2>")
>>  [...])
>> 
>> I think <string2> should be unibyte, because xsd:base64Binary represents
>> binary data, not necessarily a string.
>
> Btw, why is "originator" not a string? why xsd:base64Binary?  It's a
> name of a human (or some other entity), so it's clearly text, no?

Good question, I don't know.  Maybe Michael could comment here, since
this was a Debbugs decision.

>> What <string1> should be (unibyte or multibyte) and how it should be
>> produced (decoded) is the broader discussion.  I don't know enough to
>> have an opinion on that yet, other than it seems to have been working to
>> treat it as multibyte up until now.  Again, I'll have to talk to Alex
>> about this.
>
> If you can reliably decode it, then multibyte and decoded is better.
> I'd also say that if it's known that xsd:base64Binary is a string in
> disguise, it should also be decoded.

OK.  But there is no way for soap-client itself to know what the content
of an xsd:base64Binary value is.  The caller of soap-client will know
what it is, and so it should be up to the caller to interpret the bytes.
That's what Debbugs does with "originator"; it decodes them assuming
they represent a UTF-8 string.

> IOW, whenever the data is a string, it is better to decode it, I agree
> with Andreas here: unibyte strings that represent text are a PITA
> without a good justification.

I agree with you and Andreas in general.  However an xsd:base64Binary
value is a special case because it does not always represent text.
Therefore, I disagree with Andreas's patch; I've been trying to provide
good justification for not assuming that xsd:base64Binary values
represent text, because they don't always represent text, and there's no
way for soap-client to know when they do.

Thomas



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: emacs-25 b6b47AF: Properly encode/decode base64Binary data in SOAP
  2016-03-13 21:17                                 ` Thomas Fitzsimmons
@ 2016-03-14  3:30                                   ` Eli Zaretskii
  2016-03-14  8:49                                     ` Andreas Schwab
  2016-03-14  8:02                                   ` Michael Albinus
  1 sibling, 1 reply; 58+ messages in thread
From: Eli Zaretskii @ 2016-03-14  3:30 UTC (permalink / raw)
  To: Thomas Fitzsimmons, Andreas Schwab; +Cc: monnier, emacs-devel

> From: Thomas Fitzsimmons <fitzsim@fitzsim.org>
> Cc: monnier@iro.umontreal.ca,  emacs-devel@gnu.org
> Date: Sun, 13 Mar 2016 17:17:04 -0400
> 
> I agree with you and Andreas in general.  However an xsd:base64Binary
> value is a special case because it does not always represent text.
> Therefore, I disagree with Andreas's patch; I've been trying to provide
> good justification for not assuming that xsd:base64Binary values
> represent text, because they don't always represent text, and there's no
> way for soap-client to know when they do.

Andreas, is it possible to move the decoding of originator from
soap-client to debbugs?  Or were there other problems you found, in
addition to decoding the originator?



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: emacs-25 b6b47AF: Properly encode/decode base64Binary data in SOAP
  2016-03-13 21:17                                 ` Thomas Fitzsimmons
  2016-03-14  3:30                                   ` Eli Zaretskii
@ 2016-03-14  8:02                                   ` Michael Albinus
  2016-03-14 12:39                                     ` Stefan Monnier
  2016-03-14 17:48                                     ` Eli Zaretskii
  1 sibling, 2 replies; 58+ messages in thread
From: Michael Albinus @ 2016-03-14  8:02 UTC (permalink / raw)
  To: Thomas Fitzsimmons; +Cc: Eli Zaretskii, monnier, emacs-devel

Thomas Fitzsimmons <fitzsim@fitzsim.org> writes:

>> Btw, why is "originator" not a string? why xsd:base64Binary?  It's a
>> name of a human (or some other entity), so it's clearly text, no?
>
> Good question, I don't know.  Maybe Michael could comment here, since
> this was a Debbugs decision.

Debbugs.wsdl does not care about the argument types, it regards them as
xsd:anyType. Proper decoding is left to debbugs.el.

On the server side, debbugs.gnu.org, a perl script using SOAP::Lite is
responsible for encoding the attributes. IIUC, the internal package
SOAP::Serializer decides depending on the contents of a string, whether
it shall be xsd:base64Binary, or not. See:

    $self->typelookup({
           'base64Binary' =>
              [10, sub {$_[0] =~ /[^\x09\x0a\x0d\x20-\x7f]/ }, 'as_base64Binary'],
    ...

> Thomas

Best regards, Michael.



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: emacs-25 b6b47AF: Properly encode/decode base64Binary data in SOAP
  2016-03-14  3:30                                   ` Eli Zaretskii
@ 2016-03-14  8:49                                     ` Andreas Schwab
  2016-03-14  9:15                                       ` Michael Albinus
  2016-03-14  9:23                                       ` Thomas Fitzsimmons
  0 siblings, 2 replies; 58+ messages in thread
From: Andreas Schwab @ 2016-03-14  8:49 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Thomas Fitzsimmons, monnier, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

> Andreas, is it possible to move the decoding of originator from
> soap-client to debbugs?

That needs to be answered by its author.

Andreas.

-- 
Andreas Schwab, SUSE Labs, schwab@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: emacs-25 b6b47AF: Properly encode/decode base64Binary data in SOAP
  2016-03-14  8:49                                     ` Andreas Schwab
@ 2016-03-14  9:15                                       ` Michael Albinus
  2016-03-14 11:56                                         ` Stefan Monnier
                                                           ` (2 more replies)
  2016-03-14  9:23                                       ` Thomas Fitzsimmons
  1 sibling, 3 replies; 58+ messages in thread
From: Michael Albinus @ 2016-03-14  9:15 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: Eli Zaretskii, Thomas Fitzsimmons, monnier, emacs-devel

Andreas Schwab <schwab@suse.de> writes:

> Eli Zaretskii <eliz@gnu.org> writes:
>
>> Andreas, is it possible to move the decoding of originator from
>> soap-client to debbugs?
>
> That needs to be answered by its author.

As explained in my other message, originator (and other attributes)
could be sent by the debbugs server as either xsd:string or
xsd:base64Binary. debbugs.el does not know which encoding has been
applied, it must trust on consistent soap-client.el decoding.

> Andreas.

Best regards, Michael.



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: emacs-25 b6b47AF: Properly encode/decode base64Binary data in SOAP
  2016-03-14  8:49                                     ` Andreas Schwab
  2016-03-14  9:15                                       ` Michael Albinus
@ 2016-03-14  9:23                                       ` Thomas Fitzsimmons
  2016-03-14  9:58                                         ` Andreas Schwab
  1 sibling, 1 reply; 58+ messages in thread
From: Thomas Fitzsimmons @ 2016-03-14  9:23 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: Eli Zaretskii, monnier, emacs-devel

Andreas Schwab <schwab@suse.de> writes:

> Eli Zaretskii <eliz@gnu.org> writes:
>
>> Andreas, is it possible to move the decoding of originator from
>> soap-client to debbugs?
>
> That needs to be answered by its author.

Is there a Debbugs bug (other than the presence of a unibyte string in
debbugs results) that you were trying to fix with your patch to
soap-client?  Like, did you notice the Debbugs user interface
misbehaving, showing undecoded characters to the user, or anything like
that?  Any further information you could provide would be helpful here.

Thanks,
Thomas



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: emacs-25 b6b47AF: Properly encode/decode base64Binary data in SOAP
  2016-03-14  9:23                                       ` Thomas Fitzsimmons
@ 2016-03-14  9:58                                         ` Andreas Schwab
  0 siblings, 0 replies; 58+ messages in thread
From: Andreas Schwab @ 2016-03-14  9:58 UTC (permalink / raw)
  To: Thomas Fitzsimmons; +Cc: Eli Zaretskii, monnier, emacs-devel

See this thread: <http://permalink.gmane.org/gmane.emacs.devel/197339>

Andreas.

-- 
Andreas Schwab, SUSE Labs, schwab@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: emacs-25 b6b47AF: Properly encode/decode base64Binary data in SOAP
  2016-03-14  9:15                                       ` Michael Albinus
@ 2016-03-14 11:56                                         ` Stefan Monnier
  2016-03-14 12:18                                           ` Alex Harsanyi
  2016-03-14 11:58                                         ` Alex Harsanyi
  2016-03-14 17:49                                         ` Eli Zaretskii
  2 siblings, 1 reply; 58+ messages in thread
From: Stefan Monnier @ 2016-03-14 11:56 UTC (permalink / raw)
  To: Michael Albinus
  Cc: Andreas Schwab, Eli Zaretskii, Thomas Fitzsimmons, emacs-devel

> As explained in my other message, originator (and other attributes)
> could be sent by the debbugs server as either xsd:string or
> xsd:base64Binary.

Why?  Doesn't the schema specify these constraints?
I mean, isn't this an issue that should be solved between the
soap-client client (i.e. debbugs.el) and the SOAP server?

And of course, in case both are possible for historical reasons and the
debbugs.el isn't told which version of the schema is used, debbugs.el
can check the multibyteness of the string.


        Stefan



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: emacs-25 b6b47AF: Properly encode/decode base64Binary data in SOAP
  2016-03-14  9:15                                       ` Michael Albinus
  2016-03-14 11:56                                         ` Stefan Monnier
@ 2016-03-14 11:58                                         ` Alex Harsanyi
  2016-03-14 12:38                                           ` Michael Albinus
  2016-03-14 17:53                                           ` Eli Zaretskii
  2016-03-14 17:49                                         ` Eli Zaretskii
  2 siblings, 2 replies; 58+ messages in thread
From: Alex Harsanyi @ 2016-03-14 11:58 UTC (permalink / raw)
  To: Michael Albinus
  Cc: Andreas Schwab, Eli Zaretskii, Thomas Fitzsimmons, monnier,
	emacs-devel

2016-03-14 17:15 GMT+08:00 Michael Albinus <michael.albinus@gmx.de>:
> Andreas Schwab <schwab@suse.de> writes:
>
>> Eli Zaretskii <eliz@gnu.org> writes:
>>
>>> Andreas, is it possible to move the decoding of originator from
>>> soap-client to debbugs?
>>
>> That needs to be answered by its author.
>
> As explained in my other message, originator (and other attributes)
> could be sent by the debbugs server as either xsd:string or
> xsd:base64Binary. debbugs.el does not know which encoding has been
> applied, it must trust on consistent soap-client.el decoding.

The decoding in soap-client is consistent, even though not in the way
you would like it :-)

  * if it is told that a value is a string, it will return a string,
  * if it is told that a value is a byte array it will return a byte
array (unibyte string)

To use another example, if the SOAP server chooses to send all
parameters as strings:

    <example xsi:type="xsd:string">1234</example>

soap-client will decode them as strings, even though to the human
reader it is "obvious" that it is a number:

   '(example . "1234")

I think the problem here is that the debbugs server encodes utf8
values as base64, even though the message envelope is utf8 XML and
could handle them as normal strings.   debbugs.el does not want to
know about this, so expects strings to be strings.  soap-client.el is
caught in the middle.

I would also like to reiterate that base64 encoding can be used for
other things, such as images, and it would not be appropriate to
decode those as utf8 (not to mention that such a decoding might fail).

Best Regards,
Alex.

>
>> Andreas.
>
> Best regards, Michael.
>



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: emacs-25 b6b47AF: Properly encode/decode base64Binary data in SOAP
  2016-03-14 11:56                                         ` Stefan Monnier
@ 2016-03-14 12:18                                           ` Alex Harsanyi
  0 siblings, 0 replies; 58+ messages in thread
From: Alex Harsanyi @ 2016-03-14 12:18 UTC (permalink / raw)
  To: Stefan Monnier
  Cc: Andreas Schwab, Eli Zaretskii, Thomas Fitzsimmons,
	Michael Albinus, emacs-devel

2016-03-14 19:56 GMT+08:00 Stefan Monnier <monnier@iro.umontreal.ca>:
>> As explained in my other message, originator (and other attributes)
>> could be sent by the debbugs server as either xsd:string or
>> xsd:base64Binary.
>
> Why?  Doesn't the schema specify these constraints?

The debuggs schema (WSDL document) specifies that parameters can be of
"any" type.  This means that the server encodes type information with
each parameter (the "xsi:type" attributes) and soap-client will use
this info to decode the contents of the XML tag into lisp data.

Unfortunately, the Perl SOAP sever works without a schema (it will
simply encode a list of values into an XML document based on their
type).  Since the SOAP server does not read the WSDL schema, changing
it will not help us.

> I mean, isn't this an issue that should be solved between the
> soap-client client (i.e. debbugs.el) and the SOAP server?

Yes this is what Thomas and me are arguing for :-)

Best Regards,
Alex.



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: emacs-25 b6b47AF: Properly encode/decode base64Binary data in SOAP
  2016-03-14 11:58                                         ` Alex Harsanyi
@ 2016-03-14 12:38                                           ` Michael Albinus
  2016-03-14 13:18                                             ` Alex Harsanyi
  2016-03-14 13:26                                             ` Stefan Monnier
  2016-03-14 17:53                                           ` Eli Zaretskii
  1 sibling, 2 replies; 58+ messages in thread
From: Michael Albinus @ 2016-03-14 12:38 UTC (permalink / raw)
  To: Alex Harsanyi
  Cc: Andreas Schwab, Eli Zaretskii, Thomas Fitzsimmons, monnier,
	emacs-devel

Alex Harsanyi <alexharsanyi@gmail.com> writes:

> I think the problem here is that the debbugs server encodes utf8
> values as base64, even though the message envelope is utf8 XML and
> could handle them as normal strings.   debbugs.el does not want to
> know about this, so expects strings to be strings.  soap-client.el is
> caught in the middle.

Well, if debbugs.el would get an indication from soap-client.el, whether
a string was encoded as xsd:string or xsd:base64Binary, it could decode
the latter values itself. Less convenient, but so what.

> I would also like to reiterate that base64 encoding can be used for
> other things, such as images, and it would not be appropriate to
> decode those as utf8 (not to mention that such a decoding might fail).

Is there a way to tell soap-client.el, what to do with base64 encoded values?
Something like a user-defined function, or alike?

> Best Regards,
> Alex.

Best regards, Michael.



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: emacs-25 b6b47AF: Properly encode/decode base64Binary data in SOAP
  2016-03-14  8:02                                   ` Michael Albinus
@ 2016-03-14 12:39                                     ` Stefan Monnier
  2016-03-14 17:55                                       ` Eli Zaretskii
  2016-03-14 17:48                                     ` Eli Zaretskii
  1 sibling, 1 reply; 58+ messages in thread
From: Stefan Monnier @ 2016-03-14 12:39 UTC (permalink / raw)
  To: emacs-devel

> On the server side, debbugs.gnu.org, a perl script using SOAP::Lite is
> responsible for encoding the attributes. IIUC, the internal package
> SOAP::Serializer decides depending on the contents of a string, whether
> it shall be xsd:base64Binary, or not. See:

>     $self->typelookup({
>            'base64Binary' =>
>               [10, sub {$_[0] =~ /[^\x09\x0a\x0d\x20-\x7f]/ }, 'as_base64Binary'],
>     ...

So on the receiving side, debbugs.el should similarly check
multibyteness of the string to decide what to do with it, I think.


        Stefan




^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: emacs-25 b6b47AF: Properly encode/decode base64Binary data in SOAP
  2016-03-14 12:38                                           ` Michael Albinus
@ 2016-03-14 13:18                                             ` Alex Harsanyi
  2016-03-14 13:30                                               ` Michael Albinus
  2016-03-14 13:26                                             ` Stefan Monnier
  1 sibling, 1 reply; 58+ messages in thread
From: Alex Harsanyi @ 2016-03-14 13:18 UTC (permalink / raw)
  To: Michael Albinus
  Cc: Andreas Schwab, Eli Zaretskii, Thomas Fitzsimmons, Stefan Monnier,
	emacs-devel

2016-03-14 20:38 GMT+08:00 Michael Albinus <michael.albinus@gmx.de>:
> Alex Harsanyi <alexharsanyi@gmail.com> writes:
>
>> I think the problem here is that the debbugs server encodes utf8
>> values as base64, even though the message envelope is utf8 XML and
>> could handle them as normal strings.   debbugs.el does not want to
>> know about this, so expects strings to be strings.  soap-client.el is
>> caught in the middle.
>
> Well, if debbugs.el would get an indication from soap-client.el, whether
> a string was encoded as xsd:string or xsd:base64Binary, it could decode
> the latter values itself. Less convenient, but so what.

We were discussing with Thomas to have soap-client.el return the
base64 string "as is" and let the caller decode and process it.
Unfortunately, there is no base64-string-p function and this value
would look like a (multibyte) string.  Perhaps we could return (cons
'base64 value), or a more appropriate type for a "array of bytes"
concept.  I'm open to suggestions.

>
>> I would also like to reiterate that base64 encoding can be used for
>> other things, such as images, and it would not be appropriate to
>> decode those as utf8 (not to mention that such a decoding might fail).
>
> Is there a way to tell soap-client.el, what to do with base64 encoded values?
> Something like a user-defined function, or alike?

This would not work in the general case except when all base64 encoded
values are the same underlying type (like utf8).

Although, given that debbugs.el is the only soap-client client that is
affected by this, we can add a simple "soap-base64-handler" function
to do what needs doing, and we can always extend that interface later
as more use cases emerge.

Best Regards,
Alex.



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: emacs-25 b6b47AF: Properly encode/decode base64Binary data in SOAP
  2016-03-14 12:38                                           ` Michael Albinus
  2016-03-14 13:18                                             ` Alex Harsanyi
@ 2016-03-14 13:26                                             ` Stefan Monnier
  2016-03-14 14:12                                               ` Michael Albinus
  1 sibling, 1 reply; 58+ messages in thread
From: Stefan Monnier @ 2016-03-14 13:26 UTC (permalink / raw)
  To: emacs-devel

> Well, if debbugs.el would get an indication from soap-client.el, whether
> a string was encoded as xsd:string or xsd:base64Binary, it could decode
> the latter values itself.

The multibyteness of the string should be exactly that indication.

> Less convenient, but so what.

Agree, but I don't think soap-client.el can do much more.  It's due to
a problem on the server side.

> Is there a way to tell soap-client.el, what to do with base64 encoded values?

Just a note: "base64" here might be a convenient word to use, but it
focuses on the wrong part of the issue.  soap-client already does the
base64 decoding.
The issue is that the server sends "base64Binary" where the important
word is "Binary".


        Stefan




^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: emacs-25 b6b47AF: Properly encode/decode base64Binary data in SOAP
  2016-03-14 13:18                                             ` Alex Harsanyi
@ 2016-03-14 13:30                                               ` Michael Albinus
  0 siblings, 0 replies; 58+ messages in thread
From: Michael Albinus @ 2016-03-14 13:30 UTC (permalink / raw)
  To: Alex Harsanyi
  Cc: Andreas Schwab, Eli Zaretskii, Thomas Fitzsimmons, Stefan Monnier,
	emacs-devel

Alex Harsanyi <alexharsanyi@gmail.com> writes:

> We were discussing with Thomas to have soap-client.el return the
> base64 string "as is" and let the caller decode and process it.
> Unfortunately, there is no base64-string-p function and this value
> would look like a (multibyte) string.  Perhaps we could return (cons
> 'base64 value), or a more appropriate type for a "array of bytes"
> concept.  I'm open to suggestions.

Array of bytes sounds appropriate to me. This is similar to D-Bus, which
uses an array of bytes in case it cannot marshal the data into a proper type.

>>> I would also like to reiterate that base64 encoding can be used for
>>> other things, such as images, and it would not be appropriate to
>>> decode those as utf8 (not to mention that such a decoding might fail).
>>
>> Is there a way to tell soap-client.el, what to do with base64 encoded values?
>> Something like a user-defined function, or alike?
>
> This would not work in the general case except when all base64 encoded
> values are the same underlying type (like utf8).

Your default could behave as you like. It would be the responsibility of
the calling library to replace the default by a proper
function. debbugs.el would know, that all base64 encoded data must be
utf8.

> Although, given that debbugs.el is the only soap-client client that is
> affected by this, we can add a simple "soap-base64-handler" function
> to do what needs doing, and we can always extend that interface later
> as more use cases emerge.

Would be OK to me also.

> Best Regards,
> Alex.

Best regards, Michael.



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: emacs-25 b6b47AF: Properly encode/decode base64Binary data in SOAP
  2016-03-14 13:26                                             ` Stefan Monnier
@ 2016-03-14 14:12                                               ` Michael Albinus
  2016-03-14 14:58                                                 ` Thomas Fitzsimmons
  0 siblings, 1 reply; 58+ messages in thread
From: Michael Albinus @ 2016-03-14 14:12 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

Stefan Monnier <monnier@iro.umontreal.ca> writes:

>> Well, if debbugs.el would get an indication from soap-client.el, whether
>> a string was encoded as xsd:string or xsd:base64Binary, it could decode
>> the latter values itself.
>
> The multibyteness of the string should be exactly that indication.

It isn't, at least in its current implementation. Let's take our example
(debbugs-get-status 22285). The debbugs server returns (beside other
data)

<severity xsi:type="xsd:string">normal</severity>
<originator xsi:type="xsd:base64Binary">Q2zDqW1lbnQgUGl0LS1DbGF1ZGVsIDxjbGVtZW50LnBpdGNsYXVkZWxAbGl2ZS5jb20+</originator>

Testing multibyteness returns

(multibyte-string-p (cdr (assq 'severity (car (debbugs-get-status 22285)))))
t
(multibyte-string-p (cdr (assq 'originator (car (debbugs-get-status 22285)))))
t

>         Stefan

Best regards, Michael.



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: emacs-25 b6b47AF: Properly encode/decode base64Binary data in SOAP
  2016-03-14 14:12                                               ` Michael Albinus
@ 2016-03-14 14:58                                                 ` Thomas Fitzsimmons
  2016-03-14 15:56                                                   ` Michael Albinus
  2016-03-14 17:58                                                   ` Eli Zaretskii
  0 siblings, 2 replies; 58+ messages in thread
From: Thomas Fitzsimmons @ 2016-03-14 14:58 UTC (permalink / raw)
  To: Michael Albinus; +Cc: Stefan Monnier, emacs-devel

Michael Albinus <michael.albinus@gmx.de> writes:

> Stefan Monnier <monnier@iro.umontreal.ca> writes:
>
>>> Well, if debbugs.el would get an indication from soap-client.el, whether
>>> a string was encoded as xsd:string or xsd:base64Binary, it could decode
>>> the latter values itself.
>>
>> The multibyteness of the string should be exactly that indication.
>
> It isn't, at least in its current implementation. Let's take our example
> (debbugs-get-status 22285). The debbugs server returns (beside other
> data)
>
> <severity xsi:type="xsd:string">normal</severity>
> <originator xsi:type="xsd:base64Binary">Q2zDqW1lbnQgUGl0LS1DbGF1ZGVsIDxjbGVtZW50LnBpdGNsYXVkZWxAbGl2ZS5jb20+</originator>
>
> Testing multibyteness returns
>
> (multibyte-string-p (cdr (assq 'severity (car (debbugs-get-status 22285)))))
> t
> (multibyte-string-p (cdr (assq 'originator (car (debbugs-get-status 22285)))))
> t

Yes, that's because of Andreas's soap-client patch
(b6b47af82f6c7d960388ec46dd8ab371c2e34de4), the patch under discussion
that I'd like to revert.  Without that patch, the originator check would
return nil.

Andreas has pointed me to the issue he applied it to fix:

http://permalink.gmane.org/gmane.emacs.devel/197339

My plan is to replicate that without the soap-client patch, and try to
fix it a different way, in Debbugs.  From that thread, I think this will
show it:

(async-get (async-start
            `(lambda ()
               (load ,(locate-library "debbugs"))
               (debbugs-get-status 22285))))

I'll work on this hopefully this evening unless you beat me to it.

Thomas



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: emacs-25 b6b47AF: Properly encode/decode base64Binary data in SOAP
  2016-03-14 14:58                                                 ` Thomas Fitzsimmons
@ 2016-03-14 15:56                                                   ` Michael Albinus
  2016-03-14 17:58                                                   ` Eli Zaretskii
  1 sibling, 0 replies; 58+ messages in thread
From: Michael Albinus @ 2016-03-14 15:56 UTC (permalink / raw)
  To: Thomas Fitzsimmons; +Cc: Stefan Monnier, emacs-devel

Thomas Fitzsimmons <fitzsim@fitzsim.org> writes:

> I'll work on this hopefully this evening unless you beat me to it.

Likely, I won't work on this tonight. But we could share the work: you
revert the patch in soap-client.el, and I'll try to adapt debbugs.el
tomorrow over the day. All strings, for which `multibyte-string-p'
returns nil, must be decoded, right?

> Thomas

Best regards, Michael.



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: emacs-25 b6b47AF: Properly encode/decode base64Binary data in SOAP
  2016-03-14  8:02                                   ` Michael Albinus
  2016-03-14 12:39                                     ` Stefan Monnier
@ 2016-03-14 17:48                                     ` Eli Zaretskii
  2016-03-14 18:42                                       ` Michael Albinus
  1 sibling, 1 reply; 58+ messages in thread
From: Eli Zaretskii @ 2016-03-14 17:48 UTC (permalink / raw)
  To: Michael Albinus; +Cc: fitzsim, monnier, emacs-devel

> From: Michael Albinus <michael.albinus@gmx.de>
> Cc: Eli Zaretskii <eliz@gnu.org>,  monnier@iro.umontreal.ca,  emacs-devel@gnu.org
> Date: Mon, 14 Mar 2016 09:02:48 +0100
> 
> On the server side, debbugs.gnu.org, a perl script using SOAP::Lite is
> responsible for encoding the attributes. IIUC, the internal package
> SOAP::Serializer decides depending on the contents of a string, whether
> it shall be xsd:base64Binary, or not. See:
> 
>     $self->typelookup({
>            'base64Binary' =>
>               [10, sub {$_[0] =~ /[^\x09\x0a\x0d\x20-\x7f]/ }, 'as_base64Binary'],

That doesn't look right: it disallows UTF-8 encoded text in a document
whose encoding is announced to be UTF-8.  But I guess we won't be able
to change that.



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: emacs-25 b6b47AF: Properly encode/decode base64Binary data in SOAP
  2016-03-14  9:15                                       ` Michael Albinus
  2016-03-14 11:56                                         ` Stefan Monnier
  2016-03-14 11:58                                         ` Alex Harsanyi
@ 2016-03-14 17:49                                         ` Eli Zaretskii
  2016-03-14 18:44                                           ` Michael Albinus
  2 siblings, 1 reply; 58+ messages in thread
From: Eli Zaretskii @ 2016-03-14 17:49 UTC (permalink / raw)
  To: Michael Albinus; +Cc: schwab, fitzsim, monnier, emacs-devel

> From: Michael Albinus <michael.albinus@gmx.de>
> Cc: Eli Zaretskii <eliz@gnu.org>,  Thomas Fitzsimmons <fitzsim@fitzsim.org>,  monnier@iro.umontreal.ca,  emacs-devel@gnu.org
> Date: Mon, 14 Mar 2016 10:15:12 +0100
> 
> As explained in my other message, originator (and other attributes)
> could be sent by the debbugs server as either xsd:string or
> xsd:base64Binary. debbugs.el does not know which encoding has been
> applied, it must trust on consistent soap-client.el decoding.

Does debbugs know the value is a text string?



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: emacs-25 b6b47AF: Properly encode/decode base64Binary data in SOAP
  2016-03-14 11:58                                         ` Alex Harsanyi
  2016-03-14 12:38                                           ` Michael Albinus
@ 2016-03-14 17:53                                           ` Eli Zaretskii
  2016-03-14 18:47                                             ` Michael Albinus
  1 sibling, 1 reply; 58+ messages in thread
From: Eli Zaretskii @ 2016-03-14 17:53 UTC (permalink / raw)
  To: Alex Harsanyi; +Cc: schwab, fitzsim, michael.albinus, monnier, emacs-devel

> Date: Mon, 14 Mar 2016 19:58:21 +0800
> From: Alex Harsanyi <alexharsanyi@gmail.com>
> Cc: Andreas Schwab <schwab@suse.de>, Eli Zaretskii <eliz@gnu.org>, 
> 	Thomas Fitzsimmons <fitzsim@fitzsim.org>, monnier@iro.umontreal.ca, emacs-devel@gnu.org
> 
> The decoding in soap-client is consistent, even though not in the way
> you would like it :-)
> 
>   * if it is told that a value is a string, it will return a string,
>   * if it is told that a value is a byte array it will return a byte
> array (unibyte string)

Told by whom?  By debbugs.el, by the WSDL, by the debbugs server, by
something else?  (Sorry, I know almost nothing about the debbugs or
SOAP.)

> I think the problem here is that the debbugs server encodes utf8
> values as base64, even though the message envelope is utf8 XML and
> could handle them as normal strings.   debbugs.el does not want to
> know about this, so expects strings to be strings.  soap-client.el is
> caught in the middle.

Which ones of the involved parties know, or can know, that the
originator is a string?

> I would also like to reiterate that base64 encoding can be used for
> other things, such as images, and it would not be appropriate to
> decode those as utf8

It should be easy to detect truly binary byte streams: e.g., look for
null bytes.

> (not to mention that such a decoding might fail).

It cannot fail in Emacs, AFAIK.



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: emacs-25 b6b47AF: Properly encode/decode base64Binary data in SOAP
  2016-03-14 12:39                                     ` Stefan Monnier
@ 2016-03-14 17:55                                       ` Eli Zaretskii
  0 siblings, 0 replies; 58+ messages in thread
From: Eli Zaretskii @ 2016-03-14 17:55 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Date: Mon, 14 Mar 2016 08:39:06 -0400
> 
> > On the server side, debbugs.gnu.org, a perl script using SOAP::Lite is
> > responsible for encoding the attributes. IIUC, the internal package
> > SOAP::Serializer decides depending on the contents of a string, whether
> > it shall be xsd:base64Binary, or not. See:
> 
> >     $self->typelookup({
> >            'base64Binary' =>
> >               [10, sub {$_[0] =~ /[^\x09\x0a\x0d\x20-\x7f]/ }, 'as_base64Binary'],
> >     ...
> 
> So on the receiving side, debbugs.el should similarly check
> multibyteness of the string to decide what to do with it, I think.

??? The multibyteness of a string is an illusion created by Emacs, as
you well know.  There's nothing in the byte stream that is inherently
multibyte or unibyte.  The only thing that the multibyteness of the
string could tell us is what kind of processing did soap-client.el do
to produce the string, that's all.

So I'm unsure what you had in mind when you wrote this.  I'm probably
missing something, but what?



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: emacs-25 b6b47AF: Properly encode/decode base64Binary data in SOAP
  2016-03-14 14:58                                                 ` Thomas Fitzsimmons
  2016-03-14 15:56                                                   ` Michael Albinus
@ 2016-03-14 17:58                                                   ` Eli Zaretskii
  2016-03-15  1:56                                                     ` Thomas Fitzsimmons
  1 sibling, 1 reply; 58+ messages in thread
From: Eli Zaretskii @ 2016-03-14 17:58 UTC (permalink / raw)
  To: Thomas Fitzsimmons; +Cc: michael.albinus, monnier, emacs-devel

> From: Thomas Fitzsimmons <fitzsim@fitzsim.org>
> Date: Mon, 14 Mar 2016 10:58:44 -0400
> Cc: Stefan Monnier <monnier@iro.umontreal.ca>, emacs-devel@gnu.org
> 
> My plan is to replicate that without the soap-client patch, and try to
> fix it a different way, in Debbugs.  From that thread, I think this will
> show it:
> 
> (async-get (async-start
>             `(lambda ()
>                (load ,(locate-library "debbugs"))
>                (debbugs-get-status 22285))))
> 
> I'll work on this hopefully this evening unless you beat me to it.

Thanks.  But it would be good to discuss what you think should be done
in debbugs.el, before you actually do that.  The important part is to
figure out how can debbugs.el know whether a byte stream is a text
string.



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: emacs-25 b6b47AF: Properly encode/decode base64Binary data in SOAP
  2016-03-14 17:48                                     ` Eli Zaretskii
@ 2016-03-14 18:42                                       ` Michael Albinus
  0 siblings, 0 replies; 58+ messages in thread
From: Michael Albinus @ 2016-03-14 18:42 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: fitzsim, monnier, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Michael Albinus <michael.albinus@gmx.de>
>> Cc: Eli Zaretskii <eliz@gnu.org>,  monnier@iro.umontreal.ca,  emacs-devel@gnu.org
>> Date: Mon, 14 Mar 2016 09:02:48 +0100
>> 
>> On the server side, debbugs.gnu.org, a perl script using SOAP::Lite is
>> responsible for encoding the attributes. IIUC, the internal package
>> SOAP::Serializer decides depending on the contents of a string, whether
>> it shall be xsd:base64Binary, or not. See:
>> 
>>     $self->typelookup({
>>            'base64Binary' =>
>>               [10, sub {$_[0] =~ /[^\x09\x0a\x0d\x20-\x7f]/ }, 'as_base64Binary'],
>
> That doesn't look right: it disallows UTF-8 encoded text in a document
> whose encoding is announced to be UTF-8.  But I guess we won't be able
> to change that.

No. You would need to write a bug report towards SOAP-Lite. Haven't
check whether this is solved already. Reading existing bug report
titles, there are already some of them wrt utf8. See
<https://rt.cpan.org/Public/Dist/Display.html?Name=SOAP-Lite>.

Best regards, Michael.



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: emacs-25 b6b47AF: Properly encode/decode base64Binary data in SOAP
  2016-03-14 17:49                                         ` Eli Zaretskii
@ 2016-03-14 18:44                                           ` Michael Albinus
  2016-03-14 18:52                                             ` Eli Zaretskii
  0 siblings, 1 reply; 58+ messages in thread
From: Michael Albinus @ 2016-03-14 18:44 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: schwab, fitzsim, monnier, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>> As explained in my other message, originator (and other attributes)
>> could be sent by the debbugs server as either xsd:string or
>> xsd:base64Binary. debbugs.el does not know which encoding has been
>> applied, it must trust on consistent soap-client.el decoding.
>
> Does debbugs know the value is a text string?

debbugs.el knows for every attribute, whether it is a string or
something else. There're not so many attributes, see `debbugs-get-status'.

Best regards, Michael.



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: emacs-25 b6b47AF: Properly encode/decode base64Binary data in SOAP
  2016-03-14 17:53                                           ` Eli Zaretskii
@ 2016-03-14 18:47                                             ` Michael Albinus
  2016-03-14 18:57                                               ` Eli Zaretskii
  0 siblings, 1 reply; 58+ messages in thread
From: Michael Albinus @ 2016-03-14 18:47 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: schwab, Alex Harsanyi, fitzsim, monnier, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>> The decoding in soap-client is consistent, even though not in the way
>> you would like it :-)
>> 
>>   * if it is told that a value is a string, it will return a string,
>>   * if it is told that a value is a byte array it will return a byte
>> array (unibyte string)
>
> Told by whom?  By debbugs.el, by the WSDL, by the debbugs server, by
> something else?  (Sorry, I know almost nothing about the debbugs or
> SOAP.)

By the XML data returned by the debbugs (SOAP) server.

>> I think the problem here is that the debbugs server encodes utf8
>> values as base64, even though the message envelope is utf8 XML and
>> could handle them as normal strings.   debbugs.el does not want to
>> know about this, so expects strings to be strings.  soap-client.el is
>> caught in the middle.
>
> Which ones of the involved parties know, or can know, that the
> originator is a string?

The debbugs server could know. But is uses the simple-minded SOAP-Lite
Perl library, which encodes everything into base64Binary which doesn't
look like ASCII.

Best regards, Michael.



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: emacs-25 b6b47AF: Properly encode/decode base64Binary data in SOAP
  2016-03-14 18:44                                           ` Michael Albinus
@ 2016-03-14 18:52                                             ` Eli Zaretskii
  2016-03-14 19:05                                               ` Michael Albinus
  0 siblings, 1 reply; 58+ messages in thread
From: Eli Zaretskii @ 2016-03-14 18:52 UTC (permalink / raw)
  To: Michael Albinus; +Cc: schwab, fitzsim, monnier, emacs-devel

> From: Michael Albinus <michael.albinus@gmx.de>
> Cc: schwab@suse.de,  fitzsim@fitzsim.org,  monnier@iro.umontreal.ca,  emacs-devel@gnu.org
> Date: Mon, 14 Mar 2016 19:44:25 +0100
> 
> > Does debbugs know the value is a text string?
> 
> debbugs.el knows for every attribute, whether it is a string or
> something else. There're not so many attributes, see `debbugs-get-status'.

If debbugs.el knows it is a text string, then debbugs.el could decode
it (provided that soap-client.el gives you a unibyte undecoded string
that came out of base64 decoding).



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: emacs-25 b6b47AF: Properly encode/decode base64Binary data in SOAP
  2016-03-14 18:47                                             ` Michael Albinus
@ 2016-03-14 18:57                                               ` Eli Zaretskii
  0 siblings, 0 replies; 58+ messages in thread
From: Eli Zaretskii @ 2016-03-14 18:57 UTC (permalink / raw)
  To: Michael Albinus; +Cc: schwab, alexharsanyi, fitzsim, monnier, emacs-devel

> From: Michael Albinus <michael.albinus@gmx.de>
> Cc: Alex Harsanyi <alexharsanyi@gmail.com>,  schwab@suse.de,  fitzsim@fitzsim.org,  monnier@iro.umontreal.ca,  emacs-devel@gnu.org
> Date: Mon, 14 Mar 2016 19:47:16 +0100
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> >> The decoding in soap-client is consistent, even though not in the way
> >> you would like it :-)
> >> 
> >>   * if it is told that a value is a string, it will return a string,
> >>   * if it is told that a value is a byte array it will return a byte
> >> array (unibyte string)
> >
> > Told by whom?  By debbugs.el, by the WSDL, by the debbugs server, by
> > something else?  (Sorry, I know almost nothing about the debbugs or
> > SOAP.)
> 
> By the XML data returned by the debbugs (SOAP) server.

Then this doesn't really help us, since the server lies to us.

Thanks.



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: emacs-25 b6b47AF: Properly encode/decode base64Binary data in SOAP
  2016-03-14 18:52                                             ` Eli Zaretskii
@ 2016-03-14 19:05                                               ` Michael Albinus
  0 siblings, 0 replies; 58+ messages in thread
From: Michael Albinus @ 2016-03-14 19:05 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: schwab, fitzsim, monnier, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>> > Does debbugs know the value is a text string?
>> 
>> debbugs.el knows for every attribute, whether it is a string or
>> something else. There're not so many attributes, see `debbugs-get-status'.
>
> If debbugs.el knows it is a text string, then debbugs.el could decode
> it (provided that soap-client.el gives you a unibyte undecoded string
> that came out of base64 decoding).

That's the plan. See the discussion with Thomas.

Best regards, Michael.



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: emacs-25 b6b47AF: Properly encode/decode base64Binary data in SOAP
  2016-03-14 17:58                                                   ` Eli Zaretskii
@ 2016-03-15  1:56                                                     ` Thomas Fitzsimmons
  2016-03-15  7:45                                                       ` Andreas Schwab
                                                                         ` (2 more replies)
  0 siblings, 3 replies; 58+ messages in thread
From: Thomas Fitzsimmons @ 2016-03-15  1:56 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: michael.albinus, monnier, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 1892 bytes --]

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Thomas Fitzsimmons <fitzsim@fitzsim.org>
>> Date: Mon, 14 Mar 2016 10:58:44 -0400
>> Cc: Stefan Monnier <monnier@iro.umontreal.ca>, emacs-devel@gnu.org
>> 
>> My plan is to replicate that without the soap-client patch, and try to
>> fix it a different way, in Debbugs.  From that thread, I think this will
>> show it:
>> 
>> (async-get (async-start
>>             `(lambda ()
>>                (load ,(locate-library "debbugs"))
>>                (debbugs-get-status 22285))))
>> 
>> I'll work on this hopefully this evening unless you beat me to it.
>
> Thanks.  But it would be good to discuss what you think should be done
> in debbugs.el, before you actually do that.  The important part is to
> figure out how can debbugs.el know whether a byte stream is a text
> string.

With the soap-client patch reverted, this debbugs.el patch fixes the
above test case, producing a correct multibyte string for originator.

This is just one example, obviously.  The new function should be applied
to all values returned by the Debbugs server that may contain multibyte
UTF-8 characters.  I'll leave this part to Michael.

Also, there's still something not quite right about async, for strings
that don't contain extended characters:

(multibyte-string-p (cdr (assq 'severity (car (async-get (async-start
            `(lambda ()
               (load ,(locate-library "debbugs"))
               (debbugs-get-status 22285))))))))
=> nil

(multibyte-string-p (cdr (assq 'severity (car (debbugs-get-status 22285)))))
=> t

but this looks like a side effect of async.

I'll wait to revert the soap-client patch on master and emacs-25 until
Debbugs has a new release to fix this, to avoid temporarily re-breaking
Debbugs.  (However, I'd really like to get this done before Emacs 25.1
so that it doesn't go out with an incompatible soap-client API.)

Thomas


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: emacs-debbugs-soap-value-to-string.patch --]
[-- Type: text/x-patch, Size: 2038 bytes --]

diff --git a/packages/debbugs/debbugs.el b/packages/debbugs/debbugs.el
index f145280..e4c9667 100644
--- a/packages/debbugs/debbugs.el
+++ b/packages/debbugs/debbugs.el
@@ -264,6 +264,27 @@ (defun debbugs-newest-bugs (amount)
   "Return the list of bug numbers, according to AMOUNT (a number) latest bugs."
   (sort (car (soap-invoke debbugs-wsdl debbugs-port "newest_bugs" amount)) '<))
 
+(defun debbugs-convert-soap-value-to-string (string-value)
+  "If STRING-VALUE is unibyte, decode its contents as a UTF-8 string.
+If STRING-VALUE is a multibyte string, then `soap-client'
+received an xsd:string for this value, and will have decoded it
+already.
+
+If STRING-VALUE is a unibyte string, then `soap-client' received
+an xsd:base64Binary, and ran `base64-decode-string' on it to
+produce a unibyte string of bytes.
+
+For some reason, the Debbugs server code base64-encodes strings
+that contain UTF-8 characters, and returns them as
+xsd:base64Binary, instead of just returning them as xsd:string.
+Therefore, when STRING-VALUE is a unibyte string, we assume its
+bytes represent a UTF-8 string and decode them accordingly."
+  (if (stringp string-value)
+      (if (not (multibyte-string-p string-value))
+	  (decode-coding-string string-value 'utf-8)
+	string-value)
+    (error "Invalid string value")))
+
 (defun debbugs-get-status (&rest bug-numbers)
   "Return a list of status entries for the bugs identified by BUG-NUMBERS.
 
@@ -421,6 +442,11 @@ (defun debbugs-get-status (&rest bug-numbers)
 	    (when (stringp (cdr y))
 	      (setcdr y (mapcar
 			 'string-to-number (split-string (cdr y) " " t)))))
+	  ;; "originator" may be an xsd:base64Binary value containing
+	  ;; a UTF-8-encoded string.
+	  (dolist (attribute '(originator))
+	    (setq y (assoc attribute (cdr (assoc 'value x))))
+	    (setcdr y (debbugs-convert-soap-value-to-string (cdr y))))
 	  ;; "package" is a string, containing comma separated
 	  ;; package names.  "keywords" and "tags" are strings,
 	  ;; containing blank separated package names.

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* Re: emacs-25 b6b47AF: Properly encode/decode base64Binary data in SOAP
  2016-03-15  1:56                                                     ` Thomas Fitzsimmons
@ 2016-03-15  7:45                                                       ` Andreas Schwab
  2016-03-15  7:57                                                         ` Michael Albinus
  2016-03-15  7:49                                                       ` Andreas Schwab
  2016-03-15  8:08                                                       ` Michael Albinus
  2 siblings, 1 reply; 58+ messages in thread
From: Andreas Schwab @ 2016-03-15  7:45 UTC (permalink / raw)
  To: Thomas Fitzsimmons; +Cc: Eli Zaretskii, michael.albinus, monnier, emacs-devel

Thomas Fitzsimmons <fitzsim@fitzsim.org> writes:

> @@ -421,6 +442,11 @@ (defun debbugs-get-status (&rest bug-numbers)
>  	    (when (stringp (cdr y))
>  	      (setcdr y (mapcar
>  			 'string-to-number (split-string (cdr y) " " t)))))
> +	  ;; "originator" may be an xsd:base64Binary value containing
> +	  ;; a UTF-8-encoded string.

What about all the other strings in the result?

Andreas.

-- 
Andreas Schwab, SUSE Labs, schwab@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: emacs-25 b6b47AF: Properly encode/decode base64Binary data in SOAP
  2016-03-15  1:56                                                     ` Thomas Fitzsimmons
  2016-03-15  7:45                                                       ` Andreas Schwab
@ 2016-03-15  7:49                                                       ` Andreas Schwab
  2016-03-15  8:08                                                       ` Michael Albinus
  2 siblings, 0 replies; 58+ messages in thread
From: Andreas Schwab @ 2016-03-15  7:49 UTC (permalink / raw)
  To: Thomas Fitzsimmons; +Cc: Eli Zaretskii, michael.albinus, monnier, emacs-devel

Thomas Fitzsimmons <fitzsim@fitzsim.org> writes:

> Also, there's still something not quite right about async, for strings
> that don't contain extended characters:

(multibyte-string-p (read (format "%S" (string-to-multibyte "foo")))) => nil

Andreas.

-- 
Andreas Schwab, SUSE Labs, schwab@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: emacs-25 b6b47AF: Properly encode/decode base64Binary data in SOAP
  2016-03-15  7:45                                                       ` Andreas Schwab
@ 2016-03-15  7:57                                                         ` Michael Albinus
  0 siblings, 0 replies; 58+ messages in thread
From: Michael Albinus @ 2016-03-15  7:57 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: Eli Zaretskii, Thomas Fitzsimmons, monnier, emacs-devel

Andreas Schwab <schwab@suse.de> writes:

> Thomas Fitzsimmons <fitzsim@fitzsim.org> writes:
>
>> @@ -421,6 +442,11 @@ (defun debbugs-get-status (&rest bug-numbers)
>>  	    (when (stringp (cdr y))
>>  	      (setcdr y (mapcar
>>  			 'string-to-number (split-string (cdr y) " " t)))))
>> +	  ;; "originator" may be an xsd:base64Binary value containing
>> +	  ;; a UTF-8-encoded string.
>
> What about all the other strings in the result?

I've added them.

> Andreas.

Best regards, Michael.



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: emacs-25 b6b47AF: Properly encode/decode base64Binary data in SOAP
  2016-03-15  1:56                                                     ` Thomas Fitzsimmons
  2016-03-15  7:45                                                       ` Andreas Schwab
  2016-03-15  7:49                                                       ` Andreas Schwab
@ 2016-03-15  8:08                                                       ` Michael Albinus
  2016-03-15 14:39                                                         ` Thomas Fitzsimmons
  2 siblings, 1 reply; 58+ messages in thread
From: Michael Albinus @ 2016-03-15  8:08 UTC (permalink / raw)
  To: Thomas Fitzsimmons; +Cc: Eli Zaretskii, monnier, emacs-devel

Thomas Fitzsimmons <fitzsim@fitzsim.org> writes:

> With the soap-client patch reverted, this debbugs.el patch fixes the
> above test case, producing a correct multibyte string for originator.

Thanks. I've committed it in your name.

> This is just one example, obviously.  The new function should be applied
> to all values returned by the Debbugs server that may contain multibyte
> UTF-8 characters.  I'll leave this part to Michael.

Done. The other affected attributes are "subject", "owner" and
"summary"; I've added them.

> Also, there's still something not quite right about async, for strings
> that don't contain extended characters:
>
> (multibyte-string-p (cdr (assq 'severity (car (async-get (async-start
>             `(lambda ()
>                (load ,(locate-library "debbugs"))
>                (debbugs-get-status 22285))))))))
> => nil
>
> (multibyte-string-p (cdr (assq 'severity (car (debbugs-get-status 22285)))))
> => t
>
> but this looks like a side effect of async.

Hmm, don't know. I would like to get rid of async. If
`soap-invoke-async' exists (soap-client >= 3.0), that function is
preferred over `async-start'/`async-get'.

> I'll wait to revert the soap-client patch on master and emacs-25 until
> Debbugs has a new release to fix this, to avoid temporarily re-breaking
> Debbugs.  (However, I'd really like to get this done before Emacs 25.1
> so that it doesn't go out with an incompatible soap-client API.)

I've increased debbugs' version to 0.9.1, shall be released soon by the
elpa release script.

> Thomas

Best regards, Michael.



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: emacs-25 b6b47AF: Properly encode/decode base64Binary data in SOAP
  2016-03-15  8:08                                                       ` Michael Albinus
@ 2016-03-15 14:39                                                         ` Thomas Fitzsimmons
  2016-03-15 15:04                                                           ` Michael Albinus
  2016-03-15 17:39                                                           ` Eli Zaretskii
  0 siblings, 2 replies; 58+ messages in thread
From: Thomas Fitzsimmons @ 2016-03-15 14:39 UTC (permalink / raw)
  To: Michael Albinus; +Cc: Eli Zaretskii, monnier, emacs-devel

Michael Albinus <michael.albinus@gmx.de> writes:

> Thomas Fitzsimmons <fitzsim@fitzsim.org> writes:
>
>> With the soap-client patch reverted, this debbugs.el patch fixes the
>> above test case, producing a correct multibyte string for originator.
>
> Thanks. I've committed it in your name.
>
>> This is just one example, obviously.  The new function should be applied
>> to all values returned by the Debbugs server that may contain multibyte
>> UTF-8 characters.  I'll leave this part to Michael.
>
> Done. The other affected attributes are "subject", "owner" and
> "summary"; I've added them.
>
>> Also, there's still something not quite right about async, for strings
>> that don't contain extended characters:
>>
>> (multibyte-string-p (cdr (assq 'severity (car (async-get (async-start
>>             `(lambda ()
>>                (load ,(locate-library "debbugs"))
>>                (debbugs-get-status 22285))))))))
>> => nil
>>
>> (multibyte-string-p (cdr (assq 'severity (car (debbugs-get-status 22285)))))
>> => t
>>
>> but this looks like a side effect of async.
>
> Hmm, don't know.

I think Andreas explained it with his code snippet.  IIUC, async does a
`read' on the sexp from the child process, and read returns a unibyte
string if there are no non-ASCII characters in the string.  If there are
non-ASCII characters, then read returns a multibyte string:

(let ((pair (read "(\"á\" \"a\")")))
  (list (multibyte-string-p (car pair))
	(multibyte-string-p (cdr pair))))
=> (t nil)

I don't think this will affect debbugs.el's functionality though, since
it will always decode fields that it knows might contain non-ASCII.

The only difference will be that ASCII-only strings coming back from
debbugs-over-async will be unibyte, whereas ASCII-only strings coming
back from in-process debbugs.el will be multibyte, because:

(multibyte-string-p
 (async-get (async-start
	     `(lambda ()
		(decode-coding-string "a" 'utf-8)))))
=> nil

(multibyte-string-p (decode-coding-string "a" 'utf-8))
=> t

As long as no users of the debbugs.el APIs key off multibytedness of the
strings, they'll be fine.  I guess this is a quirk of async, that it
strips the multibytedness of strings that are multibyte-ASCII-only in
the inferior.

> I would like to get rid of async. If `soap-invoke-async' exists
> (soap-client >= 3.0), that function is preferred over
> `async-start'/`async-get'.

The latest soap-client version is also available in GNU ELPA.  Can you
just Package-Requires it, and always rely on the latest version that has
`soap-invoke-async'?

>> I'll wait to revert the soap-client patch on master and emacs-25 until
>> Debbugs has a new release to fix this, to avoid temporarily re-breaking
>> Debbugs.  (However, I'd really like to get this done before Emacs 25.1
>> so that it doesn't go out with an incompatible soap-client API.)
>
> I've increased debbugs' version to 0.9.1, shall be released soon by the
> elpa release script.

Thanks; I should be able to finish the soap-client merge and release
3.1.0 this evening.  Then maybe Debbugs 0.9.2 can Package-Requires
soap-client 3.1.0?

Thomas



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: emacs-25 b6b47AF: Properly encode/decode base64Binary data in SOAP
  2016-03-15 14:39                                                         ` Thomas Fitzsimmons
@ 2016-03-15 15:04                                                           ` Michael Albinus
  2016-03-17 14:23                                                             ` Thomas Fitzsimmons
  2016-03-15 17:39                                                           ` Eli Zaretskii
  1 sibling, 1 reply; 58+ messages in thread
From: Michael Albinus @ 2016-03-15 15:04 UTC (permalink / raw)
  To: Thomas Fitzsimmons; +Cc: Eli Zaretskii, monnier, emacs-devel

Thomas Fitzsimmons <fitzsim@fitzsim.org> writes:

> The latest soap-client version is also available in GNU ELPA.  Can you
> just Package-Requires it, and always rely on the latest version that has
> `soap-invoke-async'?

Will do. Last time I've touched debbugs, it was not available in elpa yet.

> Thanks; I should be able to finish the soap-client merge and release
> 3.1.0 this evening.  Then maybe Debbugs 0.9.2 can Package-Requires
> soap-client 3.1.0?

Yep. Tomorrow, when soap-client 3.1.0 is available.

> Thomas

Best regards, Michael.



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: emacs-25 b6b47AF: Properly encode/decode base64Binary data in SOAP
  2016-03-15 14:39                                                         ` Thomas Fitzsimmons
  2016-03-15 15:04                                                           ` Michael Albinus
@ 2016-03-15 17:39                                                           ` Eli Zaretskii
  1 sibling, 0 replies; 58+ messages in thread
From: Eli Zaretskii @ 2016-03-15 17:39 UTC (permalink / raw)
  To: Thomas Fitzsimmons; +Cc: michael.albinus, monnier, emacs-devel

> From: Thomas Fitzsimmons <fitzsim@fitzsim.org>
> Cc: Eli Zaretskii <eliz@gnu.org>,  monnier@iro.umontreal.ca,  emacs-devel@gnu.org
> Date: Tue, 15 Mar 2016 10:39:02 -0400
> 
> I think Andreas explained it with his code snippet.  IIUC, async does a
> `read' on the sexp from the child process, and read returns a unibyte
> string if there are no non-ASCII characters in the string.  If there are
> non-ASCII characters, then read returns a multibyte string:
> 
> (let ((pair (read "(\"á\" \"a\")")))
>   (list (multibyte-string-p (car pair))
> 	(multibyte-string-p (cdr pair))))
> => (t nil)

Yes.

> I don't think this will affect debbugs.el's functionality though, since
> it will always decode fields that it knows might contain non-ASCII.

Decoding a pure-ASCII string is harmless.

> The only difference will be that ASCII-only strings coming back from
> debbugs-over-async will be unibyte, whereas ASCII-only strings coming
> back from in-process debbugs.el will be multibyte, because:
> 
> (multibyte-string-p
>  (async-get (async-start
> 	     `(lambda ()
> 		(decode-coding-string "a" 'utf-8)))))
> => nil
> 
> (multibyte-string-p (decode-coding-string "a" 'utf-8))
> => t
> 
> As long as no users of the debbugs.el APIs key off multibytedness of the
> strings, they'll be fine.  I guess this is a quirk of async, that it
> strips the multibytedness of strings that are multibyte-ASCII-only in
> the inferior.

I don't think it's something async does, nor is it a quirk.  Depending
on the APIs used, Emacs might decide to produce a unibyte string if
the text is pure ASCII.  That is harmless and should never cause any
problems.

Thanks.



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: emacs-25 b6b47AF: Properly encode/decode base64Binary data in SOAP
  2016-03-15 15:04                                                           ` Michael Albinus
@ 2016-03-17 14:23                                                             ` Thomas Fitzsimmons
  2016-03-17 15:39                                                               ` Thomas Fitzsimmons
  2016-03-17 15:46                                                               ` Stefan Monnier
  0 siblings, 2 replies; 58+ messages in thread
From: Thomas Fitzsimmons @ 2016-03-17 14:23 UTC (permalink / raw)
  To: Michael Albinus; +Cc: Eli Zaretskii, monnier, emacs-devel

Michael Albinus <michael.albinus@gmx.de> writes:

> Thomas Fitzsimmons <fitzsim@fitzsim.org> writes:
>
>> The latest soap-client version is also available in GNU ELPA.  Can you
>> just Package-Requires it, and always rely on the latest version that has
>> `soap-invoke-async'?
>
> Will do. Last time I've touched debbugs, it was not available in elpa yet.
>
>> Thanks; I should be able to finish the soap-client merge and release
>> 3.1.0 this evening.  Then maybe Debbugs 0.9.2 can Package-Requires
>> soap-client 3.1.0?
>
> Yep. Tomorrow, when soap-client 3.1.0 is available.

I synced soap-client 3.1.1 to the Emacs master branch.  But it didn't
get published in the GNU ELPA repository last night.  (I also pushed
Excorporate 0.7.2, which did get published and is now showing as
"incompatible" because of its soap-client 3.1.1 dependency.)

Should updates to :core externals automatically get published to GNU
ELPA?  Does anyone know what I missed or how to debug this?

Thanks,
Thomas



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: emacs-25 b6b47AF: Properly encode/decode base64Binary data in SOAP
  2016-03-17 14:23                                                             ` Thomas Fitzsimmons
@ 2016-03-17 15:39                                                               ` Thomas Fitzsimmons
  2016-03-17 19:24                                                                 ` Michael Albinus
  2016-03-17 15:46                                                               ` Stefan Monnier
  1 sibling, 1 reply; 58+ messages in thread
From: Thomas Fitzsimmons @ 2016-03-17 15:39 UTC (permalink / raw)
  To: Michael Albinus; +Cc: Eli Zaretskii, monnier, emacs-devel

Thomas Fitzsimmons <fitzsim@fitzsim.org> writes:

> Michael Albinus <michael.albinus@gmx.de> writes:
>
>> Thomas Fitzsimmons <fitzsim@fitzsim.org> writes:
>>
>>> The latest soap-client version is also available in GNU ELPA.  Can you
>>> just Package-Requires it, and always rely on the latest version that has
>>> `soap-invoke-async'?
>>
>> Will do. Last time I've touched debbugs, it was not available in elpa yet.
>>
>>> Thanks; I should be able to finish the soap-client merge and release
>>> 3.1.0 this evening.  Then maybe Debbugs 0.9.2 can Package-Requires
>>> soap-client 3.1.0?
>>
>> Yep. Tomorrow, when soap-client 3.1.0 is available.
>
> I synced soap-client 3.1.1 to the Emacs master branch.  But it didn't
> get published in the GNU ELPA repository last night.  (I also pushed
> Excorporate 0.7.2, which did get published and is now showing as
> "incompatible" because of its soap-client 3.1.1 dependency.)
>
> Should updates to :core externals automatically get published to GNU
> ELPA?  Does anyone know what I missed or how to debug this?

soap-client 3.1.1 is there now.  Not quite sure what I was seeing; maybe
my list-packages test was invalid.  In any case, you should be unblocked
for Debbugs 0.9.2.

Thanks,
Thomas



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: emacs-25 b6b47AF: Properly encode/decode base64Binary data in SOAP
  2016-03-17 14:23                                                             ` Thomas Fitzsimmons
  2016-03-17 15:39                                                               ` Thomas Fitzsimmons
@ 2016-03-17 15:46                                                               ` Stefan Monnier
  1 sibling, 0 replies; 58+ messages in thread
From: Stefan Monnier @ 2016-03-17 15:46 UTC (permalink / raw)
  To: Thomas Fitzsimmons; +Cc: Eli Zaretskii, Michael Albinus, emacs-devel

> Should updates to :core externals automatically get published to GNU
> ELPA?

Yes (at least those installed in "master").

> Does anyone know what I missed or how to debug this?

I just changed the scripts to give better debug output, so hopefully
next time I'll see what was the problem.  In the mean time, I re-ran the
script and it did make the soap-client package, so the immediate problem
is fixed.


        Stefan



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: emacs-25 b6b47AF: Properly encode/decode base64Binary data in SOAP
  2016-03-17 15:39                                                               ` Thomas Fitzsimmons
@ 2016-03-17 19:24                                                                 ` Michael Albinus
  0 siblings, 0 replies; 58+ messages in thread
From: Michael Albinus @ 2016-03-17 19:24 UTC (permalink / raw)
  To: Thomas Fitzsimmons; +Cc: Eli Zaretskii, monnier, emacs-devel

Thomas Fitzsimmons <fitzsim@fitzsim.org> writes:

Hi Thomas,

> soap-client 3.1.1 is there now.  Not quite sure what I was seeing; maybe
> my list-packages test was invalid.  In any case, you should be unblocked
> for Debbugs 0.9.2.

Thanks. I'll perform some further tests before releasing debbugs
0.9.2. Likely, I'll release it after the Eastern break.

> Thanks,
> Thomas

Best regards, Michael.



^ permalink raw reply	[flat|nested] 58+ messages in thread

end of thread, other threads:[~2016-03-17 19:24 UTC | newest]

Thread overview: 58+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20160106200404.17375.71733@vcs.savannah.gnu.org>
     [not found] ` <E1aGuJQ-0004Wv-Lz@vcs.savannah.gnu.org>
2016-03-10  1:03   ` emacs-25 b6b47af: Properly encode/decode base64Binary data in SOAP Thomas Fitzsimmons
2016-03-10  9:30     ` Andreas Schwab
2016-03-11  3:29       ` emacs-25 b6b47AF: " Thomas Fitzsimmons
2016-03-11  8:35         ` Andreas Schwab
2016-03-11 13:49           ` Alex Harsanyi
2016-03-11 14:09             ` Andreas Schwab
2016-03-11 16:48               ` Stefan Monnier
2016-03-11 16:59                 ` Andreas Schwab
2016-03-11 22:27                   ` Stefan Monnier
2016-03-13  3:52                     ` Thomas Fitzsimmons
2016-03-13 15:15                       ` Stefan Monnier
2016-03-13 18:09                         ` Thomas Fitzsimmons
2016-03-13 16:02                       ` Eli Zaretskii
2016-03-13 17:57                         ` Thomas Fitzsimmons
2016-03-13 18:30                           ` Eli Zaretskii
2016-03-13 19:54                             ` Thomas Fitzsimmons
2016-03-13 20:19                               ` Eli Zaretskii
2016-03-13 21:17                                 ` Thomas Fitzsimmons
2016-03-14  3:30                                   ` Eli Zaretskii
2016-03-14  8:49                                     ` Andreas Schwab
2016-03-14  9:15                                       ` Michael Albinus
2016-03-14 11:56                                         ` Stefan Monnier
2016-03-14 12:18                                           ` Alex Harsanyi
2016-03-14 11:58                                         ` Alex Harsanyi
2016-03-14 12:38                                           ` Michael Albinus
2016-03-14 13:18                                             ` Alex Harsanyi
2016-03-14 13:30                                               ` Michael Albinus
2016-03-14 13:26                                             ` Stefan Monnier
2016-03-14 14:12                                               ` Michael Albinus
2016-03-14 14:58                                                 ` Thomas Fitzsimmons
2016-03-14 15:56                                                   ` Michael Albinus
2016-03-14 17:58                                                   ` Eli Zaretskii
2016-03-15  1:56                                                     ` Thomas Fitzsimmons
2016-03-15  7:45                                                       ` Andreas Schwab
2016-03-15  7:57                                                         ` Michael Albinus
2016-03-15  7:49                                                       ` Andreas Schwab
2016-03-15  8:08                                                       ` Michael Albinus
2016-03-15 14:39                                                         ` Thomas Fitzsimmons
2016-03-15 15:04                                                           ` Michael Albinus
2016-03-17 14:23                                                             ` Thomas Fitzsimmons
2016-03-17 15:39                                                               ` Thomas Fitzsimmons
2016-03-17 19:24                                                                 ` Michael Albinus
2016-03-17 15:46                                                               ` Stefan Monnier
2016-03-15 17:39                                                           ` Eli Zaretskii
2016-03-14 17:53                                           ` Eli Zaretskii
2016-03-14 18:47                                             ` Michael Albinus
2016-03-14 18:57                                               ` Eli Zaretskii
2016-03-14 17:49                                         ` Eli Zaretskii
2016-03-14 18:44                                           ` Michael Albinus
2016-03-14 18:52                                             ` Eli Zaretskii
2016-03-14 19:05                                               ` Michael Albinus
2016-03-14  9:23                                       ` Thomas Fitzsimmons
2016-03-14  9:58                                         ` Andreas Schwab
2016-03-14  8:02                                   ` Michael Albinus
2016-03-14 12:39                                     ` Stefan Monnier
2016-03-14 17:55                                       ` Eli Zaretskii
2016-03-14 17:48                                     ` Eli Zaretskii
2016-03-14 18:42                                       ` Michael Albinus

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).