From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Thomas Fitzsimmons Newsgroups: gmane.emacs.devel Subject: Re: emacs-25 b6b47AF: Properly encode/decode base64Binary data in SOAP Date: Sun, 13 Mar 2016 15:54:34 -0400 Message-ID: References: <20160106200404.17375.71733@vcs.savannah.gnu.org> <87oaalxvi7.fsf@linux-m68k.org> <87k2l9xg19.fsf@linux-m68k.org> <87d1r1x84z.fsf@linux-m68k.org> <83lh5mfjsn.fsf@gnu.org> <83oaaidydx.fsf@gnu.org> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain X-Trace: ger.gmane.org 1457898902 17193 80.91.229.3 (13 Mar 2016 19:55:02 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sun, 13 Mar 2016 19:55:02 +0000 (UTC) Cc: monnier@iro.umontreal.ca, emacs-devel@gnu.org To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Sun Mar 13 20:54:56 2016 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1afC6J-0001NY-Os for ged-emacs-devel@m.gmane.org; Sun, 13 Mar 2016 20:54:56 +0100 Original-Received: from localhost ([::1]:37422 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1afC6J-00056D-Aw for ged-emacs-devel@m.gmane.org; Sun, 13 Mar 2016 15:54:55 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:39961) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1afC63-000567-IX for emacs-devel@gnu.org; Sun, 13 Mar 2016 15:54:40 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1afC61-0006dX-DT for emacs-devel@gnu.org; Sun, 13 Mar 2016 15:54:39 -0400 Original-Received: from mail-ig0-x232.google.com ([2607:f8b0:4001:c05::232]:33890) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1afC61-0006cT-6s for emacs-devel@gnu.org; Sun, 13 Mar 2016 15:54:37 -0400 Original-Received: by mail-ig0-x232.google.com with SMTP id av4so48023164igc.1 for ; Sun, 13 Mar 2016 12:54:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fitzsim-org.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:references:date:in-reply-to:message-id :user-agent:mime-version; bh=wxvUnocerwrZQmGkKTIa+P0d9M5H5SG+BMpvjWt3lq8=; b=0LPpRJSd4XDRG+aee3FvqOCcCzXPbjS9CE00Gg+DNj8ODgHZEAIR9FPFrYvew3B/xe PmvXqPOLxlwogyv60z7bLFeRSrIneVmTms3LlVHWXxCMztsSOIqjj6u77P7V3GH6Fv+x GDv3z0DlHHyXnSjMt4p9pqSL8DRhcKphbOrd0X55UC22XjApmVdDJJB/gjOoJnNOxWFb SuU0OapLjg5oW0fGOrv5gKberVOhsseQGbzUkyynwIy5oVbPmz4cCTCzEYrqFMJe3q7/ eHRUpOO3QOW+6M/77RuMJS/ONuixk9o9Pd+Z396TocQ+sNGguK40lNiOXGcNrr1q4tn+ Ii+w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:references:date:in-reply-to :message-id:user-agent:mime-version; bh=wxvUnocerwrZQmGkKTIa+P0d9M5H5SG+BMpvjWt3lq8=; b=PSyt98Brcu0VbtctVfyg4jwFSnr1ysYZIyHXLCzc0njEiXQyeVfqieobonPXpKI21o JXT6CEJmn9CccA+ypJkGU4CnmxNDu++69DzfOHOpl5PvfyZu52KbrvFXAFrJDS7Kt56W po/SxHNfdBCyWPSA3nXkkAWu1CGJCSVl1QT0frn0+Cacze8E9ShtA0xSXZaRUy89ur1i LQqHAZcNXt+vm09NCbT4ZrP5ef/rKYNK9pgc/4uGqdVTyJBALFhWb45eyEZVX3ALjlLM KAw2iT5VOMxtzHzBkcIF3/sHIArtus8AWXbiOCZr+6dwH/pM7cl6L+QCDywcGO+cgOop 91GQ== X-Gm-Message-State: AD7BkJJLhQ4VLibSNlGjBUD6MSmJxxR+tsJzKym5fRJQxO8gqM2DZbHCWlvsgspE7IPmpw== X-Received: by 10.50.171.199 with SMTP id aw7mr518953igc.59.1457898875454; Sun, 13 Mar 2016 12:54:35 -0700 (PDT) Original-Received: from hp-dv5t (69-165-165-189.dsl.teksavvy.com. [69.165.165.189]) by smtp.gmail.com with ESMTPSA id z36sm7325146ioi.42.2016.03.13.12.54.34 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 13 Mar 2016 12:54:34 -0700 (PDT) In-Reply-To: <83oaaidydx.fsf@gnu.org> (Eli Zaretskii's message of "Sun, 13 Mar 2016 20:30:02 +0200") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.1.50 (gnu/linux) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2607:f8b0:4001:c05::232 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:201638 Archived-At: Eli Zaretskii writes: >> From: Thomas Fitzsimmons >> Cc: monnier@iro.umontreal.ca, emacs-devel@gnu.org >> Date: Sun, 13 Mar 2016 13:57:32 -0400 >> >> (defun soap-parse-server-response () >> "Error-check and parse the XML contents of the current buffer." >> (let ((mime-part (mm-dissect-buffer t t))) >> (unless mime-part >> (error "Failed to decode response from server")) >> (unless (equal (car (mm-handle-type mime-part)) "text/xml") >> (error "Server response is not an XML document")) >> (with-temp-buffer >> (mm-insert-part mime-part) >> (prog1 >> (car (xml-parse-region (point-min) (point-max))) >> (kill-buffer) >> (mm-destroy-part mime-part))))) >> >> mm-insert-part does: >> >> (string-to-multibyte (mm-get-part handle no-cache)) > > Why does it do that? string-to-multibyte is one of those functions > that should never be used. I don't know. This is the first I've looked at the mm code. I'll have to do more investigation here, apparently. >> In cases where the caller is expecting an xsd:string, the idea is for >> soap-client to return a native Emacs string, for the caller's >> convenience. > > But that's not what string-to-multibyte does. > >> I guess soap-client assumes that the mm and xml packages will do the >> right thing to convert XML string values into Emacs's internal >> format. > > I'm not sure we are not mis-communicating: conversion into internal > format is what decoding does. Whereas you just said a few messages > upthread that you thought strings should be returned undecoded, > i.e. as binary streams of bytes. What am I missing? The discussion expanded from being about xsd:base64Binary, to being about all strings returned by soap-client (see below). Upthread I was saying only that xsd:base64Binary values should be returned undecoded. I wasn't commenting on how other XSD string values (xsd:string, etc.) should be returned. >> >> Is the attached patch OK for master and emacs-25? >> > >> > Doesn't it bring back the bug which caused Andreas to make the change >> > you want to undo? >> >> It brings back the behavior of soap-client returning base64-decoded >> xsd:base64Binary values as unibyte strings. > > I'm confused: you've just demonstrated that it returns them as > multibyte strings with raw bytes in their multibyte encoding. > >> The debate on this thread is about whether that behavior is buggy or >> not. But yes, I want to revert Andreas's change on both master and >> emacs-25 branches, because I don't consider the old behavior buggy. > > That'll bring the bug in the debbugs package back, I think. Once > again, if you want to return undecoded strings, they should at the > very least be unibyte, not multibyte. Apologies if I'm too confused > to talk intelligently about this. Apologies for helping lead to confusion; it's good to have you reviewing soap-client's design. The discussion expanded from being about how to handle xsd:base64Binary values only (Andreas's patch), to about how soap-client handles all strings (including xsd:string, etc.). It could be that how soap-client handles all strings is broken, since it appears to be relying on string-to-multibyte which you're saying should never be used. However, soap-client's decoding has been good enough that no one has complained about string handling in general up til now. But I'll review the design with Alex to see if we can avoid calling string-to-multibyte via mm. Maybe I can give an example with XML fragments returned by the server, to show how I think soap-client should handle xsd:base64Binary values. The debbugs server will respond with: [...] normal [...] Q2zDqW1lbnQgUGl0LS1DbGF1ZGVsIDxjbGVtZW50LnBpdGNsYXVkZWxAbGl2ZS5jb20+ [...] soap-client will parse those results into a structure that it returns to the caller: ([...] (severity . "") [...] (originator . "") [...]) I think should be unibyte, because xsd:base64Binary represents binary data, not necessarily a string. It was unibyte before Andreas's patch. His patch changed it to be multibyte, by assuming the binary data is a UTF-8 string and decoding it into Emacs's internal format. What should be (unibyte or multibyte) and how it should be produced (decoded) is the broader discussion. I don't know enough to have an opinion on that yet, other than it seems to have been working to treat it as multibyte up until now. Again, I'll have to talk to Alex about this. Thomas