From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Alain Schneble Newsgroups: gmane.emacs.bugs Subject: bug#22044: 24.4; url-expand.el and url-parse.el not conforming to RFC 3986 Date: Sat, 28 Nov 2015 22:53:39 +0100 Message-ID: <86fuzpyen0.fsf@realize.ch> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain X-Trace: ger.gmane.org 1448751381 2850 80.91.229.3 (28 Nov 2015 22:56:21 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sat, 28 Nov 2015 22:56:21 +0000 (UTC) To: 22044@debbugs.gnu.org Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Sat Nov 28 23:56:08 2015 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1a2oPW-0002E0-Uk for geb-bug-gnu-emacs@m.gmane.org; Sat, 28 Nov 2015 23:56:07 +0100 Original-Received: from localhost ([::1]:34249 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a2oPa-0001Rc-Gw for geb-bug-gnu-emacs@m.gmane.org; Sat, 28 Nov 2015 17:56:10 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:40675) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a2oPX-0001RX-5E for bug-gnu-emacs@gnu.org; Sat, 28 Nov 2015 17:56:08 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1a2oPS-0007bF-4U for bug-gnu-emacs@gnu.org; Sat, 28 Nov 2015 17:56:07 -0500 Original-Received: from debbugs.gnu.org ([208.118.235.43]:40591) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a2oPS-0007bA-1c for bug-gnu-emacs@gnu.org; Sat, 28 Nov 2015 17:56:02 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.80) (envelope-from ) id 1a2oPR-0002vr-OA for bug-gnu-emacs@gnu.org; Sat, 28 Nov 2015 17:56:01 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Alain Schneble Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 28 Nov 2015 22:56:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: report 22044 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: X-Debbugs-Original-To: Original-Received: via spool by submit@debbugs.gnu.org id=B.144875134411241 (code B ref -1); Sat, 28 Nov 2015 22:56:01 +0000 Original-Received: (at submit) by debbugs.gnu.org; 28 Nov 2015 22:55:44 +0000 Original-Received: from localhost ([127.0.0.1]:58532 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1a2oP9-0002vD-7Z for submit@debbugs.gnu.org; Sat, 28 Nov 2015 17:55:43 -0500 Original-Received: from eggs.gnu.org ([208.118.235.92]:58675) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1a2oOn-0002um-Hs for submit@debbugs.gnu.org; Sat, 28 Nov 2015 17:55:40 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1a2oOm-0007UD-33 for submit@debbugs.gnu.org; Sat, 28 Nov 2015 17:55:21 -0500 Original-Received: from lists.gnu.org ([2001:4830:134:3::11]:43520) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a2oOl-0007U9-Vd for submit@debbugs.gnu.org; Sat, 28 Nov 2015 17:55:19 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:40425) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a2oOj-0001Ii-A8 for bug-gnu-emacs@gnu.org; Sat, 28 Nov 2015 17:55:19 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1a2oOd-0007Ri-4P for bug-gnu-emacs@gnu.org; Sat, 28 Nov 2015 17:55:17 -0500 Original-Received: from clientmail.realize.ch ([46.140.89.53]:2147) by eggs.gnu.org with smtp (Exim 4.71) (envelope-from ) id 1a2oOc-0007Qi-QL for bug-gnu-emacs@gnu.org; Sat, 28 Nov 2015 17:55:11 -0500 Original-Received: from rintintin.hq.realize.ch.lan.rit ([192.168.0.105]) by clientmail.realize.ch ; Sat, 28 Nov 2015 22:54:26 +0100 Original-Received: from MYNGB (192.168.66.64) by rintintin.hq.realize.ch.lan.rit (192.168.0.105) with Microsoft SMTP Server (TLS) id 15.0.516.32; Sat, 28 Nov 2015 22:53:53 +0100 X-ClientProxiedBy: rintintin.hq.realize.ch.lan.rit (192.168.0.105) To rintintin.hq.realize.ch.lan.rit (192.168.0.105) X-detected-operating-system: by eggs.gnu.org: Windows NT kernel [generic] X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 208.118.235.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.bugs:109392 Archived-At: url-expand.el and url-parse.el seem to not conform to RFC 3986 "Uniform Resource Identifier (URI): Generic Syntax" in some cases. But I assume they should. Here is a list of issues found in url-expand-file-name and url-generic-parse-url, respectively: 1. resolving relative "fragment-only" URIs against a given absolute base URI (see RFC3986, section 5. Reference Resolution, and especially 5.2.2. Transform References): (url-expand-file-name "#s" "http://a/b/c/d;p?q") => "#s" but should be http://a/b/c/d;p?q#s" (url-expand-file-name "#bar" "http://host") => "#bar" but should be "http://host#bar" (url-expand-file-name "#bar" "http://host/") => "#bar" but should be "http://host/#bar" (url-expand-file-name "#bar" "http://host/foo") => "#bar" but should be "http://host/foo#bar" 2. resolving relative "query-only" URIs against a given absolute base URI (see RFC3986, same sections as mentioned in point 1.): (url-expand-file-name "?y" "http://a/b/c/d;p?q") => "http://a/b/c/?y" but should be "http://a/b/c/d;p?y" (url-expand-file-name "?y" "http://a/b/c/d") => "http://a/b/c/?y" but should be "http://a/b/c/d?y") 3. removing dot segments (see RFC3986, section 5.2.4. Remove Dot Segments): (url-expand-file-name "/./g" "http://a/b/c/d;p?q") => "http://a/./g" but should be "http://a/g" (url-expand-file-name "/../g" "http://a/b/c/d;p?q") => "http://a/../g" but should be "http://a/g" 4. empty fragment information is lost after parsing URI: (equal (url-generic-parse-url "#") (url-parse-make-urlobj nil nil nil nil nil "" "" nil nil)) ^ => nil but should be t (fragment component is actually nil instead of an empty string) Same issue with URLs having a number sign (#) as suffix: "/foo/bar#" "/foo/bar/#" "http://host#" "http://host?#" "http://host?query#" "http://host/#" "http://host/?#" "http://host/?query#" "http://host/foo#" "http://host/foo?#" "http://host/foo?query#" ... and so forth With the current behavior, the inverse function url-recreate-url won't be able to reconstruct exactly the same URI. For example: (url-recreate-url (url-generic-parse-url "#")) => "" but should be "#" Alain In GNU Emacs 24.4.1 (i686-pc-mingw32) of 2014-10-24 on LEG570 Windowing system distributor `Microsoft Corp.', version 6.3.9600 Configured using: `configure --prefix=/c/usr' Important settings: value of $LANG: DES locale-coding-system: cp1252