unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#27012: 26.0.50; eww does not generate Referer headers
@ 2017-05-21 21:41 Lars Ingebrigtsen
  2017-05-22 12:18 ` Richard Stallman
  0 siblings, 1 reply; 12+ messages in thread
From: Lars Ingebrigtsen @ 2017-05-21 21:41 UTC (permalink / raw)
  To: 27012


This makes it impossible to log in to services like

https://secure.last.fm/login

which results in


---
You are seeing this message because this HTTPS site requires a 'Referer
header' to be sent by your Web browser, but none was sent. This header is
required for security reasons, to ensure that your browser is not being hijacked
by third parties.
---


eww should always generate Referer headers when following links or
posting forms.


In GNU Emacs 26.0.50 (build 3, x86_64-pc-linux-gnu, GTK+ Version 3.14.5)
 of 2017-04-24 built on stories
Repository revision: a1f93c1dfa53dbe007faa09ab0c6e913e86e3ffe
Windowing system distributor 'The X.Org Foundation', version 11.0.11604000
System Description:	Debian GNU/Linux 8.7 (jessie)


-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no






^ permalink raw reply	[flat|nested] 12+ messages in thread

* bug#27012: 26.0.50; eww does not generate Referer headers
  2017-05-21 21:41 bug#27012: 26.0.50; eww does not generate Referer headers Lars Ingebrigtsen
@ 2017-05-22 12:18 ` Richard Stallman
  2017-05-22 12:28   ` Lars Ingebrigtsen
  0 siblings, 1 reply; 12+ messages in thread
From: Richard Stallman @ 2017-05-22 12:18 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: 27012

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > eww should always generate Referer headers when following links or
  > posting forms.

For users' privacy, we should not give real values for the referrer field,
except in special cases.

-- 
Dr Richard Stallman
President, Free Software Foundation (gnu.org, fsf.org)
Internet Hall-of-Famer (internethalloffame.org)
Skype: No way! See stallman.org/skype.html.






^ permalink raw reply	[flat|nested] 12+ messages in thread

* bug#27012: 26.0.50; eww does not generate Referer headers
  2017-05-22 12:18 ` Richard Stallman
@ 2017-05-22 12:28   ` Lars Ingebrigtsen
  2017-05-22 16:37     ` Glenn Morris
  2017-05-23  0:54     ` Richard Stallman
  0 siblings, 2 replies; 12+ messages in thread
From: Lars Ingebrigtsen @ 2017-05-22 12:28 UTC (permalink / raw)
  To: Richard Stallman; +Cc: 27012

Richard Stallman <rms@gnu.org> writes:

> For users' privacy, we should not give real values for the referrer field,
> except in special cases.

What are those special cases?

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 12+ messages in thread

* bug#27012: 26.0.50; eww does not generate Referer headers
  2017-05-22 12:28   ` Lars Ingebrigtsen
@ 2017-05-22 16:37     ` Glenn Morris
  2017-05-22 16:53       ` Lars Ingebrigtsen
  2017-05-23  0:54     ` Richard Stallman
  1 sibling, 1 reply; 12+ messages in thread
From: Glenn Morris @ 2017-05-22 16:37 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: Richard Stallman, 27012


You could look at how GNU Icecat handles this, eg
network.http.referer.spoofSource. (I would guess every privacy issue eww
might encounter has already been considered by Icecat.)





^ permalink raw reply	[flat|nested] 12+ messages in thread

* bug#27012: 26.0.50; eww does not generate Referer headers
  2017-05-22 16:37     ` Glenn Morris
@ 2017-05-22 16:53       ` Lars Ingebrigtsen
  2017-07-12 23:03         ` Peder O. Klingenberg
  0 siblings, 1 reply; 12+ messages in thread
From: Lars Ingebrigtsen @ 2017-05-22 16:53 UTC (permalink / raw)
  To: Glenn Morris; +Cc: Richard Stallman, 27012

Glenn Morris <rgm@gnu.org> writes:

> You could look at how GNU Icecat handles this, eg
> network.http.referer.spoofSource. (I would guess every privacy issue eww
> might encounter has already been considered by Icecat.)

network.http.referer.XOriginPolicy

    0 - always send referrer (default).
    1 - only send if base domains match.
    2 - only send if hosts match.

Adding something like this (and defaulting to 1) might make sense for
eww.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 12+ messages in thread

* bug#27012: 26.0.50; eww does not generate Referer headers
  2017-05-22 12:28   ` Lars Ingebrigtsen
  2017-05-22 16:37     ` Glenn Morris
@ 2017-05-23  0:54     ` Richard Stallman
  1 sibling, 0 replies; 12+ messages in thread
From: Richard Stallman @ 2017-05-23  0:54 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: 27012

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > > For users' privacy, we should not give real values for the referrer field,
  > > except in special cases.

  > What are those special cases?

I know of one: when page FOO uses Cloudflare, the Cloudflare CAPTCHA
page insists on getting FOO as a referrer.

I suppose there are others, but I only know of that one.

-- 
Dr Richard Stallman
President, Free Software Foundation (gnu.org, fsf.org)
Internet Hall-of-Famer (internethalloffame.org)
Skype: No way! See stallman.org/skype.html.






^ permalink raw reply	[flat|nested] 12+ messages in thread

* bug#27012: 26.0.50; eww does not generate Referer headers
  2017-05-22 16:53       ` Lars Ingebrigtsen
@ 2017-07-12 23:03         ` Peder O. Klingenberg
  2017-07-13  7:16           ` Andreas Schwab
  0 siblings, 1 reply; 12+ messages in thread
From: Peder O. Klingenberg @ 2017-07-12 23:03 UTC (permalink / raw)
  To: 27012

[-- Attachment #1: Type: text/plain, Size: 1820 bytes --]

Lars Ingebrigtsen <larsi@gnus.org> writes:

>     0 - always send referrer (default).
>     1 - only send if base domains match.
>     2 - only send if hosts match.
>
> Adding something like this (and defaulting to 1) might make sense for
> eww.

I took a stab at implementing this.  It was trickier than I had
anticipated.  The URL library already had a mechanism of sorts for
adding Referer headers, but it was as an optional argument to a helper
function, and there was no way (short of code changes) of making the
entry points of url.el pass that argument to the helper.

Changing the signature of the url entry points and tracking down every
caller didn't seem attractive to me.  Instead, I reasoned that the
referring url is a property of the page currently being displayed, so a
buffer-local variable seemed natural.

That hit a snag because eww uses url queues, and the queue runner didn't
care what the current buffer was.  So some requests got referrers, some
didn't.  I fixed the queue mechanism so it always calls url-retrieve
from the same buffer that queued up the job.

With the basic mechanism for sending the Referer header then working, I
looked at limiting the distribution of it.  url-privacy-level already
existed, and had a basic on-off-knob for referrers, or "lastloc", as it's
called in that variable.  I left that alone, but added an additional
user option - url-lastloc-privacy, with possible values "none",
"domain-match" and "host-match" corresponding to the levels mentioned
above, and with "domain-match" the default.

So with this patch, eww sets up the buffer local url-current-lastloc
when the page renders.  Subsequent requests, be they automatic
requests for images or other resources, or user-invoked link-following,
get the correct Referer header if the privacy settings allow it.



[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Make-eww-optionally-send-Referer-headers.patch --]
[-- Type: text/x-patch, Size: 14993 bytes --]

From 2ab0fc7b9067f81712714dcb62c40d08fe05ab7a Mon Sep 17 00:00:00 2001
From: "Peder O. Klingenberg" <peder@klingenberg.no>
Date: Thu, 13 Jul 2017 00:27:58 +0200
Subject: [PATCH] Make eww (optionally) send Referer-headers

* lisp/net/eww.el (eww-render): Set url-current-lastloc to the
url we are rendering, to get the referer header right on
subsequent requests.

* lisp/url/url-queue.el (url-queue): New struct member
context-buffer for keeping track of the context a queued job
started from.
(url-queue-retrieve): Store the current buffer in the queue
object.
(url-queue-start-retrieve): Make sure url-retrieve is called
in the context of the original buffer, if available.

* doc/misc/url.texi (Customization): Describe the new user
option url-lastloc-privacy.

* lisp/url/url-http.el (url-http--get-referer): New function
to determine which referer to send, if any, considering the
users privacy settings and the target url we are visiting.
(url-http-referer): New variable keeping track of the referer
computed by url-http--get-referer
(url-http-create-request): Use url-http-referer instead of the
optional argument to set up the referer header.  Leave
checking of privacy settings to url-http--get-referer.
(url-http): Set up url-http-referer by using
url-http--get-referer.

* lisp/url/url-util.el (url-domain): New function to determine
the domain of a given URL.

* lisp/url/url-vars.el (url-current-lastloc): New variable to
keep track of the desired "last location" (referer header).
(url-lastloc-privacy): New custom setting for more
fine-grained control over how lastloc (referer) is sent to
servers.

(Bug#27012)
---
 doc/misc/url.texi     | 14 ++++++++++++++
 etc/NEWS              | 16 ++++++++++++++++
 lisp/net/eww.el       |  7 +++++--
 lisp/url/url-http.el  | 52 +++++++++++++++++++++++++++++++++++++++------------
 lisp/url/url-queue.el | 18 +++++++++++-------
 lisp/url/url-util.el  | 29 ++++++++++++++++++++++++++++
 lisp/url/url-vars.el  | 28 ++++++++++++++++++++++++++-
 7 files changed, 142 insertions(+), 22 deletions(-)

diff --git a/doc/misc/url.texi b/doc/misc/url.texi
index a3c625edce..c7532c0c32 100644
--- a/doc/misc/url.texi
+++ b/doc/misc/url.texi
@@ -1303,6 +1303,20 @@ Customization
 @end defopt
 @defopt url-privacy-level
 @end defopt
+@defopt url-lastloc-privacy
+Provided @code{lastloc} is not prohibited by @code{url-privacy-level},
+this determines who we send our last location to.  @code{none} means
+we include our last location in every outgoing request.
+@code{domain-match} means we send it only if the domain of our last
+location matches the domain of the URI we are requesting.
+@code{host-match} means we only send our last location back to the
+same host.  The default is @code{domain-match}.
+
+Using @code{domain-match} for this option requires emacs to make one
+or more DNS requests each time a new host is contacted, to determine
+the domain of the host.  Results of these lookups are cached, so
+repeated visits do not require repeated domain lookups.
+@end defopt
 @defopt url-uncompressor-alist
 @end defopt
 @defopt url-passwd-entry-func
diff --git a/etc/NEWS b/etc/NEWS
index dc9393c87d..90f96ad550 100644
--- a/etc/NEWS
+++ b/etc/NEWS
@@ -653,6 +653,11 @@ replaced by the real images asynchronously, which will also now
 respect width/height HTML specs (unless they specify widths/heights
 bigger than the current window).
 
+---
+*** EWW now sends Referer headers
+Provided they are allowed by 'url-privacy-level' and
+'url-lastloc-privacy'.
+
 ** Ido
 
 *** The commands 'find-alternate-file-other-window',
@@ -871,6 +876,17 @@ domain.
 *** 'url-user-agent' now defaults to 'default', and the User-Agent
 string is computed dynamically based on 'url-privacy-level'.
 
+---
+*** url-http now uses a buffer local variable to determine the referer
+for a request.  Previously, there was an optional argument to
+'url-http-create-request' to set up a referer, but no callers were
+using it.  Now, callers can set up 'url-current-lastloc' in a buffer
+before calling 'url-retrieve'.
+
++++
+*** New user option 'url-lastloc-privacy' providing fine-grained
+control over who we send referer-headers to.
+
 ** VC and related modes
 
 ---
diff --git a/lisp/net/eww.el b/lisp/net/eww.el
index fe31657914..e922b4f834 100644
--- a/lisp/net/eww.el
+++ b/lisp/net/eww.el
@@ -271,7 +271,7 @@ eww
     (insert (format "Loading %s..." url))
     (goto-char (point-min)))
   (url-retrieve url 'eww-render
-		(list url nil (current-buffer))))
+                (list url nil (current-buffer))))
 
 (defun eww--dwim-expand-url (url)
   (setq url (string-trim url))
@@ -359,7 +359,10 @@ eww-render
       ;; Save the https peer status.
       (plist-put eww-data :peer (plist-get status :peer))
       ;; Make buffer listings more informative.
-      (setq list-buffers-directory url))
+      (setq list-buffers-directory url)
+      ;; Let the URL library have a handle to the current URL for
+      ;; referer purposes.
+      (setq url-current-lastloc (url-generic-parse-url url)))
     (unwind-protect
 	(progn
 	  (cond
diff --git a/lisp/url/url-http.el b/lisp/url/url-http.el
index 06d32861b2..35a0f06d48 100644
--- a/lisp/url/url-http.el
+++ b/lisp/url/url-http.el
@@ -54,6 +54,7 @@ url-http-response-version
 (defvar url-http-target-url)
 (defvar url-http-transfer-encoding)
 (defvar url-show-status)
+(defvar url-http-referer)
 
 (require 'url-gw)
 (require 'url-parse)
@@ -238,6 +239,34 @@ url-http--user-agent-default-string
                                  emacs-info os-info))
                  " ")))
 
+(defun url-http--get-referer (url)
+  (url-http-debug "getting referer from buffer: buffer:%S target-url:%S lastloc:%S" (current-buffer) url url-current-lastloc)
+  (when url-current-lastloc
+    (if (not (url-p url-current-lastloc))
+        (setq url-current-lastloc (url-generic-parse-url url-current-lastloc)))
+    (let* ((referer url-current-lastloc)
+           (referer-string (url-recreate-url referer)))
+      (when (and (not (memq url-privacy-level '(low high paranoid)))
+                 (not (and (listp url-privacy-level)
+                           (memq 'lastloc url-privacy-level))))
+        ;; url-privacy-level allows referer.  But url-lastloc-privacy
+        ;; may restrict who we send it to.
+        (cl-case url-lastloc-privacy
+          (host-match
+           (let ((referer-host (url-host referer))
+                 (url-host (url-host url)))
+             (when (string= referer-host url-host)
+               referer-string)))
+          (domain-match
+           (let ((referer-domain (url-domain referer))
+                 (url-domain (url-domain url)))
+             (when (and referer-domain
+                        url-domain
+                        (string= referer-domain url-domain))
+               referer-string)))
+          (otherwise
+           referer-string))))))
+
 ;; Building an HTTP request
 (defun url-http-user-agent-string ()
   "Compute a User-Agent string.
@@ -254,8 +283,9 @@ url-http-user-agent-string
                 ((eq url-user-agent 'default) (url-http--user-agent-default-string))))))
     (if ua-string (format "User-Agent: %s\r\n" (string-trim ua-string)) "")))
 
-(defun url-http-create-request (&optional ref-url)
-  "Create an HTTP request for `url-http-target-url', referred to by REF-URL."
+(defun url-http-create-request ()
+  "Create an HTTP request for `url-http-target-url', using `url-http-referer'
+as the Referer-header (subject to `url-privacy-level'."
   (let* ((extra-headers)
 	 (request nil)
 	 (no-cache (cdr-safe (assoc "Pragma" url-http-extra-headers)))
@@ -274,7 +304,8 @@ url-http-create-request
 		 (url-get-authentication (or
 					  (and (boundp 'proxy-info)
 					       proxy-info)
-					  url-http-target-url) nil 'any nil))))
+					  url-http-target-url) nil 'any nil)))
+         (ref-url url-http-referer))
     (if (equal "" real-fname)
 	(setq real-fname "/"))
     (setq no-cache (and no-cache (string-match "no-cache" no-cache)))
@@ -288,12 +319,6 @@ url-http-create-request
 					   (string= ref-url "")))
 	(setq ref-url nil))
 
-    ;; We do not want to expose the referrer if the user is paranoid.
-    (if (or (memq url-privacy-level '(low high paranoid))
-	    (and (listp url-privacy-level)
-		 (memq 'lastloc url-privacy-level)))
-	(setq ref-url nil))
-
     ;; url-http-extra-headers contains an assoc-list of
     ;; header/value pairs that we need to put into the request.
     (setq extra-headers (mapconcat
@@ -1255,7 +1280,8 @@ url-http
          (mime-accept-string url-mime-accept-string)
 	 (buffer (or retry-buffer
 		     (generate-new-buffer
-                      (format " *http %s:%d*" (url-host url) (url-port url))))))
+                      (format " *http %s:%d*" (url-host url) (url-port url)))))
+         (referer (url-http--get-referer url)))
     (if (not connection)
 	;; Failed to open the connection for some reason
 	(progn
@@ -1290,7 +1316,8 @@ url-http
 		       url-http-no-retry
 		       url-http-connection-opened
                        url-mime-accept-string
-		       url-http-proxy))
+		       url-http-proxy
+                       url-http-referer))
 	  (set (make-local-variable var) nil))
 
 	(setq url-http-method (or url-request-method "GET")
@@ -1308,7 +1335,8 @@ url-http
 	      url-http-no-retry retry-buffer
 	      url-http-connection-opened nil
               url-mime-accept-string mime-accept-string
-	      url-http-proxy url-using-proxy)
+	      url-http-proxy url-using-proxy
+              url-http-referer referer)
 
 	(set-process-buffer connection buffer)
 	(set-process-filter connection 'url-http-generic-filter)
diff --git a/lisp/url/url-queue.el b/lisp/url/url-queue.el
index dd1699bd08..db0c16c793 100644
--- a/lisp/url/url-queue.el
+++ b/lisp/url/url-queue.el
@@ -52,7 +52,7 @@ url-queue-progress-timer
 (cl-defstruct url-queue
   url callback cbargs silentp
   buffer start-time pre-triggered
-  inhibit-cookiesp)
+  inhibit-cookiesp context-buffer)
 
 ;;;###autoload
 (defun url-queue-retrieve (url callback &optional cbargs silent inhibit-cookies)
@@ -67,7 +67,8 @@ url-queue-retrieve
 				      :callback callback
 				      :cbargs cbargs
 				      :silentp silent
-				      :inhibit-cookiesp inhibit-cookies))))
+				      :inhibit-cookiesp inhibit-cookies
+                                      :context-buffer (current-buffer)))))
   (url-queue-setup-runners))
 
 ;; To ensure asynch behavior, we start the required number of queue
@@ -147,11 +148,14 @@ url-queue-remove-jobs-from-host
 (defun url-queue-start-retrieve (job)
   (setf (url-queue-buffer job)
 	(ignore-errors
-	  (let ((url-request-noninteractive t))
-	    (url-retrieve (url-queue-url job)
-			  #'url-queue-callback-function (list job)
-			  (url-queue-silentp job)
-			  (url-queue-inhibit-cookiesp job))))))
+          (with-current-buffer (if (buffer-live-p (url-queue-context-buffer job))
+                                   (url-queue-context-buffer job)
+                                 (current-buffer))
+	   (let ((url-request-noninteractive t))
+             (url-retrieve (url-queue-url job)
+                           #'url-queue-callback-function (list job)
+                           (url-queue-silentp job)
+                           (url-queue-inhibit-cookiesp job)))))))
 
 (defun url-queue-prune-old-entries ()
   (let (dead-jobs)
diff --git a/lisp/url/url-util.el b/lisp/url/url-util.el
index a3c9655ebd..6be0a05314 100644
--- a/lisp/url/url-util.el
+++ b/lisp/url/url-util.el
@@ -627,6 +627,35 @@ url-make-private-file
          (error "Danger: `%s' is a symbolic link" file))
      (set-file-modes file #o0600))))
 
+(autoload 'dns-query "dns")
+
+(defvar url--domain-cache (make-hash-table :test 'equal :size 17)
+  "Cache to minimize dns lookups.")
+
+;;;###autoload
+(defun url-domain (url)
+  "Return the domain of the host of the url, or nil if url does
+not contain a registered name."
+  ;; Determining the domain of a name can not be done with simple
+  ;; textual manipulations.  a.b.c is either host a in domain b.c
+  ;; (www.google.com), or domain a.b.c with no separate host
+  ;; (bbc.co.uk).  Instead of guessing based on tld (which in any case
+  ;; may be inaccurate in the face of subdelegations), we look for
+  ;; domain delegations in DNS.
+  ;;
+  ;; Domain delegations change rarely enough that we won't bother with
+  ;; cache invalidation, I think.
+  (let* ((host-parts (split-string (url-host url) "\\."))
+         (result (gethash host-parts url--domain-cache 'not-found)))
+    (when (eq result 'not-found)
+      (setq result
+            (cl-loop for parts on host-parts
+                     for dom = (mapconcat #'identity parts ".")
+                     when (dns-query dom 'SOA)
+                     return dom))
+      (puthash host-parts result url--domain-cache))
+    result))
+
 (provide 'url-util)
 
 ;;; url-util.el ends here
diff --git a/lisp/url/url-vars.el b/lisp/url/url-vars.el
index f08779f695..41ed6d57d8 100644
--- a/lisp/url/url-vars.el
+++ b/lisp/url/url-vars.el
@@ -60,10 +60,18 @@ url-current-object
 (defvar url-current-mime-headers nil
   "A parsed representation of the MIME headers for the current URL.")
 
+(defvar url-current-lastloc nil
+  "A parsed representation of the URL to be considered as the last location.
+Use of this value on outbound connections is subject to
+`url-privacy-level' and `url-lastloc-privacy'.  This is never set
+by the url library, applications are expected to set this
+variable in buffers representing a displayed location.")
+
 (mapc 'make-variable-buffer-local
       '(
 	url-current-object
 	url-current-mime-headers
+        url-current-lastloc
 	))
 
 (defcustom url-honor-refresh-requests t
@@ -117,7 +125,7 @@ url-privacy-level
 email    -- the email address
 os       -- the operating system info
 emacs    -- the version of Emacs
-lastloc  -- the last location
+lastloc  -- the last location (see also `url-lastloc-privacy')
 agent    -- do not send the User-Agent string
 cookies  -- never accept HTTP cookies
 
@@ -150,6 +158,24 @@ url-privacy-level
 			   (const :tag "No cookies" :value cookie)))
   :group 'url)
 
+(defcustom url-lastloc-privacy 'domain-match
+  "Further restrictions on sending the last location.
+This value is only consulted if `url-privacy-level' permits
+sending last location in the first place.
+
+Valid values are:
+none          -- Always send last location.
+domain-match  -- Send last location if the new location is within the
+                 same domain
+host-match    -- Send last location if the new location is on the
+                 same host
+"
+  :version "26.1"
+  :type '(radio (const :tag "Always send" none)
+                (const :tag "Domains match" domain-match)
+                (const :tag "Hosts match" host-match))
+  :group 'url)
+
 (defvar url-inhibit-uncompression nil "Do not do decompression if non-nil.")
 
 (defcustom url-uncompressor-alist '((".z"  . "x-gzip")
-- 
2.11.0


[-- Attachment #3: Type: text/plain, Size: 16 bytes --]


--
...Peder...

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* bug#27012: 26.0.50; eww does not generate Referer headers
  2017-07-12 23:03         ` Peder O. Klingenberg
@ 2017-07-13  7:16           ` Andreas Schwab
  2017-07-13 14:02             ` Peder O. Klingenberg
  0 siblings, 1 reply; 12+ messages in thread
From: Andreas Schwab @ 2017-07-13  7:16 UTC (permalink / raw)
  To: Peder O. Klingenberg; +Cc: 27012

On Jul 13 2017, peder@klingenberg.no (Peder O. Klingenberg) wrote:

> With the basic mechanism for sending the Referer header then working, I
> looked at limiting the distribution of it.  url-privacy-level already
> existed, and had a basic on-off-knob for referrers, or "lastloc", as it's
> called in that variable.  I left that alone, but added an additional
> user option - url-lastloc-privacy, with possible values "none",

Perhaps url-lastloc-privacy-level, to rhyme with url-privacy-level.

Andreas.

-- 
Andreas Schwab, SUSE Labs, schwab@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."





^ permalink raw reply	[flat|nested] 12+ messages in thread

* bug#27012: 26.0.50; eww does not generate Referer headers
  2017-07-13  7:16           ` Andreas Schwab
@ 2017-07-13 14:02             ` Peder O. Klingenberg
  2018-04-13 13:09               ` Lars Ingebrigtsen
  0 siblings, 1 reply; 12+ messages in thread
From: Peder O. Klingenberg @ 2017-07-13 14:02 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: 27012

[-- Attachment #1: Type: text/plain, Size: 150 bytes --]

Andreas Schwab <schwab@suse.de> writes:

> Perhaps url-lastloc-privacy-level, to rhyme with url-privacy-level.

Sure, that's better.  Updated patch:


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Make-eww-optionally-send-Referer-headers.patch --]
[-- Type: text/x-patch, Size: 15053 bytes --]

From 41f5017e49ff98ef5727eb1aa855848ea352b3c4 Mon Sep 17 00:00:00 2001
From: "Peder O. Klingenberg" <peder@klingenberg.no>
Date: Thu, 13 Jul 2017 00:27:58 +0200
Subject: [PATCH] Make eww (optionally) send Referer-headers

* lisp/net/eww.el (eww-render): Set url-current-lastloc to the
url we are rendering, to get the referer header right on
subsequent requests.

* lisp/url/url-queue.el (url-queue): New struct member
context-buffer for keeping track of the context a queued job
started from.
(url-queue-retrieve): Store the current buffer in the queue
object.
(url-queue-start-retrieve): Make sure url-retrieve is called
in the context of the original buffer, if available.

* doc/misc/url.texi (Customization): Describe the new user
option url-lastloc-privacy-level.

* lisp/url/url-http.el (url-http--get-referer): New function
to determine which referer to send, if any, considering the
users privacy settings and the target url we are visiting.
(url-http-referer): New variable keeping track of the referer
computed by url-http--get-referer
(url-http-create-request): Use url-http-referer instead of the
optional argument to set up the referer header.  Leave
checking of privacy settings to url-http--get-referer.
(url-http): Set up url-http-referer by using
url-http--get-referer.

* lisp/url/url-util.el (url-domain): New function to determine
the domain of a given URL.

* lisp/url/url-vars.el (url-current-lastloc): New variable to
keep track of the desired "last location" (referer header).
(url-lastloc-privacy-level): New custom setting for more
fine-grained control over how lastloc (referer) is sent to
servers.

(Bug#27012)
---
 doc/misc/url.texi     | 14 ++++++++++++++
 etc/NEWS              | 16 ++++++++++++++++
 lisp/net/eww.el       |  7 +++++--
 lisp/url/url-http.el  | 52 +++++++++++++++++++++++++++++++++++++++------------
 lisp/url/url-queue.el | 18 +++++++++++-------
 lisp/url/url-util.el  | 29 ++++++++++++++++++++++++++++
 lisp/url/url-vars.el  | 28 ++++++++++++++++++++++++++-
 7 files changed, 142 insertions(+), 22 deletions(-)

diff --git a/doc/misc/url.texi b/doc/misc/url.texi
index a3c625edce..4c9fab569a 100644
--- a/doc/misc/url.texi
+++ b/doc/misc/url.texi
@@ -1303,6 +1303,20 @@ Customization
 @end defopt
 @defopt url-privacy-level
 @end defopt
+@defopt url-lastloc-privacy-level
+Provided @code{lastloc} is not prohibited by @code{url-privacy-level},
+this determines who we send our last location to.  @code{none} means
+we include our last location in every outgoing request.
+@code{domain-match} means we send it only if the domain of our last
+location matches the domain of the URI we are requesting.
+@code{host-match} means we only send our last location back to the
+same host.  The default is @code{domain-match}.
+
+Using @code{domain-match} for this option requires emacs to make one
+or more DNS requests each time a new host is contacted, to determine
+the domain of the host.  Results of these lookups are cached, so
+repeated visits do not require repeated domain lookups.
+@end defopt
 @defopt url-uncompressor-alist
 @end defopt
 @defopt url-passwd-entry-func
diff --git a/etc/NEWS b/etc/NEWS
index dc9393c87d..7a4de80843 100644
--- a/etc/NEWS
+++ b/etc/NEWS
@@ -653,6 +653,11 @@ replaced by the real images asynchronously, which will also now
 respect width/height HTML specs (unless they specify widths/heights
 bigger than the current window).
 
+---
+*** EWW now sends Referer headers
+Provided they are allowed by 'url-privacy-level' and
+'url-lastloc-privacy-level'.
+
 ** Ido
 
 *** The commands 'find-alternate-file-other-window',
@@ -871,6 +876,17 @@ domain.
 *** 'url-user-agent' now defaults to 'default', and the User-Agent
 string is computed dynamically based on 'url-privacy-level'.
 
+---
+*** url-http now uses a buffer local variable to determine the referer
+for a request.  Previously, there was an optional argument to
+'url-http-create-request' to set up a referer, but no callers were
+using it.  Now, callers can set up 'url-current-lastloc' in a buffer
+before calling 'url-retrieve'.
+
++++
+*** New user option 'url-lastloc-privacy-level' providing fine-grained
+control over who we send referer-headers to.
+
 ** VC and related modes
 
 ---
diff --git a/lisp/net/eww.el b/lisp/net/eww.el
index fe31657914..e922b4f834 100644
--- a/lisp/net/eww.el
+++ b/lisp/net/eww.el
@@ -271,7 +271,7 @@ eww
     (insert (format "Loading %s..." url))
     (goto-char (point-min)))
   (url-retrieve url 'eww-render
-		(list url nil (current-buffer))))
+                (list url nil (current-buffer))))
 
 (defun eww--dwim-expand-url (url)
   (setq url (string-trim url))
@@ -359,7 +359,10 @@ eww-render
       ;; Save the https peer status.
       (plist-put eww-data :peer (plist-get status :peer))
       ;; Make buffer listings more informative.
-      (setq list-buffers-directory url))
+      (setq list-buffers-directory url)
+      ;; Let the URL library have a handle to the current URL for
+      ;; referer purposes.
+      (setq url-current-lastloc (url-generic-parse-url url)))
     (unwind-protect
 	(progn
 	  (cond
diff --git a/lisp/url/url-http.el b/lisp/url/url-http.el
index 06d32861b2..55d725a477 100644
--- a/lisp/url/url-http.el
+++ b/lisp/url/url-http.el
@@ -54,6 +54,7 @@ url-http-response-version
 (defvar url-http-target-url)
 (defvar url-http-transfer-encoding)
 (defvar url-show-status)
+(defvar url-http-referer)
 
 (require 'url-gw)
 (require 'url-parse)
@@ -238,6 +239,34 @@ url-http--user-agent-default-string
                                  emacs-info os-info))
                  " ")))
 
+(defun url-http--get-referer (url)
+  (url-http-debug "getting referer from buffer: buffer:%S target-url:%S lastloc:%S" (current-buffer) url url-current-lastloc)
+  (when url-current-lastloc
+    (if (not (url-p url-current-lastloc))
+        (setq url-current-lastloc (url-generic-parse-url url-current-lastloc)))
+    (let* ((referer url-current-lastloc)
+           (referer-string (url-recreate-url referer)))
+      (when (and (not (memq url-privacy-level '(low high paranoid)))
+                 (not (and (listp url-privacy-level)
+                           (memq 'lastloc url-privacy-level))))
+        ;; url-privacy-level allows referer.  But url-lastloc-privacy-level
+        ;; may restrict who we send it to.
+        (cl-case url-lastloc-privacy-level
+          (host-match
+           (let ((referer-host (url-host referer))
+                 (url-host (url-host url)))
+             (when (string= referer-host url-host)
+               referer-string)))
+          (domain-match
+           (let ((referer-domain (url-domain referer))
+                 (url-domain (url-domain url)))
+             (when (and referer-domain
+                        url-domain
+                        (string= referer-domain url-domain))
+               referer-string)))
+          (otherwise
+           referer-string))))))
+
 ;; Building an HTTP request
 (defun url-http-user-agent-string ()
   "Compute a User-Agent string.
@@ -254,8 +283,9 @@ url-http-user-agent-string
                 ((eq url-user-agent 'default) (url-http--user-agent-default-string))))))
     (if ua-string (format "User-Agent: %s\r\n" (string-trim ua-string)) "")))
 
-(defun url-http-create-request (&optional ref-url)
-  "Create an HTTP request for `url-http-target-url', referred to by REF-URL."
+(defun url-http-create-request ()
+  "Create an HTTP request for `url-http-target-url', using `url-http-referer'
+as the Referer-header (subject to `url-privacy-level'."
   (let* ((extra-headers)
 	 (request nil)
 	 (no-cache (cdr-safe (assoc "Pragma" url-http-extra-headers)))
@@ -274,7 +304,8 @@ url-http-create-request
 		 (url-get-authentication (or
 					  (and (boundp 'proxy-info)
 					       proxy-info)
-					  url-http-target-url) nil 'any nil))))
+					  url-http-target-url) nil 'any nil)))
+         (ref-url url-http-referer))
     (if (equal "" real-fname)
 	(setq real-fname "/"))
     (setq no-cache (and no-cache (string-match "no-cache" no-cache)))
@@ -288,12 +319,6 @@ url-http-create-request
 					   (string= ref-url "")))
 	(setq ref-url nil))
 
-    ;; We do not want to expose the referrer if the user is paranoid.
-    (if (or (memq url-privacy-level '(low high paranoid))
-	    (and (listp url-privacy-level)
-		 (memq 'lastloc url-privacy-level)))
-	(setq ref-url nil))
-
     ;; url-http-extra-headers contains an assoc-list of
     ;; header/value pairs that we need to put into the request.
     (setq extra-headers (mapconcat
@@ -1255,7 +1280,8 @@ url-http
          (mime-accept-string url-mime-accept-string)
 	 (buffer (or retry-buffer
 		     (generate-new-buffer
-                      (format " *http %s:%d*" (url-host url) (url-port url))))))
+                      (format " *http %s:%d*" (url-host url) (url-port url)))))
+         (referer (url-http--get-referer url)))
     (if (not connection)
 	;; Failed to open the connection for some reason
 	(progn
@@ -1290,7 +1316,8 @@ url-http
 		       url-http-no-retry
 		       url-http-connection-opened
                        url-mime-accept-string
-		       url-http-proxy))
+		       url-http-proxy
+                       url-http-referer))
 	  (set (make-local-variable var) nil))
 
 	(setq url-http-method (or url-request-method "GET")
@@ -1308,7 +1335,8 @@ url-http
 	      url-http-no-retry retry-buffer
 	      url-http-connection-opened nil
               url-mime-accept-string mime-accept-string
-	      url-http-proxy url-using-proxy)
+	      url-http-proxy url-using-proxy
+              url-http-referer referer)
 
 	(set-process-buffer connection buffer)
 	(set-process-filter connection 'url-http-generic-filter)
diff --git a/lisp/url/url-queue.el b/lisp/url/url-queue.el
index dd1699bd08..db0c16c793 100644
--- a/lisp/url/url-queue.el
+++ b/lisp/url/url-queue.el
@@ -52,7 +52,7 @@ url-queue-progress-timer
 (cl-defstruct url-queue
   url callback cbargs silentp
   buffer start-time pre-triggered
-  inhibit-cookiesp)
+  inhibit-cookiesp context-buffer)
 
 ;;;###autoload
 (defun url-queue-retrieve (url callback &optional cbargs silent inhibit-cookies)
@@ -67,7 +67,8 @@ url-queue-retrieve
 				      :callback callback
 				      :cbargs cbargs
 				      :silentp silent
-				      :inhibit-cookiesp inhibit-cookies))))
+				      :inhibit-cookiesp inhibit-cookies
+                                      :context-buffer (current-buffer)))))
   (url-queue-setup-runners))
 
 ;; To ensure asynch behavior, we start the required number of queue
@@ -147,11 +148,14 @@ url-queue-remove-jobs-from-host
 (defun url-queue-start-retrieve (job)
   (setf (url-queue-buffer job)
 	(ignore-errors
-	  (let ((url-request-noninteractive t))
-	    (url-retrieve (url-queue-url job)
-			  #'url-queue-callback-function (list job)
-			  (url-queue-silentp job)
-			  (url-queue-inhibit-cookiesp job))))))
+          (with-current-buffer (if (buffer-live-p (url-queue-context-buffer job))
+                                   (url-queue-context-buffer job)
+                                 (current-buffer))
+	   (let ((url-request-noninteractive t))
+             (url-retrieve (url-queue-url job)
+                           #'url-queue-callback-function (list job)
+                           (url-queue-silentp job)
+                           (url-queue-inhibit-cookiesp job)))))))
 
 (defun url-queue-prune-old-entries ()
   (let (dead-jobs)
diff --git a/lisp/url/url-util.el b/lisp/url/url-util.el
index a3c9655ebd..6be0a05314 100644
--- a/lisp/url/url-util.el
+++ b/lisp/url/url-util.el
@@ -627,6 +627,35 @@ url-make-private-file
          (error "Danger: `%s' is a symbolic link" file))
      (set-file-modes file #o0600))))
 
+(autoload 'dns-query "dns")
+
+(defvar url--domain-cache (make-hash-table :test 'equal :size 17)
+  "Cache to minimize dns lookups.")
+
+;;;###autoload
+(defun url-domain (url)
+  "Return the domain of the host of the url, or nil if url does
+not contain a registered name."
+  ;; Determining the domain of a name can not be done with simple
+  ;; textual manipulations.  a.b.c is either host a in domain b.c
+  ;; (www.google.com), or domain a.b.c with no separate host
+  ;; (bbc.co.uk).  Instead of guessing based on tld (which in any case
+  ;; may be inaccurate in the face of subdelegations), we look for
+  ;; domain delegations in DNS.
+  ;;
+  ;; Domain delegations change rarely enough that we won't bother with
+  ;; cache invalidation, I think.
+  (let* ((host-parts (split-string (url-host url) "\\."))
+         (result (gethash host-parts url--domain-cache 'not-found)))
+    (when (eq result 'not-found)
+      (setq result
+            (cl-loop for parts on host-parts
+                     for dom = (mapconcat #'identity parts ".")
+                     when (dns-query dom 'SOA)
+                     return dom))
+      (puthash host-parts result url--domain-cache))
+    result))
+
 (provide 'url-util)
 
 ;;; url-util.el ends here
diff --git a/lisp/url/url-vars.el b/lisp/url/url-vars.el
index f08779f695..5629f974a0 100644
--- a/lisp/url/url-vars.el
+++ b/lisp/url/url-vars.el
@@ -60,10 +60,18 @@ url-current-object
 (defvar url-current-mime-headers nil
   "A parsed representation of the MIME headers for the current URL.")
 
+(defvar url-current-lastloc nil
+  "A parsed representation of the URL to be considered as the last location.
+Use of this value on outbound connections is subject to
+`url-privacy-level' and `url-lastloc-privacy-level'.  This is never set
+by the url library, applications are expected to set this
+variable in buffers representing a displayed location.")
+
 (mapc 'make-variable-buffer-local
       '(
 	url-current-object
 	url-current-mime-headers
+        url-current-lastloc
 	))
 
 (defcustom url-honor-refresh-requests t
@@ -117,7 +125,7 @@ url-privacy-level
 email    -- the email address
 os       -- the operating system info
 emacs    -- the version of Emacs
-lastloc  -- the last location
+lastloc  -- the last location (see also `url-lastloc-privacy-level')
 agent    -- do not send the User-Agent string
 cookies  -- never accept HTTP cookies
 
@@ -150,6 +158,24 @@ url-privacy-level
 			   (const :tag "No cookies" :value cookie)))
   :group 'url)
 
+(defcustom url-lastloc-privacy-level 'domain-match
+  "Further restrictions on sending the last location.
+This value is only consulted if `url-privacy-level' permits
+sending last location in the first place.
+
+Valid values are:
+none          -- Always send last location.
+domain-match  -- Send last location if the new location is within the
+                 same domain
+host-match    -- Send last location if the new location is on the
+                 same host
+"
+  :version "26.1"
+  :type '(radio (const :tag "Always send" none)
+                (const :tag "Domains match" domain-match)
+                (const :tag "Hosts match" host-match))
+  :group 'url)
+
 (defvar url-inhibit-uncompression nil "Do not do decompression if non-nil.")
 
 (defcustom url-uncompressor-alist '((".z"  . "x-gzip")
-- 
2.11.0


[-- Attachment #3: Type: text/plain, Size: 17 bytes --]


-- 
...Peder...

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* bug#27012: 26.0.50; eww does not generate Referer headers
  2017-07-13 14:02             ` Peder O. Klingenberg
@ 2018-04-13 13:09               ` Lars Ingebrigtsen
  2018-04-13 13:33                 ` Peder O. Klingenberg
  0 siblings, 1 reply; 12+ messages in thread
From: Lars Ingebrigtsen @ 2018-04-13 13:09 UTC (permalink / raw)
  To: Peder O. Klingenberg; +Cc: Andreas Schwab, 27012

peder@klingenberg.no (Peder O. Klingenberg) writes:

> Sure, that's better.  Updated patch:

Great!  Now I can log in on last.fm!  :-)

I've applied this to Emacs master.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 12+ messages in thread

* bug#27012: 26.0.50; eww does not generate Referer headers
  2018-04-13 13:09               ` Lars Ingebrigtsen
@ 2018-04-13 13:33                 ` Peder O. Klingenberg
  2018-04-13 13:39                   ` Lars Ingebrigtsen
  0 siblings, 1 reply; 12+ messages in thread
From: Peder O. Klingenberg @ 2018-04-13 13:33 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: Andreas Schwab, 27012

On Fri, Apr 13 2018 at 15:09, Lars Ingebrigtsen wrote:

> peder@klingenberg.no (Peder O. Klingenberg) writes:
>
>> Sure, that's better.  Updated patch:
>
> Great!  Now I can log in on last.fm!  :-)
>
> I've applied this to Emacs master.

Thanks.  But the patch has been sitting for a long time, so I think
maybe the :version of the defcustom needs to be bumped to 27?  I don't
think this is likely to be backported to emacs 26.

...Peder...
-- 
I wish a new life awaited _me_ in some off-world colony.






^ permalink raw reply	[flat|nested] 12+ messages in thread

* bug#27012: 26.0.50; eww does not generate Referer headers
  2018-04-13 13:33                 ` Peder O. Klingenberg
@ 2018-04-13 13:39                   ` Lars Ingebrigtsen
  0 siblings, 0 replies; 12+ messages in thread
From: Lars Ingebrigtsen @ 2018-04-13 13:39 UTC (permalink / raw)
  To: Peder O. Klingenberg; +Cc: Andreas Schwab, 27012

peder@klingenberg.no (Peder O. Klingenberg) writes:

> Thanks.  But the patch has been sitting for a long time, so I think
> maybe the :version of the defcustom needs to be bumped to 27?  I don't
> think this is likely to be backported to emacs 26.

Yup; I'll update the defcustom...

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2018-04-13 13:39 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-05-21 21:41 bug#27012: 26.0.50; eww does not generate Referer headers Lars Ingebrigtsen
2017-05-22 12:18 ` Richard Stallman
2017-05-22 12:28   ` Lars Ingebrigtsen
2017-05-22 16:37     ` Glenn Morris
2017-05-22 16:53       ` Lars Ingebrigtsen
2017-07-12 23:03         ` Peder O. Klingenberg
2017-07-13  7:16           ` Andreas Schwab
2017-07-13 14:02             ` Peder O. Klingenberg
2018-04-13 13:09               ` Lars Ingebrigtsen
2018-04-13 13:33                 ` Peder O. Klingenberg
2018-04-13 13:39                   ` Lars Ingebrigtsen
2017-05-23  0:54     ` Richard Stallman

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).