unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Your last change to browse-url is bogus.
@ 2007-09-12  9:13 Michaël Cadilhac
  2007-09-12 10:20 ` Johannes Weiner
  0 siblings, 1 reply; 18+ messages in thread
From: Michaël Cadilhac @ 2007-09-12  9:13 UTC (permalink / raw)
  To: emacs-devel; +Cc: Johannes Weiner


[-- Attachment #1.1.1: Type: text/plain, Size: 1113 bytes --]

This change:

+2007-09-07  Johannes Weiner  <hannes@saeurebad.de>
+
+	* net/browse-url.el (browse-url-browser-function): Add elinks.
+	(browse-url-elinks-wrapper): New option.
+	(browse-url-encode-url, browse-url-elinks)
+	(browse-url-elinks-sentinel): New functions.
+	(browse-url-file-url, browse-url-netscape, browse-url-mozilla)
+	(browse-url-firefox, browse-url-galeon, browse-url-epiphany): Use
+	new function browse-url-encode-url.
+

(the addition of browse-url-encode-url) creates some bugs :

1. This:

    (while (string-match "%" encoded-url)
      (setq encoded-url (replace-match "%25" t t encoded-url)))

is an infinite loop.

2. The URL 

"http://www.benzedrine.cx/lookandsay?SAME%20AS%201069"

for example, is translated to

"http://www.benzedrine.cx/lookandsay%3fSAME%2520AS%25201069"

which is utterly bogus, because the original %20 are no longer replaced
by ` ' by the server and the `?' will not be parsed.

I'd suggest to revert this change or, if it's for the sake of code
factoring, (what was the first purpose, by the way?) use something like
this:


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1.1.2: browse-url.patch --]
[-- Type: text/x-patch, Size: 4020 bytes --]

--- browse-url.el	12 Sep 2007 10:49:04 +0200	1.61
+++ browse-url.el	12 Sep 2007 11:09:27 +0200	
@@ -619,12 +619,16 @@
 ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
 ;; URL encoding
 
-(defun browse-url-encode-url (url)
-  "Encode all `confusing' characters in URL."
-  (let ((encoded-url (copy-sequence url)))
-    (while (string-match "%" encoded-url)
-      (setq encoded-url (replace-match "%25" t t encoded-url)))
-    (while (string-match "[*\"()',=;? ]" encoded-url)
+(defun browse-url-encode-url (url chars &optional encode-percent)
+  "Encode all `confusing' characters denoted by the regexp CHARS in URL.
+If ENCODE-PERCENT is non-nil, consider `%' as a confusing char."
+  (let ((s 0)
+	(encoded-url (copy-sequence url)))
+    (when encode-percent
+      (while (setq s (string-match "%" encoded-url s))
+	(setq encoded-url (replace-match "%25" t t encoded-url)
+	      s (1+ s))))
+    (while (string-match chars encoded-url)
       (setq encoded-url
 	    (replace-match (format "%%%x"
 				   (string-to-char (match-string 0 encoded-url)))
@@ -703,7 +707,7 @@
 		     (or file-name-coding-system
 			 default-file-name-coding-system))))
     (if coding (setq file (encode-coding-string file coding))))
-  (setq file (browse-url-encode-url file))
+  (setq file (browse-url-encode-url file "[*\"()',=;? ]" 'encode-percent))
   (dolist (map browse-url-filename-alist)
     (when (and map (string-match (car map) file))
       (setq file (replace-match (cdr map) t nil file))))
@@ -909,7 +913,7 @@
 When called non-interactively, optional second argument NEW-WINDOW is
 used instead of `browse-url-new-window-flag'."
   (interactive (browse-url-interactive-arg "URL: "))
-  (setq url (browse-url-encode-url url))
+  (setq url (browse-url-encode-url url "[,)$]"))
   (let* ((process-environment (browse-url-process-environment))
 	 (process
 	  (apply 'start-process
@@ -979,7 +983,7 @@
 When called non-interactively, optional second argument NEW-WINDOW is
 used instead of `browse-url-new-window-flag'."
   (interactive (browse-url-interactive-arg "URL: "))
-  (setq url (browse-url-encode-url url))
+  (setq url (browse-url-encode-url url "[,)$]"))
   (let* ((process-environment (browse-url-process-environment))
          (process
 	  (apply 'start-process
@@ -1037,7 +1041,7 @@
 are ignored as well.  Firefox on Windows will always open the requested
 URL in a new window."
   (interactive (browse-url-interactive-arg "URL: "))
-  (setq url (browse-url-encode-url url))
+  (setq url (browse-url-encode-url url "[,)$]"))
   (let* ((process-environment (browse-url-process-environment))
 	 (process
 	  (apply 'start-process
@@ -1089,7 +1093,7 @@
 When called non-interactively, optional second argument NEW-WINDOW is
 used instead of `browse-url-new-window-flag'."
   (interactive (browse-url-interactive-arg "URL: "))
-  (setq url (browse-url-encode-url url))
+  (setq url (browse-url-encode-url url "[,)$]"))
   (let* ((process-environment (browse-url-process-environment))
          (process (apply 'start-process
 			 (concat "galeon " url)
@@ -1134,7 +1138,7 @@
 When called non-interactively, optional second argument NEW-WINDOW is
 used instead of `browse-url-new-window-flag'."
   (interactive (browse-url-interactive-arg "URL: "))
-  (setq url (browse-url-encode-url url))
+  (setq url (browse-url-encode-url url "[,)$]"))
   (let* ((process-environment (browse-url-process-environment))
          (process (apply 'start-process
 			 (concat "epiphany " url)
@@ -1520,7 +1524,7 @@
 The Elinks command will be prepended by the program+arguments
 from `elinks-browse-url-wrapper'."
   (interactive (browse-url-interactive-arg "URL: "))
-  (setq url (browse-url-encode-url url))
+  (setq url (browse-url-encode-url url "[,)$]"))
   (let ((process-environment (browse-url-process-environment))
 	(elinks-ping-process (start-process "elinks-ping" nil
 					     "elinks" "-remote" "ping()")))

[-- Attachment #1.1.3: Type: text/plain, Size: 329 bytes --]



-- 
 |   Michaël `Micha' Cadilhac       |  And please suggest to him that        |
 |   http://michael.cadilhac.name   |    he not refer to Microsoft Windows   |
 |   JID/MSN:                       |                  as "win".             |
 `----  michael.cadilhac@gmail.com  |          -- RMS                   -  --'

[-- Attachment #1.2: Type: application/pgp-signature, Size: 188 bytes --]

[-- Attachment #2: Type: text/plain, Size: 142 bytes --]

_______________________________________________
Emacs-devel mailing list
Emacs-devel@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Your last change to browse-url is bogus.
  2007-09-12  9:13 Your last change to browse-url is bogus Michaël Cadilhac
@ 2007-09-12 10:20 ` Johannes Weiner
  2007-09-12 10:38   ` Michaël Cadilhac
  0 siblings, 1 reply; 18+ messages in thread
From: Johannes Weiner @ 2007-09-12 10:20 UTC (permalink / raw)
  To: Michaël Cadilhac; +Cc: emacs-devel


[-- Attachment #1.1: Type: text/plain, Size: 1273 bytes --]

Hi Micha,

On Wed, Sep 12, 2007 at 11:13:42AM +0200, Michaël Cadilhac wrote:
> I'd suggest to revert this change or, if it's for the sake of code
> factoring, (what was the first purpose, by the way?)

Yes, code factoring.

> use something like this:

> --- browse-url.el	12 Sep 2007 10:49:04 +0200	1.61
> +++ browse-url.el	12 Sep 2007 11:09:27 +0200	
[...]
> -  (setq file (browse-url-encode-url file))
> +  (setq file (browse-url-encode-url file "[*\"()',=;? ]" 'encode-percent))
[...]
> -  (setq url (browse-url-encode-url url))
> +  (setq url (browse-url-encode-url url "[,)$]"))
[...]
> -  (setq url (browse-url-encode-url url))
> +  (setq url (browse-url-encode-url url "[,)$]"))
[...]
> -  (setq url (browse-url-encode-url url))
> +  (setq url (browse-url-encode-url url "[,)$]"))
[...]
> -  (setq url (browse-url-encode-url url))
> +  (setq url (browse-url-encode-url url "[,)$]"))
[...]
> -  (setq url (browse-url-encode-url url))
> +  (setq url (browse-url-encode-url url "[,)$]"))
[...]
> -  (setq url (browse-url-encode-url url))
> +  (setq url (browse-url-encode-url url "[,)$]"))

These use mostly the same argument.  Can't we generalize this?  Would it hurt
the callsites if they all would use "[*\"()',=;? ]"?

	Hannes

[-- Attachment #1.2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

[-- Attachment #2: Type: text/plain, Size: 142 bytes --]

_______________________________________________
Emacs-devel mailing list
Emacs-devel@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Your last change to browse-url is bogus.
  2007-09-12 10:20 ` Johannes Weiner
@ 2007-09-12 10:38   ` Michaël Cadilhac
  2007-09-12 10:49     ` Johannes Weiner
  2007-09-12 12:54     ` YAMAMOTO Mitsuharu
  0 siblings, 2 replies; 18+ messages in thread
From: Michaël Cadilhac @ 2007-09-12 10:38 UTC (permalink / raw)
  To: emacs-devel


[-- Attachment #1.1: Type: text/plain, Size: 1982 bytes --]

Johannes Weiner <hannes@saeurebad.de> writes:

> On Wed, Sep 12, 2007 at 11:13:42AM +0200, Michaël Cadilhac wrote:

>> use something like this:
>
>> --- browse-url.el	12 Sep 2007 10:49:04 +0200	1.61
>> +++ browse-url.el	12 Sep 2007 11:09:27 +0200	
> [...]
>> -  (setq file (browse-url-encode-url file))
>> +  (setq file (browse-url-encode-url file "[*\"()',=;? ]" 'encode-percent))
> [...]
>> -  (setq url (browse-url-encode-url url))
>> +  (setq url (browse-url-encode-url url "[,)$]"))
> [...]
>> -  (setq url (browse-url-encode-url url))
>> +  (setq url (browse-url-encode-url url "[,)$]"))
> [...]
>> -  (setq url (browse-url-encode-url url))
>> +  (setq url (browse-url-encode-url url "[,)$]"))
> [...]
>> -  (setq url (browse-url-encode-url url))
>> +  (setq url (browse-url-encode-url url "[,)$]"))
> [...]
>> -  (setq url (browse-url-encode-url url))
>> +  (setq url (browse-url-encode-url url "[,)$]"))
> [...]
>> -  (setq url (browse-url-encode-url url))
>> +  (setq url (browse-url-encode-url url "[,)$]"))
>
> These use mostly the same argument.  Can't we generalize this?  Would it hurt
> the callsites if they all would use "[*\"()',=;? ]"?

Yes, it will.  A ``confusing char'' is just something Firefox or others
can consider as a URL separator or as a variable or something when the
website is passed as an argument to the executable, AFAIU.

`?=*' for example are not usually ``confusing''.  The only place those
chars are to be converted is when we browse for a file (thus when `?='
don't have their special meanings).

In the other cases, removing those chars destroys the meaning of the
URL.

-- 
 |   Michaël `Micha' Cadilhac       |  And please suggest to him that        |
 |   http://michael.cadilhac.name   |    he not refer to Microsoft Windows   |
 |   JID/MSN:                       |                  as "win".             |
 `----  michael.cadilhac@gmail.com  |          -- RMS                   -  --'

[-- Attachment #1.2: Type: application/pgp-signature, Size: 188 bytes --]

[-- Attachment #2: Type: text/plain, Size: 142 bytes --]

_______________________________________________
Emacs-devel mailing list
Emacs-devel@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Your last change to browse-url is bogus.
  2007-09-12 10:38   ` Michaël Cadilhac
@ 2007-09-12 10:49     ` Johannes Weiner
  2007-09-12 11:09       ` Michaël Cadilhac
  2007-09-12 12:54     ` YAMAMOTO Mitsuharu
  1 sibling, 1 reply; 18+ messages in thread
From: Johannes Weiner @ 2007-09-12 10:49 UTC (permalink / raw)
  To: Michaël Cadilhac; +Cc: emacs-devel


[-- Attachment #1.1: Type: text/plain, Size: 1834 bytes --]

Hi Micha,

On Wed, Sep 12, 2007 at 12:38:51PM +0200, Michaël Cadilhac wrote:
> >> --- browse-url.el	12 Sep 2007 10:49:04 +0200	1.61
> >> +++ browse-url.el	12 Sep 2007 11:09:27 +0200	
> > [...]
> >> -  (setq file (browse-url-encode-url file))
> >> +  (setq file (browse-url-encode-url file "[*\"()',=;? ]" 'encode-percent))
> > [...]
> >> -  (setq url (browse-url-encode-url url))
> >> +  (setq url (browse-url-encode-url url "[,)$]"))
> > [...]
> >> -  (setq url (browse-url-encode-url url))
> >> +  (setq url (browse-url-encode-url url "[,)$]"))
> > [...]
> >> -  (setq url (browse-url-encode-url url))
> >> +  (setq url (browse-url-encode-url url "[,)$]"))
> > [...]
> >> -  (setq url (browse-url-encode-url url))
> >> +  (setq url (browse-url-encode-url url "[,)$]"))
> > [...]
> >> -  (setq url (browse-url-encode-url url))
> >> +  (setq url (browse-url-encode-url url "[,)$]"))
> > [...]
> >> -  (setq url (browse-url-encode-url url))
> >> +  (setq url (browse-url-encode-url url "[,)$]"))
> >
> > These use mostly the same argument.  Can't we generalize this?  Would it hurt
> > the callsites if they all would use "[*\"()',=;? ]"?
> 
> Yes, it will.  A ``confusing char'' is just something Firefox or others
> can consider as a URL separator or as a variable or something when the
> website is passed as an argument to the executable, AFAIU.
> 
> `?=*' for example are not usually ``confusing''.  The only place those
> chars are to be converted is when we browse for a file (thus when `?='
> don't have their special meanings).

Ah, okay.  So what about an (&optional filename) for this function?
And if it's true, the character set to be translated is "[*\"()',=;? ]" and
percent is also encoded.  If ommited (nil), just "[,)$]" will be translated.

How does that sound?

	Hannes

[-- Attachment #1.2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

[-- Attachment #2: Type: text/plain, Size: 142 bytes --]

_______________________________________________
Emacs-devel mailing list
Emacs-devel@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Your last change to browse-url is bogus.
  2007-09-12 10:49     ` Johannes Weiner
@ 2007-09-12 11:09       ` Michaël Cadilhac
  2007-09-12 11:25         ` Johannes Weiner
  0 siblings, 1 reply; 18+ messages in thread
From: Michaël Cadilhac @ 2007-09-12 11:09 UTC (permalink / raw)
  To: emacs-devel


[-- Attachment #1.1.1: Type: text/plain, Size: 1210 bytes --]

Hi Johannes!

Johannes Weiner <hannes@saeurebad.de> writes:

>> > These use mostly the same argument.  Can't we generalize this?  Would it hurt
>> > the callsites if they all would use "[*\"()',=;? ]"?
>> 
>> Yes, it will.  A ``confusing char'' is just something Firefox or others
>> can consider as a URL separator or as a variable or something when the
>> website is passed as an argument to the executable, AFAIU.
>> 
>> `?=*' for example are not usually ``confusing''.  The only place those
>> chars are to be converted is when we browse for a file (thus when `?='
>> don't have their special meanings).
>
> Ah, okay.  So what about an (&optional filename) for this function?
> And if it's true, the character set to be translated is "[*\"()',=;? ]" and
> percent is also encoded.  If ommited (nil), just "[,)$]" will be translated.
>
> How does that sound?

Yeah, it seems like a good idea : I already added encode-percent, which
had this role but didn't integrate the regexps.  But it's true that if
in 2013 a new web-browser adds a special meaning for `*' in its
executable parameter, it'll not hurt to encode it for all the other
web-browsers.

Great, so we're now here :


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1.1.2: browse-url.patch --]
[-- Type: text/x-patch, Size: 1611 bytes --]

--- browse-url.el	12 Sep 2007 10:49:04 +0200	1.61
+++ browse-url.el	12 Sep 2007 13:04:52 +0200	
@@ -619,16 +619,19 @@
 ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
 ;; URL encoding
 
-(defun browse-url-encode-url (url)
-  "Encode all `confusing' characters in URL."
-  (let ((encoded-url (copy-sequence url)))
-    (while (string-match "%" encoded-url)
-      (setq encoded-url (replace-match "%25" t t encoded-url)))
-    (while (string-match "[*\"()',=;? ]" encoded-url)
+(defun browse-url-encode-url (url &optional filename-p)
+  "Encode all `confusing' characters in URL.
+If FILENAME-P is nil, the confusing characters are [,)$].
+Otherwise, the confusing characters are [*\"()',=;?% ]."
+  (let ((conf-char (if filename-p "[*\"()',=;?% ]" "[,)$]"))
+	(encoded-url (copy-sequence url))
+	(s 0))
+    (while (setq s (string-match conf-char encoded-url s))
       (setq encoded-url
 	    (replace-match (format "%%%x"
 				   (string-to-char (match-string 0 encoded-url)))
-			   t t encoded-url)))
+			   t t encoded-url)
+	    s (1+ s)))
     encoded-url))
 
 ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
@@ -703,7 +706,7 @@
 		     (or file-name-coding-system
 			 default-file-name-coding-system))))
     (if coding (setq file (encode-coding-string file coding))))
-  (setq file (browse-url-encode-url file))
+  (setq file (browse-url-encode-url file 'url-is-filename))
   (dolist (map browse-url-filename-alist)
     (when (and map (string-match (car map) file))
       (setq file (replace-match (cdr map) t nil file))))

[-- Attachment #1.1.3: Type: text/plain, Size: 329 bytes --]



-- 
 |   Michaël `Micha' Cadilhac       |   Je veut dire que la loi francaise    |
 |   http://michael.cadilhac.name   |           est overwritable par le      |
 |   JID/MSN:                       |    reglement interieur il me semble.   |
 `----  michael.cadilhac@gmail.com  |          -- ElBarto               -  --'

[-- Attachment #1.2: Type: application/pgp-signature, Size: 188 bytes --]

[-- Attachment #2: Type: text/plain, Size: 142 bytes --]

_______________________________________________
Emacs-devel mailing list
Emacs-devel@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Your last change to browse-url is bogus.
  2007-09-12 11:09       ` Michaël Cadilhac
@ 2007-09-12 11:25         ` Johannes Weiner
  2007-09-12 11:46           ` Michaël Cadilhac
  0 siblings, 1 reply; 18+ messages in thread
From: Johannes Weiner @ 2007-09-12 11:25 UTC (permalink / raw)
  To: Michaël Cadilhac; +Cc: emacs-devel


[-- Attachment #1.1: Type: text/plain, Size: 2235 bytes --]

Hi Micha,

On Wed, Sep 12, 2007 at 01:09:57PM +0200, Michaël Cadilhac wrote:
> > Ah, okay.  So what about an (&optional filename) for this function?
> > And if it's true, the character set to be translated is "[*\"()',=;? ]" and
> > percent is also encoded.  If ommited (nil), just "[,)$]" will be translated.
> >
> > How does that sound?
> 
> Yeah, it seems like a good idea : I already added encode-percent, which
> had this role but didn't integrate the regexps.

Yep.

> Great, so we're now here :
> --- browse-url.el	12 Sep 2007 10:49:04 +0200	1.61
> +++ browse-url.el	12 Sep 2007 13:04:52 +0200	
> @@ -619,16 +619,19 @@
>  ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
>  ;; URL encoding
>  
> -(defun browse-url-encode-url (url)
> -  "Encode all `confusing' characters in URL."
> -  (let ((encoded-url (copy-sequence url)))
> -    (while (string-match "%" encoded-url)
> -      (setq encoded-url (replace-match "%25" t t encoded-url)))
> -    (while (string-match "[*\"()',=;? ]" encoded-url)
> +(defun browse-url-encode-url (url &optional filename-p)
> +  "Encode all `confusing' characters in URL.
> +If FILENAME-P is nil, the confusing characters are [,)$].
> +Otherwise, the confusing characters are [*\"()',=;?% ]."
> +  (let ((conf-char (if filename-p "[*\"()',=;?% ]" "[,)$]"))
> +	(encoded-url (copy-sequence url))
> +	(s 0))
> +    (while (setq s (string-match conf-char encoded-url s))
>        (setq encoded-url
>  	    (replace-match (format "%%%x"
>  				   (string-to-char (match-string 0 encoded-url)))
> -			   t t encoded-url)))
> +			   t t encoded-url)
> +	    s (1+ s)))
>      encoded-url))
>  
>  ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
> @@ -703,7 +706,7 @@
>  		     (or file-name-coding-system
>  			 default-file-name-coding-system))))
>      (if coding (setq file (encode-coding-string file coding))))
> -  (setq file (browse-url-encode-url file))
> +  (setq file (browse-url-encode-url file 'url-is-filename))
>    (dolist (map browse-url-filename-alist)
>      (when (and map (string-match (car map) file))
>        (setq file (replace-match (cdr map) t nil file))))

Ack.  Thank you!

	Hannes

[-- Attachment #1.2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

[-- Attachment #2: Type: text/plain, Size: 142 bytes --]

_______________________________________________
Emacs-devel mailing list
Emacs-devel@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Your last change to browse-url is bogus.
  2007-09-12 11:25         ` Johannes Weiner
@ 2007-09-12 11:46           ` Michaël Cadilhac
  0 siblings, 0 replies; 18+ messages in thread
From: Michaël Cadilhac @ 2007-09-12 11:46 UTC (permalink / raw)
  To: emacs-devel


[-- Attachment #1.1: Type: text/plain, Size: 625 bytes --]

Johannes Weiner <hannes@saeurebad.de> writes:

>> Great, so we're now here :

>> --- browse-url.el	12 Sep 2007 10:49:04 +0200	1.61
>> +++ browse-url.el	12 Sep 2007 13:04:52 +0200	
>> @@ -619,16 +619,19 @@
   [...]

>
> Ack.  Thank you!

This patch installed.  Thank you for your help!

-- 
 |   Michaël `Micha' Cadilhac       |  C'est véritablement un scandale       |
 |   http://michael.cadilhac.name   |    et probablement une contrepèterie.  |
 |   JID/MSN:                       |          -- P. Desproges               |
 `----  michael.cadilhac@gmail.com  |                                   -  --'

[-- Attachment #1.2: Type: application/pgp-signature, Size: 188 bytes --]

[-- Attachment #2: Type: text/plain, Size: 142 bytes --]

_______________________________________________
Emacs-devel mailing list
Emacs-devel@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Your last change to browse-url is bogus.
  2007-09-12 10:38   ` Michaël Cadilhac
  2007-09-12 10:49     ` Johannes Weiner
@ 2007-09-12 12:54     ` YAMAMOTO Mitsuharu
  2007-09-12 13:13       ` Michaël Cadilhac
  1 sibling, 1 reply; 18+ messages in thread
From: YAMAMOTO Mitsuharu @ 2007-09-12 12:54 UTC (permalink / raw)
  To: Michaël Cadilhac; +Cc: Johannes Weiner, emacs-devel

>>>>> On Wed, 12 Sep 2007 12:38:51 +0200, michael@cadilhac.name (Michaël Cadilhac) said:

>> These use mostly the same argument.  Can't we generalize this?
>> Would it hurt the callsites if they all would use "[*\"()',=;? ]"?

> Yes, it will.  A ``confusing char'' is just something Firefox or
> others can consider as a URL separator or as a variable or something
> when the website is passed as an argument to the executable, AFAIU.

> `?=*' for example are not usually ``confusing''.  The only place
> those chars are to be converted is when we browse for a file (thus
> when `?=' don't have their special meanings).

> In the other cases, removing those chars destroys the meaning of the
> URL.

%-escaping in browse-url-file-url (filename -> url) and those in other
places such as browse-url-netscape (url -> url) are inherently
different operations.
(see http://lists.gnu.org/archive/html/emacs-devel/2006-05/msg01060.html)

I think consolidating these two operations into one function only
because they look similar is over-refactoring and shouldn't be done in
order to avoid re-escaping or re-unescaping by mistake.

				     YAMAMOTO Mitsuharu
				mituharu@math.s.chiba-u.ac.jp

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Your last change to browse-url is bogus.
  2007-09-12 12:54     ` YAMAMOTO Mitsuharu
@ 2007-09-12 13:13       ` Michaël Cadilhac
  2007-09-12 14:18         ` YAMAMOTO Mitsuharu
  2007-09-12 23:29         ` Stephen J. Turnbull
  0 siblings, 2 replies; 18+ messages in thread
From: Michaël Cadilhac @ 2007-09-12 13:13 UTC (permalink / raw)
  To: YAMAMOTO Mitsuharu; +Cc: Johannes Weiner, emacs-devel


[-- Attachment #1.1: Type: text/plain, Size: 1315 bytes --]

YAMAMOTO Mitsuharu <mituharu@math.s.chiba-u.ac.jp> writes:

> %-escaping in browse-url-file-url (filename -> url) and those in other
> places such as browse-url-netscape (url -> url) are inherently
> different operations.

Not quite.  I may have a too low-level point of view, but they are the
same operations, they are just not used for the same purpose.

However, this patch has been somewhat reviewed and approved (though
containing an infinite loop ;-)), so it may be something made on
purpose.

> I think consolidating these two operations into one function only
> because they look similar is over-refactoring and shouldn't be done in
> order to avoid re-escaping or re-unescaping by mistake.

Well, I'm not so sure.  Their purpose is to escape characters in a way
we don't want to duplicate too much.  The «problem» you're pointing out
is the reason why I made the escaping function take the set of
characters to escape in a first place.

What would you do?

-- 
 |   Michaël `Micha' Cadilhac       |  To be portable,                       |
 |   http://michael.cadilhac.name   |    Just stay on Windows.               |
 |   JID/MSN:                       |                                        |
 `----  michael.cadilhac@gmail.com  |          -- A Microsoft Guy       -  --'

[-- Attachment #1.2: Type: application/pgp-signature, Size: 188 bytes --]

[-- Attachment #2: Type: text/plain, Size: 142 bytes --]

_______________________________________________
Emacs-devel mailing list
Emacs-devel@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Your last change to browse-url is bogus.
  2007-09-12 13:13       ` Michaël Cadilhac
@ 2007-09-12 14:18         ` YAMAMOTO Mitsuharu
  2007-09-12 22:50           ` Michaël Cadilhac
  2007-09-12 23:29         ` Stephen J. Turnbull
  1 sibling, 1 reply; 18+ messages in thread
From: YAMAMOTO Mitsuharu @ 2007-09-12 14:18 UTC (permalink / raw)
  To: michael; +Cc: hannes, emacs-devel

>>>>> On Wed, 12 Sep 2007 15:13:06 +0200, michael@cadilhac.name (Michaël Cadilhac) said:

>> %-escaping in browse-url-file-url (filename -> url) and those in
>> other places such as browse-url-netscape (url -> url) are
>> inherently different operations.

> Not quite.  I may have a too low-level point of view, but they are
> the same operations, they are just not used for the same purpose.

"Inherently different" might have been an exaggeration.  Do you agree
that they are semantically different operations?

>> I think consolidating these two operations into one function only
>> because they look similar is over-refactoring and shouldn't be done
>> in order to avoid re-escaping or re-unescaping by mistake.

> Well, I'm not so sure.  Their purpose is to escape characters in a
> way we don't want to duplicate too much.  The «problem» you're
> pointing out is the reason why I made the escaping function take the
> set of characters to escape in a first place.

IMO, differentiating them just by an argument makes the semantical
difference more or less implicit.  What is worse, the function name
`browse-url-encode-url' looks as if it takes a URL as an argument.  I
think operations on filename and URL should be deliberately separated
(except for purely basic string operations) so as to avoid mixture of
them and resulting re-escaping/unescaping.

> What would you do?

Maybe I would revert browse-url-file-url to the one that doesn't use
browse-url-encode-url in order to clarify that 1) encoding with
file-name-coding-system, 2) %-escaping, and 3) adding scheme are
unseparable operations in conversion from a filename to a URL.  Also I
would use more explicit and specific name in place of
browse-url-encode-url (e.g., browse-url-escape-confusing-characters).

An alternative way would be, as you suggested, to give characters to
be escaped as an argument to browse-url-encode-url, but rename the
function so it looks like a low-level string operation.

				     YAMAMOTO Mitsuharu
				mituharu@math.s.chiba-u.ac.jp

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Your last change to browse-url is bogus.
  2007-09-12 14:18         ` YAMAMOTO Mitsuharu
@ 2007-09-12 22:50           ` Michaël Cadilhac
  2007-09-17  4:40             ` YAMAMOTO Mitsuharu
  0 siblings, 1 reply; 18+ messages in thread
From: Michaël Cadilhac @ 2007-09-12 22:50 UTC (permalink / raw)
  To: YAMAMOTO Mitsuharu; +Cc: hannes, emacs-devel


[-- Attachment #1.1: Type: text/plain, Size: 3173 bytes --]

YAMAMOTO Mitsuharu <mituharu@math.s.chiba-u.ac.jp> writes:

>>>>>> On Wed, 12 Sep 2007 15:13:06 +0200, michael@cadilhac.name (Michaël Cadilhac) said:
>
>>> %-escaping in browse-url-file-url (filename -> url) and those in
>>> other places such as browse-url-netscape (url -> url) are
>>> inherently different operations.
>
>> Not quite.  I may have a too low-level point of view, but they are
>> the same operations, they are just not used for the same purpose.
>
> "Inherently different" might have been an exaggeration.  Do you agree
> that they are semantically different operations?

I'm not sure we will agree on the terms, but we're probably saying the
same thing :-)

>>> I think consolidating these two operations into one function only
>>> because they look similar is over-refactoring and shouldn't be done
>>> in order to avoid re-escaping or re-unescaping by mistake.
>
>> Well, I'm not so sure.  Their purpose is to escape characters in a
>> way we don't want to duplicate too much.  The «problem» you're
>> pointing out is the reason why I made the escaping function take the
>> set of characters to escape in a first place.
>
> IMO, differentiating them just by an argument makes the semantical
> difference more or less implicit.

IMO, the semantical difference is to be taken care of by the caller, not
this tool function.  All the callers need a function to url-encode a set
of chars in a string, so there's a point in creating such a function,
nop ?

> What is worse, the function name `browse-url-encode-url' looks as if
> it takes a URL as an argument.

I do agree that the function name doesn't tell the right thing.  At
first, I'd have preferred something like
`browse-url-encode-chars-in-string' (when the change consisted in
calling this function with the set of chars to be escaped), but if
factoring the regexp is not considered as over-factoring (which I'm not
sure), this name is too generic.

> I think operations on filename and URL should be deliberately
> separated (except for purely basic string operations).

This is IMO a basic string operation, but the current function is not
generic enough.

>> What would you do?
>
> Maybe I would revert browse-url-file-url to the one that doesn't use
> browse-url-encode-url [...].  Also I would use more explicit and
> specific name in place of browse-url-encode-url (e.g.,
> browse-url-escape-confusing-characters).

> An alternative way would be, as you suggested, to give characters to
> be escaped as an argument to browse-url-encode-url, but rename the
> function so it looks like a low-level string operation.

Well, for now on I'm not sure which one of the possibilities is the good
one (and I'm getting sleepy :)).  If someone has a strong feeling about
that around here, it'd be nice to hear him!

-- 
 |   Michaël `Micha' Cadilhac       |    Le second degré,                    |
 |   http://michael.cadilhac.name   |       c'est un peu                     |
 |   JID/MSN:                       |   le verlan sémantique.                |
 `----  michael.cadilhac@gmail.com  |                                   -  --'

[-- Attachment #1.2: Type: application/pgp-signature, Size: 188 bytes --]

[-- Attachment #2: Type: text/plain, Size: 142 bytes --]

_______________________________________________
Emacs-devel mailing list
Emacs-devel@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Your last change to browse-url is bogus.
  2007-09-12 13:13       ` Michaël Cadilhac
  2007-09-12 14:18         ` YAMAMOTO Mitsuharu
@ 2007-09-12 23:29         ` Stephen J. Turnbull
  2007-09-13  2:30           ` Stefan Monnier
  2007-09-13  2:39           ` Davis Herring
  1 sibling, 2 replies; 18+ messages in thread
From: Stephen J. Turnbull @ 2007-09-12 23:29 UTC (permalink / raw)
  To: Michaël Cadilhac; +Cc: Johannes Weiner, YAMAMOTO Mitsuharu, emacs-devel

Michaël Cadilhac writes:

 > YAMAMOTO Mitsuharu <mituharu@math.s.chiba-u.ac.jp> writes:
 > 
 > > %-escaping in browse-url-file-url (filename -> url) and those in other
 > > places such as browse-url-netscape (url -> url) are inherently
 > > different operations.
 > 
 > Not quite.  I may have a too low-level point of view, but they are the
 > same operations, they are just not used for the same purpose.

That depends on how you look at it.  If you think of them as
operations on characters, they're the same.  If you think of them as
operations on strings, they're different (ie, it's possible to find
strings which are not transformed tthe same way by both operations).

I tend to agree with Yamamoto-san, because even if the string->string
maps were identical, having the names specify the intended domain
helps with deciding when to apply them at all.

 > Well, I'm not so sure.  Their purpose is to escape characters in a way
 > we don't want to duplicate too much.  The «problem» you're pointing out
 > is the reason why I made the escaping function take the set of
 > characters to escape in a first place.

In that case it would make sense to give the escaping function that
takes the set of characters an "-internal" or "-helper" name (excuse
me, I don't know the GNU conventions for this), and use nice efficient
defsubsts for the semantically mnemonic public entry points.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Your last change to browse-url is bogus.
  2007-09-12 23:29         ` Stephen J. Turnbull
@ 2007-09-13  2:30           ` Stefan Monnier
  2007-09-13 18:15             ` Stephen J. Turnbull
  2007-09-13  2:39           ` Davis Herring
  1 sibling, 1 reply; 18+ messages in thread
From: Stefan Monnier @ 2007-09-13  2:30 UTC (permalink / raw)
  To: Stephen J. Turnbull
  Cc: YAMAMOTO Mitsuharu, Johannes Weiner, Michaël Cadilhac,
	emacs-devel

> I tend to agree with Yamamoto-san, because even if the string->string
> maps were identical, having the names specify the intended domain
> helps with deciding when to apply them at all.

I beg to disagree: a function's name and docstring should reflect what the
function *does* rather than how it is used.


        Stefan

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Your last change to browse-url is bogus.
  2007-09-12 23:29         ` Stephen J. Turnbull
  2007-09-13  2:30           ` Stefan Monnier
@ 2007-09-13  2:39           ` Davis Herring
  1 sibling, 0 replies; 18+ messages in thread
From: Davis Herring @ 2007-09-13  2:39 UTC (permalink / raw)
  To: Stephen J. Turnbull
  Cc: YAMAMOTO Mitsuharu, Johannes Weiner, Michaël Cadilhac,
	emacs-devel

> That depends on how you look at it.  If you think of them as
> operations on characters, they're the same.  If you think of them as
> operations on strings, they're different (ie, it's possible to find
> strings which are not transformed tthe same way by both operations).
> [...]
> In that case it would make sense to give the escaping function that
> takes the set of characters an "-internal" or "-helper" name (excuse
> me, I don't know the GNU conventions for this), and use nice efficient
> defsubsts for the semantically mnemonic public entry points.

I think I agree, and that the first paragraph above (perhaps
unintentionally) supports the second.  It is an implementation detail that
the filename->URL and other->URL (I don't remember the other item)
operations, which are semantically different, can use the same
string-substitution subroutine.  So I'd rename the function in question to
"browse-url-escape-chars" and then have "browse-url-escape-filename" and
"browse-url-escape-<other>" that called it.

Davis

-- 
This product is sold by volume, not by mass.  If it appears too dense or
too sparse, it is because mass-energy conversion has occurred during
shipping.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Your last change to browse-url is bogus.
  2007-09-13  2:30           ` Stefan Monnier
@ 2007-09-13 18:15             ` Stephen J. Turnbull
  2007-09-13 21:44               ` Stefan Monnier
  0 siblings, 1 reply; 18+ messages in thread
From: Stephen J. Turnbull @ 2007-09-13 18:15 UTC (permalink / raw)
  To: Stefan Monnier
  Cc: YAMAMOTO Mitsuharu, Johannes Weiner, Michaël Cadilhac,
	emacs-devel

Stefan Monnier writes:

 > > I tend to agree with Yamamoto-san, because even if the string->string
 > > maps were identical, having the names specify the intended domain
 > > helps with deciding when to apply them at all.
 > 
 > I beg to disagree: a function's name and docstring should reflect what the
 > function *does* rather than how it is used.

That's what I said.  I'm a categorist: functions with different
domains are different functions, they do different things even if they
happen to agree on the intersection of their domains.

As for the docstring, we'll have to agree to disagree.  There is no
real cost to having a docstring that warns about improper usage any
more, it's a matter of style.  You prefer a style that assumes that
people only need to be reminded of the arguments; others should refer
to the manual.  I think that's what skeletons etc are for, and prefer
more verbose docstrings, more like Unix man pages in sections 2 and 3.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Your last change to browse-url is bogus.
  2007-09-13 18:15             ` Stephen J. Turnbull
@ 2007-09-13 21:44               ` Stefan Monnier
  0 siblings, 0 replies; 18+ messages in thread
From: Stefan Monnier @ 2007-09-13 21:44 UTC (permalink / raw)
  To: Stephen J. Turnbull
  Cc: Michaël Cadilhac, Johannes Weiner, YAMAMOTO Mitsuharu,
	emacs-devel

>> > I tend to agree with Yamamoto-san, because even if the string->string
>> > maps were identical, having the names specify the intended domain
>> > helps with deciding when to apply them at all.
>> 
>> I beg to disagree: a function's name and docstring should reflect what the
>> function *does* rather than how it is used.

> That's what I said.  I'm a categorist: functions with different
> domains are different functions, they do different things even if they
> happen to agree on the intersection of their domains.

> As for the docstring, we'll have to agree to disagree.  There is no
> real cost to having a docstring that warns about improper usage any
> more, it's a matter of style.  You prefer a style that assumes that
> people only need to be reminded of the arguments; others should refer
> to the manual.  I think that's what skeletons etc are for, and prefer
> more verbose docstrings, more like Unix man pages in sections 2 and 3.

You lost me.  I guess we were just not talking about the same thing.


        Stefan

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Your last change to browse-url is bogus.
  2007-09-12 22:50           ` Michaël Cadilhac
@ 2007-09-17  4:40             ` YAMAMOTO Mitsuharu
  2007-09-17 11:52               ` Michaël Cadilhac
  0 siblings, 1 reply; 18+ messages in thread
From: YAMAMOTO Mitsuharu @ 2007-09-17  4:40 UTC (permalink / raw)
  To: Michaël Cadilhac; +Cc: hannes, emacs-devel

Your point of view is low-level (string operation), and mine is
high-level (URL operation).  The problems is that the current
implementation of browse-url-encode-url is neither of them: it looks
like a result of superficial code factoring.

(defun browse-url-encode-url (url &optional filename-p)
  "Encode all `confusing' characters in URL.
If FILENAME-P is nil, the confusing characters are [,)$].
Otherwise, the confusing characters are [*\"()',=;?% ]."
  (let ((conf-char (if filename-p "[*\"()',=;?% ]" "[,)$]"))
	(encoded-url (copy-sequence url))
	(s 0))
    (while (setq s (string-match conf-char encoded-url s))
      (setq encoded-url
	    (replace-match (format "%%%x"
				   (string-to-char (match-string 0 encoded-url)))
			   t t encoded-url)
	    s (1+ s)))
    encoded-url))

It is inappropriate from a high-level view because it is a mixture of
two semantically different operations.  Also inappropriate from a
low-level view because it is not general enough and the function name
doesn't represent what it does.

>>>>> On Thu, 13 Sep 2007 00:50:59 +0200, michael@cadilhac.name (Michaël Cadilhac) said:

>>> What would you do?
>> 
>> Maybe I would revert browse-url-file-url to the one that doesn't
>> use browse-url-encode-url [...].  Also I would use more explicit
>> and specific name in place of browse-url-encode-url (e.g.,
>> browse-url-escape-confusing-characters).

>> An alternative way would be, as you suggested, to give characters
>> to be escaped as an argument to browse-url-encode-url, but rename
>> the function so it looks like a low-level string operation.

> Well, for now on I'm not sure which one of the possibilities is the
> good one (and I'm getting sleepy :)).  If someone has a strong
> feeling about that around here, it'd be nice to hear him!

I propose the combination of above two, which is similar to what Davis
suggested:

  1. Add a low-level string replacing function (say,
     browse-url-encode-chars-in-string as you mentioned) that takes an
     argument representing the characters to be escaped.

  2. In browse-url-file-url, directly call the above low-level
     function instead of calling browse-url-encode-url.

  3. Change browse-url-encode-url so it calls the the low-level
     function introduced in 1, and just do one task (URL -> URL).

The difference between direct and indirect calls to the low-level
function comes from whether it is meaningful as a self-contained task
or not.  Escaping confusing characters in a URL deserves a specialized
function, but escaping characters in a filename string is just a part
of the conversion from the filename to a URL.

				     YAMAMOTO Mitsuharu
				mituharu@math.s.chiba-u.ac.jp

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Your last change to browse-url is bogus.
  2007-09-17  4:40             ` YAMAMOTO Mitsuharu
@ 2007-09-17 11:52               ` Michaël Cadilhac
  0 siblings, 0 replies; 18+ messages in thread
From: Michaël Cadilhac @ 2007-09-17 11:52 UTC (permalink / raw)
  To: YAMAMOTO Mitsuharu; +Cc: hannes, emacs-devel


[-- Attachment #1.1: Type: text/plain, Size: 1377 bytes --]

YAMAMOTO Mitsuharu <mituharu@math.s.chiba-u.ac.jp> writes:

> Your point of view is low-level (string operation), and mine is
> high-level (URL operation).  The problems is that the current
> implementation of browse-url-encode-url is neither of them: it looks
> like a result of superficial code factoring.

I do agree.

> I propose the combination of above two, which is similar to what Davis
> suggested:
>
>   1. Add a low-level string replacing function (say,
>      browse-url-encode-chars-in-string as you mentioned) that takes an
>      argument representing the characters to be escaped.
>
>   2. In browse-url-file-url, directly call the above low-level
>      function instead of calling browse-url-encode-url.
>
>   3. Change browse-url-encode-url so it calls the the low-level
>      function introduced in 1, and just do one task (URL -> URL).

Yeah, I did that on the weekend.  If nobody complains, it seems that we
agree that this is TRT, so I'll install the change later on.

Thank you all for this debate.

-- 
 |   Michaël `Micha' Cadilhac       |   Je veut dire que la loi francaise    |
 |   http://michael.cadilhac.name   |           est overwritable par le      |
 |   JID/MSN:                       |    reglement interieur il me semble.   |
 `----  michael.cadilhac@gmail.com  |          -- ElBarto               -  --'

[-- Attachment #1.2: Type: application/pgp-signature, Size: 188 bytes --]

[-- Attachment #2: Type: text/plain, Size: 142 bytes --]

_______________________________________________
Emacs-devel mailing list
Emacs-devel@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2007-09-17 11:52 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-09-12  9:13 Your last change to browse-url is bogus Michaël Cadilhac
2007-09-12 10:20 ` Johannes Weiner
2007-09-12 10:38   ` Michaël Cadilhac
2007-09-12 10:49     ` Johannes Weiner
2007-09-12 11:09       ` Michaël Cadilhac
2007-09-12 11:25         ` Johannes Weiner
2007-09-12 11:46           ` Michaël Cadilhac
2007-09-12 12:54     ` YAMAMOTO Mitsuharu
2007-09-12 13:13       ` Michaël Cadilhac
2007-09-12 14:18         ` YAMAMOTO Mitsuharu
2007-09-12 22:50           ` Michaël Cadilhac
2007-09-17  4:40             ` YAMAMOTO Mitsuharu
2007-09-17 11:52               ` Michaël Cadilhac
2007-09-12 23:29         ` Stephen J. Turnbull
2007-09-13  2:30           ` Stefan Monnier
2007-09-13 18:15             ` Stephen J. Turnbull
2007-09-13 21:44               ` Stefan Monnier
2007-09-13  2:39           ` Davis Herring

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).