emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
* Re: %20 in file://... URL
@ 2010-11-23  5:25 Vincent Belaïche
  2010-11-24 20:57 ` David Maus
       [not found] ` <BLU104-W15A3F7F6097ED8F6D95CEB84210@phx.gbl>
  0 siblings, 2 replies; 21+ messages in thread
From: Vincent Belaïche @ 2010-11-23  5:25 UTC (permalink / raw)
  To: Org mode; +Cc: Vincent Belaïche

>From: 	David Maus
>Subject: 	Re: [Orgmode] %20 in file://... URL
>Date: 	Mon, 22 Nov 2010 19:16:09 +0100
>User-agent: 	Wanderlust/2.15.9 (Almost Unreal) SEMI/1.14.6 (Maruoka) FLIM/1.14.9 (GojÅ) APEL/10.8 Emacs/23.2 (i486-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO)
>
>At Mon, 22 Nov 2010 16:46:44 +0100,
>Vincent Belaïche wrote:
>> I see, so I understand that you will someday modify a function creating
>> links in order to implement character escaping. I can give a hand if
>> tell me the function name.
>
>To be exact: Org already escapes some characters (C-h v
>org-link-escape-chars RET) and the colon is a candidate for beeing on
>the list.  

What does "already" exactly means ? I pushed the colon '(?: . "%3A")
into this org-link-escape-chars list, and I made a trial with a link
like this:

[[file://localhost/c%3A/msys/1.0/temp/jay.html][link]]

I get this message: "if: No such file:
//localhost/c%3A/msys/1.0/temp/jay.html", evaluating the full org.el on
the link you gave does not make it either because I get the message that
org-complete cannot be loaded.

>The functions responsible for escaping/unescaping are `org-link-escape'
>and `org-link-unescape' and the new implementations of these functions
>can be found in
>
>https://github.com/dmj/dmj-org-mode/tree/feature/org-percent-escaping
>

Ok, you mean that some version of org already does the job, but not the
org that is on the official Git depo ?


>The task at hand: Anticipate the consquences of the new implementation.
>I.e.  what will happen to links created with the old algorithm.
>

I have no idea of the consequences, I can be a beta tester of it, but
for the time being this code does not work with the kind of link which I
use.

>Patches, ideas, and comments on the modifications are welcome.
>

The following is just comments on the code, most of it is a matter of
taste, which you may well disagree with.

1. In the org.el file in the link which you provided I found also these
   functions org-entry-protect-space & org-entry-restore-space which
   does also some escaping, why not use a unique function

2. In the function org-link-escape, there is a lambda expression  

   (lambda (sequence)
   			(format "%%%.2X" sequence))

   The argument name should be sequence-element rather than sequence.

3. In org-link-unescape, there are 3 substringing-or-concatenations, but
   you could make it simpler by a single replace-match and using a start-position in the
   string-match. That would look like this (*not tested*):

(defun org-link-unescape (str)
  "Unhex hexified unicode strings as returned from the JavaScript function
encodeURIComponent. E.g. `%C3%B6' is the german Umlaut `ö'."
  (setq str (or str ""))
  (let ((case-fold-search t)
        (pos 0))
    (while (string-match "\\(%[0-9a-f][0-9a-f]\\)+" str pos)
             (setq pos (+ pos (/ (- (match-end 0) (match-beginning 0))
				 3))
		   str (replace-match 
			(org-link-unescape-compound (upcase  (match-string 0 str); hex
                                                   ))
			t t str))))
  str))

My feeling that the kind of code above is slightly simpler in
execution as there is only one string manipulation at each
iteration instead of two, and also easier to maintain as is has
fewer use cases (i.e. it does not really matter if the escaped
sequence is at the end of string or not). You also avoid some
intermediate variables like `replacement' as the use of
replace-match make it self explanatory that the result of
org-link-unescape-compound is a replacement.

3. in org-link-unescape-compound,  

    (remove "" (split-string hex "%"))


   can be replaced by (cdr  (split-string hex "%")) because there is
   always only one empty string in the sequence and it is in the 1st
   place.

4. in org-link-unescape-compound, you could have made fewer comparison
   by replacing code

	     (shift
	      (if (= 0 eat) ;; new byte
		  (if (>= val 252) 6
		    (if (>= val 248) 5
		      (if (>= val 240) 4
			(if (>= val 224) 3
			  (if (>= val 192) 2 0)))))
		6))
	     (xor
	      (if (= 0 eat) ;; new byte
		  (if (>= val 252) 252
		    (if (>= val 248) 248
		      (if (>= val 240) 240
			(if (>= val 224) 224
			  (if (>= val 192) 192 0)))))
		128)))

by (*not tested*):

	     (shift-xor
	      (if (= 0 eat) ;; new byte
		  (if (>= val 252) '(6 . 252)
		    (if (>= val 248) '(5 . 248)
		      (if (>= val 240) '(4 . 240)
			(if (>= val 224) '(3 . 224)
			  (if (>= val 192) '(2 . 192) '(0. 0))))))
		 '(6 . 128)))
         (shift (car shift-xor))
	     (xor (cdr shift-xor))


the code above looks more concise to me, depending on val it may also
run faster.

hoping that the above helps.

>
>Best,
>  -- David
>-- 

BR,
   Vincent.

[...]

^ permalink raw reply	[flat|nested] 21+ messages in thread
* Re: %20 in file://... URL
@ 2010-12-30  5:29 Vincent Belaïche
  0 siblings, 0 replies; 21+ messages in thread
From: Vincent Belaïche @ 2010-12-30  5:29 UTC (permalink / raw)
  To: Org mode, David Maus


[...]

>> 
>> hoping that the above helps.
>
>Definitely.
>
>Last not least: On this mailing list you should normally Cc: answers
>to the original poster -- some are not subscribed to the list at all,
>some (like me) read the list in a different account than their main
>mail account and miss answers etc.
>
>Best and thanks,
>  -- David

By the way, I realized that emacs embeds a "URL" package that already
has some URL parse function url-generic-parse-url. 

Wouldn't it be better if Org would just rely on this function and/or
extend it, or at least if org would offer the same API as url and try to
align on the same conventions for non standard URL's, so that org could
be a replacement to the URL package.

I noticed that the URL package does not seem to make any %XX decoding,
for instance on my machine:

(url-generic-parse-url "file:c%3A/toto.html")

evaluates to 

[cl-struct-url "file" nil nil nil 21 "c%3A/toto.html" nil nil nil]

I also noticed that the info:FILE#NODE does not seem to be supported by
Org, but it is by URL. 

Actually it would be even more useful to have also info:FILE#NODE::NNN
with NNN being the line number within the info NODE, but url does not
support the ::NNN extension which seems to be defined only in Org.

VBR, 
   Vincent.

^ permalink raw reply	[flat|nested] 21+ messages in thread
* RE: %20 in file://... URL
@ 2010-11-22 15:46 Vincent Belaïche
  2010-11-22 18:16 ` David Maus
  0 siblings, 1 reply; 21+ messages in thread
From: Vincent Belaïche @ 2010-11-22 15:46 UTC (permalink / raw)
  To: Org mode; +Cc: Vincent Belaïche

[-- Attachment #1: Type: text/plain, Size: 1500 bytes --]

> Date: Wed, 17 Nov 2010 21:43:59 +0100
> From: dmaus@ictsoc.de
> To: vincent.b.1@hotmail.fr
> Subject: Re: [Orgmode] %20 in file://... URL
> CC: emacs-orgmode@gnu.org; carsten.dominik@gmail.com
> 

[...]

Hello,

Sorry for the delay, I was on business trip. 
>  
> Thanks for sending the patch, but it won't provide a clean solution to
> the problem: The function modified by your patch works under the
> assumption, that for example the sequence %3A represents a percent
> escaped colon.  But the function that creates the link in the first
> place does not percent-escape chars 

Er, in my situation I create the link with another package, and I *did*
escaped the colon.

> -- If we use just this patch, opening a link to a file literarally
> called "%3A.org" will fail.
>  
> So we need to modify all functions that create links to propertly
> percent-escape the part of a link that follows the link type in order
> to make all functions unescape the link.
>  
> Good news: Reworking the percent-escaping is a work in progress on my
> list[1] and if it is finished and accepted, the problem should be
> solved.
>  
> Best,
>   -- David

I see, so I understand that you will someday modify a function creating
links in order to implement character escaping. I can give a hand if
tell me the function name.

I also send you my patch with a git diff, just in case (with same
changelog attached again). Sorry for using `diff -c', I just followed
the info node `(emacs) Sending Patches'

   Vincent.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: Patch --]
[-- Type: text/x-patch, Size: 1448 bytes --]

diff --git a/lisp/org.el b/lisp/org.el
index 201dd87..4e2e2c4 100644
--- a/lisp/org.el
+++ b/lisp/org.el
@@ -9639,9 +9639,28 @@ to search for.  If LINE or SEARCH is given, the file will be
 opened in Emacs, unless an entry from org-file-apps that makes
 use of groups in a regexp matches.
 If the file does not exist, an error is thrown."
-  (let* ((file (if (equal path "")
+  (let* ((%xx-decoded-path 
+	  (let ((pos 0) (%xx-decoded-path path))
+	    (setq %xx-decoded-path path)
+	    (while (setq pos (string-match "%\\([0-9A-F]\\)\\([0-9A-F]\\)" %xx-decoded-path pos))
+	      (setq pos (1+ pos)
+		    %xx-decoded-path (replace-match 
+				      (string (let ((code 0) digit)
+						(dotimes (i 2)
+						  (setq 
+						   digit (aref (match-string (1+ i) %xx-decoded-path) 0)
+						   code (+ (if (<= digit ?9)
+							       (- digit ?0)
+							     (- digit 55))
+							   (* 16 code)))) code))
+				      t t %xx-decoded-path)))
+	    ;; remove //localhost/ prefix if any
+	    (and (string-match "\\`//localhost/" %xx-decoded-path)
+		 (setq %xx-decoded-path (substring %xx-decoded-path 12)))
+	    %xx-decoded-path))
+	 (file (if (equal path "")
 		   buffer-file-name
-		 (substitute-in-file-name (expand-file-name path))))
+		 (substitute-in-file-name (expand-file-name %xx-decoded-path))))
 	 (file-apps (append org-file-apps (org-default-apps)))
 	 (apps (org-remove-if
 		'org-file-apps-entry-match-against-dlink-p file-apps))

[-- Attachment #3: ChangeLog --]
[-- Type: text/plain, Size: 218 bytes --]

2010-11-13  Vincent Belaïche  <vincentb1@users.sourceforge.net>

	* org.el (org-open-file): Decode %XX escapes in URL with file
	type, so that applications other than browsers are not confused with the filename.


[-- Attachment #4: Type: text/plain, Size: 201 bytes --]

_______________________________________________
Emacs-orgmode mailing list
Please use `Reply All' to send replies to the list.
Emacs-orgmode@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-orgmode

^ permalink raw reply related	[flat|nested] 21+ messages in thread
* RE: %20 in file://... URL
@ 2010-11-22 15:46 Vincent Belaïche
  0 siblings, 0 replies; 21+ messages in thread
From: Vincent Belaïche @ 2010-11-22 15:46 UTC (permalink / raw)
  To: Org mode; +Cc: Vincent Belaïche

[-- Attachment #1: Type: text/plain, Size: 1500 bytes --]

> Date: Wed, 17 Nov 2010 21:43:59 +0100
> From: dmaus@ictsoc.de
> To: vincent.b.1@hotmail.fr
> Subject: Re: [Orgmode] %20 in file://... URL
> CC: emacs-orgmode@gnu.org; carsten.dominik@gmail.com
> 

[...]

Hello,

Sorry for the delay, I was on business trip. 
>  
> Thanks for sending the patch, but it won't provide a clean solution to
> the problem: The function modified by your patch works under the
> assumption, that for example the sequence %3A represents a percent
> escaped colon.  But the function that creates the link in the first
> place does not percent-escape chars 

Er, in my situation I create the link with another package, and I *did*
escaped the colon.

> -- If we use just this patch, opening a link to a file literarally
> called "%3A.org" will fail.
>  
> So we need to modify all functions that create links to propertly
> percent-escape the part of a link that follows the link type in order
> to make all functions unescape the link.
>  
> Good news: Reworking the percent-escaping is a work in progress on my
> list[1] and if it is finished and accepted, the problem should be
> solved.
>  
> Best,
>   -- David

I see, so I understand that you will someday modify a function creating
links in order to implement character escaping. I can give a hand if
tell me the function name.

I also send you my patch with a git diff, just in case (with same
changelog attached again). Sorry for using `diff -c', I just followed
the info node `(emacs) Sending Patches'

   Vincent.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: Patch --]
[-- Type: text/x-patch, Size: 1448 bytes --]

diff --git a/lisp/org.el b/lisp/org.el
index 201dd87..4e2e2c4 100644
--- a/lisp/org.el
+++ b/lisp/org.el
@@ -9639,9 +9639,28 @@ to search for.  If LINE or SEARCH is given, the file will be
 opened in Emacs, unless an entry from org-file-apps that makes
 use of groups in a regexp matches.
 If the file does not exist, an error is thrown."
-  (let* ((file (if (equal path "")
+  (let* ((%xx-decoded-path 
+	  (let ((pos 0) (%xx-decoded-path path))
+	    (setq %xx-decoded-path path)
+	    (while (setq pos (string-match "%\\([0-9A-F]\\)\\([0-9A-F]\\)" %xx-decoded-path pos))
+	      (setq pos (1+ pos)
+		    %xx-decoded-path (replace-match 
+				      (string (let ((code 0) digit)
+						(dotimes (i 2)
+						  (setq 
+						   digit (aref (match-string (1+ i) %xx-decoded-path) 0)
+						   code (+ (if (<= digit ?9)
+							       (- digit ?0)
+							     (- digit 55))
+							   (* 16 code)))) code))
+				      t t %xx-decoded-path)))
+	    ;; remove //localhost/ prefix if any
+	    (and (string-match "\\`//localhost/" %xx-decoded-path)
+		 (setq %xx-decoded-path (substring %xx-decoded-path 12)))
+	    %xx-decoded-path))
+	 (file (if (equal path "")
 		   buffer-file-name
-		 (substitute-in-file-name (expand-file-name path))))
+		 (substitute-in-file-name (expand-file-name %xx-decoded-path))))
 	 (file-apps (append org-file-apps (org-default-apps)))
 	 (apps (org-remove-if
 		'org-file-apps-entry-match-against-dlink-p file-apps))

[-- Attachment #3: ChangeLog --]
[-- Type: text/plain, Size: 218 bytes --]

2010-11-13  Vincent Belaïche  <vincentb1@users.sourceforge.net>

	* org.el (org-open-file): Decode %XX escapes in URL with file
	type, so that applications other than browsers are not confused with the filename.


[-- Attachment #4: Type: text/plain, Size: 201 bytes --]

_______________________________________________
Emacs-orgmode mailing list
Please use `Reply All' to send replies to the list.
Emacs-orgmode@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-orgmode

^ permalink raw reply related	[flat|nested] 21+ messages in thread
* Re: %20 in file://... URL
@ 2010-11-13  6:18 Vincent Belaïche
  2010-11-13  6:28 ` Vincent Belaïche
                   ` (3 more replies)
  0 siblings, 4 replies; 21+ messages in thread
From: Vincent Belaïche @ 2010-11-13  6:18 UTC (permalink / raw)
  To: Org mode, giovanni.ridolfi; +Cc: Vincent Belaïche

[-- Attachment #1: Type: text/plain, Size: 124 bytes --]


[...]

>
>Please, do! :-)
>
>Giovanni
>

Herein attached follows my patch. Please feel free for brickbats...

   Vincent.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: %xx decode --]
[-- Type: text/x-patch, Size: 1738 bytes --]

*** org.el.old	Fri Nov  5 19:16:29 2010
--- org.el	Sat Nov 13 05:50:54 2010
***************
*** 9639,9647 ****
  opened in Emacs, unless an entry from org-file-apps that makes
  use of groups in a regexp matches.
  If the file does not exist, an error is thrown."
!   (let* ((file (if (equal path "")
  		   buffer-file-name
! 		 (substitute-in-file-name (expand-file-name path))))
  	 (file-apps (append org-file-apps (org-default-apps)))
  	 (apps (org-remove-if
  		'org-file-apps-entry-match-against-dlink-p file-apps))
--- 9639,9666 ----
  opened in Emacs, unless an entry from org-file-apps that makes
  use of groups in a regexp matches.
  If the file does not exist, an error is thrown."
!   (let* ((%xx-decoded-path 
! 	  (let ((pos 0) (%xx-decoded-path path))
! 	    (setq %xx-decoded-path path)
! 	    (while (setq pos (string-match "%\\([0-9A-F]\\)\\([0-9A-F]\\)" %xx-decoded-path pos))
! 	      (setq pos (1+ pos)
! 		    %xx-decoded-path (replace-match 
! 				      (string (let ((code 0) digit)
! 						(dotimes (i 2)
! 						  (setq 
! 						   digit (aref (match-string (1+ i) %xx-decoded-path) 0)
! 						   code (+ (if (<= digit ?9)
! 							       (- digit ?0)
! 							     (- digit 55))
! 							   (* 16 code)))) code))
! 				      t t %xx-decoded-path)))
! 	    ;; remove //localhost/ prefix if any
! 	    (and (string-match "\\`//localhost/" %xx-decoded-path)
! 		 (setq %xx-decoded-path (substring %xx-decoded-path 12)))
! 	    %xx-decoded-path))
! 	 (file (if (equal path "")
  		   buffer-file-name
! 		 (substitute-in-file-name (expand-file-name %xx-decoded-path))))
  	 (file-apps (append org-file-apps (org-default-apps)))
  	 (apps (org-remove-if
  		'org-file-apps-entry-match-against-dlink-p file-apps))

[-- Attachment #3: Change log --]
[-- Type: text/plain, Size: 218 bytes --]

2010-11-13  Vincent Belaïche  <vincentb1@users.sourceforge.net>

	* org.el (org-open-file): Decode %XX escapes in URL with file
	type, so that applications other than browsers are not confused with the filename.


[-- Attachment #4: Type: text/plain, Size: 201 bytes --]

_______________________________________________
Emacs-orgmode mailing list
Please use `Reply All' to send replies to the list.
Emacs-orgmode@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-orgmode

^ permalink raw reply	[flat|nested] 21+ messages in thread
* RE: %20 in file://... URL
@ 2010-11-05  6:42 Vincent Belaïche
  2010-11-05  8:39 ` Giovanni Ridolfi
  0 siblings, 1 reply; 21+ messages in thread
From: Vincent Belaïche @ 2010-11-05  6:42 UTC (permalink / raw)
  To: Org mode; +Cc: Vincent Belaïche

Hello,

Sorry to dwell on it: I am just wondering, is there any reason why %20
and suchlikes are not supported with the file: protocole ? 

I not, I can submit a patch to correct this and have the % constructs
decoded.

BR,
   Vincent.

^ permalink raw reply	[flat|nested] 21+ messages in thread
* RE: %20 in file://... URL
@ 2010-10-27 21:19 Vincent Belaïche
  0 siblings, 0 replies; 21+ messages in thread
From: Vincent Belaïche @ 2010-10-27 21:19 UTC (permalink / raw)
  To: Org mode; +Cc: Vincent Belaïche



> From: giovanni.ridolfi@yahoo.it
> To: vincent.b.1@hotmail.fr
> Subject: Re: [Orgmode] %20 in file://... URL
> Date: Tue, 26 Oct 2010 17:39:55 +0200
> CC: emacs-orgmode@gnu.org
>

[...]

>
> *But*, Vincent, why do you use "%3A" when the colon ":" works? ?-/
>

The reason is quite simple, I wrote a package called w32utils.el which
does several things useful for MSWindows users, among which converting
path of marked files in Dired mode to various format, like URL for
navigator, for LaTeX hyperref, for orgmode, and backslashed MSWindows
path (that was the primary purpose), amongst other. 

This package also makes it easier to open bash shell buffers (using MSYS
bash) under emacs in MSWindows, and also allows some easier update of
the default-directory variable when you make CD to some path (like
changing the driver letter, or using the MSYS fstab links).

If you are interested in that I can put w32utils.el on my page and send you a
link. This is still very experimental, and the manual is not uptodate.

Well, this package makes a strict and complete conversion of paths to
URL, and this is the reason for the %3A.

> cheers,
>
> Giovanni
>

BR,
  Vincent.

> _______________________________________________
> Emacs-orgmode mailing list
> Please use `Reply All' to send replies to the list.
> Emacs-orgmode@gnu.org
> http://lists.gnu.org/mailman/listinfo/emacs-orgmode

^ permalink raw reply	[flat|nested] 21+ messages in thread
* Re: %20 in file://... URL
@ 2010-10-26  5:15 Vincent Belaïche
  2010-10-26 15:39 ` Giovanni Ridolfi
  0 siblings, 1 reply; 21+ messages in thread
From: Vincent Belaïche @ 2010-10-26  5:15 UTC (permalink / raw)
  To: Org mode; +Cc: Vincent Belaïche


>Which Org mode version are you using?
> 
>M-x org-version RET
> 
>And can you give an example of a link that does not work as expected?
> 
>Best,
>  -- David
>
Hello,

Thanks for the feedback. Here is an example of failing link:

[[file://localhost/c%3A/msys/1.0/temp/foo.html][link]]

the file exists on my PC as 

c:\msys\1.0\temp\foo.html


I am under MSWindows XP.

M-x org-version
=> Org-mode version 7.01

This is more or less the latest version on emacs trunk.

BR,
   Vincent.

^ permalink raw reply	[flat|nested] 21+ messages in thread
* %20 in file://... URL
@ 2010-10-24 20:49 Vincent Belaïche
  2010-10-24 21:02 ` David Maus
  0 siblings, 1 reply; 21+ messages in thread
From: Vincent Belaïche @ 2010-10-24 20:49 UTC (permalink / raw)
  To: Org mode; +Cc: Vincent Belaïche

Hello,

My Org mode version is not able to interprete any `%20' or suchlike
escape codes in file://... URL, is that normal ?

      Vincent.

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2011-02-12 15:02 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-11-23  5:25 %20 in file://... URL Vincent Belaïche
2010-11-24 20:57 ` David Maus
2011-02-12 14:36   ` Bastien
     [not found] ` <BLU104-W15A3F7F6097ED8F6D95CEB84210@phx.gbl>
2010-11-29 20:03   ` David Maus
  -- strict thread matches above, loose matches on Subject: below --
2010-12-30  5:29 Vincent Belaïche
2010-11-22 15:46 Vincent Belaïche
2010-11-22 18:16 ` David Maus
2011-02-12 15:02   ` Bastien
2010-11-22 15:46 Vincent Belaïche
2010-11-13  6:18 Vincent Belaïche
2010-11-13  6:28 ` Vincent Belaïche
2010-11-14 17:30 ` David Maus
2010-11-17 20:43 ` David Maus
2010-11-17 20:43 ` David Maus
2010-11-05  6:42 Vincent Belaïche
2010-11-05  8:39 ` Giovanni Ridolfi
2010-10-27 21:19 Vincent Belaïche
2010-10-26  5:15 Vincent Belaïche
2010-10-26 15:39 ` Giovanni Ridolfi
2010-10-24 20:49 Vincent Belaïche
2010-10-24 21:02 ` David Maus

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).