emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
* Escaping links
@ 2017-08-11 13:26 Fabrice Popineau
  2017-08-11 15:13 ` Nicolas Goaziou
  0 siblings, 1 reply; 9+ messages in thread
From: Fabrice Popineau @ 2017-08-11 13:26 UTC (permalink / raw)
  To: emacs-orgmode@gnu.org

[-- Attachment #1: Type: text/plain, Size: 1018 bytes --]

Hi,

Are links to a file whose name already holds (url-)escaped chars supported?

If I have a directory named "c:/temp/foo bar/"
and files in this directory named
foo.txt
foo bar.txt
foo%2Fbar.txt

I can create links in an Org buffer by using `insert' but I find the
situation a bit confusing.

#+LINK: temp file:c:/temp/%s

1. [[temp:foo bar/foo bar.txt]]
2. [[temp:foo%20bar/foo bar.txt]]
3. [[temp:foo bar/foo%20bar.txt]]
4. [[temp:foo%20bar/foo%20bar.txt]]


All of these links seem to work the same way.

5. [[temp:foo bar/foo%2Fbar.txt]]
6. [[temp:foo bar/foo%252Fbar.txt]]
7. [[temp:foo%20bar/foo%252Fbar.txt]]

Link 5 does not work.

Link 6 and 7 do work: as long as I press enter on the link, I visit the
file.

Unfortunately, if I edit these links with 'C-c C-l', doing nothing
(return), Org replaces the escaped chars and unescape them.

I have grabbed files whose name hold such %2F %3A and so on escaped chars.
Do I have any option to make a link point at them or should I rename them?

Regards,

Fabrice

[-- Attachment #2: Type: text/html, Size: 1584 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Escaping links
  2017-08-11 13:26 Escaping links Fabrice Popineau
@ 2017-08-11 15:13 ` Nicolas Goaziou
  2017-08-11 17:31   ` John Kitchin
  0 siblings, 1 reply; 9+ messages in thread
From: Nicolas Goaziou @ 2017-08-11 15:13 UTC (permalink / raw)
  To: Fabrice Popineau; +Cc: emacs-orgmode@gnu.org

Hello,

Fabrice Popineau <fabrice.popineau@gmail.com> writes:

> Are links to a file whose name already holds (url-)escaped chars supported?
>
> If I have a directory named "c:/temp/foo bar/"
> and files in this directory named
> foo.txt
> foo bar.txt
> foo%2Fbar.txt
>
> I can create links in an Org buffer by using `insert' but I find the
> situation a bit confusing.
>
> #+LINK: temp file:c:/temp/%s
>
> 1. [[temp:foo bar/foo bar.txt]]
> 2. [[temp:foo%20bar/foo bar.txt]]
> 3. [[temp:foo bar/foo%20bar.txt]]
> 4. [[temp:foo%20bar/foo%20bar.txt]]
>
>
> All of these links seem to work the same way.
>
> 5. [[temp:foo bar/foo%2Fbar.txt]]
> 6. [[temp:foo bar/foo%252Fbar.txt]]
> 7. [[temp:foo%20bar/foo%252Fbar.txt]]
>
> Link 5 does not work.
>
> Link 6 and 7 do work: as long as I press enter on the link, I visit the
> file.
>
> Unfortunately, if I edit these links with 'C-c C-l', doing nothing
> (return), Org replaces the escaped chars and unescape them.
>
> I have grabbed files whose name hold such %2F %3A and so on escaped chars.
> Do I have any option to make a link point at them or should I rename
> them?

You might get around it by not using link abbreviation.

Anyway, the core problem here is that:

  1. Org uses percent escaping to get around its own limitations (e.g., no
     square brackets allowed in a link);
  2. it's not possible to know if a string is percent-encoded or not;
  3. percent-encoding is not idempotent.

Using a different escaping mechanism to solve 1 and never ever
percent-decode an URL could put an end to the link mess.

Finding an escaping mechanism that also solves 2 is yet to be done.


Regards,

-- 
Nicolas Goaziou

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Escaping links
  2017-08-11 15:13 ` Nicolas Goaziou
@ 2017-08-11 17:31   ` John Kitchin
  2017-08-11 19:54     ` Fabrice Popineau
  2017-08-12 10:44     ` Nicolas Goaziou
  0 siblings, 2 replies; 9+ messages in thread
From: John Kitchin @ 2017-08-11 17:31 UTC (permalink / raw)
  To: Fabrice Popineau, Nicolas Goaziou; +Cc: emacs-orgmode@gnu.org

[-- Attachment #1: Type: text/plain, Size: 2180 bytes --]

Could you put some magic at the beginning of the string that indicates it
is encoded?

On Fri, Aug 11, 2017 at 8:15 AM Nicolas Goaziou <mail@nicolasgoaziou.fr>
wrote:

> Hello,
>
> Fabrice Popineau <fabrice.popineau@gmail.com> writes:
>
> > Are links to a file whose name already holds (url-)escaped chars
> supported?
> >
> > If I have a directory named "c:/temp/foo bar/"
> > and files in this directory named
> > foo.txt
> > foo bar.txt
> > foo%2Fbar.txt
> >
> > I can create links in an Org buffer by using `insert' but I find the
> > situation a bit confusing.
> >
> > #+LINK: temp file:c:/temp/%s
> >
> > 1. [[temp:foo bar/foo bar.txt]]
> > 2. [[temp:foo%20bar/foo bar.txt]]
> > 3. [[temp:foo bar/foo%20bar.txt]]
> > 4. [[temp:foo%20bar/foo%20bar.txt]]
> >
> >
> > All of these links seem to work the same way.
> >
> > 5. [[temp:foo bar/foo%2Fbar.txt]]
> > 6. [[temp:foo bar/foo%252Fbar.txt]]
> > 7. [[temp:foo%20bar/foo%252Fbar.txt]]
> >
> > Link 5 does not work.
> >
> > Link 6 and 7 do work: as long as I press enter on the link, I visit the
> > file.
> >
> > Unfortunately, if I edit these links with 'C-c C-l', doing nothing
> > (return), Org replaces the escaped chars and unescape them.
> >
> > I have grabbed files whose name hold such %2F %3A and so on escaped
> chars.
> > Do I have any option to make a link point at them or should I rename
> > them?
>
> You might get around it by not using link abbreviation.
>
> Anyway, the core problem here is that:
>
>   1. Org uses percent escaping to get around its own limitations (e.g., no
>      square brackets allowed in a link);
>   2. it's not possible to know if a string is percent-encoded or not;
>   3. percent-encoding is not idempotent.
>
> Using a different escaping mechanism to solve 1 and never ever
> percent-decode an URL could put an end to the link mess.
>
> Finding an escaping mechanism that also solves 2 is yet to be done.
>
>
> Regards,
>
> --
> Nicolas Goaziou
>
> --
John

-----------------------------------
Professor John Kitchin
Doherty Hall A207F
Department of Chemical Engineering
Carnegie Mellon University
Pittsburgh, PA 15213
412-268-7803
@johnkitchin
http://kitchingroup.cheme.cmu.edu

[-- Attachment #2: Type: text/html, Size: 3092 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Escaping links
  2017-08-11 17:31   ` John Kitchin
@ 2017-08-11 19:54     ` Fabrice Popineau
  2017-08-12 10:44     ` Nicolas Goaziou
  1 sibling, 0 replies; 9+ messages in thread
From: Fabrice Popineau @ 2017-08-11 19:54 UTC (permalink / raw)
  To: John Kitchin; +Cc: emacs-orgmode@gnu.org, Nicolas Goaziou

[-- Attachment #1: Type: text/plain, Size: 462 bytes --]

2017-08-11 19:31 GMT+02:00 John Kitchin <jkitchin@andrew.cmu.edu>:

> Could you put some magic at the beginning of the string that indicates it
> is encoded?
>
>
For my own needs, yes I could probably define a handler for my special kind
of abbreviations.

I have a too small number of such files, so I renamed them.

I mainly wanted to have an update (thanks Nicolas) on the current status
about this escape thing
in case I missed something.

Regards,

Fabrice

[-- Attachment #2: Type: text/html, Size: 969 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Escaping links
  2017-08-11 17:31   ` John Kitchin
  2017-08-11 19:54     ` Fabrice Popineau
@ 2017-08-12 10:44     ` Nicolas Goaziou
  2017-08-12 14:01       ` John Kitchin
  1 sibling, 1 reply; 9+ messages in thread
From: Nicolas Goaziou @ 2017-08-12 10:44 UTC (permalink / raw)
  To: John Kitchin; +Cc: emacs-orgmode@gnu.org, Fabrice Popineau

Hello,

John Kitchin <jkitchin@andrew.cmu.edu> writes:

> Could you put some magic at the beginning of the string that indicates it
> is encoded?

I don't know. Could you elaborate a bit?

Regards,

-- 
Nicolas Goaziou                                                0x80A93738

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Escaping links
  2017-08-12 10:44     ` Nicolas Goaziou
@ 2017-08-12 14:01       ` John Kitchin
  2017-08-14 16:26         ` Neil Jerram
  0 siblings, 1 reply; 9+ messages in thread
From: John Kitchin @ 2017-08-12 14:01 UTC (permalink / raw)
  To: Nicolas Goaziou; +Cc: emacs-orgmode@gnu.org, Fabrice Popineau

I was thinking of something like how all PDF files start with something
like %PDF-1.3. So any string that started with %org-9.0, for example
would be certain to be encoded, whereas any other beginning would not be
certain.

Nicolas Goaziou writes:

> Hello,
>
> John Kitchin <jkitchin@andrew.cmu.edu> writes:
>
>> Could you put some magic at the beginning of the string that indicates it
>> is encoded?
>
> I don't know. Could you elaborate a bit?
>
> Regards,


-- 
Professor John Kitchin
Doherty Hall A207F
Department of Chemical Engineering
Carnegie Mellon University
Pittsburgh, PA 15213
412-268-7803
@johnkitchin
http://kitchingroup.cheme.cmu.edu

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Escaping links
  2017-08-12 14:01       ` John Kitchin
@ 2017-08-14 16:26         ` Neil Jerram
  2017-08-14 16:35           ` Fabrice Popineau
  0 siblings, 1 reply; 9+ messages in thread
From: Neil Jerram @ 2017-08-14 16:26 UTC (permalink / raw)
  To: emacs-orgmode

Except if your original string was "%org-9.0"...

For this kind of approach to work, you generally need to prefix 
everything; specifically included the cases that are _not_ encoded.

Regards - Neil


On 12/08/17 16:01, John Kitchin wrote:
> I was thinking of something like how all PDF files start with something
> like %PDF-1.3. So any string that started with %org-9.0, for example
> would be certain to be encoded, whereas any other beginning would not be
> certain.
>
> Nicolas Goaziou writes:
>
>> Hello,
>>
>> John Kitchin <jkitchin@andrew.cmu.edu> writes:
>>
>>> Could you put some magic at the beginning of the string that indicates it
>>> is encoded?
>> I don't know. Could you elaborate a bit?
>>
>> Regards,
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Escaping links
  2017-08-14 16:26         ` Neil Jerram
@ 2017-08-14 16:35           ` Fabrice Popineau
  2017-08-19  9:15             ` Nicolas Goaziou
  0 siblings, 1 reply; 9+ messages in thread
From: Fabrice Popineau @ 2017-08-14 16:35 UTC (permalink / raw)
  To: Neil Jerram; +Cc: emacs-orgmode@gnu.org

[-- Attachment #1: Type: text/plain, Size: 1282 bytes --]

You could also prefix the link by a string holding (in ascii) the number of
bytes of the unencoded link.

But that makes raw/manual editing of an org file much harder.


2017-08-14 18:26 GMT+02:00 Neil Jerram <neil@ossau.homelinux.net>:

> Except if your original string was "%org-9.0"...
>
> For this kind of approach to work, you generally need to prefix
> everything; specifically included the cases that are _not_ encoded.
>
> Regards - Neil
>
>
>
> On 12/08/17 16:01, John Kitchin wrote:
>
>> I was thinking of something like how all PDF files start with something
>> like %PDF-1.3. So any string that started with %org-9.0, for example
>> would be certain to be encoded, whereas any other beginning would not be
>> certain.
>>
>> Nicolas Goaziou writes:
>>
>> Hello,
>>>
>>> John Kitchin <jkitchin@andrew.cmu.edu> writes:
>>>
>>> Could you put some magic at the beginning of the string that indicates it
>>>> is encoded?
>>>>
>>> I don't know. Could you elaborate a bit?
>>>
>>> Regards,
>>>
>>
>>
>
>


-- 
Fabrice Popineau
-----------------------------
SUPELEC
Département Informatique
3, rue Joliot Curie
91192 Gif/Yvette Cedex
Tel direct : +33 (0) 169851950
Standard : +33 (0) 169851212
------------------------------

[-- Attachment #2: Type: text/html, Size: 2297 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Escaping links
  2017-08-14 16:35           ` Fabrice Popineau
@ 2017-08-19  9:15             ` Nicolas Goaziou
  0 siblings, 0 replies; 9+ messages in thread
From: Nicolas Goaziou @ 2017-08-19  9:15 UTC (permalink / raw)
  To: Fabrice Popineau; +Cc: Neil Jerram, emacs-orgmode@gnu.org, fabrice.popineau

Hello,

Fabrice Popineau <fabrice.popineau@supelec.fr> writes:

> You could also prefix the link by a string holding (in ascii) the number of
> bytes of the unencoded link.
>
> But that makes raw/manual editing of an org file much harder.

I'd rather have something simple.

Here are some suggestions.

1. Replace "\\[\\[\\([^][]+\\)\\]\\(\\[\\([^][]+\\)\\]\\)?\\]"
   (`org-bracket-link-regexp') with
   "\\[\\[\\([^\000]+?\\)\\]\\(\\[\\([^\000]+?\\)\\]\\)?\\]". This gives
   more possibilities. We will just live with the unsupported cases
   (e.g. square brackets at the end of the path or the description).

2. Use good ole backslash character to escape ambiguous characters (even
   though any character can be escaped). `org-link-unescape' would take
   care of them instead of url-encoded characters.

3. A mix of both. `org-bracket-link-regexp' could become more
   complicated though.

I'm open to other suggestions, as long as they do not massively impede
manual editing.

Regards,

-- 
Nicolas Goaziou

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2017-08-19  9:15 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-08-11 13:26 Escaping links Fabrice Popineau
2017-08-11 15:13 ` Nicolas Goaziou
2017-08-11 17:31   ` John Kitchin
2017-08-11 19:54     ` Fabrice Popineau
2017-08-12 10:44     ` Nicolas Goaziou
2017-08-12 14:01       ` John Kitchin
2017-08-14 16:26         ` Neil Jerram
2017-08-14 16:35           ` Fabrice Popineau
2017-08-19  9:15             ` Nicolas Goaziou

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).