all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: David Maus <dmaus@ictsoc.de>
To: org-mode <emacs-orgmode@gnu.org>, bastien.guerry@wikimedia.fr
Subject: Improve percent escaping links in Org mode (pull request / OK to push)
Date: Sun, 02 Jan 2011 20:37:24 +0100	[thread overview]
Message-ID: <87lj33f66j.wl%dmaus@ictsoc.de> (raw)


[-- Attachment #1.1: Type: text/plain, Size: 3241 bytes --]

This is a pull request or push announcement for the first set of
patches to improve Org mode's percent escaping functions.  This set of
changes solves the problems with percent escaping non-ascii
characters.

git@github.com:dmj/dmj-org-mode.git feature/org-percent-escaping

I do have commit access but because this set of changes might break
things seriously I'd like to get an "OK to push" or someone who pulls
and reviews the changeset.

The problem:

Current implementation of percent escaping URIs uses a whitelist
approach, e.g. only percent escapes characters that are in
`org-link-escape-chars' or in a user supplied list.  This is a problem
because using this function requires knowledge about all possible
characters that could occur in a URI -- and URIs are limited to plain
ASCII, meaning a call to the function must list literally all possible
characters and their escapings to get a properly percent escaped
string.

The changes:

- `org-link-escape' percent escapes every character that matches one
  of the following conditiions:

  * equal 37 (percent sign)
  * equal 127 (DEL, control character)
  * below 32 (control character)
  * above 127 (non-ASCII character)
  * a character in the escaping table (e.g. `org-link-escape-chars')

  The character in question is first encoded in UTF-8, then all bytes
  of the resulting character are percent escaped.  If converting to
  UTF-8 fails, Org throws an error indicating this problem.

  The function got a optional third argument which can be set to merge
  to user defined table with the default escaping table.

- `org-link-unescape' unescapes every percent-escape sequence.  It is
  no longer possible to supply a list of characters that should be
  unescaped.  No function in core used `org-link-unescape' with a
  unescaping table.

  Internally the `org-protocol-unhex-*' functions were renamend to
  `org-link-unescape-*', moved to org.el and refactored (thanks to
  Vincent Belaïche for suggesting some of the changes).  They are
  declared obsolete and aliased per 2010-11-21.

  The unescaping function is backward compatible and unescapes the old
  percent escape format for non-ASCII characters (thanks to Sebastian
  Rose).

  It is possible that the new implementation will break links in at
  least this (known) case: If the user stored a link to a file or
  directory containing a percent sign.  Currently Org mode does not
  percent escape the percent sign and subsequently the new variant of
  `org-link-unescape' will try to unescpae the alleged percent escape
  sequence.[1]

- `org-link-escape-chars' format changed.  It's just a list of
  characters to escape, the percent escape sequence is implied by the
  character.

  Functions in core that used a custom escaping table are changed
  accordingly to use the new table format.

What is next:

  - check if we can fall back to use `url-hexify-string' and
    `url-unhex-string' instead our own functions
  - check if the recent problems with percent escaping are solved

Best,
  -- David

[1] Not escaping the percent sign is actually a glitch: Try to store
and open a link to a file literally called "foo%20baz.org".


[-- Attachment #1.2: Type: application/pgp-signature, Size: 230 bytes --]

[-- Attachment #2: Type: text/plain, Size: 201 bytes --]

_______________________________________________
Emacs-orgmode mailing list
Please use `Reply All' to send replies to the list.
Emacs-orgmode@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-orgmode

             reply	other threads:[~2011-01-02 19:39 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-01-02 19:37 David Maus [this message]
2011-02-12 22:17 ` Improve percent escaping links in Org mode (pull request / OK to push) Bastien
2011-02-13 12:01   ` David Maus
2011-02-13 13:41     ` Bastien
2011-02-14  6:38       ` David Maus
2011-02-14 10:09         ` Bastien
2011-02-13 12:01   ` [PATCH 01/16] Decode single byte sequence if decoding unicode failed David Maus
2011-02-13 12:01   ` [PATCH 02/16] New unicode aware percent encoding algorithm David Maus
2011-02-13 12:01   ` [PATCH 03/16] New format of percent escape table David Maus
2011-02-13 12:01   ` [PATCH 04/16] Fixup doc string David Maus
2011-02-13 12:01   ` [PATCH 05/16] New optional argument: Merge user table with default table David Maus
2011-02-13 12:01   ` [PATCH 06/16] Inline function to properly decode utf8 characters in Emacs 22 David Maus
2011-02-13 12:01   ` [PATCH 07/16] Unescape functions moved and renamed from org-protocol.el David Maus
2011-02-13 12:01   ` [PATCH 08/16] Declare obsolete & alias to respective org-link-unescape-* functions David Maus
2011-02-13 12:01   ` [PATCH 09/16] Remove obsolete argument in call to org-link-unescape David Maus
2011-02-13 12:01   ` [PATCH 10/16] Use new percent escape character table format David Maus
2011-02-13 12:01   ` [PATCH 11/16] Add percent sign to list of escape chars David Maus
2011-02-13 12:01   ` [PATCH 12/16] Rename lambda argument David Maus
2011-02-13 12:01   ` [PATCH 13/16] Refactor unescaping functions David Maus
2011-02-13 12:01   ` [PATCH 14/16] Always percent escape the percent sign David Maus
2011-02-13 12:01   ` [PATCH 15/16] Use `org-link-unescape' instead of obsolete unhex string function David Maus
2011-02-13 12:01   ` [PATCH 16/16] Throw error if encoding character in utf8 fails David Maus

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87lj33f66j.wl%dmaus@ictsoc.de \
    --to=dmaus@ictsoc.de \
    --cc=bastien.guerry@wikimedia.fr \
    --cc=emacs-orgmode@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.