all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
* url-get-url-at-point & url-get-url-filename-chars | comma-p
@ 2009-03-22  1:01 S+*n_Pe*rm*n
  2009-03-22  1:18 ` Stefan Monnier
  0 siblings, 1 reply; 2+ messages in thread
From: S+*n_Pe*rm*n @ 2009-03-22  1:01 UTC (permalink / raw)
  To: emacs-devel

The var `url-get-url-filename-chars' is bound globally to
"-%.?@a-zA-Z0-9()_/:~=&"
and called by `url-get-url-at-point' in ../lisp/url/url-util.el

Why isn't comma (",")  defined as a valid URL character?

RFC3986 denotes the comma (",") as a reserved character (sub-delim)

Per section 3.3:
|   Aside from dot-segments in hierarchical paths, a path segment is
|   considered opaque by the generic syntax.  URI producing applications
|   often use the reserved characters allowed in a segment to delimit
|   scheme-specific or dereference-handler-specific subcomponents.  For
|   example, the semicolon (";") and equals ("=") reserved characters are
|   often used to delimit parameters and parameter values applicable to
|   that segment.  The comma (",") reserved character is often used for
|   similar purposes.  For example, one URI producer might use a segment
|   such as "name;v=1.1" to indicate a reference to version 1.1 of
|   "name", whereas another might use a segment such as "name,1.1" to
|   indicate the same.  Parameter types may be defined by scheme-specific
|   semantics, but in most cases the syntax of a parameter is specific to
|   the implementation of the URI's dereferencing algorithm.


See; 3986 Uniform Resource Identifier (URI): Generic Syntax. T.
       Berners-Lee, R. Fielding, L. Masinter. January 2005. (Format:
       TXT=141811 bytes) (Obsoletes RFC2732, RFC2396, RFC1808) (Updates
       RFC1738) (Also STD0066) (Status: STANDARD)

(URL `http://www.ietf.org/rfc/rfc3986.txt?number=3986')
-
I'm not necessarily advocating inclusion of the comma (",") in the
var, and I understand that doing so could pose potential syntax
conflicts elsewhere. However, I would like to know whether I should
expect the `url-get-url-filename-chars' to change at some point. It
would be nice to fully leverage the functions in ./url but I can see
where the occlusion of the comma (",") might present future problems
as currently defined.

(FWIW I seem to be encountering comma's in web URLs with increasing frequency).


s_P




^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: url-get-url-at-point & url-get-url-filename-chars | comma-p
  2009-03-22  1:01 url-get-url-at-point & url-get-url-filename-chars | comma-p S+*n_Pe*rm*n
@ 2009-03-22  1:18 ` Stefan Monnier
  0 siblings, 0 replies; 2+ messages in thread
From: Stefan Monnier @ 2009-03-22  1:18 UTC (permalink / raw)
  To: S+*n_Pe*rm*n; +Cc: emacs-devel

> The var `url-get-url-filename-chars' is bound globally to
> "-%.?@a-zA-Z0-9()_/:~=&"
> and called by `url-get-url-at-point' in ../lisp/url/url-util.el

> Why isn't comma (",")  defined as a valid URL character?

url-get-url-filename-chars doesn't actually define valid URL chars
(contrary to what its docstring claim), only chars that are likely to be
included in URLs that appear in the middle of other chunks of text, and
that are unlikely to be part of the surrounding text.

> (FWIW I seem to be encountering comma's in web URLs with increasing
> frequency).

It's likely that url-get-url-at-point will need to be updated if that's
indeed becoming the norm.  E.g. we already allow "." and then filter it
out if it appears as the last element.  We could do the same for ",".


        Stefan




^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2009-03-22  1:18 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-03-22  1:01 url-get-url-at-point & url-get-url-filename-chars | comma-p S+*n_Pe*rm*n
2009-03-22  1:18 ` Stefan Monnier

Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.