unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Why is ' commented out in mm-url-html-entities?
@ 2010-11-03  1:18 Lennart Borgman
  2010-11-03 18:21 ` Deniz Dogan
  0 siblings, 1 reply; 4+ messages in thread
From: Lennart Borgman @ 2010-11-03  1:18 UTC (permalink / raw)
  To: Emacs-Devel devel

It seems it is used quite often, at least on web pages.



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Why is ' commented out in mm-url-html-entities?
  2010-11-03  1:18 Why is ' commented out in mm-url-html-entities? Lennart Borgman
@ 2010-11-03 18:21 ` Deniz Dogan
  2010-11-03 18:35   ` Lennart Borgman
  0 siblings, 1 reply; 4+ messages in thread
From: Deniz Dogan @ 2010-11-03 18:21 UTC (permalink / raw)
  To: Lennart Borgman; +Cc: Emacs-Devel devel

2010/11/3 Lennart Borgman <lennart.borgman@gmail.com>:
> It seems it is used quite often, at least on web pages.
>
>

&apos; is special.

The following is quoted from current version of:
http://en.wikipedia.org/wiki/List_of_XML_and_HTML_character_entity_references#Entities_representing_special_characters_in_XHTML

----8<----

The XHTML  DTDs explicitly declare 253 entities (including the 5
predefined entities of XML 1.0) whose expansion is a single character,
which can therefore be informally referred to as "character entities".
These (with the exception of the &apos; entity) have the same names
and represent the same characters as the 252 character entities in
HTML. Also, by virtue of being XML, XHTML documents may reference the
predefined &apos; entity, which is not one of the 252 character
entities in HTML. Additional entities of any size may be defined on a
per-document basis. However, the usability of entity references in
XHTML is affected by how the document is being processed:

    * If the document is read by a conforming HTML processor, then
only the 252 HTML character entities can safely be used. The use of
&apos; or custom entity references may not be supported and may
produce unpredictable results.
    * If the document is read by an XML parser that does not or cannot
read external entities, then only the five built-in XML character
entities (see above) can safely be used, although other entities may
be used if they are declared in the internal DTD subset.
    * If the document is read by an XML parser that does read external
entities, then the five built-in XML character entities can safely be
used. The other 248 HTML character entities can be used as long as the
XHTML DTD is accessible to the parser at the time the document is
read. Other entities may also be used if they are declared in the
internal DTD subset.

Because of the special &apos; case mentioned above, only &quot;,
&amp;, &lt;, and &gt; will work in all processing situations.

----8<----

That said, I guess the reason for leaving it out of the alist is that
its value would be ambiguous.

-- 
Deniz Dogan



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Why is &apos; commented out in mm-url-html-entities?
  2010-11-03 18:21 ` Deniz Dogan
@ 2010-11-03 18:35   ` Lennart Borgman
  2010-11-03 23:06     ` Deniz Dogan
  0 siblings, 1 reply; 4+ messages in thread
From: Lennart Borgman @ 2010-11-03 18:35 UTC (permalink / raw)
  To: Deniz Dogan; +Cc: Emacs-Devel devel

On Wed, Nov 3, 2010 at 7:21 PM, Deniz Dogan <deniz.a.m.dogan@gmail.com> wrote:
>
> That said, I guess the reason for leaving it out of the alist is that
> its value would be ambiguous.

Thanks Deniz. However I see no reason to leave it out when decoding
(only when encoding). How can we handle this? Or is it already
handled?



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Why is &apos; commented out in mm-url-html-entities?
  2010-11-03 18:35   ` Lennart Borgman
@ 2010-11-03 23:06     ` Deniz Dogan
  0 siblings, 0 replies; 4+ messages in thread
From: Deniz Dogan @ 2010-11-03 23:06 UTC (permalink / raw)
  To: Lennart Borgman; +Cc: Emacs-Devel devel

2010/11/3 Lennart Borgman <lennart.borgman@gmail.com>:
> On Wed, Nov 3, 2010 at 7:21 PM, Deniz Dogan <deniz.a.m.dogan@gmail.com> wrote:
>>
>> That said, I guess the reason for leaving it out of the alist is that
>> its value would be ambiguous.
>
> Thanks Deniz. However I see no reason to leave it out when decoding
> (only when encoding). How can we handle this? Or is it already
> handled?
>

I am not too familiar with mm-url.el (I don't use Gnus) so I don't know.

However, I noticed that the value of rsquo is the same as apos (39),
with a comment next to it saying it really should be U+8217. So it
seems that not only does it not encode apos, it also encodes rsquo
incorrectly.

Does anyone know what's going on here?

-- 
Deniz Dogan



^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2010-11-03 23:06 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-11-03  1:18 Why is &apos; commented out in mm-url-html-entities? Lennart Borgman
2010-11-03 18:21 ` Deniz Dogan
2010-11-03 18:35   ` Lennart Borgman
2010-11-03 23:06     ` Deniz Dogan

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).