* Re: [juri@jurta.org: Why 10 lines?]
[not found] <E1Fm05r-0002bd-6s@fencepost.gnu.org>
@ 2006-06-02 3:54 ` Kenichi Handa
2006-06-02 9:28 ` Juri Linkov
0 siblings, 1 reply; 5+ messages in thread
From: Kenichi Handa @ 2006-06-02 3:54 UTC (permalink / raw)
Cc: juri, walters, emacs-devel
In article <E1Fm05r-0002bd-6s@fencepost.gnu.org>, Richard Stallman <rms@gnu.org> writes:
> Please DTRT.
Ok.
> ------- Start of forwarded message -------
> From: Juri Linkov <juri@jurta.org>
> To: emacs-devel@gnu.org
> Subject: Why 10 lines?
[...]
> sgml-html-meta-auto-coding-function has the hard-coded limit of 10 lines
> to search for the HTML meta tag. But HTML files can have the HTML meta tag
> outside the 10-line limit. For example, HTML files generated by livejournal
> contain this tag on 11-th line, and Emacs fails to recognize the coding
> of such HTML files.
> I propose to limit the search for the HTML meta tag by the end of the
> existing HTML header (by looking for </head>). The limit of 10 lines
> (or perhaps any slightly increased number) could be still applied only
> for the case if there are no HTML header.
I think the change is good and has no problem. So, I
installed it.
---
Kenichi Handa
handa@m17n.org
> Index: lisp/international/mule.el
> ===================================================================
> RCS file: /sources/emacs/emacs/lisp/international/mule.el,v
> retrieving revision 1.236
> diff -c -r1.236 mule.el
> *** lisp/international/mule.el 24 May 2006 13:22:12 -0000 1.236
> - --- lisp/international/mule.el 1 Jun 2006 00:55:47 -0000
> ***************
> *** 2253,2261 ****
> "If the buffer has an HTML meta tag, use it to determine encoding.
> This function is intended to be added to `auto-coding-functions'."
> (setq size (min (+ (point) size)
> - - ;; Only search forward 10 lines
> (save-excursion
> ! (forward-line 10)
> (point))))
> (when (and (search-forward "<html" size t)
> (re-search-forward "<meta\\s-+http-equiv=\"content-type\"\\s-+content=\"text/\\sw+;\\s-*charset=\\(.+?\\)\"" size t))
> - --- 2257,2267 ----
> "If the buffer has an HTML meta tag, use it to determine encoding.
> This function is intended to be added to `auto-coding-functions'."
> (setq size (min (+ (point) size)
> (save-excursion
> ! ;; Limit the search by the end of the HTML header
> ! (or (search-forward "</head>" size t)
> ! ;; In case of no header, search only 10 lines
> ! (forward-line 10))
> (point))))
> (when (and (search-forward "<html" size t)
> (re-search-forward "<meta\\s-+http-equiv=\"content-type\"\\s-+content=\"text/\\sw+;\\s-*charset=\\(.+?\\)\"" size t))
> - --
> Juri Linkov
> http://www.jurta.org/emacs/
> _______________________________________________
> Emacs-devel mailing list
> Emacs-devel@gnu.org
> http://lists.gnu.org/mailman/listinfo/emacs-devel
> ------- End of forwarded message -------
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [juri@jurta.org: Why 10 lines?]
2006-06-02 3:54 ` [juri@jurta.org: Why 10 lines?] Kenichi Handa
@ 2006-06-02 9:28 ` Juri Linkov
2006-06-02 11:19 ` Kenichi Handa
2006-06-02 22:21 ` Kevin Rodgers
0 siblings, 2 replies; 5+ messages in thread
From: Juri Linkov @ 2006-06-02 9:28 UTC (permalink / raw)
Cc: walters, rms, emacs-devel
> I think the change is good and has no problem. So, I installed it.
Thanks. What do you think about a related uninstalled change in
http://lists.gnu.org/archive/html/emacs-devel/2005-10/msg00916.html.
Do you see any problems with it?
--
Juri Linkov
http://www.jurta.org/emacs/
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [juri@jurta.org: Why 10 lines?]
2006-06-02 9:28 ` Juri Linkov
@ 2006-06-02 11:19 ` Kenichi Handa
2006-06-02 22:21 ` Kevin Rodgers
1 sibling, 0 replies; 5+ messages in thread
From: Kenichi Handa @ 2006-06-02 11:19 UTC (permalink / raw)
Cc: walters, rms, emacs-devel
In article <87k67zlxqi.fsf@jurta.org>, Juri Linkov <juri@jurta.org> writes:
>> I think the change is good and has no problem. So, I installed it.
> Thanks. What do you think about a related uninstalled change in
> http://lists.gnu.org/archive/html/emacs-devel/2005-10/msg00916.html.
> Do you see any problems with it?
As I don't know about the detail of HTML/SGML/XHTML
specification, I can't tell the change is good or not. If
there's no objection, please install it. I remember that
Richard wrote the same thing.
---
Kenichi Handa
handa@m17n.org
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [juri@jurta.org: Why 10 lines?]
2006-06-02 9:28 ` Juri Linkov
2006-06-02 11:19 ` Kenichi Handa
@ 2006-06-02 22:21 ` Kevin Rodgers
2006-06-03 6:43 ` Juri Linkov
1 sibling, 1 reply; 5+ messages in thread
From: Kevin Rodgers @ 2006-06-02 22:21 UTC (permalink / raw)
Juri Linkov wrote:
> Thanks. What do you think about a related uninstalled change in
> http://lists.gnu.org/archive/html/emacs-devel/2005-10/msg00916.html.
> Do you see any problems with it?
It also allows mismatched beginning and ending quotes (both present vs.
absent and double vs. single). Under a "Be liberal in what you accept"
policy that may be acceptable, but it goes a lot further than just
allowing properly matched single quotes or no quotes at all as intended.
Thanks,
--
Kevin
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [juri@jurta.org: Why 10 lines?]
2006-06-02 22:21 ` Kevin Rodgers
@ 2006-06-03 6:43 ` Juri Linkov
0 siblings, 0 replies; 5+ messages in thread
From: Juri Linkov @ 2006-06-03 6:43 UTC (permalink / raw)
Cc: emacs-devel
>> Thanks. What do you think about a related uninstalled change in
>> http://lists.gnu.org/archive/html/emacs-devel/2005-10/msg00916.html.
>> Do you see any problems with it?
>
> It also allows mismatched beginning and ending quotes (both present vs.
> absent and double vs. single). Under a "Be liberal in what you accept"
> policy that may be acceptable, but it goes a lot further than just
> allowing properly matched single quotes or no quotes at all as intended.
This function is not an HTML validator, so we shouldn't be too pedantic in
this area. Some HTML files downloaded from the web might be invalid, but
I definitely don't want to correct all them nor try to find their authors
and convince them to fix HTML syntax. I just need to read downloaded HTML
files in Emacs with the auto-detection of the coding system and without
much hassle.
--
Juri Linkov
http://www.jurta.org/emacs/
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2006-06-03 6:43 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <E1Fm05r-0002bd-6s@fencepost.gnu.org>
2006-06-02 3:54 ` [juri@jurta.org: Why 10 lines?] Kenichi Handa
2006-06-02 9:28 ` Juri Linkov
2006-06-02 11:19 ` Kenichi Handa
2006-06-02 22:21 ` Kevin Rodgers
2006-06-03 6:43 ` Juri Linkov
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).