* Re: [juri@jurta.org: Why 10 lines?] [not found] <E1Fm05r-0002bd-6s@fencepost.gnu.org> @ 2006-06-02 3:54 ` Kenichi Handa 2006-06-02 9:28 ` Juri Linkov 0 siblings, 1 reply; 5+ messages in thread From: Kenichi Handa @ 2006-06-02 3:54 UTC (permalink / raw) Cc: juri, walters, emacs-devel In article <E1Fm05r-0002bd-6s@fencepost.gnu.org>, Richard Stallman <rms@gnu.org> writes: > Please DTRT. Ok. > ------- Start of forwarded message ------- > From: Juri Linkov <juri@jurta.org> > To: emacs-devel@gnu.org > Subject: Why 10 lines? [...] > sgml-html-meta-auto-coding-function has the hard-coded limit of 10 lines > to search for the HTML meta tag. But HTML files can have the HTML meta tag > outside the 10-line limit. For example, HTML files generated by livejournal > contain this tag on 11-th line, and Emacs fails to recognize the coding > of such HTML files. > I propose to limit the search for the HTML meta tag by the end of the > existing HTML header (by looking for </head>). The limit of 10 lines > (or perhaps any slightly increased number) could be still applied only > for the case if there are no HTML header. I think the change is good and has no problem. So, I installed it. --- Kenichi Handa handa@m17n.org > Index: lisp/international/mule.el > =================================================================== > RCS file: /sources/emacs/emacs/lisp/international/mule.el,v > retrieving revision 1.236 > diff -c -r1.236 mule.el > *** lisp/international/mule.el 24 May 2006 13:22:12 -0000 1.236 > - --- lisp/international/mule.el 1 Jun 2006 00:55:47 -0000 > *************** > *** 2253,2261 **** > "If the buffer has an HTML meta tag, use it to determine encoding. > This function is intended to be added to `auto-coding-functions'." > (setq size (min (+ (point) size) > - - ;; Only search forward 10 lines > (save-excursion > ! (forward-line 10) > (point)))) > (when (and (search-forward "<html" size t) > (re-search-forward "<meta\\s-+http-equiv=\"content-type\"\\s-+content=\"text/\\sw+;\\s-*charset=\\(.+?\\)\"" size t)) > - --- 2257,2267 ---- > "If the buffer has an HTML meta tag, use it to determine encoding. > This function is intended to be added to `auto-coding-functions'." > (setq size (min (+ (point) size) > (save-excursion > ! ;; Limit the search by the end of the HTML header > ! (or (search-forward "</head>" size t) > ! ;; In case of no header, search only 10 lines > ! (forward-line 10)) > (point)))) > (when (and (search-forward "<html" size t) > (re-search-forward "<meta\\s-+http-equiv=\"content-type\"\\s-+content=\"text/\\sw+;\\s-*charset=\\(.+?\\)\"" size t)) > - -- > Juri Linkov > http://www.jurta.org/emacs/ > _______________________________________________ > Emacs-devel mailing list > Emacs-devel@gnu.org > http://lists.gnu.org/mailman/listinfo/emacs-devel > ------- End of forwarded message ------- ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [juri@jurta.org: Why 10 lines?] 2006-06-02 3:54 ` [juri@jurta.org: Why 10 lines?] Kenichi Handa @ 2006-06-02 9:28 ` Juri Linkov 2006-06-02 11:19 ` Kenichi Handa 2006-06-02 22:21 ` Kevin Rodgers 0 siblings, 2 replies; 5+ messages in thread From: Juri Linkov @ 2006-06-02 9:28 UTC (permalink / raw) Cc: walters, rms, emacs-devel > I think the change is good and has no problem. So, I installed it. Thanks. What do you think about a related uninstalled change in http://lists.gnu.org/archive/html/emacs-devel/2005-10/msg00916.html. Do you see any problems with it? -- Juri Linkov http://www.jurta.org/emacs/ ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [juri@jurta.org: Why 10 lines?] 2006-06-02 9:28 ` Juri Linkov @ 2006-06-02 11:19 ` Kenichi Handa 2006-06-02 22:21 ` Kevin Rodgers 1 sibling, 0 replies; 5+ messages in thread From: Kenichi Handa @ 2006-06-02 11:19 UTC (permalink / raw) Cc: walters, rms, emacs-devel In article <87k67zlxqi.fsf@jurta.org>, Juri Linkov <juri@jurta.org> writes: >> I think the change is good and has no problem. So, I installed it. > Thanks. What do you think about a related uninstalled change in > http://lists.gnu.org/archive/html/emacs-devel/2005-10/msg00916.html. > Do you see any problems with it? As I don't know about the detail of HTML/SGML/XHTML specification, I can't tell the change is good or not. If there's no objection, please install it. I remember that Richard wrote the same thing. --- Kenichi Handa handa@m17n.org ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [juri@jurta.org: Why 10 lines?] 2006-06-02 9:28 ` Juri Linkov 2006-06-02 11:19 ` Kenichi Handa @ 2006-06-02 22:21 ` Kevin Rodgers 2006-06-03 6:43 ` Juri Linkov 1 sibling, 1 reply; 5+ messages in thread From: Kevin Rodgers @ 2006-06-02 22:21 UTC (permalink / raw) Juri Linkov wrote: > Thanks. What do you think about a related uninstalled change in > http://lists.gnu.org/archive/html/emacs-devel/2005-10/msg00916.html. > Do you see any problems with it? It also allows mismatched beginning and ending quotes (both present vs. absent and double vs. single). Under a "Be liberal in what you accept" policy that may be acceptable, but it goes a lot further than just allowing properly matched single quotes or no quotes at all as intended. Thanks, -- Kevin ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [juri@jurta.org: Why 10 lines?] 2006-06-02 22:21 ` Kevin Rodgers @ 2006-06-03 6:43 ` Juri Linkov 0 siblings, 0 replies; 5+ messages in thread From: Juri Linkov @ 2006-06-03 6:43 UTC (permalink / raw) Cc: emacs-devel >> Thanks. What do you think about a related uninstalled change in >> http://lists.gnu.org/archive/html/emacs-devel/2005-10/msg00916.html. >> Do you see any problems with it? > > It also allows mismatched beginning and ending quotes (both present vs. > absent and double vs. single). Under a "Be liberal in what you accept" > policy that may be acceptable, but it goes a lot further than just > allowing properly matched single quotes or no quotes at all as intended. This function is not an HTML validator, so we shouldn't be too pedantic in this area. Some HTML files downloaded from the web might be invalid, but I definitely don't want to correct all them nor try to find their authors and convince them to fix HTML syntax. I just need to read downloaded HTML files in Emacs with the auto-detection of the coding system and without much hassle. -- Juri Linkov http://www.jurta.org/emacs/ ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2006-06-03 6:43 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- [not found] <E1Fm05r-0002bd-6s@fencepost.gnu.org> 2006-06-02 3:54 ` [juri@jurta.org: Why 10 lines?] Kenichi Handa 2006-06-02 9:28 ` Juri Linkov 2006-06-02 11:19 ` Kenichi Handa 2006-06-02 22:21 ` Kevin Rodgers 2006-06-03 6:43 ` Juri Linkov
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/emacs.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).