* bug#6345: css-mode `css-extract-keyword-list' does not actually [PATCH]
@ 2010-06-03 18:01 MON KEY
2012-04-10 11:11 ` Lars Magne Ingebrigtsen
2015-03-19 22:42 ` bug#6345: Status: " Simen Heggestøyl
0 siblings, 2 replies; 4+ messages in thread
From: MON KEY @ 2010-06-03 18:01 UTC (permalink / raw)
To: 6345
[-- Attachment #1: Type: text/plain, Size: 4583 bytes --]
`css-extract-keyword-list' does not actually [PATCH]
In function `css-extract-keyword-list' the search for "Appendix
H. Index" fails e.g. this form:
(search-backward "Appendix H. Index")
when used to search this the contents of this URL:
"http://www.w3.org/TR/REC-CSS2/css2.txt"
which is dated: W3C Candidate Recommendation 08 September 2009
Returns this message:
css-extract-keyword-list: Search failed: "Appendix H. Index"
It appears this function was originally supplied to scrape CSS
keywords as per the commented code in: lisp/textmodes/css-mode.el
,----
| (css-extract-keyword-list
| '((pseudo . "^ +\\* :\\([^ \n,]+\\)")
| (at . "^ +\\* @\\([^ \n,]+\\)")
| (descriptor . "^ +\\* '\\([^ '\n]+\\)' (descriptor)")
| (media . "^ +\\* '\\([^ '\n]+\\)' media group")
| (property . "^ +\\* '\\([^ '\n]+\\)',")))
`----
However, W3C has gone behined Stefan's back and changed the Appendix
enumeration without asking his permission first :)
"Appendix H" is now "Appendix I".
Compare the version scraped (presumably):
(URL `http://www.w3.org/TR/2008/REC-CSS2-20080411/indexlist.html')
(URL `http://www.w3.org/TR/2008/REC-CSS2-20080411/css2.txt')
with the current version:
(URL `http://www.w3.org/TR/CSS2/indexlist.html')
(URL `http://www.w3.org/TR/CSS2/css2.txt')
The following regexp may be more robust and appears to works for
either the older version or the latest version and leaves room for W3C
to continue add appendices J-M:
(search-backward-regexp "[_━]\\{60,79\\}\xa[[:space:]]+Appendix [A-M]\. Index")
This said, `css-extract-keyword-list' is now borking on regexps in
these conses:
(css-extract-keyword-list
'((pseudo . "^ +\\* :\\([^ \n,]+\\)")
(at . "^ +\\* @\\([^ \n,]+\\)")
(descriptor . "^ +\\* '\\([^ '\n]+\\)' (descriptor)")
(media . "^ +\\* '\\([^ '\n]+\\)' media group")
(property . "^ +\\* '\\([^ '\n]+\\)',")))
and seems to be failing per `url-insert-file-contents' reliance on
`decode-coding-inserted-region' which frobs the asterisks `*' (char
#x2a) into a bullet `•' (char #x2022) -- at least on on my system.
If we substitute occurences of "\\*" with "[*•]" (e.g. "[\x2a\x2022]")
the following regexps now seem to work correctly:
(pp (css-extract-keyword-list
'((pseudo . "^ +[\x2a\x2022] :\\([^ \n,]+\\)")
(at . "^ +[\x2a\x2022] @\\([^ \n,]+\\)")
(descriptor . "^ +[\x2a\x2022] '\\([^ '\n]+\\)' (descriptor)")
(media . "^ +[\x2a\x2022] '\\([^ '\n]+\\)' media group")
(property . "^ +[\x2a\x2022] '\\([^ '\n]+\\)',")))
(current-buffer))
Following diffed against Bazaar revision 100231
;;; ==============================
*** ediff3753M5g 2010-06-03 09:43:04.000000000 -0400
--- lisp/textmodes/css-mode.el 2010-06-03 09:42:43.000000000 -0400
***************
*** 41,49 ****
(defun css-extract-keyword-list (res)
(with-temp-buffer
! (url-insert-file-contents "http://www.w3.org/TR/REC-CSS2/css2.txt")
(goto-char (point-max))
! (search-backward "Appendix H. Index")
(forward-line)
(delete-region (point-min) (point))
(let ((result nil)
--- 41,49 ----
(defun css-extract-keyword-list (res)
(with-temp-buffer
! (url-insert-file-contents
"http://www.w3.org/TR/2008/REC-CSS2-20080411/css2.txt")
(goto-char (point-max))
! (search-backward-regexp "[_━]\\{60,79\\}\xa[[:space:]]+Appendix
[A-M]\. Index")
(forward-line)
(delete-region (point-min) (point))
(let ((result nil)
***************
*** 115,125 ****
;; Extraction was done with:
;; (css-extract-keyword-list
! ;; '((pseudo . "^ +\\* :\\([^ \n,]+\\)")
! ;; (at . "^ +\\* @\\([^ \n,]+\\)")
! ;; (descriptor . "^ +\\* '\\([^ '\n]+\\)' (descriptor)")
! ;; (media . "^ +\\* '\\([^ '\n]+\\)' media group")
! ;; (property . "^ +\\* '\\([^ '\n]+\\)',")))
(defconst css-pseudo-ids
'("active" "after" "before" "first" "first-child" "first-letter"
"first-line"
--- 115,125 ----
;; Extraction was done with:
;; (css-extract-keyword-list
! ;; '((pseudo . "^ +[\x2a\x2022] :\\([^ \n,]+\\)")
! ;; (at . "^ +[\x2a\x2022] @\\([^ \n,]+\\)")
! ;; (descriptor . "^ +[\x2a\x2022] '\\([^ '\n]+\\)' (descriptor)")
! ;; (media . "^ +[\x2a\x2022] '\\([^ '\n]+\\)' media group")
! ;; (property . "^ +[\x2a\x2022] '\\([^ '\n]+\\)',")))
(defconst css-pseudo-ids
'("active" "after" "before" "first" "first-child" "first-letter"
"first-line"
[-- Attachment #2: css-mode.diff-2010-06-03 --]
[-- Type: application/octet-stream, Size: 1799 bytes --]
*** ediff3753M5g 2010-06-03 09:43:04.000000000 -0400
--- lisp/textmodes/css-mode.el 2010-06-03 09:42:43.000000000 -0400
***************
*** 41,49 ****
(defun css-extract-keyword-list (res)
(with-temp-buffer
! (url-insert-file-contents "http://www.w3.org/TR/REC-CSS2/css2.txt")
(goto-char (point-max))
! (search-backward "Appendix H. Index")
(forward-line)
(delete-region (point-min) (point))
(let ((result nil)
--- 41,49 ----
(defun css-extract-keyword-list (res)
(with-temp-buffer
! (url-insert-file-contents "http://www.w3.org/TR/2008/REC-CSS2-20080411/css2.txt")
(goto-char (point-max))
! (search-backward-regexp "[_━]\\{60,79\\}\xa[[:space:]]+Appendix [A-M]\. Index")
(forward-line)
(delete-region (point-min) (point))
(let ((result nil)
***************
*** 115,125 ****
;; Extraction was done with:
;; (css-extract-keyword-list
! ;; '((pseudo . "^ +\\* :\\([^ \n,]+\\)")
! ;; (at . "^ +\\* @\\([^ \n,]+\\)")
! ;; (descriptor . "^ +\\* '\\([^ '\n]+\\)' (descriptor)")
! ;; (media . "^ +\\* '\\([^ '\n]+\\)' media group")
! ;; (property . "^ +\\* '\\([^ '\n]+\\)',")))
(defconst css-pseudo-ids
'("active" "after" "before" "first" "first-child" "first-letter" "first-line"
--- 115,125 ----
;; Extraction was done with:
;; (css-extract-keyword-list
! ;; '((pseudo . "^ +[\x2a\x2022] :\\([^ \n,]+\\)")
! ;; (at . "^ +[\x2a\x2022] @\\([^ \n,]+\\)")
! ;; (descriptor . "^ +[\x2a\x2022] '\\([^ '\n]+\\)' (descriptor)")
! ;; (media . "^ +[\x2a\x2022] '\\([^ '\n]+\\)' media group")
! ;; (property . "^ +[\x2a\x2022] '\\([^ '\n]+\\)',")))
(defconst css-pseudo-ids
'("active" "after" "before" "first" "first-child" "first-letter" "first-line"
^ permalink raw reply [flat|nested] 4+ messages in thread
* bug#6345: css-mode `css-extract-keyword-list' does not actually [PATCH]
2010-06-03 18:01 bug#6345: css-mode `css-extract-keyword-list' does not actually [PATCH] MON KEY
@ 2012-04-10 11:11 ` Lars Magne Ingebrigtsen
2012-04-10 12:06 ` Stefan Monnier
2015-03-19 22:42 ` bug#6345: Status: " Simen Heggestøyl
1 sibling, 1 reply; 4+ messages in thread
From: Lars Magne Ingebrigtsen @ 2012-04-10 11:11 UTC (permalink / raw)
To: MON KEY; +Cc: 6345
MON KEY <monkey@sandpframing.com> writes:
> `css-extract-keyword-list' does not actually [PATCH]
[...]
> ! (search-backward-regexp "[_]\\{60,79\\}\xa[[:space:]]+Appendix [A-M]\. Index")
The rest of the patch seems reasonable (I think), but is there a way to
rework this? Having characters like that in the source code isn't
ideal, if it can be avoided.
--
(domestic pets only, the antidote for overdose, milk.)
bloggy blog http://lars.ingebrigtsen.no/
^ permalink raw reply [flat|nested] 4+ messages in thread
* bug#6345: css-mode `css-extract-keyword-list' does not actually [PATCH]
2012-04-10 11:11 ` Lars Magne Ingebrigtsen
@ 2012-04-10 12:06 ` Stefan Monnier
0 siblings, 0 replies; 4+ messages in thread
From: Stefan Monnier @ 2012-04-10 12:06 UTC (permalink / raw)
To: Lars Magne Ingebrigtsen; +Cc: MON KEY, 6345
>> `css-extract-keyword-list' does not actually [PATCH]
> [...]
>> ! (search-backward-regexp "[_煤]\\{60,79\\}\xa[[:space:]]+Appendix [A-M]\. Index")
> The rest of the patch seems reasonable (I think), but is there a way to
> rework this? Having characters like that in the source code isn't
> ideal, if it can be avoided.
Indeed: the code is only run occasionally to update the keyword-list, so
it's not super important for it to be terribly robust. In a sense, the
code is only kept as documentation to have a good stating point for the
next time I need such a thing.
Stefan
^ permalink raw reply [flat|nested] 4+ messages in thread
* bug#6345: Status: css-mode `css-extract-keyword-list' does not actually [PATCH]
2010-06-03 18:01 bug#6345: css-mode `css-extract-keyword-list' does not actually [PATCH] MON KEY
2012-04-10 11:11 ` Lars Magne Ingebrigtsen
@ 2015-03-19 22:42 ` Simen Heggestøyl
1 sibling, 0 replies; 4+ messages in thread
From: Simen Heggestøyl @ 2015-03-19 22:42 UTC (permalink / raw)
To: 6345-done
[-- Attachment #1: Type: text/plain, Size: 146 bytes --]
Version: 25.1
As of commit 7ec63a3afa52213b7b3cd3ecc0717c6e6504dc43, that code is no
longer part of css-mode.
Thanks for your report!
-- Simen
[-- Attachment #2: Type: text/html, Size: 238 bytes --]
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2015-03-19 22:42 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-06-03 18:01 bug#6345: css-mode `css-extract-keyword-list' does not actually [PATCH] MON KEY
2012-04-10 11:11 ` Lars Magne Ingebrigtsen
2012-04-10 12:06 ` Stefan Monnier
2015-03-19 22:42 ` bug#6345: Status: " Simen Heggestøyl
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.