* Re: elisp optimization question [not found] <mailman.11345.1210283700.18990.help-gnu-emacs@gnu.org> @ 2008-05-09 0:00 ` harven 2008-05-09 1:45 ` Kevin Rodgers [not found] ` <mailman.11354.1210297808.18990.help-gnu-emacs@gnu.org> 2008-05-09 0:36 ` Xah 1 sibling, 2 replies; 8+ messages in thread From: harven @ 2008-05-09 0:00 UTC (permalink / raw) To: help-gnu-emacs hi, you can save some typing by using an alist. Here is what i use to convert accented-letters into html and back. (defun accent-html (prefix) "Accented letter translation é -> é. With an argument, reverse é <- é. Works on the whole buffer" (interactive "P") (save-excursion (let ((association '(("É" . "É") ("á" . "á") ("à" . "à") ("â" . "â") ("ä" . "ä") (""" . "ã") ("é" . "é") ("è" . "è") ("ê" . "ê") ("ë" . "ë") ("í" . "í") ("ì" . "ì") ("î" . "î") ("ï" . "ï") ("ñ" . "ñ") ("ó" . "ó") ("ò" . "ò") ("ô" . "ô") ("ö" . "ö") ("ı" . "õ") ("ú" . "ú") ("ù" . "ù") ("û" . "û") ("ü" . "ü") ("ç" . "ç"))) (case-fold-search nil)) (dolist (paire association) (when prefix (setq paire (cons (cdr paire) (car paire)))) (goto-char (point-min)) (while (search-forward (car paire) nil t) (replace-match (cdr paire) nil t)))))) This is not more efficient than your own defun. If you only want to translate characters, the function (subst-char-in-region) is a primitive that saves a while loop and is probably faster. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: elisp optimization question 2008-05-09 0:00 ` elisp optimization question harven @ 2008-05-09 1:45 ` Kevin Rodgers [not found] ` <mailman.11354.1210297808.18990.help-gnu-emacs@gnu.org> 1 sibling, 0 replies; 8+ messages in thread From: Kevin Rodgers @ 2008-05-09 1:45 UTC (permalink / raw) To: help-gnu-emacs harven wrote: > hi, > you can save some typing by using an alist. Here is what i use to > convert > accented-letters into html and back. > > (defun accent-html (prefix) > "Accented letter translation é -> é. > With an argument, reverse é <- é. > Works on the whole buffer" > (interactive "P") > (save-excursion > (let ((association > '(("É" . "É") ("á" . "á") ("à" . "à") > ("â" . "â") ("ä" . "ä") (""" . "ã") > ("é" . "é") ("è" . "è") ("ê" . "ê") > ("ë" . "ë") ("í" . "í") ("ì" . "ì") > ("î" . "î") ("ï" . "ï") ("ñ" . "ñ") > ("ó" . "ó") ("ò" . "ò") ("ô" . "ô") > ("ö" . "ö") ("ı" . "õ") ("ú" . > "ú") > ("ù" . "ù") ("û" . "û") ("ü" . "ü") > ("ç" . "ç"))) > (case-fold-search nil)) > (dolist (paire association) > (when prefix > (setq paire (cons (cdr paire) (car paire)))) > (goto-char (point-min)) > (while (search-forward (car paire) nil t) > (replace-match (cdr paire) nil t)))))) Even faster than an alist is a hash table. -- Kevin Rodgers Denver, Colorado, USA ^ permalink raw reply [flat|nested] 8+ messages in thread
[parent not found: <mailman.11354.1210297808.18990.help-gnu-emacs@gnu.org>]
* Re: elisp optimization question [not found] ` <mailman.11354.1210297808.18990.help-gnu-emacs@gnu.org> @ 2008-05-09 9:59 ` Rupert Swarbrick 2008-05-09 21:43 ` harven 0 siblings, 1 reply; 8+ messages in thread From: Rupert Swarbrick @ 2008-05-09 9:59 UTC (permalink / raw) To: help-gnu-emacs Kevin Rodgers <kevin.d.rodgers@gmail.com> writes: > harven wrote: >> hi, >> you can save some typing by using an alist. Here is what i use to >> convert >> accented-letters into html and back. >> >> (defun accent-html (prefix) >> "Accented letter translation é -> é. >> With an argument, reverse é <- é. >> Works on the whole buffer" >> (interactive "P") >> (save-excursion >> (let ((association >> '(("É" . "É") ("á" . "á") ("à" . "à") >> ("â" . "â") ("ä" . "ä") (""" . "ã") >> ("é" . "é") ("è" . "è") ("ê" . "ê") >> ("ë" . "ë") ("í" . "í") ("ì" . "ì") >> ("î" . "î") ("ï" . "ï") ("ñ" . "ñ") >> ("ó" . "ó") ("ò" . "ò") ("ô" . "ô") >> ("ö" . "ö") ("ı" . "õ") ("ú" . >> "ú") >> ("ù" . "ù") ("û" . "û") ("ü" . "ü") >> ("ç" . "ç"))) >> (case-fold-search nil)) >> (dolist (paire association) >> (when prefix >> (setq paire (cons (cdr paire) (car paire)))) >> (goto-char (point-min)) >> (while (search-forward (car paire) nil t) >> (replace-match (cdr paire) nil t)))))) > > Even faster than an alist is a hash table. > Huh? In this code, he's iterating over the alist (which is pretty fast - there's only a small, fixed number of items). For each element of this alist, he's doing a search/replace. Each of those is expensive. The data structure he uses for association is thus completely irrelevant. Not sure that this is the best approach, but your criticism definitely doesn't hold. I wonder whether one could use that alist to "build" a regexp which you could use with regexp-replace: you could use the \, syntax to add lisp code to the stuff run. Rupert ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: elisp optimization question 2008-05-09 9:59 ` Rupert Swarbrick @ 2008-05-09 21:43 ` harven 0 siblings, 0 replies; 8+ messages in thread From: harven @ 2008-05-09 21:43 UTC (permalink / raw) To: help-gnu-emacs > I wonder whether one could use that alist to "build" a regexp which > you could use with regexp-replace: you could use the \, syntax to add > lisp code to the stuff run. > > Rupert Here is a short command that take advantage of the advices in the previous posts. (setq my-alist '( ("»" . ">>") ("ö" . "o") ("—" . "-"))) (setq html-regexp (regexp-opt (mapcar 'car my-alist))) (defun w3m-filter () (interactive) (goto-char (point-min)) (while (re-search-forward html-regexp nil t) (replace-match (cdr (assoc (match-string 0) my-alist)) nil t))) I don't know how to pass interactively the values of the html-regexp variable to the M-% command, though. It's a bit strange to use regexp here. The tree structure given by a keymap would be better I think. If the keymap would insert a non valid prefix key sequence instead of reporting an error, we could just actually read the html file with a keymap binding the "—" key sequence to "-" etc. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: elisp optimization question [not found] <mailman.11345.1210283700.18990.help-gnu-emacs@gnu.org> 2008-05-09 0:00 ` elisp optimization question harven @ 2008-05-09 0:36 ` Xah 1 sibling, 0 replies; 8+ messages in thread From: Xah @ 2008-05-09 0:36 UTC (permalink / raw) To: help-gnu-emacs On May 8, 11:03 am, brad clawsie <claw...@fastmail.fm> wrote: > hi, i use the following function to translate unicode and other > entities found on the web into ascii that i can view in emacs-w3m. i > am concerned that each search and replace as done in my example is > inefficient, is there a better way to do this? i.e., is there a better > way to group search/replace pairs? thanks in advance! > > (defun w3m-filter-brad (url) > (goto-char (point-min)) > (while (re-search-forward "»" nil t) > (replace-match ">>")) > (goto-char (point-min)) > (while (re-search-forward "’" nil t) > (replace-match "'")) ... > ) I had similar problem and also thought about the efficiency or different implementation issues. Here's a alternative implementation. The idea is that instead of working on buffer, you grab them into a string, and do replacement on the string, then put them back in buffer. I haven't tested whether it is faster, but i think David Kastrup mentioned in the past that working on string is slower. (defun fold (f x li) "Recursively apply (f x i), where i is the ith element in the list li.\n For example, (fold f x '(1 2)) returns (f (f x 1) 2)" (let ((li2 li) (ele) (x2 x)) (while (setq ele (pop li2)) (setq x2 (funcall f x2 ele)) ) x2 ) ) (defun replace-string-pairs (str pairs) "Replace the string str repeatedy by the list pairs.\n Example: (replace-string-pairs \"yes or no\" '( (\"yes\" \"no\") (\"no\" \"n\") ) ) ⇒ \"n or n\"" (fold (lambda (x y) "" (replace-regexp-in-string (nth 0 y) (nth 1 y) x) ) str pairs) ) you might use replace-string instead of replace-regexp-in-string. -------------------- Also, the following are 3 different implementations. The first is same as yours except in works on region, by first narrow- to-region. The second is avoided the narrow-to-region by grabing the region as string and work on the string. Since i heard that working on string is slower, and since i want to avoid narrow-to-region, i thougth of using a temp buffer instead. That's the third solution, which i believe to be the best. However, at the time either the 2nd or the 3rd solution had a bug, so i switched back to the first. I haven't had time to investigate what was the problem. (defun replace-string-pairs-region (start end mylist) "Replace string pairs in region. Example syntax: (replace-string-pairs-region start end '((\"alpha\" \"α\") (\"beta\" \"β\"))) The search string and replace string are all literal." (save-restriction (narrow-to-region start end) (mapc (lambda (arg) (goto-char (point-min)) (while (search-forward (car arg) nil t) (replace-match (cadr arg) t t) )) mylist))) (defun replace-string-pairs-region2 (start end mylist) "Replace string pairs in region. Same as replace-string-pairs-region but with different implementation. This implementation does not use narrow-to-region or save-restriction. Is cleaner in a sense." (let (mystr) (setq mystr (buffer-substring start end)) (mapc (lambda (x) (setq mystr (replace-regexp-in-string (car x) (cadr x) mystr))) mylist) (delete-region start end) (insert mystr) ) ) (defun replace-string-pairs-region3 (start end mylist) "Replace string pairs in region. Same as replace-string-pairs-region but with different implementation." (let (mystr tempbuff) (setq mystr (buffer-substring start end)) (setq tempbuff (concat " " (random))) (save-current-buffer (set-buffer (get-buffer-create tempbuff)) (insert mystr) (mapc (lambda (arg) (goto-char (point-min)) (while (search-forward (car arg) nil t) (replace-match (cadr arg) t t) )) mylist) (kill-buffer tempbuff) ) (delete-region start end) (insert mystr) ) ) Xah xah@xahlee.org ∑ http://xahlee.org/ ☄ ^ permalink raw reply [flat|nested] 8+ messages in thread
* elisp optimization question @ 2008-05-08 18:03 brad clawsie 2008-05-08 22:14 ` Lennart Borgman (gmail) 0 siblings, 1 reply; 8+ messages in thread From: brad clawsie @ 2008-05-08 18:03 UTC (permalink / raw) To: help-gnu-emacs -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 hi, i use the following function to translate unicode and other entities found on the web into ascii that i can view in emacs-w3m. i am concerned that each search and replace as done in my example is inefficient, is there a better way to do this? i.e., is there a better way to group search/replace pairs? thanks in advance! (defun w3m-filter-brad (url) (goto-char (point-min)) (while (re-search-forward "»" nil t) (replace-match ">>")) (goto-char (point-min)) (while (re-search-forward "’" nil t) (replace-match "'")) (goto-char (point-min)) (while (re-search-forward "“" nil t) (replace-match "\"")) (goto-char (point-min)) (while (re-search-forward "”" nil t) (replace-match "\"")) (goto-char (point-min)) (while (re-search-forward "—" nil t) (replace-match "-")) (goto-char (point-min)) (while (re-search-forward "«" nil t) (replace-match "<")) (goto-char (point-min)) (while (re-search-forward "»" nil t) (replace-match ">")) (goto-char (point-min)) (while (re-search-forward "ö" nil t) (replace-match "o")) ) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.9 (FreeBSD) iEYEARECAAYFAkgjQIwACgkQxRg3RkRK91MO8gCgqJHsYhE/3bUERIeVztOkABUI xy0An3rk59o/OCHfaOlSVmM3zBdTgUXQ =lwIH -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: elisp optimization question 2008-05-08 18:03 brad clawsie @ 2008-05-08 22:14 ` Lennart Borgman (gmail) 2008-05-09 1:42 ` Kevin Rodgers 0 siblings, 1 reply; 8+ messages in thread From: Lennart Borgman (gmail) @ 2008-05-08 22:14 UTC (permalink / raw) To: brad clawsie; +Cc: help-gnu-emacs brad clawsie wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > hi, i use the following function to translate unicode and other > entities found on the web into ascii that i can view in emacs-w3m. i > am concerned that each search and replace as done in my example is > inefficient, is there a better way to do this? i.e., is there a better > way to group search/replace pairs? thanks in advance! > > (defun w3m-filter-brad (url) > (goto-char (point-min)) > (while (re-search-forward "»" nil t) > (replace-match ">>")) > (goto-char (point-min)) > (while (re-search-forward "’" nil t) > (replace-match "'")) > (goto-char (point-min)) > (while (re-search-forward "“" nil t) > (replace-match "\"")) > (goto-char (point-min)) > (while (re-search-forward "”" nil t) > (replace-match "\"")) > (goto-char (point-min)) > (while (re-search-forward "—" nil t) > (replace-match "-")) > (goto-char (point-min)) > (while (re-search-forward "«" nil t) > (replace-match "<")) > (goto-char (point-min)) > (while (re-search-forward "»" nil t) > (replace-match ">")) > (goto-char (point-min)) > (while (re-search-forward "ö" nil t) > (replace-match "o")) > ) When you write it the way you do you do not need re-search-forward, just search-forward since you search for strings, not regular expressions. Another way to make it faster would perhaps be to make one regular expression with regexp-opt and then check the match. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: elisp optimization question 2008-05-08 22:14 ` Lennart Borgman (gmail) @ 2008-05-09 1:42 ` Kevin Rodgers 0 siblings, 0 replies; 8+ messages in thread From: Kevin Rodgers @ 2008-05-09 1:42 UTC (permalink / raw) To: help-gnu-emacs Lennart Borgman (gmail) wrote: > Another way to make it faster would perhaps be to make one regular > expression with regexp-opt and then check the match. That's a good suggestion, and it led me to look into regexp-opt for the first time. But how do I get it to capture just the variant part of the matched strings in "\\( ... \\)" i.e. excluding any common prefix or suffix? E.g. (regexp-opt '("»" "’")) => "&#\\(?:\\(?:18\\|821\\)7;\\)" (regexp-opt '("»" "’") t) => "\\(&#\\(?:\\(?:18\\|821\\)7;\\)\\)" But what I'd like it to return is "&#\\(\\(?:18\\|821\\)7;\\)" so that (match-string 1) would return just "187" or "8217". -- Kevin Rodgers Denver, Colorado, USA ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2008-05-09 21:43 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- [not found] <mailman.11345.1210283700.18990.help-gnu-emacs@gnu.org> 2008-05-09 0:00 ` elisp optimization question harven 2008-05-09 1:45 ` Kevin Rodgers [not found] ` <mailman.11354.1210297808.18990.help-gnu-emacs@gnu.org> 2008-05-09 9:59 ` Rupert Swarbrick 2008-05-09 21:43 ` harven 2008-05-09 0:36 ` Xah 2008-05-08 18:03 brad clawsie 2008-05-08 22:14 ` Lennart Borgman (gmail) 2008-05-09 1:42 ` Kevin Rodgers
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).