* bug#22147: Obsolete search-forward-lax-whitespace @ 2015-12-11 23:52 Juri Linkov 2015-12-12 0:44 ` Artur Malabarba 0 siblings, 1 reply; 33+ messages in thread From: Juri Linkov @ 2015-12-11 23:52 UTC (permalink / raw) To: 22147 After commit e5ece322 that removed a layer of indirection for lax-whitespace it's not possible anymore to override search-forward-lax-whitespace with own implementation to ignore all possible whitespace instead of just spaces in the search string. For example, such customization as this one: (setq search-whitespace-regexp "\\(\\s-\\|\n\\)+") (defun search-whitespace-regexp (string) "Return a regexp which ignores all possible whitespace in search string. Uses the value of the variable `search-whitespace-regexp'." (if (or (not (stringp search-whitespace-regexp)) (null (if isearch-regexp isearch-regexp-lax-whitespace isearch-lax-whitespace))) string (replace-regexp-in-string search-whitespace-regexp search-whitespace-regexp string nil t))) (defun search-forward-lax-whitespace (string &optional bound noerror count) (re-search-forward (search-whitespace-regexp (regexp-quote string)) bound noerror count)) (defun search-backward-lax-whitespace (string &optional bound noerror count) (re-search-backward (search-whitespace-regexp (regexp-quote string)) bound noerror count)) (defun re-search-forward-lax-whitespace (regexp &optional bound noerror count) (re-search-forward (search-whitespace-regexp regexp) bound noerror count)) (defun re-search-backward-lax-whitespace (regexp &optional bound noerror count) (re-search-backward (search-whitespace-regexp regexp) bound noerror count)) allowed to search for a string with a newline like ‘C-s abc C-q C-j def’ and match the text “abc def”. It's not clear what to do with this customization now using a replacement recommended in (make-obsolete old "instead, use (let ((search-spaces-regexp search-whitespace-regexp)) (re-search-... ...))" ^ permalink raw reply [flat|nested] 33+ messages in thread
* bug#22147: Obsolete search-forward-lax-whitespace 2015-12-11 23:52 bug#22147: Obsolete search-forward-lax-whitespace Juri Linkov @ 2015-12-12 0:44 ` Artur Malabarba 2015-12-12 23:31 ` Juri Linkov 0 siblings, 1 reply; 33+ messages in thread From: Artur Malabarba @ 2015-12-12 0:44 UTC (permalink / raw) To: Juri Linkov; +Cc: 22147 [-- Attachment #1: Type: text/plain, Size: 647 bytes --] On 11 Dec 2015 11:52 pm, "Juri Linkov" <juri@linkov.net> wrote: > > It's not clear what to do with this customization now using > a replacement recommended in (make-obsolete old "instead, use (let > ((search-spaces-regexp search-whitespace-regexp)) (re-search-... ...))" The obsoletion message tells you what to use instead of search-forward-lax-whitespace, but that doesn't help you because you weren't using this function, you were overriding it (IIUC). Fortunately, I think you don't need to override anything at all. You can just set search-default-regexp-function to your #'search-whitespace-regexp. IIUC, that should have the same effect. [-- Attachment #2: Type: text/html, Size: 809 bytes --] ^ permalink raw reply [flat|nested] 33+ messages in thread
* bug#22147: Obsolete search-forward-lax-whitespace 2015-12-12 0:44 ` Artur Malabarba @ 2015-12-12 23:31 ` Juri Linkov 2015-12-13 0:29 ` Artur Malabarba 0 siblings, 1 reply; 33+ messages in thread From: Juri Linkov @ 2015-12-12 23:31 UTC (permalink / raw) To: Artur Malabarba; +Cc: 22147 > Fortunately, I think you don't need to override anything at all. You can > just set search-default-regexp-function to your #'search-whitespace-regexp. > IIUC, that should have the same effect. Thanks, setting search-default-regexp-mode to #'search-whitespace-regexp gives the same effect. One drawback is that then it removes char-fold search. Do you have a plan to combine lax-whitespace search with char-fold search? ^ permalink raw reply [flat|nested] 33+ messages in thread
* bug#22147: Obsolete search-forward-lax-whitespace 2015-12-12 23:31 ` Juri Linkov @ 2015-12-13 0:29 ` Artur Malabarba 2015-12-14 0:23 ` Juri Linkov 0 siblings, 1 reply; 33+ messages in thread From: Artur Malabarba @ 2015-12-13 0:29 UTC (permalink / raw) To: Juri Linkov; +Cc: 22147 [-- Attachment #1: Type: text/plain, Size: 611 bytes --] On 12 Dec 2015 11:31 pm, "Juri Linkov" <juri@linkov.net> wrote: > > Thanks, setting search-default-regexp-mode to #'search-whitespace-regexp > gives the same effect. > > One drawback is that then it removes char-fold search. True. I think it might also be possible to get what you want by just setting the search-whitespace-regexp variable to "[ \t\r\n]+". That would have the advantage of not removing char folding (and would reduce everything to one line). > Do you have a plan to combine lax-whitespace search > with char-fold search? Char-folding is perfectly compatible with the regular lax-whitespace. [-- Attachment #2: Type: text/html, Size: 800 bytes --] ^ permalink raw reply [flat|nested] 33+ messages in thread
* bug#22147: Obsolete search-forward-lax-whitespace 2015-12-13 0:29 ` Artur Malabarba @ 2015-12-14 0:23 ` Juri Linkov 2015-12-14 1:11 ` Artur Malabarba 0 siblings, 1 reply; 33+ messages in thread From: Juri Linkov @ 2015-12-14 0:23 UTC (permalink / raw) To: Artur Malabarba; +Cc: 22147 >> Thanks, setting search-default-regexp-mode to #'search-whitespace-regexp >> gives the same effect. >> >> One drawback is that then it removes char-fold search. > > True. I think it might also be possible to get what you want by just > setting the search-whitespace-regexp variable to "[ \t\r\n]+". That would > have the advantage of not removing char folding (and would reduce > everything to one line). This still doesn't allow ^J in the search string to match a newline. I often paste multi-line texts into the search string and need to ignore differences in newlines usually caused by text re-filling. What the mentioned regexp function does is replacing all whitespace in the search string with the regexp that matches whitespace (also it's possible to replace whitespace with a space character and then use search-spaces-regexp to match this space character using the regexp in search-whitespace-regexp). By analogy with char-folding, this means symmetric whitespace folding. When char-fold-symmetric causes all members of a folding equivalence class to be treated equivalently, lax-whitespace-symmetric could treat only whitespace character equivalently. >> Do you have a plan to combine lax-whitespace search with char-fold search? > > Char-folding is perfectly compatible with the regular lax-whitespace. Could char-folding already do the described above (maybe simpler would be to normalize the search string by turning all whitespace into space characters), or better first implement char-fold-symmetric and then use it for whitespace characters? ^ permalink raw reply [flat|nested] 33+ messages in thread
* bug#22147: Obsolete search-forward-lax-whitespace 2015-12-14 0:23 ` Juri Linkov @ 2015-12-14 1:11 ` Artur Malabarba 2015-12-14 23:58 ` Juri Linkov 0 siblings, 1 reply; 33+ messages in thread From: Artur Malabarba @ 2015-12-14 1:11 UTC (permalink / raw) To: Juri Linkov; +Cc: 22147 [-- Attachment #1: Type: text/plain, Size: 879 bytes --] On 14 Dec 2015 12:23 am, "Juri Linkov" <juri@linkov.net> wrote: > > > > True. I think it might also be possible to get what you want by just > > setting the search-whitespace-regexp variable to "[ \t\r\n]+". That would > > have the advantage of not removing char folding (and would reduce > > everything to one line). > > This still doesn't allow ^J in the search string to match a newline. Right. I always get confused about that variable. > (maybe simpler > would be to normalize the search string by turning all whitespace > into space characters), Yes, I think this should give you the behaviour you're looking for. Try setting search-default-regexp-function to #'my-lax-with-char-fold, where (defun my-lax-with-char-fold (s &optional l) (character-fold-to-regexp (replace-regexp-in-string "\t\n\r\s+" " " s) l)) And then also set search-whitespace-regexp like above. [-- Attachment #2: Type: text/html, Size: 1189 bytes --] ^ permalink raw reply [flat|nested] 33+ messages in thread
* bug#22147: Obsolete search-forward-lax-whitespace 2015-12-14 1:11 ` Artur Malabarba @ 2015-12-14 23:58 ` Juri Linkov 2015-12-15 10:15 ` Artur Malabarba 0 siblings, 1 reply; 33+ messages in thread From: Juri Linkov @ 2015-12-14 23:58 UTC (permalink / raw) To: Artur Malabarba; +Cc: 22147 >> (maybe simpler >> would be to normalize the search string by turning all whitespace >> into space characters), > > Yes, I think this should give you the behaviour you're looking for. > Try setting search-default-regexp-function to #'my-lax-with-char-fold, search-default-regexp-mode definitely needs to be renamed to search-default-regexp-function ;-) > where > > (defun my-lax-with-char-fold (s &optional l) > (character-fold-to-regexp (replace-regexp-in-string "\t\n\r\s+" " " s) > l)) > > And then also set search-whitespace-regexp like above. Thanks for the suggestion. Since it works, then maybe better generalize it to allow a mode that supports normalization of the search string, that also will do symmetric char-folding, where e.g. searching for “ä” will match “a”, etc. ^ permalink raw reply [flat|nested] 33+ messages in thread
* bug#22147: Obsolete search-forward-lax-whitespace 2015-12-14 23:58 ` Juri Linkov @ 2015-12-15 10:15 ` Artur Malabarba 2015-12-16 0:57 ` Juri Linkov 0 siblings, 1 reply; 33+ messages in thread From: Artur Malabarba @ 2015-12-15 10:15 UTC (permalink / raw) To: Juri Linkov; +Cc: 22147 2015-12-14 23:58 GMT+00:00 Juri Linkov <juri@linkov.net>: >>> (maybe simpler >>> would be to normalize the search string by turning all whitespace >>> into space characters), >> >> Yes, I think this should give you the behaviour you're looking for. >> Try setting search-default-regexp-function to #'my-lax-with-char-fold, > > search-default-regexp-mode definitely needs to be renamed to > search-default-regexp-function ;-) Like I said on the devel thread, not sure about this. The only reason I got this wrong on all the above messages is that I wrote every single one of them from my phone. :-) >> where >> >> (defun my-lax-with-char-fold (s &optional l) >> (character-fold-to-regexp (replace-regexp-in-string "\t\n\r\s+" " " s) >> l)) >> >> And then also set search-whitespace-regexp like above. > > Thanks for the suggestion. Since it works, then maybe better > generalize it to allow a mode that supports normalization of > the search string, that also will do symmetric char-folding, > where e.g. searching for “ä” will match “a”, etc. I don't know what you mean. IIUC, the current framework already supports a "normalizing mode", which is what we just did here, isn't it? ^ permalink raw reply [flat|nested] 33+ messages in thread
* bug#22147: Obsolete search-forward-lax-whitespace 2015-12-15 10:15 ` Artur Malabarba @ 2015-12-16 0:57 ` Juri Linkov 2015-12-16 1:47 ` Drew Adams 2015-12-16 10:59 ` Artur Malabarba 0 siblings, 2 replies; 33+ messages in thread From: Juri Linkov @ 2015-12-16 0:57 UTC (permalink / raw) To: Artur Malabarba; +Cc: 22147 >>> (defun my-lax-with-char-fold (s &optional l) >>> (character-fold-to-regexp (replace-regexp-in-string "\t\n\r\s+" " " s) >>> l)) >>> >>> And then also set search-whitespace-regexp like above. >> >> Thanks for the suggestion. Since it works, then maybe better >> generalize it to allow a mode that supports normalization of >> the search string, that also will do symmetric char-folding, >> where e.g. searching for “ä” will match “a”, etc. > > I don't know what you mean. IIUC, the current framework already > supports a "normalizing mode", which is what we just did here, isn't > it? I mean a char-folding customization that allows a search for “ä” match “a”. Is this already possible? If yes, then it should be easy to customize it in such a way that “\n” will match space “\s” to avoid the need to write own functions that define an intersection of the existing functions char-folding and lax-whitespace. IOW, to customize a char-folding option instead of search-default-regexp-mode? ^ permalink raw reply [flat|nested] 33+ messages in thread
* bug#22147: Obsolete search-forward-lax-whitespace 2015-12-16 0:57 ` Juri Linkov @ 2015-12-16 1:47 ` Drew Adams 2016-05-14 20:45 ` Juri Linkov 2015-12-16 10:59 ` Artur Malabarba 1 sibling, 1 reply; 33+ messages in thread From: Drew Adams @ 2015-12-16 1:47 UTC (permalink / raw) To: Juri Linkov, Artur Malabarba; +Cc: 22147 > I mean a char-folding customization that allows a search > for “ä” match “a”. Is this already possible? It sounds like you are asking for symmetric char folding: being able to use any of the various A's that make up the A-characters equivalence class as a search pattern and find any of those characters. If so, I implemented that (one way, at least), and in emacs-devel I proposed such behavior as a togglable option. It is trivial to try it, if you like: character-fold+.el. http://www.emacswiki.org/emacs/download/character-fold%2b.el (A toggle command for it, `isearchp-toggle-symmetric-char-fold', is defined in isearch+.el: http://www.emacswiki.org/emacs/download/isearch%2b.el.) > If yes, then it should be easy to customize it in such a way that > “\n” will match space “\s” to avoid the need to write own > functions that define an intersection of the existing functions > char-folding and lax-whitespace. IOW, to customize a char-folding > option instead of search-default-regexp-mode? Not sure if it answers the need you just described, but the same library has an option, `char-fold-ad-hoc', that lets users add their own equivalence classes. (Caveat: I think that Artur made some changes to character-fold.el recently. It's possible that character-fold+.el is not up-to-date wrt those changes, in which case it might not work with the most recent versions of character-fold.el. Maybe check the dates, if you are interested.) ^ permalink raw reply [flat|nested] 33+ messages in thread
* bug#22147: Obsolete search-forward-lax-whitespace 2015-12-16 1:47 ` Drew Adams @ 2016-05-14 20:45 ` Juri Linkov 2016-05-14 22:20 ` Artur Malabarba 2016-05-14 22:22 ` Drew Adams 0 siblings, 2 replies; 33+ messages in thread From: Juri Linkov @ 2016-05-14 20:45 UTC (permalink / raw) To: Drew Adams; +Cc: 22147, Artur Malabarba >> I mean a char-folding customization that allows a search >> for “ä” match “a”. Is this already possible? > > It sounds like you are asking for symmetric char folding: being > able to use any of the various A's that make up the A-characters > equivalence class as a search pattern and find any of those > characters. > > If so, I implemented that (one way, at least), and in emacs-devel > I proposed such behavior as a togglable option. > > It is trivial to try it, if you like: character-fold+.el. > http://www.emacswiki.org/emacs/download/character-fold%2b.el > > (A toggle command for it, `isearchp-toggle-symmetric-char-fold', > is defined in isearch+.el: > http://www.emacswiki.org/emacs/download/isearch%2b.el.) I'm starting to recollect all the remaining pieces to finish this release blocking issue, but I can't download this library, because the link is broken and it seems the whole site is down. Drew, could you please send the latest version as an attachment? ^ permalink raw reply [flat|nested] 33+ messages in thread
* bug#22147: Obsolete search-forward-lax-whitespace 2016-05-14 20:45 ` Juri Linkov @ 2016-05-14 22:20 ` Artur Malabarba 2016-05-14 22:27 ` Drew Adams 2016-05-15 20:45 ` Juri Linkov 2016-05-14 22:22 ` Drew Adams 1 sibling, 2 replies; 33+ messages in thread From: Artur Malabarba @ 2016-05-14 22:20 UTC (permalink / raw) To: Juri Linkov, Drew Adams; +Cc: 22147 [-- Attachment #1: Type: text/plain, Size: 1289 bytes --] IIUC, Drew was offering an implementation of symmetric char folding, whereas the release blocking aspect of this bug is to add a char-folding-ad-hoc variable. On Sat, 14 May 2016 5:45 pm Juri Linkov, <juri@linkov.net> wrote: > >> I mean a char-folding customization that allows a search > >> for “ä” match “a”. Is this already possible? > > > > It sounds like you are asking for symmetric char folding: being > > able to use any of the various A's that make up the A-characters > > equivalence class as a search pattern and find any of those > > characters. > > > > If so, I implemented that (one way, at least), and in emacs-devel > > I proposed such behavior as a togglable option. > > > > It is trivial to try it, if you like: character-fold+.el. > > http://www.emacswiki.org/emacs/download/character-fold%2b.el > > > > (A toggle command for it, `isearchp-toggle-symmetric-char-fold', > > is defined in isearch+.el: > > http://www.emacswiki.org/emacs/download/isearch%2b.el.) > > I'm starting to recollect all the remaining pieces to finish this > release blocking issue, but I can't download this library, > because the link is broken and it seems the whole site is down. > > Drew, could you please send the latest version as an attachment? > [-- Attachment #2: Type: text/html, Size: 1847 bytes --] ^ permalink raw reply [flat|nested] 33+ messages in thread
* bug#22147: Obsolete search-forward-lax-whitespace 2016-05-14 22:20 ` Artur Malabarba @ 2016-05-14 22:27 ` Drew Adams 2016-05-15 20:45 ` Juri Linkov 1 sibling, 0 replies; 33+ messages in thread From: Drew Adams @ 2016-05-14 22:27 UTC (permalink / raw) To: Artur Malabarba, Juri Linkov; +Cc: 22147 > IIUC, Drew was offering an implementation of symmetric char folding, > whereas the release blocking aspect of this bug is to add a > char-folding-ad-hoc variable. That makes sense. That too is in `character-fold+.el', which I attached to my previous message. Dunno whether what I have there is exactly what you want/need. This is it: (defcustom char-fold-ad-hoc '((?\" """ "“" "”" "”" "„" "⹂" "〞" "‟" "‟" "❞" "❝" "❠" "“" "„" "〝" "〟" "🙷" "🙶" "🙸" "«" "»") (?' "❟" "❛" "❜" "‘" "’" "‚" "‛" "‚" "" "❮" "❯" "‹" "›") (?` "❛" "‘" "‛" "" "❮" "‹")) "Ad hoc character foldings. Each entry is a list of a character and the strings that fold into it. The default value includes those ad hoc foldings provided by vanilla Emacs." :set (lambda (sym defs) (custom-set-default sym defs) (update-char-fold-table)) :type '(repeat (cons (character :tag "Fold to character") (repeat (string :tag "Fold from string")))) :group 'isearch) And this is where it is used: ;; Add some manual entries. (dolist (it char-fold-ad-hoc) (let ((idx (car it)) (chr-strgs (cdr it))) (aset equiv idx (append chr-strgs (aref equiv idx))))) ^ permalink raw reply [flat|nested] 33+ messages in thread
* bug#22147: Obsolete search-forward-lax-whitespace 2016-05-14 22:20 ` Artur Malabarba 2016-05-14 22:27 ` Drew Adams @ 2016-05-15 20:45 ` Juri Linkov 1 sibling, 0 replies; 33+ messages in thread From: Juri Linkov @ 2016-05-15 20:45 UTC (permalink / raw) To: Artur Malabarba; +Cc: 22147 > IIUC, Drew was offering an implementation of symmetric char folding, > whereas the release blocking aspect of this bug is to add a > char-folding-ad-hoc variable. My initial request was to restore an ability to fold whitespace. One way to do this is to implement symmetric char folding. However, I believe that the same could be achieved with just char-fold-ad-hoc providing a suitable set of mappings. I'll confirm whether this is achievable after adding char-fold-ad-hoc. ^ permalink raw reply [flat|nested] 33+ messages in thread
* bug#22147: Obsolete search-forward-lax-whitespace 2016-05-14 20:45 ` Juri Linkov 2016-05-14 22:20 ` Artur Malabarba @ 2016-05-14 22:22 ` Drew Adams 2016-05-15 20:56 ` Juri Linkov 1 sibling, 1 reply; 33+ messages in thread From: Drew Adams @ 2016-05-14 22:22 UTC (permalink / raw) To: Juri Linkov; +Cc: 22147, Artur Malabarba [-- Attachment #1: Type: text/plain, Size: 2266 bytes --] > >> I mean a char-folding customization that allows a search > >> for “ä” match “a”. Is this already possible? > > > > It sounds like you are asking for symmetric char folding: being > > able to use any of the various A's that make up the A-characters > > equivalence class as a search pattern and find any of those > > characters. > > > > If so, I implemented that (one way, at least), and in emacs-devel > > I proposed such behavior as a togglable option. > > > > It is trivial to try it, if you like: character-fold+.el. > > http://www.emacswiki.org/emacs/download/character-fold%2b.el > > > > (A toggle command for it, `isearchp-toggle-symmetric-char-fold', > > is defined in isearch+.el: > > http://www.emacswiki.org/emacs/download/isearch%2b.el.) > > I'm starting to recollect all the remaining pieces to finish this > release blocking issue, but I can't download this library, > because the link is broken and it seems the whole site is down. > > Drew, could you please send the latest version as an attachment? 1. EmacsWiki seems to be up now. Also, you should be able to get to what is on EmacsWiki at the EmacsMirror: https://github.com/emacsmirror. And you should also be able to get my libraries from MELPA. I've attached `character-fold+.el' anyway. Let me know if you also want to look at `isearch+.el' and you cannot get to it for some reason. 2. More importantly, what I wrote in `character-fold+.el' worked only at the time I wrote it and for a while thereafter, unfortunately. Not too long after that, Artur Malabarba rewrote `character-fold.el', so the code I wrote is no longer appropriate. I have not had time to look at the (fairly deep) changes he made, or to imagine what I might do with it to obtain the symmetric behavior I implemented for the earlier version. 4. Dunno whether what I wrote is needed or helpful for dealing with this bug. Perhaps you or Artur can tell. IIUC, the part of this bug report that I replied to seemed to be a request for an extension of what `character-fold.el' does: symmetric folding. But perhaps I was misunderstanding, because I don't see how that could be a blocking bug - it was never Artur's intention to provide symmetric folding, AFAIK. [-- Attachment #2: character-fold+.el --] [-- Type: application/octet-stream, Size: 12414 bytes --] ;;; character-fold+.el --- Extensions to `character-fold.el' ;; ;; Filename: character-fold+.el ;; Description: Extensions to `character-fold.el' ;; Author: Drew Adams ;; Maintainer: Drew Adams ;; Copyright (C) 2015-2016, Drew Adams, all rights reserved. ;; Created: Fri Nov 27 09:12:01 2015 (-0800) ;; Version: 0 ;; Package-Requires: () ;; Last-Updated: Sat Feb 27 15:05:20 2016 (-0800) ;; By: dradams ;; Update #: 93 ;; URL: http://www.emacswiki.org/character-fold+.el ;; Doc URL: http://emacswiki.org/CharacterFoldPlus ;; Keywords: isearch, search, unicode ;; Compatibility: GNU Emacs: 25.x builds ON OR BEFORE 2015-12-10 ;; ;; Features that might be required by this library: ;; ;; `backquote', `button', `bytecomp', `cconv', `character-fold', ;; `cl-extra', `cl-lib', `help-mode', `macroexp'. ;; ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;; ;;; Commentary: ;; ;; Extensions to Isearch character folding. ;; ;; ;; NOTE: This library is NOT UP-TO-DATE WRT EMACS 25. The vanilla ;; Emacs library `character-fold.el', which this library ;; extends, was changed in incompatible ways after this library ;; was written. I have not yet had a chance to update this ;; (and am waiting for Emacs 25 to be released to do so). ;; Sorry about that. ;; ;; ;; Choose One-Way or Symmetric Character Folding ;; --------------------------------------------- ;; ;; Non-nil option `char-fold-symmetric' means that char folding is ;; symmetric: When you search for any of an equivalence class of ;; characters you find all of them. This behavior applies to ;; query-replacing also - see option `replace-character-fold'. ;; ;; The default value of `char-fold-symmetric' is `nil', which gives ;; the same behavior as vanilla Emacs: you find all members of the ;; equivalence class only when you search for the base character. ;; ;; For example, with a `nil' value you can search for "e" (a base ;; character) to find "é", but not vice versa. With a non-`nil' ;; value you can search for either, to find itself and the other ;; members of the equivalence class - the base char is not treated ;; specially. ;; ;; Example non-`nil' behavior: ;; ;; Searching for any of these characters and character compositions ;; in the search string finds all of them. (Use `C-u C-x =' with ;; point before a character to see complete information about it.) ;; ;; e 𝚎 𝙚 𝘦 𝗲 𝖾 𝖊 𝕖 𝔢 𝓮 𝒆 𝑒 𝐞 e ㋎ ㋍ ⓔ ⒠ ;; ⅇ ℯ ₑ ẽ ẽ ẻ ẻ ẹ ẹ ḛ ḛ ḙ ḙ ᵉ ȩ ȩ ȇ ȇ ;; ȅ ȅ ě ě ę ę ė ė ĕ ĕ ē ē ë ë ê ê é é è è ;; ;; An example of a composition is "é". Searching for that finds ;; the same matches as searching for "é" or searching for "e". ;; ;; If you also use library `isearch+.el' then you can toggle option ;; `char-fold-symmetric' anytime during Isearch, using `M-s =' ;; (command `isearchp-toggle-symmetric-char-fold'). ;; ;; ;; NOTE: ;; ;; To customize option `char-fold-symmetric', use either Customize ;; or a Lisp function designed for customizing options, such as ;; `customize-set-variable', that invokes the necessary `:set' ;; function. ;; ;; ;; CAVEAT: ;; ;; Be aware that character-fold searching can be much slower when ;; symmetric - there are many more possibilities to search for. ;; If, for example, you search only for a single "e"-family ;; character then every "e" in the buffer is a search hit (which ;; means lazy-highlighting them all, by default). Searching with a ;; longer search string is much faster. ;; ;; If you also use library `isearch+.el' then you can turn off lazy ;; highlighting using the toggle key `M-s h L'. This can vastly ;; improve performance when character folding is symmetric. ;; ;; ;; Customize the Ad Hoc Character Foldings ;; --------------------------------------- ;; ;; In addition to the standard equivalence classes of a base ;; character and its family of diacriticals, vanilla Emacs includes a ;; number of ad hoc character foldings, e.g., for different quote ;; marks. ;; ;; Option `char-fold-ad-hoc' lets you customize this set of ad hoc ;; foldings. The default value is the same set provided by vanilla ;; Emacs. ;; ;; ;; ;; Options defined here: ;; ;; `char-fold-ad-hoc', `char-fold-symmetric'. ;; ;; Non-interactive functions defined here: ;; ;; `update-char-fold-table'. ;; ;; Internal variables defined here: ;; ;; `char-fold-decomps'. ;; ;; ;; ***** NOTE: The following function defined in `mouse.el' has ;; been ADVISED HERE: ;; ;; `character-fold-to-regexp'. ;; ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;; ;;; Change Log: ;; ;; 2015/12/01 dadams ;; char-fold-ad-hoc: Added :set. ;; 2015/11/28 dadams ;; Added: char-fold-ad-hoc. ;; update-char-fold-table: Use char-fold-ad-hoc. ;; 2015/11/27 dadams ;; Created. ;; ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;; ;; This program is free software: you can redistribute it and/or modify ;; it under the terms of the GNU General Public License as published by ;; the Free Software Foundation, either version 3 of the License, or (at ;; your option) any later version. ;; ;; This program is distributed in the hope that it will be useful, but ;; WITHOUT ANY WARRANTY; without even the implied warranty of ;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU ;; General Public License for more details. ;; ;; You should have received a copy of the GNU General Public License ;; along with GNU Emacs. If not, see <http://www.gnu.org/licenses/>. ;; ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;; ;;; Code: (require 'character-fold) ;;;;;;;;;;;;;;;;;;;;;;; (defvar char-fold-decomps () "List of conses of a decomposition and its base char.") (defun update-char-fold-table () "Update the value of variable `character-fold-table'. The new value reflects the current value of `char-fold-symmetric'." (setq char-fold-decomps ()) (setq character-fold-table (let* ((equiv (make-char-table 'character-fold-table)) (table (unicode-property-table-internal 'decomposition)) (func (char-table-extra-slot table 1))) ;; Ensure that the table is populated. (map-char-table (lambda (ch val) (when (consp ch) (funcall func (car ch) val table))) table) ;; Compile a list of all complex chars that each simple char should match. (map-char-table (lambda (ch dec) (when (consp dec) (when (symbolp (car dec)) (setq dec (cdr dec))) ; Discard a possible formatting tag. ;; Skip trivial cases like ?a decomposing to (?a). (unless (and (null (cdr dec)) (eq ch (car dec))) (let ((dd dec) (fold-decomp t) kk found) (while (and dd (not found)) (setq kk (pop dd)) ;; Is KK a number or letter, per unicode standard? (setq found (memq (get-char-code-property kk 'general-category) '(Lu Ll Lt Lm Lo Nd Nl No)))) (if found ;; Check if the decomposition has more than one letter, because then ;; we don't want the first letter to match the decomposition. (dolist (kk dd) (when (and fold-decomp (memq (get-char-code-property kk 'general-category) '(Lu Ll Lt Lm Lo Nd Nl No))) (setq fold-decomp nil))) ;; No number or letter on decomposition. Take its first char. (setq found (car-safe dec))) ;; Fold a multi-char decomposition only if at least one of the chars is ;; non-spacing (combining). (when fold-decomp (setq fold-decomp nil) (dolist (kk dec) (when (and (not fold-decomp) (> (get-char-code-property kk 'canonical-combining-class) 0)) (setq fold-decomp t)))) ;; Add II to the list of chars that KK can represent. Maybe add its decomposition ;; too, so we can match multi-char representations like (format "a%c" 769). (when (and found (not (eq ch kk))) (let ((chr-strgs (cons (char-to-string ch) (aref equiv kk)))) (aset equiv kk (if fold-decomp (cons (apply #'string dec) chr-strgs) chr-strgs)))))))) table) ;; Add some manual entries. (dolist (it char-fold-ad-hoc) (let ((idx (car it)) (chr-strgs (cdr it))) (aset equiv idx (append chr-strgs (aref equiv idx))))) ;; This is the essential bit added by `character-fold+.el'. (when (and (boundp 'char-fold-symmetric) char-fold-symmetric) ;; Add an entry for each equivalent char. (let ((others ())) (map-char-table (lambda (base val) (let ((chr-strgs (aref equiv base))) (when (consp chr-strgs) (dolist (strg (cdr chr-strgs)) (when (< (length strg) 2) (push (cons (string-to-char strg) (remove strg chr-strgs)) others)) ;; Add it and its base char to `char-fold-decomps'. (push (cons strg (char-to-string base)) char-fold-decomps))))) equiv) (dolist (it others) (let ((base (car it)) (chr-strgs (cdr it))) (aset equiv base (append chr-strgs (aref equiv base))))))) (map-char-table ; Convert the lists of characters we compiled into regexps. (lambda (ch val) (let ((re (regexp-opt (cons (char-to-string ch) val)))) (if (consp ch) (set-char-table-range equiv ch re) (aset equiv ch re)))) equiv) equiv))) (defcustom char-fold-ad-hoc '((?\" """ "“" "”" "”" "„" "⹂" "〞" "‟" "‟" "❞" "❝" "❠" "“" "„" "〝" "〟" "🙷" "🙶" "🙸" "«" "»") (?' "❟" "❛" "❜" "‘" "’" "‚" "‛" "‚" "" "❮" "❯" "‹" "›") (?` "❛" "‘" "‛" "" "❮" "‹")) "Ad hoc character foldings. Each entry is a list of a character and the strings that fold into it. The default value includes those ad hoc foldings provided by vanilla Emacs." :set (lambda (sym defs) (custom-set-default sym defs) (update-char-fold-table)) :type '(repeat (cons (character :tag "Fold to character") (repeat (string :tag "Fold from string")))) :group 'isearch) (defcustom char-fold-symmetric nil "Non-nil means char-fold searching treats equivalent chars the same. That is, use of any of a set of char-fold equivalent chars in a search string finds any of them in the text being searched. If nil then only the \"base\" or \"canonical\" char of the set matches any of them. The others match only themselves, even when char-folding is turned on." :set (lambda (sym defs) (custom-set-default sym defs) (update-char-fold-table)) :type 'boolean :group 'isearch) (defadvice character-fold-to-regexp (before replace-decompositions activate) "Replace any decompositions in `character-fold-table' by their base chars. This allows search to match all equivalents." (when char-fold-decomps (dolist (decomp char-fold-decomps) (ad-set-arg 0 (replace-regexp-in-string (regexp-quote (car decomp)) (cdr decomp) (ad-get-arg 0) 'FIXED-CASE 'LITERAL))))) ;;;;;;;;;;;;;;;;;;;;;;; (provide 'character-fold+) ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;;; character-fold+.el ends here ^ permalink raw reply [flat|nested] 33+ messages in thread
* bug#22147: Obsolete search-forward-lax-whitespace 2016-05-14 22:22 ` Drew Adams @ 2016-05-15 20:56 ` Juri Linkov 2016-05-15 21:51 ` Drew Adams 0 siblings, 1 reply; 33+ messages in thread From: Juri Linkov @ 2016-05-15 20:56 UTC (permalink / raw) To: Drew Adams; +Cc: 22147, Artur Malabarba >> I'm starting to recollect all the remaining pieces to finish this >> release blocking issue, but I can't download this library, >> because the link is broken and it seems the whole site is down. >> >> Drew, could you please send the latest version as an attachment? > > 1. EmacsWiki seems to be up now. Also, you should be able to get to > what is on EmacsWiki at the EmacsMirror: https://github.com/emacsmirror. > And you should also be able to get my libraries from MELPA. I've > attached `character-fold+.el' anyway. Let me know if you also want > to look at `isearch+.el' and you cannot get to it for some reason. EmacsWiki is inaccessible to me due to its invalid server certificate. But thanks for pointing to EmacsMirror - I found your code at https://github.com/emacsmirror/character-fold-plus https://github.com/emacsmirror/isearch-plus which I hope is at the latest version. > 2. More importantly, what I wrote in `character-fold+.el' worked > only at the time I wrote it and for a while thereafter, unfortunately. > Not too long after that, Artur Malabarba rewrote `character-fold.el', > so the code I wrote is no longer appropriate. I see that you just moved the hard-coded alist to defcustom char-fold-ad-hoc. I think that char-fold-ad-hoc is too ad-hoc naming. Using more wide-spread naming convention with a data type suffix -alist (like in display-buffer-alist, etc.) would provide a defcustom name char-fold-alist. Another thing we need to do is to allow customization to remove default mappings. Maybe this is possible by using the same defcustom with a rule like: remove default mappings when a char is mapped to an empty list, e.g. - adding more mappings for ‘`’: (defcustom char-fold-ad-hoc '((?` "❛" "‘" "‛" "" "❮" "‹")) - removing default mappings for ‘`’: (defcustom char-fold-ad-hoc '((?`)) ^ permalink raw reply [flat|nested] 33+ messages in thread
* bug#22147: Obsolete search-forward-lax-whitespace 2016-05-15 20:56 ` Juri Linkov @ 2016-05-15 21:51 ` Drew Adams 2016-05-17 20:55 ` Juri Linkov 0 siblings, 1 reply; 33+ messages in thread From: Drew Adams @ 2016-05-15 21:51 UTC (permalink / raw) To: Juri Linkov; +Cc: 22147, Artur Malabarba > EmacsWiki is inaccessible to me due to its invalid server certificate. I see. I don't know anything about that. > But thanks for pointing to EmacsMirror - I found your code at > https://github.com/emacsmirror/character-fold-plus > https://github.com/emacsmirror/isearch-plus > which I hope is at the latest version. Yes, I just checked, and those are the latest versions. I don't know how often EmacsMirror is updated. For a while (a year or two ago, I think) I think it was not mirroring. You can always get my code from MELPA, which refreshes from EmacsWiki daily. > > 2. More importantly, what I wrote in `character-fold+.el' worked > > only at the time I wrote it and for a while thereafter, unfortunately. > > Not too long after that, Artur Malabarba rewrote `character-fold.el', > > so the code I wrote is no longer appropriate. > > I see that you just moved the hard-coded alist to defcustom > char-fold-ad-hoc. Correct. You can see how I use it. I broke up some of the character-fold.el code (at the time), in order to use parts of it (a bit more modular). Mainly, I broke out `update-char-fold-table' so that it could be called in the :set functions of the two defcustoms. So as soon as a user made changes, they were reflected in the behavior. > I think that char-fold-ad-hoc is too ad-hoc naming. > Using more wide-spread naming convention with a data type suffix -alist > (like in display-buffer-alist, etc.) would provide a defcustom name > char-fold-alist. OK. FWIW, I'm not a fan of putting the type ("alist") in the option name, but I don't speak for what vanilla Emacs does. If all we can say about some value is that it takes the _form_ of an alist, that's too bad. Normally, we should be able to describe that value (content, not just form). It's better, IMO, if the name talks about what the value is (content, purpose - something specific about it), and not just say form it takes. Another consideration (for me, at least): I think (and hope) that eventually users will be able to have multiple such lists (sets) of char mappings that they can choose (and mix and match - sets of such sets, for different purposes/contexts). IOW, I don't see just a single set of ad-hoc char mappings. But this is anyway for the future. > Another thing we need to do is to allow customization to remove > default mappings. Maybe this is possible by using the same > defcustom with a rule like: remove default mappings when a char > is mapped to an empty list, e.g. > > - adding more mappings for ‘`’: > (defcustom char-fold-ad-hoc '((?` "❛" "‘" "‛" "" "❮" "‹")) > > - removing default mappings for ‘`’: > (defcustom char-fold-ad-hoc '((?`)) Yes, I would think that would work (already). But I could be wrong. Thanks for taking a look at this. ^ permalink raw reply [flat|nested] 33+ messages in thread
* bug#22147: Obsolete search-forward-lax-whitespace 2016-05-15 21:51 ` Drew Adams @ 2016-05-17 20:55 ` Juri Linkov 2016-05-17 21:55 ` Drew Adams 0 siblings, 1 reply; 33+ messages in thread From: Juri Linkov @ 2016-05-17 20:55 UTC (permalink / raw) To: Drew Adams; +Cc: 22147, Artur Malabarba > Another consideration (for me, at least): I think (and hope) that > eventually users will be able to have multiple such lists (sets) > of char mappings that they can choose (and mix and match - sets of > such sets, for different purposes/contexts). IOW, I don't see just > a single set of ad-hoc char mappings. But this is anyway for the > future. Yes, we have to take into consideration that in addition to the plain customizable list we are adding to the next release, in later versions we might also add more customizable lists, e.g. with categories and other character groups. >> Another thing we need to do is to allow customization to remove >> default mappings. Maybe this is possible by using the same >> defcustom with a rule like: remove default mappings when a char >> is mapped to an empty list, e.g. >> >> - adding more mappings for ‘`’: >> (defcustom char-fold-ad-hoc '((?` "❛" "‘" "‛" "" "❮" "‹")) >> >> - removing default mappings for ‘`’: >> (defcustom char-fold-ad-hoc '((?`)) > > Yes, I would think that would work (already). But I could be wrong. > > Thanks for taking a look at this. After long-planned terminology improvements, I'd wait for sync between branches to avoid merge conflicts, and then I'll submit a patch taking into account all opinions about the default value for users who will enable this feature in the next release. ^ permalink raw reply [flat|nested] 33+ messages in thread
* bug#22147: Obsolete search-forward-lax-whitespace 2016-05-17 20:55 ` Juri Linkov @ 2016-05-17 21:55 ` Drew Adams 2016-05-18 3:00 ` Artur Malabarba 0 siblings, 1 reply; 33+ messages in thread From: Drew Adams @ 2016-05-17 21:55 UTC (permalink / raw) To: Juri Linkov; +Cc: 22147, Artur Malabarba > > Another consideration (for me, at least): I think (and hope) that > > eventually users will be able to have multiple such lists (sets) > > of char mappings that they can choose (and mix and match - sets of > > such sets, for different purposes/contexts). IOW, I don't see just > > a single set of ad-hoc char mappings. But this is anyway for the > > future. > > Yes, we have to take into consideration that in addition to the > plain customizable list we are adding to the next release, > in later versions we might also add more customizable lists, > e.g. with categories and other character groups. One possibility is to (now), instead of having an option with a single list of ad-hoc mappings as value, have an option with an alist of such lists as its value, where the car of an alist entry names the particular ad-hoc mapping. See my suggestion in an earlier mail in this thread. > >> Another thing we need to do is to allow customization to remove > >> default mappings. Maybe this is possible by using the same > >> defcustom with a rule like: remove default mappings when a char > >> is mapped to an empty list, e.g. > >> > >> - adding more mappings for ‘`’: > >> (defcustom char-fold-ad-hoc '((?` "❛" "‘" "‛" "" "❮" "‹")) > >> > >> - removing default mappings for ‘`’: > >> (defcustom char-fold-ad-hoc '((?`)) > > > > Yes, I would think that would work (already). But I could be wrong. > > > > Thanks for taking a look at this. > > After long-planned terminology improvements, I'd wait for sync between > branches to avoid merge conflicts, and then I'll submit a patch taking > into account all opinions about the default value for users who will > enable this feature in the next release. Sounds good. Thx. ^ permalink raw reply [flat|nested] 33+ messages in thread
* bug#22147: Obsolete search-forward-lax-whitespace 2016-05-17 21:55 ` Drew Adams @ 2016-05-18 3:00 ` Artur Malabarba 2016-05-18 19:34 ` Juri Linkov 0 siblings, 1 reply; 33+ messages in thread From: Artur Malabarba @ 2016-05-18 3:00 UTC (permalink / raw) To: Drew Adams, Juri Linkov; +Cc: 22147 [-- Attachment #1: Type: text/plain, Size: 2567 bytes --] First of all, thanks for taking the time to help with this, Juri. I'm of the opinion that we should avoid over thinking this feature for the first release. And I'm also of the opinion that complicated custom-vars (like alists of alists) are less helpful than simple custom vars. So I'd strongly prefer if we don't turn this variable into something more complicated. Also (although I'm perfectly in favor of a variable for ad-hoc foldings), I wouldn't mind it if this bug was removed from the release blocking list. It's a feature request, not a proper bug, so the release wouldn't be flawed without it. On Tue, 17 May 2016 6:55 pm Drew Adams, <drew.adams@oracle.com> wrote: > > > Another consideration (for me, at least): I think (and hope) that > > > eventually users will be able to have multiple such lists (sets) > > > of char mappings that they can choose (and mix and match - sets of > > > such sets, for different purposes/contexts). IOW, I don't see just > > > a single set of ad-hoc char mappings. But this is anyway for the > > > future. > > > > Yes, we have to take into consideration that in addition to the > > plain customizable list we are adding to the next release, > > in later versions we might also add more customizable lists, > > e.g. with categories and other character groups. > > One possibility is to (now), instead of having an option with a single > list of ad-hoc mappings as value, have an option with an alist of such > lists as its value, where the car of an alist entry names the particular > ad-hoc mapping. > > See my suggestion in an earlier mail in this thread. > > > >> Another thing we need to do is to allow customization to remove > > >> default mappings. Maybe this is possible by using the same > > >> defcustom with a rule like: remove default mappings when a char > > >> is mapped to an empty list, e.g. > > >> > > >> - adding more mappings for ‘`’: > > >> (defcustom char-fold-ad-hoc '((?` "❛" "‘" "‛" "" "❮" "‹")) > > >> > > >> - removing default mappings for ‘`’: > > >> (defcustom char-fold-ad-hoc '((?`)) > > > > > > Yes, I would think that would work (already). But I could be wrong. > > > > > > Thanks for taking a look at this. > > > > After long-planned terminology improvements, I'd wait for sync between > > branches to avoid merge conflicts, and then I'll submit a patch taking > > into account all opinions about the default value for users who will > > enable this feature in the next release. > > Sounds good. Thx. > [-- Attachment #2: Type: text/html, Size: 3196 bytes --] ^ permalink raw reply [flat|nested] 33+ messages in thread
* bug#22147: Obsolete search-forward-lax-whitespace 2016-05-18 3:00 ` Artur Malabarba @ 2016-05-18 19:34 ` Juri Linkov 2016-05-18 20:40 ` Artur Malabarba 0 siblings, 1 reply; 33+ messages in thread From: Juri Linkov @ 2016-05-18 19:34 UTC (permalink / raw) To: Artur Malabarba; +Cc: 22147 > I'm of the opinion that we should avoid over thinking this feature for the > first release. And I'm also of the opinion that complicated custom-vars > (like alists of alists) are less helpful than simple custom vars. So I'd > strongly prefer if we don't turn this variable into something more > complicated. I agree that we are better off starting with simpler customization, and then gradually adding more layers later when needed. I wonder why you removed defvar mappings from your initial patches with ‘isearch-groups-alist’ and ‘isearch--character-fold-extras’ (a similar variable is also presented in Drew's ‘char-fold-ad-hoc’). Now I tried to reintroduce these lists with different names: ‘char-fold-include-alist’ with a list to add to default mappings and ‘char-fold-exclude-alist’ with a list to remove from default mappings taking into account all opinions expressed on emacs-devel for the default values: diff --git a/lisp/char-fold.el b/lisp/char-fold.el index 68bea29..68d1eb0 100644 --- a/lisp/char-fold.el +++ b/lisp/char-fold.el @@ -22,10 +22,68 @@ ;;; Code: -(eval-and-compile (put 'char-fold-table 'char-table-extra-slots 1)) +(put 'char-fold-table 'char-table-extra-slots 1) \f -(defconst char-fold-table - (eval-when-compile +(defcustom char-fold-include-alist + (append + '((?\" """ "“" "”" "”" "„" "⹂" "〞" "‟" "‟" "❞" "❝" "❠" "“" "„" "〝" "〟" "🙷" "🙶" "🙸" "«" "»") + (?' "❟" "❛" "❜" "‘" "’" "‚" "‛" "‚" "" "❮" "❯" "‹" "›") + (?` "❛" "‘" "‛" "" "❮" "‹") + (?→ "->") (?⇒ "=>") + (?1 "⒈") (?2 "⒉") (?3 "⒊") (?4 "⒋") (?5 "⒌") (?6 "⒍") (?7 "⒎") (?8 "⒏") (?9 "⒐") (?0 "🄀") + ) + (unless (string-match-p "^\\(?:da\\|n[obn]\\)" (getenv "LANG")) + '((?o "ø") + (?O "Ø"))) + (unless (string-match-p "^pl" (getenv "LANG")) + '((?l "ł") + (?L "Ł"))) + (unless (string-match-p "^de" (getenv "LANG")) + '((?ß "ss"))) + ) + "Ad hoc character foldings. +Each entry is a list of a character and the strings that fold into it." + :set (lambda (symbol value) + (custom-set-default symbol value) + (with-no-warnings + (setq char-fold-table (make-char-fold-table)))) + :initialize 'custom-initialize-default + :type '(repeat (cons + (character :tag "Fold to character") + (repeat (string :tag "Fold from string")))) + :version "25.1" + :group 'isearch) + +(defcustom char-fold-exclude-alist + (append + (when (string-match-p "^es" (getenv "LANG")) + '((?n "ñ") + (?N "Ñ"))) + (when (string-match-p "^\\(?:sv\\|fi\\|et\\)" (getenv "LANG")) + '((?a "ä") + (?A "Ä") + (?o "ö") + (?O "Ö"))) + (when (string-match-p "^\\(?:sv\\|da\\|n[obn]\\)" (getenv "LANG")) + '((?a "å") + (?A "Å"))) + (when (string-match-p "^ru" (getenv "LANG")) + '((?и "й") + (?И "Й")))) + "Character foldings to remove from default mappings. +Each entry is a list of a character and the strings that unfold from it." + :set (lambda (symbol value) + (custom-set-default symbol value) + (with-no-warnings + (setq char-fold-table (make-char-fold-table)))) + :initialize 'custom-initialize-default + :type '(repeat (cons + (character :tag "Unfold to character") + (repeat (string :tag "Unfold from string")))) + :version "25.1" + :group 'isearch) + +(defun make-char-fold-table () (let ((equiv (make-char-table 'char-fold-table)) (equiv-multi (make-char-table 'char-fold-table)) (table (unicode-property-table-internal 'decomposition))) @@ -58,9 +116,11 @@ char-fold-table ;; If there's no formatting tag, ensure that char matches ;; its decomp exactly. This is because we want 'ä' to ;; match 'ä', but we don't want '¹' to match '1'. + (unless (and (assq char char-fold-exclude-alist) + (member (apply #'string decomp) (assq char char-fold-exclude-alist))) (aset equiv char (cons (apply #'string decomp) - (aref equiv char)))) + (aref equiv char))))) ;; Allow the entire decomp to match char. If decomp has ;; multiple characters, this is done by adding an entry @@ -74,9 +134,11 @@ char-fold-table (cons (cons (apply #'string (cdr decomp)) (regexp-quote (string char))) (aref equiv-multi (car decomp)))) + (unless (and (assq (car decomp) char-fold-exclude-alist) + (member (char-to-string char) (assq (car decomp) char-fold-exclude-alist))) (aset equiv (car decomp) (cons (char-to-string char) - (aref equiv (car decomp)))))))) + (aref equiv (car decomp))))))))) (funcall make-decomp-match-char decomp char) ;; Do it again, without the non-spacing characters. ;; This allows 'a' to match 'ä'. @@ -98,9 +160,7 @@ char-fold-table table) ;; Add some manual entries. - (dolist (it '((?\" """ "“" "”" "”" "„" "⹂" "〞" "‟" "‟" "❞" "❝" "❠" "“" "„" "〝" "〟" "🙷" "🙶" "🙸" "«" "»") - (?' "❟" "❛" "❜" "‘" "’" "‚" "‛" "‚" "" "❮" "❯" "‹" "›") - (?` "❛" "‘" "‛" "" "❮" "‹"))) + (dolist (it char-fold-include-alist) (let ((idx (car it)) (chars (cdr it))) (aset equiv idx (append chars (aref equiv idx))))) @@ -114,6 +174,9 @@ char-fold-table (aset equiv char re)))) equiv) equiv)) + +(defvar char-fold-table + (make-char-fold-table) "Used for folding characters of the same group during search. This is a char-table with the `char-fold-table' subtype. ^ permalink raw reply related [flat|nested] 33+ messages in thread
* bug#22147: Obsolete search-forward-lax-whitespace 2016-05-18 19:34 ` Juri Linkov @ 2016-05-18 20:40 ` Artur Malabarba 2016-05-30 20:57 ` Juri Linkov 0 siblings, 1 reply; 33+ messages in thread From: Artur Malabarba @ 2016-05-18 20:40 UTC (permalink / raw) To: Juri Linkov; +Cc: 22147 Juri Linkov <juri@linkov.net> writes: > Now I tried to reintroduce these lists with different names: > ‘char-fold-include-alist’ with a list to add to default mappings and > ‘char-fold-exclude-alist’ with a list to remove from default mappings > taking into account all opinions expressed on emacs-devel for the > default values: Sounds good! Some minor comments: > +(defun make-char-fold-table () Call this `char-fold--make-table' > + (unless (and (assq char char-fold-exclude-alist) > + (member (apply #'string decomp) (assq char char-fold-exclude-alist))) This call to `member' will run dozens of times for each entry in `char-fold-exclude-alist'. Maybe we should optimize those two repeated forms: `(apply #'string decomp)' and `(assq char char-fold-exclude-alist)'. > - (dolist (it '((?\" """ "“" "”" "”" "„" "⹂" "〞" "‟" "‟" "❞" "❝" "❠" "“" "„" "〝" "〟" "🙷" "🙶" "🙸" "«" "»") > - (?' "❟" "❛" "❜" "‘" "’" "‚" "‛" "‚" "" "❮" "❯" "‹" "›") > - (?` "❛" "‘" "‛" "" "❮" "‹"))) > + (dolist (it char-fold-include-alist) > (let ((idx (car it)) The indentation looks wrong here. ^ permalink raw reply [flat|nested] 33+ messages in thread
* bug#22147: Obsolete search-forward-lax-whitespace 2016-05-18 20:40 ` Artur Malabarba @ 2016-05-30 20:57 ` Juri Linkov 2016-06-01 15:03 ` Artur Malabarba 0 siblings, 1 reply; 33+ messages in thread From: Juri Linkov @ 2016-05-30 20:57 UTC (permalink / raw) To: Artur Malabarba; +Cc: 22147 >> Now I tried to reintroduce these lists with different names: >> ‘char-fold-include-alist’ with a list to add to default mappings and >> ‘char-fold-exclude-alist’ with a list to remove from default mappings >> taking into account all opinions expressed on emacs-devel for the >> default values: > > Sounds good! Some minor comments: > >> +(defun make-char-fold-table () > > Call this `char-fold--make-table' > >> + (unless (and (assq char char-fold-exclude-alist) >> + (member (apply #'string decomp) (assq char char-fold-exclude-alist))) > > This call to `member' will run dozens of times for each entry in > `char-fold-exclude-alist'. Maybe we should optimize those two repeated > forms: `(apply #'string decomp)' and `(assq char char-fold-exclude-alist)'. This definitely needs to be optimized, but now it's clear there is no hurry since this is not going to be released in 25. Moreover, I get occasional crashes in char-tables with the latest patch, so it was a good thing not to push it to the release branch at the last minute. ^ permalink raw reply [flat|nested] 33+ messages in thread
* bug#22147: Obsolete search-forward-lax-whitespace 2016-05-30 20:57 ` Juri Linkov @ 2016-06-01 15:03 ` Artur Malabarba 2020-09-05 14:54 ` Lars Ingebrigtsen 0 siblings, 1 reply; 33+ messages in thread From: Artur Malabarba @ 2016-06-01 15:03 UTC (permalink / raw) To: Juri Linkov; +Cc: 22147 Juri Linkov <juri@linkov.net> writes: > This definitely needs to be optimized, but now it's clear there is no hurry > since this is not going to be released in 25. Moreover, I get occasional crashes > in char-tables with the latest patch, so it was a good thing not to push > it to the release branch at the last minute. Indeed. It wasn't a must-have anyway. At least now we have more time to play around with this. ^ permalink raw reply [flat|nested] 33+ messages in thread
* bug#22147: Obsolete search-forward-lax-whitespace 2016-06-01 15:03 ` Artur Malabarba @ 2020-09-05 14:54 ` Lars Ingebrigtsen 2020-09-07 18:34 ` Juri Linkov 0 siblings, 1 reply; 33+ messages in thread From: Lars Ingebrigtsen @ 2020-09-05 14:54 UTC (permalink / raw) To: Artur Malabarba; +Cc: 22147, Juri Linkov Artur Malabarba <bruce.connor.am@gmail.com> writes: > Juri Linkov <juri@linkov.net> writes: > >> This definitely needs to be optimized, but now it's clear there is no hurry >> since this is not going to be released in 25. Moreover, I get >> occasional crashes >> in char-tables with the latest patch, so it was a good thing not to push >> it to the release branch at the last minute. > > Indeed. It wasn't a must-have anyway. At least now we have more time to > play around with this. I've just skimmed this bug report, but it seems like a different version of the proposed patch was applied three years later: commit 376f5df3cca0dbf186823e5b329d76b52019473d Author: Juri Linkov <juri@linkov.net> AuthorDate: Tue Jul 23 23:27:28 2019 +0300 Customizable char-fold with char-fold-symmetric, char-fold-include (bug#35689) search-forward-lax-whitespace still isn't obsolete, but I'm unsure whether there's anything more to do in this bug report? -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 33+ messages in thread
* bug#22147: Obsolete search-forward-lax-whitespace 2020-09-05 14:54 ` Lars Ingebrigtsen @ 2020-09-07 18:34 ` Juri Linkov 0 siblings, 0 replies; 33+ messages in thread From: Juri Linkov @ 2020-09-07 18:34 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: 22147 tags 22147 fixed close 22147 28.0.50 quit > I've just skimmed this bug report, but it seems like a different version > of the proposed patch was applied three years later: > > commit 376f5df3cca0dbf186823e5b329d76b52019473d > Author: Juri Linkov <juri@linkov.net> > AuthorDate: Tue Jul 23 23:27:28 2019 +0300 > > Customizable char-fold with char-fold-symmetric, char-fold-include (bug#35689) > > search-forward-lax-whitespace still isn't obsolete, but I'm unsure > whether there's anything more to do in this bug report? Let's see if the requested feature works now: 0. emacs -Q 1. eval (setq search-whitespace-regexp "\\(?:\\s-\\|\n\\)+") (require 'char-fold) (setq-default search-default-mode 'char-fold-to-regexp) (setq char-fold-symmetric t) 2. then ‘C-s 1 C-q C-j 2 C-s’ finds both occurrences: 1 2 1 2 Oh, wait! This works because I have an uninstalled patch from bug#38539. Now pushed to master, and closed both reports. ^ permalink raw reply [flat|nested] 33+ messages in thread
* bug#22147: Obsolete search-forward-lax-whitespace 2015-12-16 0:57 ` Juri Linkov 2015-12-16 1:47 ` Drew Adams @ 2015-12-16 10:59 ` Artur Malabarba 2015-12-17 0:57 ` Juri Linkov 1 sibling, 1 reply; 33+ messages in thread From: Artur Malabarba @ 2015-12-16 10:59 UTC (permalink / raw) To: Juri Linkov; +Cc: 22147 [-- Attachment #1: Type: text/plain, Size: 346 bytes --] On 16 Dec 2015 12:57 am, "Juri Linkov" <juri@linkov.net> wrote: > > I mean a char-folding customization that allows a search > for “ä” match “a”. Is this already possible? Not yet. I do want to expose more char folding options, but I want to wait for emacs-25 to come out first, to see if and how people will use this feature. [-- Attachment #2: Type: text/html, Size: 460 bytes --] ^ permalink raw reply [flat|nested] 33+ messages in thread
* bug#22147: Obsolete search-forward-lax-whitespace 2015-12-16 10:59 ` Artur Malabarba @ 2015-12-17 0:57 ` Juri Linkov 2015-12-17 16:33 ` Artur Malabarba 0 siblings, 1 reply; 33+ messages in thread From: Juri Linkov @ 2015-12-17 0:57 UTC (permalink / raw) To: Artur Malabarba; +Cc: 22147 >> I mean a char-folding customization that allows a search >> for “ä” match “a”. Is this already possible? > > Not yet. > I do want to expose more char folding options, but I want to wait for > emacs-25 to come out first, to see if and how people will use this feature. char-fold-symmetric could wait for later, but we definitely need char-fold-ad-hoc now before the release because the users should be able to customize the default rules. ^ permalink raw reply [flat|nested] 33+ messages in thread
* bug#22147: Obsolete search-forward-lax-whitespace 2015-12-17 0:57 ` Juri Linkov @ 2015-12-17 16:33 ` Artur Malabarba 2015-12-17 17:21 ` Drew Adams 0 siblings, 1 reply; 33+ messages in thread From: Artur Malabarba @ 2015-12-17 16:33 UTC (permalink / raw) To: Juri Linkov; +Cc: 22147 [-- Attachment #1: Type: text/plain, Size: 782 bytes --] On 17 Dec 2015 12:57 am, "Juri Linkov" <juri@linkov.net> wrote: > > >> I mean a char-folding customization that allows a search > >> for “ä” match “a”. Is this already possible? > > > > Not yet. > > I do want to expose more char folding options, but I want to wait for > > emacs-25 to come out first, to see if and how people will use this feature. > > char-fold-symmetric could wait for later, but we definitely need > char-fold-ad-hoc now before the release because the users should be > able to customize the default rules. Indeed. 👍 Once we do that, we also need a variable to determine whether we should derive the default table from the unicode standard (like we currently do) or just use an empty default with the ad-hoc rules slapped on top. [-- Attachment #2: Type: text/html, Size: 971 bytes --] ^ permalink raw reply [flat|nested] 33+ messages in thread
* bug#22147: Obsolete search-forward-lax-whitespace 2015-12-17 16:33 ` Artur Malabarba @ 2015-12-17 17:21 ` Drew Adams 2015-12-17 18:47 ` Artur Malabarba 0 siblings, 1 reply; 33+ messages in thread From: Drew Adams @ 2015-12-17 17:21 UTC (permalink / raw) To: bruce.connor.am, Juri Linkov; +Cc: 22147 >> char-fold-symmetric could wait for later, but we definitely need >> char-fold-ad-hoc now before the release because the users should be >> able to customize the default rules. > > Indeed. Once we do that, we also need a variable to determine > whether we should derive the default table from the unicode > standard (like we currently do) or just use an empty default with > the ad-hoc rules slapped on top. Users should be able to define their own equivalence classes (groups), not just one class. Each class should be the value of a user option. Here is one simple and flexible way to do this: 1. Define a user option, `char-folding-classes', which is a list of any number of (OPTION-NAME DOC-STRING) pairs, where OPTION-NAME is a symbol that will name a user option and DOC-STRING is its doc string. Each symbol would automatically be used to define an option (a defcustom) that the user can then use to define a given equivalence class. 2. The generated defcustom for each user option specified in option `char-folding-classes' would allow for any number of entries, each of which could be a `choice' of either of these defcustom types: a. An alist, such as used currently in my `char-fold-ad-hoc' option: Each entry is a list of a char and the strings that fold into it. b. A function that populates such an alist. The default value of `char-folding-classes' would be something like this: ((char-fold-diacriticals "Classes of chars equivalent because they have the same base char.") (char-fold-quotations "Classes of equivalent quotation-mark characters.")) Option `char-fold-diacriticals' would have as its default value a function that returns the alist of diacritical-equivalent classes that we provide today. Its code would be derived from what we use today. (If needed, a user can replace the function with another that defines some of the classes differently or that provides only a subset of the classes we provide today. But most users would probably not customize this option.) Option `char-fold-quotations' would have as its default value what I use as the default value of my `char-fold-ad-hoc', which is an alist of the quotation-mark equivalences provided today by character-fold.el: ((?\" """ "“" "”" "”" "„" "⹂" "〞" "‟" "‟" "❞" "❝" "❠" "“" "„" "〝" "〟" "🙷" "🙶" "🙸" "«" "»") (?' "❟" "❛" "❜" "‘" "’" "‚" "‛" "‚" "" "❮" "❯" "‹" "›") (?` "❛" "‘" "‛" "" "❮" "‹")) Having an option that lets users define any number of classes, and having each class be defined by a user option, is flexible. Having multiple classes, each associated with a variable (option), lets users and libraries easily enable/disable different equivalence classes in different contexts. ^ permalink raw reply [flat|nested] 33+ messages in thread
* bug#22147: Obsolete search-forward-lax-whitespace 2015-12-17 17:21 ` Drew Adams @ 2015-12-17 18:47 ` Artur Malabarba 2015-12-17 22:16 ` Drew Adams 0 siblings, 1 reply; 33+ messages in thread From: Artur Malabarba @ 2015-12-17 18:47 UTC (permalink / raw) To: Drew Adams; +Cc: 22147, Juri Linkov 2015-12-17 17:21 GMT+00:00 Drew Adams <drew.adams@oracle.com>: >>> char-fold-symmetric could wait for later, but we definitely need >>> char-fold-ad-hoc now before the release because the users should be >>> able to customize the default rules. >> >> Indeed. Once we do that, we also need a variable to determine >> whether we should derive the default table from the unicode >> standard (like we currently do) or just use an empty default with >> the ad-hoc rules slapped on top. > > Users should be able to define their own equivalence classes (groups), > not just one class. Each class should be the value of a user option. > > Here is one simple and flexible way to do this: > > 1. Define a user option, `char-folding-classes', which is a list of > any number of (OPTION-NAME DOC-STRING) pairs, where OPTION-NAME > is a symbol that will name a user option and DOC-STRING is its doc > string. > > Each symbol would automatically be used to define an option (a > defcustom) that the user can then use to define a given equivalence > class. > > 2. The generated defcustom for each user option specified in option > `char-folding-classes' would allow for any number of entries, each > of which could be a `choice' of either of these defcustom types: > > a. An alist, such as used currently in my `char-fold-ad-hoc' option: > Each entry is a list of a char and the strings that fold into it. > > b. A function that populates such an alist. I appreciate you probably put quite a bit of thought into this, but IMO this would be over-engineering. I think we should define two simpole defcustoms that determine how the character-fold-table is generated: character-fold-ad-hoc (an alist) and character-fold-derive-from-unicode-decomposition (a boolean). This should be immediately configurable by anyone, without requiring a big initial investment. Then we also make character-fold-table into a defvar, and document it as a proper exposed API, so advanced users can change it however they want with hooks and local vars to however many different values/equiv-classes they want. This would offer a dead-simple defcustom that covers most cases, while still allowing the versatility of having multiple options for those who need it. ^ permalink raw reply [flat|nested] 33+ messages in thread
* bug#22147: Obsolete search-forward-lax-whitespace 2015-12-17 18:47 ` Artur Malabarba @ 2015-12-17 22:16 ` Drew Adams 2015-12-18 0:55 ` Artur Malabarba 0 siblings, 1 reply; 33+ messages in thread From: Drew Adams @ 2015-12-17 22:16 UTC (permalink / raw) To: bruce.connor.am; +Cc: 22147, Juri Linkov > > Users should be able to define their own equivalence classes (groups), > > not just one class. Each class should be the value of a user option. > > > > Here is one simple and flexible way to do this: > > > > 1. Define a user option, `char-folding-classes', which is a list of > > any number of (OPTION-NAME DOC-STRING) pairs, where OPTION-NAME > > is a symbol that will name a user option and DOC-STRING is its doc > > string. > > > > Each symbol would automatically be used to define an option (a > > defcustom) that the user can then use to define a given equivalence > > class. > > > > 2. The generated defcustom for each user option specified in option > > `char-folding-classes' would allow for any number of entries, each > > of which could be a `choice' of either of these defcustom types: > > > > a. An alist, such as used currently in my `char-fold-ad-hoc' option: > > Each entry is a list of a char and the strings that fold into it. > > > > b. A function that populates such an alist. > > I appreciate you probably put quite a bit of thought into this, Only a few minutes of thought, as I imagine you can guess. It just extends what I already have in character-fold.el. > but IMO this would be over-engineering. How so? I've done zero "engineering" on it. And I don't really care how it gets done, as long as it does. My point, as I said, is only this: Users should be able to define their own equivalence classes (groups), not just one class. Each class should be the value of a user option. Anything less than that is not serving users as well as they deserve, IMO. As to how that is done, I really don't care. I offered one simple approach, but you are welcome to over-, under- or just-right-engineer it your own way. > I think we should define two simpole defcustoms that determine how the > character-fold-table is generated: character-fold-ad-hoc (an alist) > and character-fold-derive-from-unicode-decomposition (a boolean). > This should be immediately configurable by anyone, That's far too restrictive, IMO. It does not let users or libraries easily apply different equivalence classes for different uses (e.g. modes). And there is no reason for such restriction - nothing is gained by it. > without requiring a big initial investment. There is no "big initial investment" to what I described. I can code it up quickly, as I'm sure you can too. And what it provides out of the box is exactly the same. It is just as "immediately configurable by anyone" - and immediately configurable in exactly the same way. Your Boolean with a default value of t is equivalent to the default presence of the function that does what your Boolean t does: "derive-from-unicode-decomposition". You can do more with what I described, and more easily. But you can also do just as little with it. > Then we also make character-fold-table into a defvar, and document it > as a proper exposed API, so advanced users Anything that can be a defvar, for "advanced users", can be a defcustom, for all users. If you are inviting users to fiddle with a char-fold table, it is far better to give them the ability to do so in a modular way, and to make your default derive-from-unicode-decomposition into a default function instead of just hard-coding the behavior. Nothing lost, modularity and flexibility gained. > can change it however they want with hooks and local vars to however > many different values/equiv-classes they want. Ugly, and complicated. And unnecessary. No need to be an "advanced user" and fiddle with such stuff. > This would offer a dead-simple defcustom that covers most cases, while > still allowing the versatility of having multiple options for those > who need it. What I proposed is just as "dead-simple", but cleaner (IMO) and open to all users. Just as importantly, it lets users (easily) define multiple classes that they can (easily) use in different contexts. Again, I don't care about the implementation, but I would like users to be able to define their own equivalence classes (groups), and to enable/disable them easily au choix. ^ permalink raw reply [flat|nested] 33+ messages in thread
* bug#22147: Obsolete search-forward-lax-whitespace 2015-12-17 22:16 ` Drew Adams @ 2015-12-18 0:55 ` Artur Malabarba 0 siblings, 0 replies; 33+ messages in thread From: Artur Malabarba @ 2015-12-18 0:55 UTC (permalink / raw) To: Drew Adams; +Cc: 22147, Juri Linkov [out of order quotes below] 2015-12-17 22:16 GMT+00:00 Drew Adams <drew.adams@oracle.com>: >> This would offer a dead-simple defcustom that covers most cases, while >> still allowing the versatility of having multiple options for those >> who need it. > > What I proposed is just as "dead-simple", but cleaner (IMO) and open > to all users. Just as importantly, it lets users (easily) define > multiple classes that they can (easily) use in different contexts. And this is the source of our impasse. IMO (and I say this will all due respect) your proposal is not as simple as the two defcustoms I suggested, and it is not cleaner than just using hooks/local-vars to set the value of character-fold-table to whatever is relevant for the current situation. Since we're both just stating opinions, it's unlikely this discussion will go anywhere. > My point, as I said, is only this: > > Users should be able to define their own equivalence classes (groups), > not just one class. Each class should be the value of a user option. > > Anything less than that is not serving users as well as they deserve, IMO. And my point is that this is too complex for user options. Most people won't need this much generality, and the amount of time these people will waste trying to understand this multi-option configuration will be significant. The few who want this behavior will be glad that we offered it, but the time it will save them (compared to if they wrote something in elisp) will be (IMO) small compared to the total accumulated wasted time for everyone else. ^ permalink raw reply [flat|nested] 33+ messages in thread
end of thread, other threads:[~2020-09-07 18:34 UTC | newest] Thread overview: 33+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-12-11 23:52 bug#22147: Obsolete search-forward-lax-whitespace Juri Linkov 2015-12-12 0:44 ` Artur Malabarba 2015-12-12 23:31 ` Juri Linkov 2015-12-13 0:29 ` Artur Malabarba 2015-12-14 0:23 ` Juri Linkov 2015-12-14 1:11 ` Artur Malabarba 2015-12-14 23:58 ` Juri Linkov 2015-12-15 10:15 ` Artur Malabarba 2015-12-16 0:57 ` Juri Linkov 2015-12-16 1:47 ` Drew Adams 2016-05-14 20:45 ` Juri Linkov 2016-05-14 22:20 ` Artur Malabarba 2016-05-14 22:27 ` Drew Adams 2016-05-15 20:45 ` Juri Linkov 2016-05-14 22:22 ` Drew Adams 2016-05-15 20:56 ` Juri Linkov 2016-05-15 21:51 ` Drew Adams 2016-05-17 20:55 ` Juri Linkov 2016-05-17 21:55 ` Drew Adams 2016-05-18 3:00 ` Artur Malabarba 2016-05-18 19:34 ` Juri Linkov 2016-05-18 20:40 ` Artur Malabarba 2016-05-30 20:57 ` Juri Linkov 2016-06-01 15:03 ` Artur Malabarba 2020-09-05 14:54 ` Lars Ingebrigtsen 2020-09-07 18:34 ` Juri Linkov 2015-12-16 10:59 ` Artur Malabarba 2015-12-17 0:57 ` Juri Linkov 2015-12-17 16:33 ` Artur Malabarba 2015-12-17 17:21 ` Drew Adams 2015-12-17 18:47 ` Artur Malabarba 2015-12-17 22:16 ` Drew Adams 2015-12-18 0:55 ` Artur Malabarba
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/emacs.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).