* bug#52769: 29.0.50; [FEATURE REQUEST] repunctuate-sentences in region
@ 2021-12-24 10:13 Rudolf Adamkovič via Bug reports for GNU Emacs, the Swiss army knife of text editors
2021-12-25 19:04 ` Juri Linkov
2021-12-28 19:20 ` Juri Linkov
0 siblings, 2 replies; 6+ messages in thread
From: Rudolf Adamkovič via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2021-12-24 10:13 UTC (permalink / raw)
To: 52769
I often copy some text from elsewhere to Emacs, and I would like to
re-punctuate it. As a user, I would expect it to work as follows:
(1) mark a position to form a region and
(2) call 'repunctuate-sentences'.
'repunctuate-sentences' documentation says:
> Put two spaces at the end of sentences from point to the end of
> buffer. It works using query-replace-regexp.
… and 'query-replace-regexp' documentation says:
> In Transient Mark mode, if the mark is active, operate on the contents
> of the region. Otherwise, operate from point to the end of the
> buffer's accessible portion.
Both functions work as documented, but as a user, I often need to
'repunctuate-sentences' in a region.
Could we improve 'repunctuate-sentences' to work such that in Transient
Mark mode and with mark active, it re-punctuates the contents of the
region?
Thank you.
Rudy
Thank you!
In GNU Emacs 29.0.50 (build 5, x86_64-apple-darwin21.2.0, NS appkit-2113.20 Version 12.1 (Build 21C52))
of 2021-12-23 built on Workstation.local
Repository revision: 2fa7feca336dd16c57ffef072e0f0da6fffe4c5f
Repository branch: master
Windowing system distributor 'Apple', version 10.3.2113
System Description: macOS 12.1
Configured using:
'configure --with-json --with-xwidgets --with-native-compilation'
Configured features:
ACL DBUS GIF GLIB GMP GNUTLS JPEG JSON LCMS2 LIBXML2 MODULES NATIVE_COMP
NOTIFY KQUEUE NS PDUMPER PNG RSVG SQLITE3 THREADS TIFF
TOOLKIT_SCROLL_BARS WEBP XIM XWIDGETS ZLIB
Important settings:
value of $LC_ALL: en_US.UTF-8
locale-coding-system: utf-8-unix
Major mode: Helpful
Minor modes in effect:
emms-mode-line-mode: t
telega-root-auto-fill-mode: t
telega-active-locations-mode: t
telega-patrons-mode: t
telega-mode-line-mode: t
TeX-PDF-mode: t
global-git-commit-mode: t
magit-auto-revert-mode: t
shell-dirtrack-mode: t
corfu-global-mode: t
corfu-mode: t
vertico-mode: t
marginalia-mode: t
global-diff-hl-mode: t
yas-global-mode: t
yas-minor-mode: t
global-hl-todo-mode: t
global-subword-mode: t
subword-mode: t
save-place-mode: t
global-auto-revert-mode: t
delete-selection-mode: t
savehist-mode: t
tooltip-mode: t
global-eldoc-mode: t
show-paren-mode: t
electric-indent-mode: t
mouse-wheel-mode: t
menu-bar-mode: t
file-name-shadow-mode: t
global-font-lock-mode: t
font-lock-mode: t
auto-composition-mode: t
auto-encryption-mode: t
auto-compression-mode: t
buffer-read-only: t
size-indication-mode: t
column-number-mode: t
line-number-mode: t
transient-mark-mode: t
Load-path shadows:
/Users/salutis/.emacs.d/elpa/transient-20211208.1819/transient hides /Users/salutis/src/emacs/nextstep/Emacs.app/Contents/Resources/lisp/transient
/Users/salutis/src/emacs/nextstep/Emacs.app/Contents/Resources/lisp/emacs-lisp/eieio-compat hides /Users/salutis/src/emacs/nextstep/Emacs.app/Contents/Resources/lisp/obsolete/eieio-compat
Features:
(shadow bbdb-message mail-extr view helpful trace edebug info-look
help-fns radix-tree elisp-refs tramp-cmds vterm tramp tramp-loaddefs
trampver tramp-integration files-x tramp-compat term ehelp vterm-module
term/xterm xterm eglot array jsonrpc ert debug backtrace xref pcase
mhtml-mode css-mode smie js sgml-mode facemenu org-duration org-pomodoro
alert log4e gntp org-timer hl-line emms-mode-line network-stream nsm
emms-player-mpd emms-url tq emms-player-simple emms-browser sort
emms-playlist-sort emms-last-played emms-volume emms-volume-sndioctl
emms-volume-mixerctl emms-volume-pulse emms-volume-amixer
emms-playlist-mode emms-source-playlist emms-source-file locate
emms-cache emms-info emms-later-do emms emms-compat ox-md ox-odt rng-loc
rng-uri rng-parse rng-match rng-dt rng-util rng-pttrn nxml-parse nxml-ns
nxml-enc xmltok nxml-util ox-latex ox-icalendar org-agenda ox-html table
ox-ascii ox-publish ox citar-org oc-csl citeproc citeproc-itemgetters
citeproc-biblatex citeproc-bibtex citeproc-cite citeproc-subbibs
citeproc-sort citeproc-name citeproc-formatters citeproc-number rst
citeproc-proc citeproc-disamb citeproc-itemdata
citeproc-generic-elements citeproc-macro citeproc-choose citeproc-date
citeproc-context citeproc-prange citeproc-style citeproc-locale
citeproc-term f citeproc-rt citeproc-lib citeproc-s let-alist queue
org-id org-refile citar s parsebib citar-file misearch multi-isearch
telega-obsolete telega telega-tdlib-events telega-webpage
visual-fill-column telega-root telega-info telega-chat telega-modes
telega-company telega-user telega-notifications notifications
telega-voip telega-msg telega-tme telega-sticker telega-i18n
telega-vvnote bindat telega-ffplay telega-media telega-sort
telega-filter telega-ins telega-folders telega-inline telega-tdlib
telega-util rainbow-identifiers dired-aux color telega-server
telega-core telega-customize cus-edit cus-start cus-load emacsbug
sendmail goto-addr bug-reference preview tex-buf font-latex latex
latex-flymake tex-ispell tex-style tex texmathp tex-mode flymake-proc
flymake compile image-file image-converter disp-table magit-extras
char-fold face-remap magit-bookmark magit-submodule magit-obsolete
magit-blame magit-stash magit-reflog magit-bisect magit-push magit-pull
magit-fetch magit-clone magit-remote magit-commit magit-sequence
magit-notes magit-worktree magit-tag magit-merge magit-branch
magit-reset magit-files magit-refs magit-status magit magit-repos
magit-apply magit-wip magit-log which-func imenu magit-diff smerge-mode
diff git-commit log-edit add-log magit-core magit-autorevert
magit-margin magit-transient magit-process with-editor shell server
magit-mode transient magit-git magit-section magit-utils crm dash
orderless cursor-sensor vc-mtn vc-hg vc-bzr vc-src vc-sccs vc-svn vc-cvs
vc-rcs project consult-vertico consult recentf tree-widget paredit
edmacro kmacro bbdb bbdb-site timezone modus-vivendi-theme
modus-operandi-theme modus-themes corfu vertico marginalia pdf-loader
diff-hl log-view pcvs-util vc-dir ewoc vc diminish yasnippet hl-todo
finder-inf fortune display-fill-column-indicator ob-sqlite ob-sql ob-C
cc-mode cc-fonts cc-guess cc-menus cc-cmds cc-styles cc-align cc-engine
cc-vars cc-defs ob-R org-clock cl ls-lisp cap-words superword subword
saveplace autorevert filenotify comp comp-cstr warnings delsel savehist
elfeed-link elfeed-show elfeed-search elfeed-csv elfeed elfeed-curl
elfeed-log xml-query bookmark pp elfeed-db elfeed-lib vc-git diff-mode
vc-dispatcher org-element avl-tree generator ol-eww eww xdg url-queue
thingatpt mm-url ol-rmail ol-mhe ol-irc ol-info ol-gnus nnselect
gnus-search eieio-opt speedbar ezimage dframe gnus-art mm-uu mml2015
mm-view mml-smime smime dig gnus-sum shr pixel-fill kinsoku svg dom
gnus-group gnus-undo gnus-start gnus-dbus dbus xml gnus-cloud nnimap
nnmail mail-source utf7 netrc nnoo parse-time gnus-spec gnus-int
gnus-range message yank-media rmc puny rfc822 mml mml-sec epa derived
epg rfc6068 epg-config mm-decode mm-bodies mm-encode mail-parse rfc2231
rfc2047 rfc2045 ietf-drums mailabbrev gmm-utils mailheader gnus-win gnus
nnheader gnus-util text-property-search mail-utils mm-util mail-prsvr
wid-edit ol-docview doc-view jka-compr image-mode exif dired
dired-loaddefs ol-bibtex ol-bbdb ol-w3m ol-doi org-link-doi cl-extra
help-mode org ob ob-tangle ob-ref ob-lob ob-table ob-exp org-macro
org-footnote org-src ob-comint org-pcomplete pcomplete comint ansi-color
ring org-list org-faces org-entities noutline outline easy-mmode
org-version ob-emacs-lisp ob-core ob-eval org-table oc-basic bibtex
iso8601 time-date ol rx org-keys oc org-compat advice org-macs
org-loaddefs format-spec find-func cal-menu calendar cal-loaddefs
tex-site info package browse-url url url-proxy url-privacy url-expand
url-methods url-history url-cookie url-domsuf url-util mailcap
url-handlers url-parse auth-source cl-seq eieio eieio-core cl-macs
eieio-loaddefs password-cache json map url-vars seq gv subr-x byte-opt
bytecomp byte-compile cconv cl-loaddefs cl-lib iso-transl tooltip eldoc
paren electric uniquify ediff-hook vc-hooks lisp-float-type elisp-mode
mwheel term/ns-win ns-win ucs-normalize mule-util term/common-win
tool-bar dnd fontset image regexp-opt fringe tabulated-list replace
newcomment text-mode lisp-mode prog-mode register page tab-bar menu-bar
rfn-eshadow isearch easymenu timer select scroll-bar mouse jit-lock
font-lock syntax font-core term/tty-colors frame minibuffer cl-generic
cham georgian utf-8-lang misc-lang vietnamese tibetan thai tai-viet lao
korean japanese eucjp-ms cp51932 hebrew greek romanian slovak czech
european ethiopic indian cyrillic chinese composite emoji-zwj charscript
charprop case-table epa-hook jka-cmpr-hook help simple abbrev obarray
cl-preloaded nadvice button loaddefs faces cus-face macroexp files
window text-properties overlay sha1 md5 base64 format env code-pages
mule custom widget keymap hashtable-print-readable backquote threads
xwidget-internal dbusbind kqueue cocoa ns lcms2 multi-tty
make-network-process native-compile emacs)
Memory information:
((conses 16 1273795 147391)
(symbols 48 61371 3)
(strings 32 344437 34383)
(string-bytes 1 11812916)
(vectors 16 121413)
(vector-slots 8 2818587 91957)
(floats 8 11575 522)
(intervals 56 13645 8634)
(buffers 992 47))
--
"Programming reliably --- must be an activity of an undeniably mathematical nature […] You see, mathematics is about thinking, and doing mathematics is always trying to think as well as possible." -- Edsger W. Dijkstra (1981)
Rudolf Adamkovič <salutis@me.com> [he/him]
Studenohorská 25
84103 Bratislava
Slovakia
^ permalink raw reply [flat|nested] 6+ messages in thread
* bug#52769: 29.0.50; [FEATURE REQUEST] repunctuate-sentences in region
2021-12-24 10:13 bug#52769: 29.0.50; [FEATURE REQUEST] repunctuate-sentences in region Rudolf Adamkovič via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2021-12-25 19:04 ` Juri Linkov
2021-12-28 19:28 ` Juri Linkov
2021-12-28 19:20 ` Juri Linkov
1 sibling, 1 reply; 6+ messages in thread
From: Juri Linkov @ 2021-12-25 19:04 UTC (permalink / raw)
To: Rudolf Adamkovič; +Cc: 52769
> Could we improve 'repunctuate-sentences' to work such that in Transient
> Mark mode and with mark active, it re-punctuates the contents of the
> region?
Thanks for the request. Until now, I used a custom command
'canonically-double-space-region' attached below, activated
by advice when the command 'fill-paragraph' (M-q) is called
on the region.
But its heuristics is too unreliable to detect the places
where two spaces are needed. It often misidentifies
an abbreviation as the end of the sentence. So using
'query-replace' would be more reliably to make the decision
for every punctuation.
When I tried 'repunctuate-sentences', it stunned by its inefficiency:
it requires a confirmation even when there are already two spaces
at the end of the sentence! Why does it do this?
PS:
#+begin_src emacs-lisp
(defun canonically-double-space-region (beg end)
(interactive "*r")
(canonically-space-region beg end)
(unless (markerp end) (setq end (copy-marker end t)))
(let* ((sentence-end-double-space nil) ; to get right regexp below
(end-spc-re (rx (>= 5 (not (in ".?!"))) (regexp (sentence-end)))))
(save-excursion
(goto-char beg)
(while (and (< (point) end)
(re-search-forward end-spc-re end t))
(unless (or (>= (point) end)
(looking-back "[[:space:]]\\{2\\}\\|\n" 3))
(insert " "))))))
(advice-add 'fill-paragraph :before
(lambda (&rest _args)
(when (use-region-p)
(canonically-double-space-region
(region-beginning)
(region-end))))
'((name . fill-paragraph-double-space)))
#+end_src
^ permalink raw reply [flat|nested] 6+ messages in thread
* bug#52769: 29.0.50; [FEATURE REQUEST] repunctuate-sentences in region
2021-12-24 10:13 bug#52769: 29.0.50; [FEATURE REQUEST] repunctuate-sentences in region Rudolf Adamkovič via Bug reports for GNU Emacs, the Swiss army knife of text editors
2021-12-25 19:04 ` Juri Linkov
@ 2021-12-28 19:20 ` Juri Linkov
2021-12-28 21:31 ` Rudolf Adamkovič via Bug reports for GNU Emacs, the Swiss army knife of text editors
1 sibling, 1 reply; 6+ messages in thread
From: Juri Linkov @ 2021-12-28 19:20 UTC (permalink / raw)
To: Rudolf Adamkovič; +Cc: 52769
close 52769 29.0.50
thanks
> Could we improve 'repunctuate-sentences' to work such that in Transient
> Mark mode and with mark active, it re-punctuates the contents of the
> region?
Now this is implemented in master.
^ permalink raw reply [flat|nested] 6+ messages in thread
* bug#52769: 29.0.50; [FEATURE REQUEST] repunctuate-sentences in region
2021-12-25 19:04 ` Juri Linkov
@ 2021-12-28 19:28 ` Juri Linkov
2021-12-28 20:18 ` Juri Linkov
0 siblings, 1 reply; 6+ messages in thread
From: Juri Linkov @ 2021-12-28 19:28 UTC (permalink / raw)
To: Rudolf Adamkovič; +Cc: 52769
[-- Attachment #1: Type: text/plain, Size: 353 bytes --]
> When I tried 'repunctuate-sentences', it stunned by its inefficiency:
> it requires a confirmation even when there are already two spaces
> at the end of the sentence! Why does it do this?
If no one has a better idea for a simpler implementation,
then this patch fixes the problem by skipping the sentences
that already have two spaces at the end:
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: repunctuate-sentences-filter.patch --]
[-- Type: text/x-diff, Size: 893 bytes --]
diff --git a/lisp/textmodes/paragraphs.el b/lisp/textmodes/paragraphs.el
index 98362b8579..0b09895339 100644
--- a/lisp/textmodes/paragraphs.el
+++ b/lisp/textmodes/paragraphs.el
@@ -494,7 +494,14 @@ repunctuate-sentences
(if no-query
(while (re-search-forward regexp nil t)
(replace-match to-string))
- (query-replace-regexp regexp to-string nil start end))))
+ (let ((regexp "\\([]\"')]?\\)\\([.?!]\\)\\([]\"')]?\\)\\( +\\)")
+ (space-filter (lambda (_start _end)
+ (not (length= (match-string 4) 2)))))
+ (unwind-protect
+ (progn
+ (add-function :after-while isearch-filter-predicate space-filter)
+ (query-replace-regexp regexp to-string nil start end))
+ (remove-function isearch-filter-predicate space-filter))))))
(defun backward-sentence (&optional arg)
^ permalink raw reply related [flat|nested] 6+ messages in thread
* bug#52769: 29.0.50; [FEATURE REQUEST] repunctuate-sentences in region
2021-12-28 19:28 ` Juri Linkov
@ 2021-12-28 20:18 ` Juri Linkov
0 siblings, 0 replies; 6+ messages in thread
From: Juri Linkov @ 2021-12-28 20:18 UTC (permalink / raw)
To: Rudolf Adamkovič; +Cc: 52769
[-- Attachment #1: Type: text/plain, Size: 459 bytes --]
> If no one has a better idea for a simpler implementation,
> then this patch fixes the problem by skipping the sentences
> that already have two spaces at the end:
The filter will also allow redefining it with own logic
such as skipping known abbreviations that don't require
two spaces, i.e., e.g.:
(defun repunctuate-sentences-filter (_start _end)
(not (or (length= (match-string 4) 2)
(looking-back (rx (or "i.e." "e.g.") " ") 5))))
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: repunctuate-sentences-filter-2.patch --]
[-- Type: text/x-diff, Size: 1849 bytes --]
diff --git a/lisp/textmodes/paragraphs.el b/lisp/textmodes/paragraphs.el
index acb26fd1c1..580f3617d0 100644
--- a/lisp/textmodes/paragraphs.el
+++ b/lisp/textmodes/paragraphs.el
@@ -479,6 +479,9 @@ forward-sentence
(setq arg (1- arg)))
(constrain-to-field nil opoint t)))
+(defun repunctuate-sentences-filter (_start _end)
+ (not (length= (match-string 4) 2)))
+
(defun repunctuate-sentences (&optional no-query start end)
"Put two spaces at the end of sentences from point to the end of buffer.
It works using `query-replace-regexp'. In Transient Mark mode,
@@ -489,14 +492,21 @@ repunctuate-sentences
(interactive (list nil
(if (use-region-p) (region-beginning))
(if (use-region-p) (region-end))))
- (let ((regexp "\\([]\"')]?\\)\\([.?!]\\)\\([]\"')]?\\) +")
- (to-string "\\1\\2\\3 "))
- (if no-query
- (progn
- (when start (goto-char start))
- (while (re-search-forward regexp end t)
- (replace-match to-string)))
- (query-replace-regexp regexp to-string nil start end))))
+ (if no-query
+ (let ((regexp "\\([]\"')]?\\)\\([.?!]\\)\\([]\"')]?\\) +")
+ (to-string "\\1\\2\\3 "))
+ (when start (goto-char start))
+ (while (re-search-forward regexp end t)
+ (replace-match to-string)))
+ (let ((regexp "\\([]\"')]?\\)\\([.?!]\\)\\([]\"')]?\\)\\( +\\)")
+ (to-string "\\1\\2\\3 "))
+ (unwind-protect
+ (progn
+ (add-function :after-while isearch-filter-predicate
+ #'repunctuate-sentences-filter)
+ (query-replace-regexp regexp to-string nil start end))
+ (remove-function isearch-filter-predicate
+ #'repunctuate-sentences-filter)))))
(defun backward-sentence (&optional arg)
^ permalink raw reply related [flat|nested] 6+ messages in thread
* bug#52769: 29.0.50; [FEATURE REQUEST] repunctuate-sentences in region
2021-12-28 19:20 ` Juri Linkov
@ 2021-12-28 21:31 ` Rudolf Adamkovič via Bug reports for GNU Emacs, the Swiss army knife of text editors
0 siblings, 0 replies; 6+ messages in thread
From: Rudolf Adamkovič via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2021-12-28 21:31 UTC (permalink / raw)
To: Juri Linkov; +Cc: 52769
Juri Linkov <juri@linkov.net> writes:
> Now this is implemented in master.
I have just recompiled Emacs, and everything works as expected. This
patch will make my life easier. Thank you!
Rudy
--
"Programming reliably --- must be an activity of an undeniably
mathematical nature […] You see, mathematics is about thinking, and
doing mathematics is always trying to think as well as possible." --
Edsger W. Dijkstra (1981)
Rudolf Adamkovič <salutis@me.com> [he/him]
Studenohorská 25
84103 Bratislava
Slovakia
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2021-12-28 21:31 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-12-24 10:13 bug#52769: 29.0.50; [FEATURE REQUEST] repunctuate-sentences in region Rudolf Adamkovič via Bug reports for GNU Emacs, the Swiss army knife of text editors
2021-12-25 19:04 ` Juri Linkov
2021-12-28 19:28 ` Juri Linkov
2021-12-28 20:18 ` Juri Linkov
2021-12-28 19:20 ` Juri Linkov
2021-12-28 21:31 ` Rudolf Adamkovič via Bug reports for GNU Emacs, the Swiss army knife of text editors
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).