unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#32462: 26.1; Can `count-lines' be rewritten to use the newline cache?
@ 2018-08-17  1:01 Phil Sainty
  2018-08-17 14:40 ` Eli Zaretskii
  0 siblings, 1 reply; 3+ messages in thread
From: Phil Sainty @ 2018-08-17  1:01 UTC (permalink / raw)
  To: 32462

I saw this story the other day:

https://fuco1.github.io/2018-08-12-WAR-STORY:-When-turning-to-the-profiler-turns-out-to-be-a-good-call.html

The summary is that some very slow code turned out to be spending the
vast bulk of its time inside `line-number-at-pos' (which was used
frequently), and once the author discovered what that function
actually entailed they were able to reduce their processing time from
42 seconds down to 5 seconds (processing a file of ~10,000 lines) by
finding an alternative approach which did not involve calling
`count-lines'.

`count-lines' uses a regexp search to find all the newlines (and/or
carriage returns -- I don't know if that's a problem) and I recall
that internally Emacs uses a newline cache to make certain
line-oriented functionality performant.  I know nothing about the
cache other than that it exists, but I wondered whether `count-lines'
might be able to use it to avoid most of the work that it currently
does?


-Phil






In GNU Emacs 26.1 (build 1, x86_64-pc-linux-gnu, X toolkit, Xaw scroll 
bars)
  of 2018-04-10 built on shodan
Windowing system distributor 'The X.Org Foundation', version 
11.0.11501000
System Description:	Ubuntu 14.04.5 LTS

Recent messages:
For information about GNU Emacs and the GNU system, type C-h C-a.

Configured using:
  'configure --prefix=/home/phil/emacs/26/26.1rc1/usr/local
  --with-x-toolkit=lucid --without-sound'

Configured features:
XPM JPEG TIFF GIF PNG RSVG IMAGEMAGICK GPM DBUS GSETTINGS NOTIFY
LIBSELINUX GNUTLS LIBXML2 FREETYPE XFT ZLIB TOOLKIT_SCROLL_BARS LUCID
X11 THREADS LCMS2

Important settings:
   value of $LANG: en_NZ.UTF-8
   value of $XMODIFIERS: @im=ibus
   locale-coding-system: utf-8-unix

Major mode: Dired by name

Minor modes in effect:
   tooltip-mode: t
   global-eldoc-mode: t
   electric-indent-mode: t
   mouse-wheel-mode: t
   tool-bar-mode: t
   menu-bar-mode: t
   file-name-shadow-mode: t
   global-font-lock-mode: t
   font-lock-mode: t
   blink-cursor-mode: t
   auto-composition-mode: t
   auto-encryption-mode: t
   auto-compression-mode: t
   buffer-read-only: t
   line-number-mode: t
   transient-mark-mode: t

Load-path shadows:
None found.

Features:
(shadow sort mail-extr emacsbug message rmc puny seq byte-opt gv
bytecomp byte-compile cconv cl-loaddefs cl-lib format-spec rfc822 mml
easymenu mml-sec password-cache epa derived epg epg-config gnus-util
rmail rmail-loaddefs mm-decode mm-bodies mm-encode mail-parse rfc2231
mailabbrev gmm-utils mailheader sendmail rfc2047 rfc2045 ietf-drums
mm-util mail-prsvr mail-utils dired dired-loaddefs advice elec-pair
time-date mule-util tooltip eldoc electric uniquify ediff-hook vc-hooks
lisp-float-type mwheel term/x-win x-win term/common-win x-dnd tool-bar
dnd fontset image regexp-opt fringe tabulated-list replace newcomment
text-mode elisp-mode lisp-mode prog-mode register page menu-bar
rfn-eshadow isearch timer select scroll-bar mouse jit-lock font-lock
syntax facemenu font-core term/tty-colors frame cl-generic cham georgian
utf-8-lang misc-lang vietnamese tibetan thai tai-viet lao korean
japanese eucjp-ms cp51932 hebrew greek romanian slovak czech european
ethiopic indian cyrillic chinese composite charscript charprop
case-table epa-hook jka-cmpr-hook help simple abbrev obarray minibuffer
cl-preloaded nadvice loaddefs button faces cus-face macroexp files
text-properties overlay sha1 md5 base64 format env code-pages mule
custom widget hashtable-print-readable backquote dbusbind inotify lcms2
dynamic-setting system-font-setting font-render-setting x-toolkit x
multi-tty make-network-process emacs)

Memory information:
((conses 16 99206 10028)
  (symbols 48 20474 0)
  (miscs 40 101 169)
  (strings 32 29605 992)
  (string-bytes 1 777027)
  (vectors 16 14293)
  (vector-slots 8 495220 11440)
  (floats 8 55 100)
  (intervals 56 315 0)
  (buffers 992 14)
  (heap 1024 30317 1420))






^ permalink raw reply	[flat|nested] 3+ messages in thread

* bug#32462: 26.1; Can `count-lines' be rewritten to use the newline cache?
  2018-08-17  1:01 bug#32462: 26.1; Can `count-lines' be rewritten to use the newline cache? Phil Sainty
@ 2018-08-17 14:40 ` Eli Zaretskii
  2018-08-18 12:10   ` Phil Sainty
  0 siblings, 1 reply; 3+ messages in thread
From: Eli Zaretskii @ 2018-08-17 14:40 UTC (permalink / raw)
  To: Phil Sainty; +Cc: 32462

> Date: Fri, 17 Aug 2018 13:01:38 +1200
> From: Phil Sainty <psainty@orcon.net.nz>
> 
> `count-lines' uses a regexp search to find all the newlines (and/or
> carriage returns

It uses regexp search only when selective-display is in effect, which
means almost never.  Otherwise, it uses forward-line, which uses
scan_newline_from_point, which already uses the newline cache (unless
the cache is disabled).

(And based on my experience, the newline cache doesn't actually speed
up things all that much, except in buffers with unusually long lines.)





^ permalink raw reply	[flat|nested] 3+ messages in thread

* bug#32462: 26.1; Can `count-lines' be rewritten to use the newline cache?
  2018-08-17 14:40 ` Eli Zaretskii
@ 2018-08-18 12:10   ` Phil Sainty
  0 siblings, 0 replies; 3+ messages in thread
From: Phil Sainty @ 2018-08-18 12:10 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 32462-done

On 18/08/18 02:40, Eli Zaretskii wrote:
>> From: Phil Sainty <psainty@orcon.net.nz>
>> `count-lines' uses a regexp search to find all the newlines (and/or
>> carriage returns
> 
> It uses regexp search only when selective-display is in effect, which
> means almost never.  Otherwise, it uses forward-line, which uses
> scan_newline_from_point, which already uses the newline cache (unless
> the cache is disabled).

Ah, thanks Eli; I see that now.  A look at the C code suggests that the
newline cache is also a more complicated arrangement than I'd imagined,
so I don't think my original thoughts about this were actually viable.

I'm closing this bug.






^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2018-08-18 12:10 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-08-17  1:01 bug#32462: 26.1; Can `count-lines' be rewritten to use the newline cache? Phil Sainty
2018-08-17 14:40 ` Eli Zaretskii
2018-08-18 12:10   ` Phil Sainty

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).