From: "Kévin Le Gouguec" <kevin.legouguec@gmail.com>
To: Andrea Corallo <akrl@sdf.org>
Cc: edouard.debry@gmail.com, 45705@debbugs.gnu.org
Subject: bug#45705: [feature/native-comp] Excessive memory consumption on windows 10
Date: Sat, 09 Jan 2021 18:26:46 +0100 [thread overview]
Message-ID: <87im86c97t.fsf@gmail.com> (raw)
In-Reply-To: <xjfy2h29tgd.fsf@sdf.org> (Andrea Corallo's message of "Sat, 09 Jan 2021 12:37:54 +0000")
[-- Attachment #1: Type: text/plain, Size: 2626 bytes --]
Andrea Corallo <akrl@sdf.org> writes:
> Eli Zaretskii <eliz@gnu.org> writes:
>
>>> From: Andrea Corallo <akrl@sdf.org>
>>>
>>> In June we changed the way we store immediate objects in the shared and
>>> this makes the compilation way lighter on the GCC side (both in time and
>>> memory). I've no precise data on this other than the experimental
>>> observation that compiling all Elisp files in Emacs on 32bit systems is
>>> not anymore an issue. This IIUC implies that the memory footprint for
>>> each compilation is always < 2GB.
>>
>> You assume that the compilations are all done serially? AFAIK, most
>> people build Emacs with "make -jN", so parallel compilation is an
>> important use case.
>
>> I guess we will have to collect the information about that, if you say
>> we don't have it now.
>
> I'm adding in CC Kevin, IIRC for bug#41077 he used a nice setup to
> produce quite accurate results on memory footprint during the
> compilation process. Perhaps he has time and he's so kind to gather
> some data on the current state, that would be extremely helpful.
See also bug#41194#20 and bug#41194#28 where I outlined how the
improvements reduced compilation time and memory usage.
I've dusted off my 32-bit laptop; unfortunately the fan sounds like it's
in need of… something (probably exorcism, given the noise).
Until I figure that out, here are the (very hacky) scripts I used to
measure and plot the RAM usage, in case someone else wants to take some
measurements:
- ./monitor.sh $PID finds the most RAM-consuming process among $PID and
its children, and logs its memory usage (VSZ and RSS) and its
command-line.
(Logs are collected every 10 seconds; this probably needs to be
reduced for faster machines)
- ./plot.py uses matplotlib to make graphs out of these measurements; it
attempts to replace the command line with the less-verbose diagnostics
from "make".
- My workflow was to start an emacs session, run M-x compile RET make,
then ./monitor.sh $PID_OF_EMACS_SESSION.
(PARENT_RE in plot.py should match the command-line of this parent
session; its RAM consumption is then labeled as "noise floor" on the
graph.
This serves no real purpose and should be removed; monitor.sh should
be amended to filter the parent session out of monitored PIDs, with
some error control to handle the lack of child processes when
compilation is finished.)
- There are some hardcoded things to tweak at the bottom of plot.py,
e.g. how long should a child process last for it to have a label on
the graph.
[-- Attachment #2: monitor.sh --]
[-- Type: application/x-shellscript, Size: 350 bytes --]
[-- Attachment #3: plot.py --]
[-- Type: text/x-python, Size: 5200 bytes --]
#!/usr/bin/env python3
from datetime import datetime, timedelta
from pathlib import Path
import re
import matplotlib
from matplotlib import pyplot
from matplotlib.dates import DateFormatter, HourLocator, MinuteLocator
from matplotlib.ticker import EngFormatter
MONITOR_RE = re.compile('\n'.join((
'(?P<time>.+)',
r' *(?P<seconds>\d+) +(?P<vsz>\d+) +(?P<rss>\d+) +(?P<args>.+)',
' *(?P<memheader>.+)',
'Mem: *(?P<memvalues>.+)',
'Swap: *(?P<swapvalues>.+)',
''
)), flags=re.MULTILINE)
def list_snapshots(monitor_log):
snapshots = []
for match in MONITOR_RE.finditer(monitor_log):
md = match.groupdict()
memkeys = md['memheader'].split()
memvalues = md['memvalues'].split()
swapvalues = md['swapvalues'].split()
snapshot = {
'time': datetime.strptime(md['time'], '%Y-%m-%d-%H:%M:%S'),
'uptime': int(md['seconds']),
'vsz': int(md['vsz'])*1024,
'rss': int(md['rss'])*1024,
'process': md['args'],
'mem': {memkeys[i]: int(val)*1024 for i, val in enumerate(memvalues)},
'swap': {memkeys[i]: int(val)*1024 for i, val in enumerate(swapvalues)}
}
snapshots.append(snapshot)
return snapshots
LOADDEFS_RE = re.compile(
r'--eval \(setq generated-autoload-file'
r' \(expand-file-name \(unmsys--file-name "([^"]+)"\)\)\)'
r' -f batch-update-autoloads'
)
SEMANTIC_RE = re.compile(
r'-l semantic/(?:wisent|bovine)/grammar -f (?:wisent|bovine)-batch-make-parser'
r' -o (.+) .+\.[wb]y'
)
ELCELN_RE = re.compile(
r'\.\./src/(?:bootstrap-)?emacs -batch --no-site-file --no-site-lisp'
r' --eval \(setq load-prefer-newer t\) -l comp'
r'(?: -f byte-compile-refresh-preloaded)?'
r' -f batch-byte-native-compile-for-bootstrap'
r' (.+\.el)'
)
SHORTENED_NAMES = {
LOADDEFS_RE: 'GEN',
SEMANTIC_RE: 'GEN',
ELCELN_RE: 'ELC+ELN'
}
QUAIL_TIT_RE = re.compile(
r'-l titdic-cnv -f batch-titdic-convert'
r' -dir \./\.\./lisp/leim/quail CXTERM-DIC/(.+)\.tit'
)
QUAIL_MISC_RE = re.compile(
r'-l titdic-cnv -f batch-miscdic-convert'
r' -dir \./\.\./lisp/leim/quail MISC-DIC/(.+\.(html|map|cin|cns|b5))'
)
QUAIL_JA_RE = re.compile(
r'-l ja-dic-cnv -f batch-skkdic-convert'
)
PARENT_RE = re.compile(
r'$^' # Adjust to match parent process.
)
TRANSFORMED_NAMES = {
QUAIL_TIT_RE: lambda m: f'GEN ../lisp/leim/quail/{m.group(1)}.el',
QUAIL_MISC_RE: lambda m: f'GEN from {m.group(1)}',
QUAIL_JA_RE: lambda m: f'GEN ../lisp/leim/ja-dic/ja-dic.el',
PARENT_RE: lambda _: '(noise floor)'
}
def shorten(process):
for r, name in SHORTENED_NAMES.items():
match = r.search(process)
if match is not None:
return f'{name} {match.group(1)}'
for r, transform in TRANSFORMED_NAMES.items():
match = r.search(process)
if match is not None:
return transform(match)
if len(process) > 40:
return f'{process[:20]}…{process[-20:]}'
return process
def list_processes(snapshots):
t0 = snapshots[0]['time']
current_process = snapshots[0]['process']
current_process_start = t0
processes = []
for s in snapshots[1:]:
if s['process'] == current_process:
continue
s_start = s['time']
processes.append((
current_process, current_process_start, s_start-current_process_start
))
current_process = s['process']
current_process_start = s_start
processes.append((
current_process,
current_process_start,
snapshots[-1]['time']-current_process_start
))
return processes
snapshots = list_snapshots(Path('monitor.log').read_text())
xs = tuple(s['time'] for s in snapshots)
vsz = tuple(s['vsz'] for s in snapshots)
rss = tuple(s['rss'] for s in snapshots)
memavail = tuple(s['mem']['available'] for s in snapshots)
swapused = tuple(s['swap']['used'] for s in snapshots)
matplotlib.use('TkAgg')
fig, axes = pyplot.subplots(figsize=(128, 9.6))
axes.plot(xs, vsz, label='VSZ (process)')
axes.plot(xs, rss, label='RSS (process)')
axes.plot(xs, memavail, label='available memory (system)', linewidth=0.5)
axes.plot(xs, swapused, label='used swap (system)')
axes.set_xlim(snapshots[0]['time'], snapshots[-1]['time'])
axes.xaxis.set_major_formatter(DateFormatter('%H:%M'))
axes.xaxis.set_major_locator(HourLocator())
axes.xaxis.set_minor_locator(MinuteLocator(tuple(5*i for i in range(1, 12))))
axes.xaxis.set_label_text('Hours')
axes.set_ylim(0)
axes.yaxis.set_major_formatter(EngFormatter(unit='B'))
axes.legend()
for p, start, duration in list_processes(snapshots):
if duration < timedelta(minutes=2):
continue
pyplot.text(start, 1e9, shorten(p), rotation=45)
pyplot.plot(
(start, start+duration), (1e9, 1e9),
marker='|', linewidth=0.5, linestyle='--',
color='black', alpha=0.8
)
pyplot.savefig('monitor.pdf')
pyplot.show()
next prev parent reply other threads:[~2021-01-09 17:26 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-01-06 20:48 bug#45705: [feature/native-comp] Excessive memory consumption on windows 10 Édouard Debry
2021-01-06 20:55 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors
2021-01-07 14:25 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors
2021-01-08 14:25 ` Eli Zaretskii
2021-01-08 15:50 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors
2021-01-08 16:10 ` Eli Zaretskii
2021-01-08 22:02 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors
2021-01-09 7:56 ` Eli Zaretskii
2021-01-09 10:55 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors
2021-01-09 11:55 ` Eli Zaretskii
2021-01-09 12:37 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors
2021-01-09 17:26 ` Kévin Le Gouguec [this message]
2021-01-09 19:41 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/emacs/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87im86c97t.fsf@gmail.com \
--to=kevin.legouguec@gmail.com \
--cc=45705@debbugs.gnu.org \
--cc=akrl@sdf.org \
--cc=edouard.debry@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).