From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Ihor Radchenko Newsgroups: gmane.emacs.devel Subject: Re: Lisp-level macro to avoid excessive GC in memory-allocating code (was: Larger GC thresholds for non-interactive Emacs) Date: Fri, 01 Jul 2022 15:52:53 +0800 Message-ID: <87k08xjmlm.fsf@localhost> References: <835yl910gp.fsf@gnu.org> <87wndndbhq.fsf@gmail.com> <83bkuzznws.fsf@gnu.org> <877d5mqmkh.fsf@localhost> <83y1y2utnd.fsf@gnu.org> <87r13up587.fsf@localhost> <83o7yyur0l.fsf@gnu.org> <87leu2p3nu.fsf@localhost> <83leu2uewn.fsf@gnu.org> <87r13qv701.fsf@localhost> <83bkuursya.fsf@gnu.org> <87h74l9jk8.fsf@localhost> <83bkutqb3z.fsf@gnu.org> <9778F176-E724-4E61-B0FB-327BCDD316C0@acm.org> <87sfo4epeo.fsf@localhost> <87bkurrc5e.fsf@localhost> <87v8shk1c5.fsf@localhost> <83wncxe4pr.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="27895"; mail-complaints-to="usenet@ciao.gmane.io" Cc: monnier@iro.umontreal.ca, mattiase@acm.org, theophilusx@gmail.com, rms@gnu.org, acm@muc.de, emacs-devel@gnu.org To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Fri Jul 01 09:53:25 2022 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1o7BSu-000722-Vr for ged-emacs-devel@m.gmane-mx.org; Fri, 01 Jul 2022 09:53:25 +0200 Original-Received: from localhost ([::1]:45942 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1o7BSt-0008NH-Ge for ged-emacs-devel@m.gmane-mx.org; Fri, 01 Jul 2022 03:53:23 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:34178) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1o7BRQ-0006gS-8L for emacs-devel@gnu.org; Fri, 01 Jul 2022 03:51:52 -0400 Original-Received: from mail-oo1-xc29.google.com ([2607:f8b0:4864:20::c29]:36547) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1o7BRO-0006Kv-FF; Fri, 01 Jul 2022 03:51:51 -0400 Original-Received: by mail-oo1-xc29.google.com with SMTP id i19-20020a4ad093000000b004256ad0893fso289130oor.3; Fri, 01 Jul 2022 00:51:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:in-reply-to:references:date:message-id :mime-version; bh=Zj9QK5Cy6Tp8aTik4sYpUEGhASKxcUztPcAgLg56p/w=; b=qCYwhj786yLoDyN39/aviV0S4gKA7PLk7xWQiitNUF/Er7CB7JNXzMqP79148LH02H Uj2Zy+Ca7l9ywjD16WyjdhK7LgNAIOQP3LRPxSZ8lQYuDfr3X+ZFfjZBqeVRWn1DsS18 7uxkC9J2YaF/k08lXgy4QT6C3a+K9q25uza1/oB9s/+jqBSVZyrRJS1d6X+tAOt76CYR Kf1j29jJSmTrNZIimwNtY2yyl+OhRdmnxAzWPyHbGPVXgDQio4hyyfS4fm0ffFKFLthq KYOXNUfhQ3TZSUzSR+df2yEPYDTCCoY/F45SezKwiAtu4udYbwtJ1uR08uQ5omqpJP+l REfg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:in-reply-to:references:date :message-id:mime-version; bh=Zj9QK5Cy6Tp8aTik4sYpUEGhASKxcUztPcAgLg56p/w=; b=pz1izteKBMxJGFm2r/WMUHT80HbuZ4cIy6bh4cHedDerHS5o0IqkQmm2I7VdZ8Tnpv ZVCTG+2NrJIf1yA70EIXLJ/P/7LuDgTP2CeAEDEKMbmtSV1svWrv8yfSJOe7IO5NLQXM GvfdDBkhwnljhbabtkJsHuCZf34wMKAvY4etLcxg1GSkYMKS3/0TEYpj62dccQQCGVcJ TJMw+Ck8vBn+r+QhRpPCDR9imFiJ/OVqU7UWRIGw2vA6WVUrG4II5QlYCDd69Th841fR Nvtm0gfAgk4bXjkJupdnPBptiPvFiMSSnuZeHa6WaNu29ejHtGE7hmEG48YwYnmeV/ag FbvA== X-Gm-Message-State: AJIora90TT8CIHMqWlLhrDgX9I8I+QdwAM1eW2iK/I0QzPw6EejUuWkC NozfoJesA+EwarRshoZicjx+S3UIjPwJtw== X-Google-Smtp-Source: AGRyM1tkoPzi0zr6Bu+jAic3FZkWCpp5KMlkRyu7ehs6/rGV11LQ3zWoVuS50yUEVkfwY9RB8dIlvQ== X-Received: by 2002:a4a:c3c1:0:b0:425:9f38:8af3 with SMTP id e1-20020a4ac3c1000000b004259f388af3mr5555966ooq.7.1656661907427; Fri, 01 Jul 2022 00:51:47 -0700 (PDT) Original-Received: from localhost ([207.126.88.10]) by smtp.gmail.com with ESMTPSA id r19-20020a0568301ad300b0060bfb08741esm12179820otc.12.2022.07.01.00.51.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 01 Jul 2022 00:51:46 -0700 (PDT) In-Reply-To: <83wncxe4pr.fsf@gnu.org> Received-SPF: pass client-ip=2607:f8b0:4864:20::c29; envelope-from=yantar92@gmail.com; helo=mail-oo1-xc29.google.com X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:291767 Archived-At: Eli Zaretskii writes: > Please don't forget that GC doesn't only collects unused Lisp objects, > it also does other useful memory-management related tasks. It > compacts buffer text and strings, and it frees unused slots in various > caches (font cache, image cache, etc.). You can find in the archives > discussions where innocently-looking code could cause Emacs run out of > memory because it used too many fonts without flushing the font cache > (any program that works on the list of fonts returned by the likes of > x-list-fonts is in danger of bumping into that). Then, if we decide to implement the macro I am suggesting, such macro should not affect memory allocation of such sensitive objects: font cache, image cache, etc. Just "safe" memory allocations. >> As one idea, a lisp program may mark some of the variables to be skipped >> by GC and to not contribute to GC threshold checks (that is, allocating >> memory into the marked variables will not increase the memory counter >> used by GC). >> >> WDYT? > > I'm not sure I understand how this idea can be implemented. The > counting of how much Lisp data since last GC was produced is done > _before_ variables are bound to the produced data as values. So by > the time we know the data is bound to such "special" variables, it's > already too late, and the only way to do what you suggest would be to > increase consing_until_gc back after we realize this fact. Which > would mean computing how much consing was done for the value of these > variables, and that would probably slow down the generation of Lisp > data, wouldn't it? > > Or what am I missing? We can do the following: 1. In addition to directly bumping the TOTAL counter of newly allocated memory, we can introduce a new LOCAL counter holding recent allocations within current sexp. 2. Every time we return from a sexp/self-quoting object into assignment, if we are inside the proposed macro and also assigning value to one of the pre-defined symbols, increase the upper LOCAL counter in the parent sexp. Otherwise, do not change the upper LOCAL counter. 3. Perform GC according to TOTAL-LOCAL threshold value. 4. When exiting the macro, set LOCAL to 0, unless inside another such macro. Example (I intentionally avoid using dolist because macros using temporary symbols complicate things): (defvar lst-value) (with-no-gc '(i return lst-value) (let (return (i 0)) (while (< i 1000000) (setq return (cons i return)) (make-string 1 ?a) (setq i (1+ i))) (setq lst-value return))) Let TOTAL be the global counter and LOCAL be the local counter. In the above code will: 1. Eval "0" to bind initial value of i. LOCAL=size_of_int; GLOBAL+=size_of_int. GC code will not count this allocation when deciding whether to perform GC. 2. Return 0 into (i 0) assignment and bump parent LOCAL because i is declared by the macro: LOCAL(let)+=LOCAL == size_of_int 3. Eval "1000000" in (< i 1000000); LOCAL=size_of_int; GLOBAL+=size_of_int. 4. Now, we are not inside an assignment, so LOCAL(<) == 0 (unchanged), TOTAL is increased, and TOTAL-LOCAL will be bumped by size_of_int. 5. Now, we are inside (while ...) and not inside assignment; LOCAL(while)+=LOCAL(<) == size_of_int 5. Eval (cons i return); LOCAL(cons)=cons_size; GLOBAL+=cons_size; 6. Return the new cons into the assignment to return. Because return is declared by the macro, the outer value of local is bumped: LOCAL(while) == size_of_int+cons_size 7. Eval (make-string ...). This will allocate a new string; LOCAL(make-string)=string_length; GLOBAL+=string_length; 8. Return to (while ...) sexp; Because we are not inside the assignment LOCAL(while) is unchanged and the string allocation will contribute to the GC threshold. 9. Eval (1+ i), which will allocate a new integer; LOCAL(1+)=size_of_int; GLOBAL+=size_of_int; 10. The newly allocated integer is assigned to i symbol, declared by the macro; thus LOCAL(while)+=LOCAL(1+) == size_of_int+cons_size+size_of_int; GLOBAL-LOCAL is unchanged and do not count towards next GC. [second iteration] 11. Eval "1000000" in (< i 1000000), which will not allocate any memory because integers are immutable (AFAIK). GLOBAL and LOCAL remain unchanged. ... (all other iterations) LOCAL(while) == size_of_int * 1000000 + cons_size * (1- 1000000) 12. Return from (while ...). It is not an assignment, so LOCAL(let)+=LOCAL(while) 13. Assign lst_value, which will not allocate any extra memory. 14. Return from let: LOCAL(with-no-gc)+=LOCAL(let) 15. Return from with-no-gc. LOCAL=0 and the whole allocated object will now be able to contribute to the GC threshold, unless the example snippet is not by itself wrapped into parent with-no-gc call. In any case, while inside with-no-gc, only a single 1000000 symbol + all the string allocations can trigger GC. Assignments to i,result, and lst_value will only be counted upon exiting the macro. Best, Ihor