From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Yuan Fu Newsgroups: gmane.emacs.devel Subject: Re: How to add pseudo vector types Date: Mon, 19 Jul 2021 11:16:01 -0400 Message-ID: References: <83h7gw6pyj.fsf@gnu.org> <45EBF16A-C953-42C7-97D1-3A2BFEF7DD01@gmail.com> <83y2a764oy.fsf@gnu.org> <83v95b60fn.fsf@gnu.org> <00DD5BFE-D14E-449A-9319-E7B725DEBFB3@gmail.com> <83r1fz5xr9.fsf@gnu.org> <1AAB1BCC-362B-4249-B785-4E0530E15C60@gmail.com> <83czri67h0.fsf@gnu.org> <46BBFF88-76C3-4818-8805-5437409BEA93@gmail.com> <83wnpq46uk.fsf@gnu.org> <533BD53B-4E85-4E9E-B46A-346A5BBAD0F5@gmail.com> <258CB68D-1CC1-42C8-BDCD-2A8A8099B783@gmail.com> <1a776770-50b7-93cd-6591-c9a5b3a56eb8@gmail.com> Mime-Version: 1.0 (Mac OS X Mail 14.0 \(3654.60.0.2.21\)) Content-Type: multipart/mixed; boundary="Apple-Mail=_E78FCE23-77F6-4F70-943B-71EA5B29FFBD" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="22203"; mail-complaints-to="usenet@ciao.gmane.io" Cc: =?utf-8?Q?Cl=C3=A9ment_Pit-Claudel?= , emacs-devel To: Stefan Monnier Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Mon Jul 19 17:17:18 2021 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1m5V14-0005CP-1E for ged-emacs-devel@m.gmane-mx.org; Mon, 19 Jul 2021 17:17:10 +0200 Original-Received: from localhost ([::1]:40010 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m5V13-0001hF-1P for ged-emacs-devel@m.gmane-mx.org; Mon, 19 Jul 2021 11:17:09 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:42914) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m5V00-0000HD-7W for emacs-devel@gnu.org; Mon, 19 Jul 2021 11:16:04 -0400 Original-Received: from mail-qk1-x72e.google.com ([2607:f8b0:4864:20::72e]:39708) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m5Uzy-0007hP-Dv for emacs-devel@gnu.org; Mon, 19 Jul 2021 11:16:04 -0400 Original-Received: by mail-qk1-x72e.google.com with SMTP id j184so16993167qkd.6 for ; Mon, 19 Jul 2021 08:16:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:message-id:mime-version:subject:date:in-reply-to:cc:to :references; bh=jUckHhpBGaz74arVyJ+5Gd+egFQsqxBg/vHhgIn3YgE=; b=DOlXQdUiYJBps4wfjUdKsfycVQ6L7C2/W+zgLK0Cb5mW3leVFPEdWkO3xX4E9Q7wZa nc8vVdO/aENu/TAA/pA53M8SJRysMzt51YB8J6L4ha/Vis8VpfOhLgW4tt68GMmb7vo0 Z0TLBx7VQfd8Gc9TC2lbjNdSniSd5P3qSxbtx787QV0QYTZ/i2gyI4b6xETpfxGKQhFq 0lIMgHCoe5m7tPvyqWB6DS94Kbt+KrYlA1DigfRkDXjXbJLmr4ZmNWqYXwd3xUqVPbyB 6xm10/xGRsTz45UImu5ytwPhhz+lbAdpeoinFWvjV6uC7jx4h50wM0t8SFxohqoW7TIx sdrA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:message-id:mime-version:subject:date :in-reply-to:cc:to:references; bh=jUckHhpBGaz74arVyJ+5Gd+egFQsqxBg/vHhgIn3YgE=; b=Z5IaVg7WSXzKqnbOHPf5NGDpIXm1VCaApinNU7ODrf+p69DkFtBeICp5joyqCbL2qG o0zyutbPxTDLdgZwFXatfy6UjiWWXZAMMLv7M6AI+Av/Wy5M5xmDO21Oi4c5R8iIt2mD M+iF2SDmnOC//HjkrmTNHayiy/ThyD/bGkbOpIWorEaazxTlkzBlzwui+KybNYTpfYcm pnb/EvoL1o+zyjDN5XJ99CbbX50g7465SFfIPwuOErNAROMi9cwmkTJTe2nq7r+4LGgN Q+Q0gmpVjGYyq0F6HXNRUErrrek5imqQTTIjXeHA2i7PV210Ca8AeaulUGgXsOJ47fle 1lrQ== X-Gm-Message-State: AOAM531AZW8wdJc4+ZnuoIx0lCSuXzZ6y2PUm5eiFr1GRwt0qZe7qvN8 nl9gc8XSALXydokRuQ67fyw= X-Google-Smtp-Source: ABdhPJxIfPVpABpuNH+LdU0YVTkK6JyPBnYsMyxl8OGpMM8fkg13+CEJ6WUvZhlpTEOCmhm8F+zw1A== X-Received: by 2002:ae9:eb85:: with SMTP id b127mr23927656qkg.151.1626707760979; Mon, 19 Jul 2021 08:16:00 -0700 (PDT) Original-Received: from ?IPv6:2601:98a:4200:9210:f518:f647:42ba:463b? ([2601:98a:4200:9210:f518:f647:42ba:463b]) by smtp.gmail.com with ESMTPSA id d28sm5044201qkj.25.2021.07.19.08.16.00 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Mon, 19 Jul 2021 08:16:00 -0700 (PDT) In-Reply-To: X-Mailer: Apple Mail (2.3654.60.0.2.21) Received-SPF: pass client-ip=2607:f8b0:4864:20::72e; envelope-from=casouri@gmail.com; helo=mail-qk1-x72e.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:271361 Archived-At: --Apple-Mail=_E78FCE23-77F6-4F70-943B-71EA5B29FFBD Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 > On Jul 17, 2021, at 1:30 PM, Stefan Monnier = wrote: >=20 > In your benchmark , you give numbers for: > - initial full-text parse (a bit above 1MB/s) > - cost of update-without-reparse >=20 > but I think it would be nice to see the cost of the reparse after > those updates (should be much faster than the initial parse). I have done some more benchmark. Initially I thought tree-sitter = doesn=E2=80=99t scale, because re-parsing my JSON file is unexpectedly = slow, but then I retired with xdisp.c with tree-sitter's C parser, and = that is really fast and matches my expectation of tree-sitter. So from = now on I=E2=80=99ll use xdispf.c and the C parser for benchmarking. I = guess the json parser is simply bad-written? I benchmarked with a simple C program. The programs are in main-c.c and = main-json.c, and the shell output of the measurements is in = benchmark.3.txt. JSON: Initial parse takes 1.2s, re-parse (with no change) takes 0.7s, = uses 307MB memory C: Initial parse takes 0.14s, re-parse (with no change) takes 0.009s, = uses 20MB memory Yuan --Apple-Mail=_E78FCE23-77F6-4F70-943B-71EA5B29FFBD Content-Disposition: attachment; filename=benchmark.3.txt Content-Type: text/plain; x-unix-mode=0644; name="benchmark.3.txt" Content-Transfer-Encoding: quoted-printable On benchmark.2.json (1.6M) One full parse: 1.2s ________________________________________________________ Executed in 1.30 secs fish external usr time 1210.81 millis 142.00 micros 1210.67 millis sys time 87.40 millis 756.00 micros 86.65 millis One full parse and a re-parse: ________________________________________________________ Executed in 2.40 secs fish external usr time 1.95 secs 154.00 micros 1.95 secs sys time 0.15 secs 763.00 micros 0.15 secs Re-parse takes 1.95 - 1.21 =3D 0.74s Memory usage of full-parse + re-parse: 2.17 real 2.00 user 0.16 sys 307269632 maximum resident set size 0 average shared memory size 0 average unshared data size 0 average unshared stack size 75035 page reclaims 0 page faults 0 swaps 0 block input operations 0 block output operations 0 messages sent 0 messages received 0 signals received 0 voluntary context switches 463 involuntary context switches 14674957821 instructions retired 7838514409 cycles elapsed 306745344 peak memory footprint 307MB for two trees that "shares internal structure". =0C On xdisp.c (1.2M) One full paese: 0.139s ________________________________________________________ Executed in 478.23 millis fish external usr time 139.69 millis 134.00 micros 139.55 millis sys time 8.05 millis 829.00 micros 7.22 millis Full parse and re-parse: ________________________________________________________ Executed in 456.58 millis fish external usr time 148.23 millis 153.00 micros 148.08 millis sys time 9.08 millis 791.00 micros 8.29 millis 148 - 139 =3D 0.009s Memory usage of full-parse + re-parse: 0.16 real 0.15 user 0.00 sys 20131840 maximum resident set size 0 average shared memory size 0 average unshared data size 0 average unshared stack size 4932 page reclaims 0 page faults 0 swaps 0 block input operations 0 block output operations 0 messages sent 0 messages received 0 signals received 0 voluntary context switches 28 involuntary context switches 1070525817 instructions retired 581557699 cycles elapsed 19271680 peak memory footprint 20MB --Apple-Mail=_E78FCE23-77F6-4F70-943B-71EA5B29FFBD Content-Disposition: attachment; filename=main-c.c Content-Type: application/octet-stream; x-unix-mode=0644; name="main-c.c" Content-Transfer-Encoding: 7bit #include #include #include TSLanguage *tree_sitter_c(); struct buffer { char *buf; long len; }; const char *read_file(void *payload, uint32_t byte_index, TSPoint position, uint32_t *bytes_read) { long len = ((struct buffer *) payload)->len; if (byte_index >= len) { *bytes_read = 0; return (char *) ""; } else { *bytes_read = len - byte_index; return (char *) (((struct buffer *) payload)->buf) + byte_index; } } int main() { TSParser *parser = ts_parser_new(); ts_parser_set_language(parser, tree_sitter_c()); /* Copy the file into BUFFER. */ FILE *file = fopen("xdisp.c", "rb"); fseek(file, 0, SEEK_END); long length = ftell (file); fseek(file, 0, SEEK_SET); char *buffer = malloc (length); fread(buffer, 1, length, file); fclose (file); struct buffer buf = {buffer, length}; TSInput input = {&buf, read_file, TSInputEncodingUTF8}; TSTree *tree = ts_parser_parse(parser, NULL, input); TSTree *new_tree = ts_parser_parse(parser, tree, input); free(buffer); ts_tree_delete(tree); ts_tree_delete(new_tree); ts_parser_delete(parser); return 0; } --Apple-Mail=_E78FCE23-77F6-4F70-943B-71EA5B29FFBD Content-Disposition: attachment; filename=main-json.c Content-Type: application/octet-stream; x-unix-mode=0644; name="main-json.c" Content-Transfer-Encoding: 7bit #include #include #include TSLanguage *tree_sitter_json(); struct buffer { char *buf; long len; }; const char *read_file(void *payload, uint32_t byte_index, TSPoint position, uint32_t *bytes_read) { long len = ((struct buffer *) payload)->len; if (byte_index >= len) { *bytes_read = 0; return (char *) ""; } else { *bytes_read = len - byte_index; return (char *) (((struct buffer *) payload)->buf) + byte_index; } } int main() { TSParser *parser = ts_parser_new(); ts_parser_set_language(parser, tree_sitter_json()); /* Copy the file into BUFFER. */ FILE *file = fopen("benchmark.3.json", "rb"); fseek(file, 0, SEEK_END); long length = ftell (file); fseek(file, 0, SEEK_SET); char *buffer = malloc (length); fread(buffer, 1, length, file); fclose (file); struct buffer buf = {buffer, length}; TSInput input = {&buf, read_file, TSInputEncodingUTF8}; TSTree *tree = ts_parser_parse(parser, NULL, input); TSTree *new_tree = ts_parser_parse(parser, tree, input); free(buffer); ts_tree_delete(tree); ts_tree_delete(new_tree); ts_parser_delete(parser); return 0; } --Apple-Mail=_E78FCE23-77F6-4F70-943B-71EA5B29FFBD--