From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Helmut Eller Newsgroups: gmane.emacs.devel Subject: Re: Markers in a gap array Date: Fri, 26 Jul 2024 21:48:46 +0200 Message-ID: <87jzh8djdt.fsf@gmail.com> References: <87ikxlqwu6.fsf@localhost> <87le2hp6ug.fsf@localhost> <87v81455iw.fsf@gmail.com> Mime-Version: 1.0 Content-Type: text/plain Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="37782"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) Cc: Pip Cet , Ihor Radchenko , emacs-devel@gnu.org To: Stefan Monnier Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Fri Jul 26 21:49:42 2024 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1sXQwf-0009l9-Ss for ged-emacs-devel@m.gmane-mx.org; Fri, 26 Jul 2024 21:49:41 +0200 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sXQvw-000299-8p; Fri, 26 Jul 2024 15:48:56 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sXQvs-00028p-NM for emacs-devel@gnu.org; Fri, 26 Jul 2024 15:48:52 -0400 Original-Received: from mail-wm1-x32d.google.com ([2a00:1450:4864:20::32d]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sXQvq-0007Jt-Mo for emacs-devel@gnu.org; Fri, 26 Jul 2024 15:48:52 -0400 Original-Received: by mail-wm1-x32d.google.com with SMTP id 5b1f17b1804b1-4257d5fc9b7so478995e9.2 for ; Fri, 26 Jul 2024 12:48:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1722023329; x=1722628129; darn=gnu.org; h=mime-version:user-agent:message-id:date:references:in-reply-to :subject:cc:to:from:from:to:cc:subject:date:message-id:reply-to; bh=lFpCVp1B+xZ7V5QH2DtWDb2jhwrYINrJdb/35/RbMrE=; b=WGbDF4CsFIsn8reIsAGykAfOQLTW5aiVJkBaStf0Up/JjPTy/qwg/PKJu1yvEhWBvI 9uzZt5qiN+kxd46uf48QJpuQDNYZEyY606GRmcTid91JEYIDd5ZH9Tg/wSuT9JxTh0zs r2YqNISo2v0I6sBIWRXSXjlSkM+Res4TT4XTmlFbwcrd9aYhXwYIB0UBGtlpG/aYTKeG w2fpyMQ3FniLQurFxgf2j500IiY8O+NYYfbf2Y8zV3V4UWNRn8gEZIv5+aih8vz9P9ID uO9KWwfuZd0YcoiKa1szLDSl8N+GnCCjQrnIDzgp9HYr3S4Ps3xN5O9eKhUGAWisSqyZ InDA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722023329; x=1722628129; h=mime-version:user-agent:message-id:date:references:in-reply-to :subject:cc:to:from:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=lFpCVp1B+xZ7V5QH2DtWDb2jhwrYINrJdb/35/RbMrE=; b=OznsCe0ArSrRVwudNPbSmvK4IOdKRw/wRGWWPc/IN1rArVJl3bslFf3zlE528OCONk eIfmSe+CeQLOqujOzO+UYoiGhTrZA9bRCMsMsJH9NlBc6Rxdad+gV7IV2ZUtJU31LwW8 xfPTnNbI2moOMYdIVSNqQYyjLJpqYSzW2iBnT5ejcvPmJTtehNrBrtRudUdUUPT89iZ/ 9RGkT+VWytdxjU6L28xkO6ueBuE9YZ5u/y0meXzhnbSfjIM3nefkaqX4QzO9d+NTF/TA BM9zTFD0zkowvtsoEnGg+njxQRO2NboCYyEUS2TR+MqyKREdqb+WxINLrC+o7w9Uk00C BOpA== X-Forwarded-Encrypted: i=1; AJvYcCWZoGltJA8xkkqVrDG62WNMelh6rPmmFPymptjVEMaj2J0JM2NLP239AjBV/zhM9BE8Ckr2osbPqgHx0Ntv6EAtQBFB X-Gm-Message-State: AOJu0YwZL3dMh/L37LtTWmNZZ5vLQGsNh5u4syU7lQNYxtsGGkHZmRgm SADsXPAlRnyMcvRgD3VyOLE1vM/WwTloHx8EsqjtGJ4Ii7cSoQOaXbseGQ== X-Google-Smtp-Source: AGHT+IHMDTbtDUR0zlJ9WfOM1D+C+LwpJI4EJhju1fqVwKchVesZpvSewE2u+Pec2PURDc4/PGjpDQ== X-Received: by 2002:a5d:64aa:0:b0:367:9049:da36 with SMTP id ffacd0b85a97d-36b5d0d0e6dmr695340f8f.44.1722023328364; Fri, 26 Jul 2024 12:48:48 -0700 (PDT) Original-Received: from caladan (dialin-234199.rol.raiffeisen.net. [195.254.234.199]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-36b367c0aa1sm5867984f8f.21.2024.07.26.12.48.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 26 Jul 2024 12:48:47 -0700 (PDT) In-Reply-To: (Stefan Monnier's message of "Thu, 18 Jul 2024 16:46:04 -0400") Received-SPF: pass client-ip=2a00:1450:4864:20::32d; envelope-from=eller.helmut@gmail.com; helo=mail-wm1-x32d.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.devel:322113 Archived-At: On Thu, Jul 18 2024, Stefan Monnier wrote: >> The current scratch/igc branch, configured with MPS and -O2 >> -fno-omit-frame-pointer: >> >> | test || tot avg (s) | tot avg err (s) | >> |------------------------++-------------+-----------------| >> | bytechar || 12.11 | 0.18 | >> | bytechar-100k || 12.38 | 0.17 | >> | bytechar-100k-nolookup || 9.14 | 0.22 | >> | bytechar-100k-random || 271.52 | 14.27 | >> | bytechar-100k-rev || 12.38 | 0.24 | >> | bytechar-10k-random || 38.08 | 1.43 | >> | bytechar-1k-random || 14.95 | 0.48 | >> | bytechar-nolookup || 8.97 | 0.12 | >> |------------------------++-------------+-----------------| >> | total || 379.53 | 14.36 | >> >> and without MPS: >> >> | test || tot avg (s) | tot avg err (s) | >> |------------------------++-------------+-----------------| >> | bytechar || 11.42 | 0.03 | >> | bytechar-100k || 11.48 | 0.02 | >> | bytechar-100k-nolookup || 9.15 | 0.00 | >> | bytechar-100k-random || 16.39 | 0.02 | >> | bytechar-100k-rev || 11.48 | 0.02 | >> | bytechar-10k-random || 11.97 | 0.02 | >> | bytechar-1k-random || 11.56 | 0.01 | >> | bytechar-nolookup || 9.13 | 0.04 | >> |------------------------++-------------+-----------------| >> | total || 92.58 | 0.06 | >> >> So the weak vector doesn't compare very well to the linked list. > > Hmm... I wonder why there is such a large difference for markers created > in a random-order compared to the cases where they're created beg-to-end > and end-to-beg. My crystal ball is of no help but suggests that it > might hint at the fact that it's probably a silly effect that could be > fixed easily once diagnosed. One problem with the benchmarks is that they all use the same buffer and that the markers for the previous benchmark can still linger around. The benchmark driver calls garbage-collect before running a benchmark and for the old GC that may be enough to collect all the old markers; with MPS, the old markers are definitely still there. If I create a fresh buffer for each benchmark, the times of the MPS and non-MPS version are much closer. >> Maybe because the vector only grows and never shrinks. > > But why would that only show up when the order is random? To figure out what is going on I run bytechar-100k followed bytechar-10k-random; in GDB I interrupted the benchmark and printed the marker array. After index 100000, it contains suspicious duplicates: ... (99997 9899604) (99998 9899703) (99999 9899802) (100000 9899901) (100001 7272795) (100002 7272795) (100003 8017474) (100004 8017474) (100005 7087003) (100006 7087003) (100007 4076094) (100008 4076094) ... The first element is the array index and the second is the charpos of the marker. Then I set a breakpoint in build_marker and got this: #0 build_marker (buf=0x7fffe46c9a10, charpos=6001308, bytepos=6003108) at alloc.c:4191 #1 0x00005555557be1a7 in buf_charpos_to_bytepos (b=0x7fffe46c9a10, charpos=6001308) at marker.c:238 #2 0x00005555557bf184 in set_marker_internal (marker=0x7fffe5a0f4dd, position=0x16e4a72, buffer=0x0, restricted=false) at marker.c:587 #3 0x00005555557bf2a3 in Fset_marker (marker=0x7fffe5a0f4dd, position=0x16e4a72, buffer=0x0) at marker.c:630 #4 0x00005555557bf640 in Fcopy_marker (marker=0x16e4a72, type=0x0) at marker.c:788 It looks like Fcopy_marker calls (indirectly) buf_charpos_to_bytepos and that creates another marker at the same position (0x16e4a72 is the fixnum for 6001308). I doubt that this is intentional, but it may not be a serious problem. So why does the problem only show up for random positions? Maybe because the benchmark is spending most of the time not in re-search-forward, but in copy-marker and for random positions the caching is ineffective?