From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: Stefan Monnier Newsgroups: gmane.emacs.devel Subject: Re: string_char_to_byte and string_byte_to_char micro-optimisation Date: Sat, 15 Jun 2019 03:48:04 -0400 Message-ID: References: <83wohovvon.fsf@gnu.org> <83r27vweor.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="127679"; mail-complaints-to="usenet@blaine.gmane.org" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (gnu/linux) Cc: emacs-devel@gnu.org To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Sat Jun 15 09:50:11 2019 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([209.51.188.17]) by blaine.gmane.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.89) (envelope-from ) id 1hc3Ry-000X76-9y for ged-emacs-devel@m.gmane.org; Sat, 15 Jun 2019 09:50:10 +0200 Original-Received: from localhost ([::1]:59162 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.86_2) (envelope-from ) id 1hc3Rx-0008Ci-BV for ged-emacs-devel@m.gmane.org; Sat, 15 Jun 2019 03:50:09 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:45193) by lists.gnu.org with esmtp (Exim 4.86_2) (envelope-from ) id 1hc3QB-0007Lw-Pu for emacs-devel@gnu.org; Sat, 15 Jun 2019 03:48:21 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hc3QA-0004Nb-Tw for emacs-devel@gnu.org; Sat, 15 Jun 2019 03:48:19 -0400 Original-Received: from mailscanner.iro.umontreal.ca ([132.204.25.50]:20919) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1hc3Q9-0004G7-9D; Sat, 15 Jun 2019 03:48:17 -0400 Original-Received: from pmg2.iro.umontreal.ca (localhost.localdomain [127.0.0.1]) by pmg2.iro.umontreal.ca (Proxmox) with ESMTP id D5E48810F9; Sat, 15 Jun 2019 03:48:13 -0400 (EDT) Original-Received: from mail01.iro.umontreal.ca (unknown [172.31.2.1]) by pmg2.iro.umontreal.ca (Proxmox) with ESMTP id CCC1980B49; Sat, 15 Jun 2019 03:48:12 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=iro.umontreal.ca; s=mail; t=1560584892; bh=7hup22QaFwuCWiP2nH6Jc0EhAUMTWtWXY9qLmdV80IA=; h=From:To:Cc:Subject:References:Date:In-Reply-To:From; b=Dxx8kCnAIbYYD7am9GHMZi+stiecvjJ3mkj0YULE2CrxD8DmCddF286SenQHrZk6R ia6GkcM2C20w6rJ/bzctY2uYuj/4XsQPZb/ygoRUtV7M14i2PUY8zzzQSaJg/Oi7nD ri46jNTK0nKIcrl57KP6Y6A2WnUMrcJNPtJYU8mJXrhJwb11Tl3k/BLUNWFNGrG2q7 ZtVn+EKajvm92P4nNCBEVh8/GY62657RJJ1R8pMSxYi3eDIf7PA1IN7MtpDwGb6PgZ l29KVsx4/Ei7OMpi9OxIYByH05PBtiw/BI1wozM3lQ9lsTkcyk4mWqAiE+5f8gPUiz OXLDDhAe3D1/g== Original-Received: from alfajor (cm-84.215.66.78.getinternet.no [84.215.66.78]) by mail01.iro.umontreal.ca (Postfix) with ESMTPSA id 362CB1203AB; Sat, 15 Jun 2019 03:48:12 -0400 (EDT) In-Reply-To: <83r27vweor.fsf@gnu.org> (Eli Zaretskii's message of "Sat, 15 Jun 2019 09:22:28 +0300") X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 132.204.25.50 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.org gmane.emacs.devel:237655 Archived-At: >> ... and uses `aref` on it extensively. > Right. And/or 'aset'. Right, but `aset` is even more rare on multibyte strings. > Other candidates are 'string-match' and 'replace-match'. `replace-match` has to copy the string, so charpos<->bytepos conversion doesn't slow it down significantly (I'd guess it's at most a factor of 2). `string-match` is only affected by charpos<->bytepos is you use the `start` argument, and the time to perform the actual regexp search will usually dwarf the charpos<->bytepos conversion, so I think it can only be noticeably slowed down by charpos<->bytepos conversion in "pathological" cases where we `start` in the middle of a longish string and we immediately find a short match. In contrast, `aref` never does much more than the charpos<->bytepos conversion itself. Stefan