From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: Daniel Lopez Newsgroups: gmane.emacs.bugs Subject: bug#34525: replace-regexp missing some matches Date: Mon, 18 Feb 2019 08:28:35 +0000 Message-ID: <5a74a337-804e-2590-bffd-43a851f90240@gmail.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="------------4A0208EF172FFB4FBC0E5D25" Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="186590"; mail-complaints-to="usenet@blaine.gmane.org" User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.4.0 To: 34525@debbugs.gnu.org Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Mon Feb 18 09:31:15 2019 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([209.51.188.17]) by blaine.gmane.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:256) (Exim 4.89) (envelope-from ) id 1gveKZ-000mO2-2t for geb-bug-gnu-emacs@m.gmane.org; Mon, 18 Feb 2019 09:31:15 +0100 Original-Received: from localhost ([127.0.0.1]:54545 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gveKX-0007Mt-VD for geb-bug-gnu-emacs@m.gmane.org; Mon, 18 Feb 2019 03:31:13 -0500 Original-Received: from eggs.gnu.org ([209.51.188.92]:48813) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gveKN-0007Mm-TF for bug-gnu-emacs@gnu.org; Mon, 18 Feb 2019 03:31:05 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gveKM-00054B-7F for bug-gnu-emacs@gnu.org; Mon, 18 Feb 2019 03:31:03 -0500 Original-Received: from debbugs.gnu.org ([209.51.188.43]:52772) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1gveKM-00053y-2N for bug-gnu-emacs@gnu.org; Mon, 18 Feb 2019 03:31:02 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1gveKL-000182-Pu for bug-gnu-emacs@gnu.org; Mon, 18 Feb 2019 03:31:01 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Daniel Lopez Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Mon, 18 Feb 2019 08:31:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: report 34525 X-GNU-PR-Package: emacs X-Debbugs-Original-To: bug-gnu-emacs@gnu.org Original-Received: via spool by submit@debbugs.gnu.org id=B.15504786524313 (code B ref -1); Mon, 18 Feb 2019 08:31:01 +0000 Original-Received: (at submit) by debbugs.gnu.org; 18 Feb 2019 08:30:52 +0000 Original-Received: from localhost ([127.0.0.1]:52053 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gveKB-00017T-9F for submit@debbugs.gnu.org; Mon, 18 Feb 2019 03:30:51 -0500 Original-Received: from eggs.gnu.org ([209.51.188.92]:34411) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gveK9-00017H-Od for submit@debbugs.gnu.org; Mon, 18 Feb 2019 03:30:50 -0500 Original-Received: from lists.gnu.org ([209.51.188.17]:55452) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gveK4-0004xr-Fg for submit@debbugs.gnu.org; Mon, 18 Feb 2019 03:30:44 -0500 Original-Received: from eggs.gnu.org ([209.51.188.92]:48792) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gveK2-0007Lz-KJ for bug-gnu-emacs@gnu.org; Mon, 18 Feb 2019 03:30:44 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gveJw-0004uX-0z for bug-gnu-emacs@gnu.org; Mon, 18 Feb 2019 03:30:42 -0500 Original-Received: from mail-wr1-x42b.google.com ([2a00:1450:4864:20::42b]:44424) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1gveJs-0004rm-PX for bug-gnu-emacs@gnu.org; Mon, 18 Feb 2019 03:30:34 -0500 Original-Received: by mail-wr1-x42b.google.com with SMTP id w2so2680904wrt.11 for ; Mon, 18 Feb 2019 00:30:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=to:from:subject:message-id:date:user-agent:mime-version :content-language; bh=QKexhx5J5hqVApGXKje0qyOnN7FcAuu0zvPDSybMG5U=; b=iHT3Z9TrT7gxCqOYmatZ2DlHaDJY9bt5dvgwqxJReEznpU2io0snIZD6xdUM2p79IC BpPYxkYECHSkUDaFNjM4QB2LTdtyafkgyTv/dtsPxLhdAzRi+i/r5OnAS5YQQqvILHvD u+LMVvEY+pzf5645rcJKD8ZoNUKGYPEQmm+IwD9zjXobKqMLvq2YK9nUitkMCw8JqJJb he7qw9cbG1qbFP5ENo3U1ZmNZ2kzGQGG3nxXNsezL+Shcr0NOSLXk9+86YDBZTqqNgec 3K2hLpqmplhCYp6f5kKNHnyGaL70I04kH9xRLO2eWPyV3nCARax9LxfqFRrEYTH6X0za rZHA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:to:from:subject:message-id:date:user-agent :mime-version:content-language; bh=QKexhx5J5hqVApGXKje0qyOnN7FcAuu0zvPDSybMG5U=; b=b67VdONad98vjKpFZzOu19avISuywmntVZUisrXARu1RfS3fNFU+6WDumfhLlcsEkx Mtc2xJ2ORh6usmYRKdcb4OC09yJ5nlBIU23U+pzZ/SUOcBkKYn9FdTmn0P87hO9qart0 tal+H2QBS8E7Xq0sWdk1WsVVlLq3bdn6FwTiTjacufAp1nWiT8eEYrtM7hneeMdn++Av F850k5EDcqxxn3+0YMOQuSRbUBoZICNcmSS2Ap3vkkrrW8nyS2Y3l1ROJS0dDRkqjw0W n2VaHFVnWav0D+12xDHhzZIMHfftnKvPDX6CWKqFJIh2MjNeljssnkz4HUnO/IrI8SKp RScw== X-Gm-Message-State: AHQUAuaWzXzEtwaPvvcafvKL6ey99FCrgLK50ZzEDDVsYknCD9O4LSXZ shTd0ylpEfaqKqoC+GdcJopqqlSH X-Google-Smtp-Source: AHgI3IZoZ7LCIUtegEOD+hXjXLacewaNGl7OLsqYV1BrOzoRK7GLmBuschRftXjyCAf6eAsuqpgT4A== X-Received: by 2002:a5d:5042:: with SMTP id h2mr15716070wrt.12.1550478627281; Mon, 18 Feb 2019 00:30:27 -0800 (PST) Original-Received: from [192.168.2.2] (w-79.cust-5765.ip.static.uno.uk.net. [95.172.231.79]) by smtp.googlemail.com with ESMTPSA id u6sm7828452wmj.28.2019.02.18.00.30.25 for (version=TLS1_3 cipher=AEAD-AES128-GCM-SHA256 bits=128/128); Mon, 18 Feb 2019 00:30:26 -0800 (PST) Content-Language: en-US-large X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.51.188.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.org gmane.emacs.bugs:155488 Archived-At: This is a multi-part message in MIME format. --------------4A0208EF172FFB4FBC0E5D25 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Reproduce: - Start "emacs -Q" and open the file BitmapFontFace.h - Evaluate the expression (replace-regexp "\\" "SharedBitmap") - The text "Replaced 8 occurrences" appears in the echo area. Problem: There were actually 12 occurrences (ie. of the word "Bitmap" surrounded by word boundaries) in the file that should have been replaced. If I now move point back to the start of the buffer and evaluate the expression again, it says "Replaced 4 occurrences". The exact number of incorrect replacements perhaps varies over time. That is, I can test it five times in a row and get 8 initial replacments each time, but after trying some other search terms, messing with the file, restarting Emacs etc, I try my initial test again and then maybe it consistently replaces 10 the first time, for a while. So your exact numbers may vary. I debugged the Lisp as far as I could and it appears to be wrong answers coming out of the re-search-forward C call that is in isearch-search-fun-default. The bug filters up to a number of string replacement user actions - I first noticed it when trying to do this replacement interactively with query-replace on word boundaries (C-u M-%), entering "Bitmap" as search string, then "SharedBitmap" as replacement string. Trying now, as I press space repeatedly about once a second to confirm each one, I see the pink highlight skip valid matches to ask me about one that is further down even while I see the skipped one highlighted in blue a few lines above, and in the end it may have replaced only 6-8 of the occurrences. Though, if I press 'n' instead of space to skip without making any replacements, it does visit all of the occurrences. I see from the Lisp that plain (non-regexp) query-replace on word boundaries gets preprocessed into the equivalent regexp search as in my initial example. I don't think there are any problems with plain string search and replacement. Some more experimental observations: - The replacement text can be any string instead of "SharedBitmap", eg. "qwertyasdfgh", "qwer", etc, and the bug still happens. The number of matches seems to be related to the length of the replacement string. Currently 12 character replacement strings are causing replace-regexp to make 8 replacements on the first call for me, while 4 character strings cause 7 replacements. 6 character replacement strings - ie. same length as "Bitmap" - always work, replacing all 12 occurrences. - The bug doesn't happen in fundamental-mode, nor c-mode, js-mode, text-mode or any other major modes I tried. - I've seen this happen in other of my C++ files where I was making the same replacement, so the problem's not precisely unique to this one. I've been trying to simplify this one but haven't found anything much more revealing so far. For example if I delete all the comments and blank lines, then the first replacement finds 9 occurrences out of 10. If I cut the file in half by deleting line 140 onwards, the first replacement finds 3 occurrences out of 6. But if I do something very simple like just pasting "Bitmap" on 100 consecutive lines, it's not fooled and it replaces them all. I've tried this in GNU Emacs 26.1 on Arch Linux and 25.2.1 on Windows 7 and am seeing the same behaviour in both. Thanks, Daniel --------------4A0208EF172FFB4FBC0E5D25 Content-Type: text/x-chdr; name="BitmapFontFace.h" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="BitmapFontFace.h" // BitmapFontFace: A class to store a font in which the glyphs are stored in bitmaps. #pragma once // This namespace #include "FontFaceMetrics.h" // Dan std #include #include #include #include using dan::Error; // Boost #include using boost::scoped_array; #include // C++ std #include #include namespace dan { struct BitmapFontGlyph // Where to find the graphic for a single character // // (This is only used by BitmapFontFace but isn't declared within it as it doesn't need to // use BitmapFontFace's template parameters) { unsigned int bitmapNo; // Which bitmap of the BitmapFontFace contains the glyph of this character dan::math::Rect2I rect; // The position and size on the bitmap of the character's glyph dan::math::Vector2I bearing; // Distance from the drawing pen position to where the top-left of the glyph should be dan::math::Vector2I advance; // The number of pixels that the drawing pen position should be moved // after printing this character, to be ready for the next one BitmapFontGlyph(unsigned int i_bitmapNo, const dan::math::Rect2I & i_rect, const dan::math::Vector2I & i_bearing, const dan::math::Vector2I & i_advance) : bitmapNo(i_bitmapNo), rect(i_rect), bearing(i_bearing), advance(i_advance) {} }; typedef std::map BitmapFontCharmap; // Represents a character map, // ie. a full alphabet of mappings from character code numbers to glyphs. // This collection of mappings is also known as a character encoding. // // (This is only used by BitmapFontFace but isn't declared within it as it doesn't need to // use BitmapFontFace's template parameters) template class BitmapFontFace { // + Shared part {{{ protected: struct Shared { std::vector< Bitmap > m_bitmaps; std::vector m_charmaps; float m_emHeight; FontFaceMetrics m_metrics; unsigned int m_replacementForMissingCharCode = 0; // + Construction {{{ Shared(float i_emHeight = 1, const FontFaceMetrics & i_metrics = FontFaceMetrics()) : m_emHeight(i_emHeight) , m_metrics(i_metrics) // Construct with no bitmaps or charmaps {} Shared(const std::vector< Bitmap > & i_bitmaps, const std::vector & i_charmaps, float i_emHeight = 1, const FontFaceMetrics & i_metrics = FontFaceMetrics()) : m_emHeight(i_emHeight) , m_metrics(i_metrics) // Construct with multiple bitmaps and multiple charmaps { m_bitmaps.assign(i_bitmaps.begin(), i_bitmaps.end()); m_charmaps.assign(i_charmaps.begin(), i_charmaps.end()); } Shared(const Bitmap & i_bitmap, const std::vector & i_charmaps, float i_emHeight = 1, const FontFaceMetrics & i_metrics = FontFaceMetrics()) : m_emHeight(i_emHeight) , m_metrics(i_metrics) // Construct with single bitmap and multiple charmaps // (bitmap specified in Bitmap object) { m_bitmaps.push_back(i_bitmap); m_charmaps.assign(i_charmaps.begin(), i_charmaps.end()); } Shared(PixelType * i_srcBitmap_pixels, unsigned int i_srcBitmap_width, unsigned int i_srcBitmap_height, const std::vector & i_charmaps, float i_emHeight = 1, const FontFaceMetrics & i_metrics = FontFaceMetrics()) : m_emHeight(i_emHeight) , m_metrics(i_metrics) // Construct with single bitmap and multiple charmaps // (bitmap specified as buffer, width and height) { m_bitmaps.push_back(Bitmap(i_srcBitmap_pixels, i_srcBitmap_width, i_srcBitmap_height)); m_charmaps.assign(i_charmaps.begin(), i_charmaps.end()); } Shared(const std::vector & i_charmaps, float i_emHeight = 1, const FontFaceMetrics & i_metrics = FontFaceMetrics()) : m_emHeight(i_emHeight) , m_metrics(i_metrics) // Construct without any bitmaps stored (class stores font metadata only) { m_charmaps.assign(i_charmaps.begin(), i_charmaps.end()); } // + }}} }; boost::shared_ptr m_shared; // + }}} // + Construction {{{ public: BitmapFontFace(float i_emHeight = 1, const FontFaceMetrics & i_metrics = FontFaceMetrics()) : m_shared(new Shared(i_emHeight, i_metrics)) // Construct with no bitmaps or charmaps {} BitmapFontFace(const std::vector< Bitmap > & i_bitmaps, const std::vector & i_charmaps, float i_emHeight = 1, const FontFaceMetrics & i_metrics = FontFaceMetrics()) : m_shared(new Shared(i_bitmaps, i_charmaps, i_emHeight, i_metrics)) // Construct with multiple bitmaps and multiple charmaps {} BitmapFontFace(const Bitmap & i_bitmap, const std::vector & i_charmaps, float i_emHeight = 1, const FontFaceMetrics & i_metrics = FontFaceMetrics()) : m_shared(new Shared(i_bitmap, i_charmaps, i_emHeight, i_metrics)) // Construct with single bitmap and multiple charmaps // (bitmap specified in Bitmap object) {} BitmapFontFace(PixelType * i_srcBitmap_pixels, unsigned int i_srcBitmap_width, unsigned int i_srcBitmap_height, const std::vector & i_charmaps, float i_emHeight = 1, const FontFaceMetrics & i_metrics = FontFaceMetrics()) : m_shared(new Shared(i_srcBitmap_pixels, i_srcBitmap_width, i_srcBitmap_height, i_charmaps, i_emHeight, i_metrics)) // Construct with single bitmap and multiple charmaps // (bitmap specified as buffer, width and height) {} BitmapFontFace(const std::vector & i_charmaps, float i_emHeight = 1, const FontFaceMetrics & i_metrics = FontFaceMetrics()) : m_shared(new Shared(i_charmaps, i_emHeight, i_metrics)) // Construct without any bitmaps stored (class stores font metadata only) {} // + }}} // + Bitmaps {{{ const std::vector< Bitmap > & bitmaps() const { return m_shared->m_bitmaps; } std::vector< Bitmap > & bitmaps() { return m_shared->m_bitmaps; } //void releaseBitmaps() //{ // m_bitmaps.clear(); //} // + }}} // + Charmaps {{{ const std::vector & charmaps() const { return m_shared->m_charmaps; } unsigned int charmap_count() const { return m_shared->m_charmaps.size(); } unsigned int charmap_charCount() const { return m_shared->m_charmaps.front().size(); } // + }}} // + Metrics {{{ unsigned int emHeight() const { return m_shared->m_emHeight; } FontFaceMetrics & metrics() const { return m_shared->m_metrics; } // + }}} // + Get details of a single char {{{ unsigned int replacementForMissingCharCode() const { return m_shared->m_replacementForMissingCharCode; } void replacementForMissingCharCode(unsigned int i_charCode) const { m_shared->m_replacementForMissingCharCode = i_charCode; } const BitmapFontGlyph & getGlyph(unsigned int i_charmapNo, unsigned int i_charCode) const { if (i_charmapNo >= m_shared->m_charmaps.size()) throw( Error(0, "in BitmapFontFace::getGlyph, i_charmapNo out of range") ); BitmapFontCharmap & charmap = m_shared->m_charmaps[i_charmapNo]; BitmapFontCharmap::const_iterator glyphItr = charmap.find(i_charCode); if (glyphItr == charmap.end()) { if (m_shared->m_replacementForMissingCharCode == 0) throw( Error(0, "in BitmapFontFace::getGlyph, no glyph in charmap for i_charCode") ); glyphItr = charmap.find(m_shared->m_replacementForMissingCharCode); if (glyphItr == charmap.end()) throw( Error(0, "in BitmapFontFace::getGlyph, no glyph in charmap for i_charCode or its replacement code") ); } return glyphItr->second; } const Bitmap & getGlyphBitmap(unsigned int i_charmapNo, unsigned int i_charCode) const { return m_shared->m_bitmaps[getGlyph(i_charmapNo, i_charCode).bitmapNo]; } const dan::math::Rect2I & getGlyphRect(unsigned int i_charmapNo, unsigned int i_charCode) const { return getGlyph(i_charmapNo, i_charCode).rect; } // + }}} dan::math::Vector2I charAdvanceMax() const { dan::math::Vector2I maxValue = 0; for (unsigned int charmapNo = 0; charmapNo < m_shared->m_charmaps.size(); ++charmapNo) { const BitmapFontCharmap & charmap = m_shared->m_charmaps[charmapNo]; for (BitmapFontCharmap::const_iterator glyphItr = charmap.begin(); glyphItr != charmap.end(); ++glyphItr) { if (glyphItr->second.advance[0] > maxValue[0]) maxValue[0] = glyphItr->second.advance[0]; if (glyphItr->second.advance[1] > maxValue[1]) maxValue[1] = glyphItr->second.advance[1]; } } return maxValue; } dan::math::Vector2I charRectSizeMax() const { dan::math::Vector2I maxValue = 0; for (unsigned int charmapNo = 0; charmapNo < m_shared->m_charmaps.size(); ++charmapNo) { const BitmapFontCharmap & charmap = m_shared->m_charmaps[charmapNo]; for (BitmapFontCharmap::const_iterator glyphItr = charmap.begin(); glyphItr != charmap.end(); ++glyphItr) { if (glyphItr->second.rect.width() > maxValue[0]) maxValue[0] = glyphItr->second.rect.width(); if (glyphItr->second.rect.height() > maxValue[1]) maxValue[1] = glyphItr->second.rect.height(); } } return maxValue; } }; } --------------4A0208EF172FFB4FBC0E5D25--