From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Michal Nazarewicz Newsgroups: gmane.emacs.bugs Subject: bug#24603: [PATCHv5 01/11] Split casify_object into multiple functions Date: Thu, 9 Mar 2017 22:51:40 +0100 Message-ID: <20170309215150.9562-2-mina86@mina86.com> References: <20170309215150.9562-1-mina86@mina86.com> NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Trace: blaine.gmane.org 1489096489 27860 195.159.176.226 (9 Mar 2017 21:54:49 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Thu, 9 Mar 2017 21:54:49 +0000 (UTC) To: 24603@debbugs.gnu.org, eliz@gnu.org Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Thu Mar 09 22:54:44 2017 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1cm61B-0006DZ-8J for geb-bug-gnu-emacs@m.gmane.org; Thu, 09 Mar 2017 22:54:41 +0100 Original-Received: from localhost ([::1]:36414 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cm61H-0003um-CI for geb-bug-gnu-emacs@m.gmane.org; Thu, 09 Mar 2017 16:54:47 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:41311) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cm5zq-0002oG-KQ for bug-gnu-emacs@gnu.org; Thu, 09 Mar 2017 16:53:21 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cm5zm-0004Mp-Rb for bug-gnu-emacs@gnu.org; Thu, 09 Mar 2017 16:53:18 -0500 Original-Received: from debbugs.gnu.org ([208.118.235.43]:49932) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1cm5zm-0004Ll-L9 for bug-gnu-emacs@gnu.org; Thu, 09 Mar 2017 16:53:14 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1cm5zd-0000Pi-13 for bug-gnu-emacs@gnu.org; Thu, 09 Mar 2017 16:53:05 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Michal Nazarewicz Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Thu, 09 Mar 2017 21:53:04 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 24603 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch Original-Received: via spool by 24603-submit@debbugs.gnu.org id=B24603.14890963401455 (code B ref 24603); Thu, 09 Mar 2017 21:53:04 +0000 Original-Received: (at 24603) by debbugs.gnu.org; 9 Mar 2017 21:52:20 +0000 Original-Received: from localhost ([127.0.0.1]:48116 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1cm5yr-0000N6-ET for submit@debbugs.gnu.org; Thu, 09 Mar 2017 16:52:18 -0500 Original-Received: from mail-wr0-f169.google.com ([209.85.128.169]:36822) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1cm5ym-0000LO-IV for 24603@debbugs.gnu.org; Thu, 09 Mar 2017 16:52:13 -0500 Original-Received: by mail-wr0-f169.google.com with SMTP id u108so53786249wrb.3 for <24603@debbugs.gnu.org>; Thu, 09 Mar 2017 13:52:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:from:to:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=EYXvRUCmQx0WZ9AtVZeFhIOBBTHZyZqG3c09G0atbmw=; b=soj3RLjkvMd2nIEp4WrbOFdmy1BaCVZ3vj3ope4GVPGbafRVKeCMFyU/i0So15/xcp iytgweJ0iz+jRvlBoHhdlxPseY+k8gM/yz+HFnubFswvR50+4IlPqUfxaeKvuovCoYCj u57fVvGjWQ3mWYHdbdHumZ1VLC9t8HJSE9U5s6MCN5gN1wnr9sXE5fvbEK21PNI413Ae PO+Q38orcxZ7iIbTYs/ZS6BbrCZiHcB1/23jqabukyVncJ0L8BVeIhybFHWThM7h+n8D 4yWrxyKpraqG1YM9FGtGcMo4LswWkXYrK0OIUYECJU5H7Xg4mzea1DKsCr8yZ9e3FvgU 1XrA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:from:to:subject:date:message-id :in-reply-to:references:mime-version:content-transfer-encoding; bh=EYXvRUCmQx0WZ9AtVZeFhIOBBTHZyZqG3c09G0atbmw=; b=jETlIrTiVPMnc9O8UnRPG+JSGtAnbCJkryXlzT1ap2GHfYW4YjXTUw0VJ+gV705lQy xeMm3YLO49A35tVRxgeVOO9rW1s3Rcpbdth7AwMkUWlw3YerJPMgN2liRsNojAc/EP9/ YwCyRgoZFYRI7DUX1cYMu33QlFYL+eJuBfNes0EPjv/z0kXuwugBXN6mwwSbXer24Fe1 gSzp7Wgr+oUNp609YH15vEzNut7Jsi8NNoyWwNAfoVGR1JzsKWatkPAHosoWGlHRYThq sHzOSooq5e8ox7G+p2fZz4cdG5h7lqhT8dexI6zSNIHpEUhQkz2w60CY3IvN6HTZt5ql 8zYQ== X-Gm-Message-State: AMke39mecHcggLAGwLmzN+US+xcBZuBr2kP00yJAjKOWZcfKlx2fUcemnTJ6y2JDNMN9FoY1 X-Received: by 10.223.163.195 with SMTP id m3mr12274705wrb.83.1489096326560; Thu, 09 Mar 2017 13:52:06 -0800 (PST) Original-Received: from mpn.zrh.corp.google.com ([172.16.115.43]) by smtp.gmail.com with ESMTPSA id d6sm287065wmd.6.2017.03.09.13.52.00 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 09 Mar 2017 13:52:05 -0800 (PST) Original-Received: by mpn.zrh.corp.google.com (Postfix, from userid 126942) id AE2881E0299; Thu, 9 Mar 2017 22:51:57 +0100 (CET) X-Mailer: git-send-email 2.12.0.246.ga2ecc84866-goog In-Reply-To: <20170309215150.9562-1-mina86@mina86.com> X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 208.118.235.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.org gmane.emacs.bugs:130404 Archived-At: casify_object had three major cases to cover and those were mostly independent of each other. Move those branches to separate function so it’s easier to comprehend each individual case. While at it, use somewhat more descriptive ch and cased variable names rather than c and c1. This commit introduces no functional changes. * src/casefiddle.c (casify_object): Split into… (do_casify_integer, do_casify_multibyte_string, do_casify_unibyte_string): …new functions. --- src/casefiddle.c | 196 +++++++++++++++++++++++++++++-------------------------- 1 file changed, 104 insertions(+), 92 deletions(-) diff --git a/src/casefiddle.c b/src/casefiddle.c index 11d59444916..e32910fa8aa 100644 --- a/src/casefiddle.c +++ b/src/casefiddle.c @@ -32,108 +32,120 @@ along with GNU Emacs. If not, see . */ enum case_action {CASE_UP, CASE_DOWN, CASE_CAPITALIZE, CASE_CAPITALIZE_UP}; static Lisp_Object -casify_object (enum case_action flag, Lisp_Object obj) +do_casify_natnum (enum case_action flag, Lisp_Object obj) +{ + int flagbits = (CHAR_ALT | CHAR_SUPER | CHAR_HYPER + | CHAR_SHIFT | CHAR_CTL | CHAR_META); + int flags, ch = XFASTINT (obj), cased; + bool multibyte; + + /* If the character has higher bits set above the flags, return it unchanged. + It is not a real character. */ + if (UNSIGNED_CMP (ch, >, flagbits)) + return obj; + + flags = ch & flagbits; + ch = ch & ~flagbits; + + /* FIXME: Even if enable-multibyte-characters is nil, we may manipulate + multibyte chars. This means we have a bug for latin-1 chars since when we + receive an int 128-255 we can't tell whether it's an eight-bit byte or + a latin-1 char. */ + multibyte = (ch >= 256 || + !NILP (BVAR (current_buffer, enable_multibyte_characters))); + if (! multibyte) + MAKE_CHAR_MULTIBYTE (ch); + cased = flag == CASE_DOWN ? downcase (ch) : upcase (ch); + if (cased == ch) + return obj; + + if (! multibyte) + MAKE_CHAR_UNIBYTE (cased); + XSETFASTINT (obj, cased | flags); + return obj; +} + +static Lisp_Object +do_casify_multibyte_string (enum case_action flag, Lisp_Object obj) +{ + ptrdiff_t i, i_byte, size = SCHARS (obj); + bool inword = flag == CASE_DOWN; + int len, ch, cased; + USE_SAFE_ALLOCA; + ptrdiff_t o_size; + if (INT_MULTIPLY_WRAPV (size, MAX_MULTIBYTE_LENGTH, &o_size)) + o_size = PTRDIFF_MAX; + unsigned char *dst = SAFE_ALLOCA (o_size); + unsigned char *o = dst; + + for (i = i_byte = 0; i < size; i++, i_byte += len) + { + if (o_size - MAX_MULTIBYTE_LENGTH < o - dst) + string_overflow (); + ch = STRING_CHAR_AND_LENGTH (SDATA (obj) + i_byte, len); + if (inword && flag != CASE_CAPITALIZE_UP) + cased = downcase (ch); + else if (!inword || flag != CASE_CAPITALIZE_UP) + cased = upcase (ch); + else + cased = ch; + if ((int) flag >= (int) CASE_CAPITALIZE) + inword = (SYNTAX (ch) == Sword); + o += CHAR_STRING (cased, o); + } + eassert (o - dst <= o_size); + obj = make_multibyte_string ((char *) dst, size, o - dst); + SAFE_FREE (); + return obj; +} + +static Lisp_Object +do_casify_unibyte_string (enum case_action flag, Lisp_Object obj) { - int c, c1; + ptrdiff_t i, size = SCHARS (obj); bool inword = flag == CASE_DOWN; + int ch, cased; + + obj = Fcopy_sequence (obj); + for (i = 0; i < size; i++) + { + ch = SREF (obj, i); + MAKE_CHAR_MULTIBYTE (ch); + cased = ch; + if (inword && flag != CASE_CAPITALIZE_UP) + ch = downcase (ch); + else if (!uppercasep (ch) + && (!inword || flag != CASE_CAPITALIZE_UP)) + ch = upcase (cased); + if ((int) flag >= (int) CASE_CAPITALIZE) + inword = (SYNTAX (ch) == Sword); + if (ch == cased) + continue; + MAKE_CHAR_UNIBYTE (ch); + /* If the char can't be converted to a valid byte, just don't change it */ + if (ch >= 0 && ch < 256) + SSET (obj, i, ch); + } + return obj; +} +static Lisp_Object +casify_object (enum case_action flag, Lisp_Object obj) +{ /* If the case table is flagged as modified, rescan it. */ if (NILP (XCHAR_TABLE (BVAR (current_buffer, downcase_table))->extras[1])) Fset_case_table (BVAR (current_buffer, downcase_table)); if (NATNUMP (obj)) - { - int flagbits = (CHAR_ALT | CHAR_SUPER | CHAR_HYPER - | CHAR_SHIFT | CHAR_CTL | CHAR_META); - int flags = XINT (obj) & flagbits; - bool multibyte = ! NILP (BVAR (current_buffer, - enable_multibyte_characters)); - - /* If the character has higher bits set - above the flags, return it unchanged. - It is not a real character. */ - if (UNSIGNED_CMP (XFASTINT (obj), >, flagbits)) - return obj; - - c1 = XFASTINT (obj) & ~flagbits; - /* FIXME: Even if enable-multibyte-characters is nil, we may - manipulate multibyte chars. This means we have a bug for latin-1 - chars since when we receive an int 128-255 we can't tell whether - it's an eight-bit byte or a latin-1 char. */ - if (c1 >= 256) - multibyte = 1; - if (! multibyte) - MAKE_CHAR_MULTIBYTE (c1); - c = flag == CASE_DOWN ? downcase (c1) : upcase (c1); - if (c != c1) - { - if (! multibyte) - MAKE_CHAR_UNIBYTE (c); - XSETFASTINT (obj, c | flags); - } - return obj; - } - - if (!STRINGP (obj)) + return do_casify_natnum (flag, obj); + else if (!STRINGP (obj)) wrong_type_argument (Qchar_or_string_p, obj); - else if (!STRING_MULTIBYTE (obj)) - { - ptrdiff_t i; - ptrdiff_t size = SCHARS (obj); - - obj = Fcopy_sequence (obj); - for (i = 0; i < size; i++) - { - c = SREF (obj, i); - MAKE_CHAR_MULTIBYTE (c); - c1 = c; - if (inword && flag != CASE_CAPITALIZE_UP) - c = downcase (c); - else if (!uppercasep (c) - && (!inword || flag != CASE_CAPITALIZE_UP)) - c = upcase (c1); - if ((int) flag >= (int) CASE_CAPITALIZE) - inword = (SYNTAX (c) == Sword); - if (c != c1) - { - MAKE_CHAR_UNIBYTE (c); - /* If the char can't be converted to a valid byte, just don't - change it. */ - if (c >= 0 && c < 256) - SSET (obj, i, c); - } - } - return obj; - } + else if (!SCHARS (obj)) + return obj; + else if (STRING_MULTIBYTE (obj)) + return do_casify_multibyte_string (flag, obj); else - { - ptrdiff_t i, i_byte, size = SCHARS (obj); - int len; - USE_SAFE_ALLOCA; - ptrdiff_t o_size; - if (INT_MULTIPLY_WRAPV (size, MAX_MULTIBYTE_LENGTH, &o_size)) - o_size = PTRDIFF_MAX; - unsigned char *dst = SAFE_ALLOCA (o_size); - unsigned char *o = dst; - - for (i = i_byte = 0; i < size; i++, i_byte += len) - { - if (o_size - MAX_MULTIBYTE_LENGTH < o - dst) - string_overflow (); - c = STRING_CHAR_AND_LENGTH (SDATA (obj) + i_byte, len); - if (inword && flag != CASE_CAPITALIZE_UP) - c = downcase (c); - else if (!inword || flag != CASE_CAPITALIZE_UP) - c = upcase (c); - if ((int) flag >= (int) CASE_CAPITALIZE) - inword = (SYNTAX (c) == Sword); - o += CHAR_STRING (c, o); - } - eassert (o - dst <= o_size); - obj = make_multibyte_string ((char *) dst, size, o - dst); - SAFE_FREE (); - return obj; - } + return do_casify_unibyte_string (flag, obj); } DEFUN ("upcase", Fupcase, Supcase, 1, 1, 0, -- 2.12.0.246.ga2ecc84866-goog