From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Michal Nazarewicz Newsgroups: gmane.emacs.bugs Subject: bug#24603: [PATCHv6 1/6] Split casify_object into multiple functions Date: Tue, 21 Mar 2017 02:27:04 +0100 Message-ID: <20170321012709.19402-2-mina86@mina86.com> References: <20170309215150.9562-1-mina86@mina86.com> <20170321012709.19402-1-mina86@mina86.com> NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Trace: blaine.gmane.org 1490059737 22051 195.159.176.226 (21 Mar 2017 01:28:57 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Tue, 21 Mar 2017 01:28:57 +0000 (UTC) To: Eli Zaretskii , Andreas Schwab , 24603@debbugs.gnu.org Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Tue Mar 21 02:28:52 2017 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1cq8bS-00055n-Ro for geb-bug-gnu-emacs@m.gmane.org; Tue, 21 Mar 2017 02:28:51 +0100 Original-Received: from localhost ([::1]:36111 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cq8bY-00074u-Lz for geb-bug-gnu-emacs@m.gmane.org; Mon, 20 Mar 2017 21:28:56 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:46871) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cq8al-0006Wc-FU for bug-gnu-emacs@gnu.org; Mon, 20 Mar 2017 21:28:08 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cq8ai-0006ph-4S for bug-gnu-emacs@gnu.org; Mon, 20 Mar 2017 21:28:07 -0400 Original-Received: from debbugs.gnu.org ([208.118.235.43]:38815) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1cq8ai-0006pc-1D for bug-gnu-emacs@gnu.org; Mon, 20 Mar 2017 21:28:04 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1cq8ah-0008Os-RU for bug-gnu-emacs@gnu.org; Mon, 20 Mar 2017 21:28:03 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Michal Nazarewicz Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Tue, 21 Mar 2017 01:28:03 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 24603 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch Original-Received: via spool by 24603-submit@debbugs.gnu.org id=B24603.149005965632208 (code B ref 24603); Tue, 21 Mar 2017 01:28:03 +0000 Original-Received: (at 24603) by debbugs.gnu.org; 21 Mar 2017 01:27:36 +0000 Original-Received: from localhost ([127.0.0.1]:37005 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1cq8aF-0008NP-As for submit@debbugs.gnu.org; Mon, 20 Mar 2017 21:27:35 -0400 Original-Received: from mail-wr0-f174.google.com ([209.85.128.174]:33063) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1cq8a6-0008M9-GW for 24603@debbugs.gnu.org; Mon, 20 Mar 2017 21:27:27 -0400 Original-Received: by mail-wr0-f174.google.com with SMTP id u48so103040663wrc.0 for <24603@debbugs.gnu.org>; Mon, 20 Mar 2017 18:27:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:from:to:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=eusTs5Xm9ngoHaOIPrNQwOhLNexZt3WTqHewDIAZ2SE=; b=G+QatpOrXd5kNegcymETTfgewc2FS6jTYqbVQxH5y3CnktZSerurVQs2bNn8+nK5vk LkOA6ex4jjz+XiKkAa4vOCdwG8BvO3atJMEw5+9KqaYG95+tKykrVSUnxG+50FZPgig8 6bt5In/bFekTg2voiFTxRmHusv2sHAhNcM+ZFT5BzScSJdimGwrMjvYr//+cfEmBEROK hpcfO/WZP4/PQaSPRw/tZoHwFzRzJqL2/9bN7XvGLkKeAbvIp7kIRP1QSjzdkOP9GFEv hXbPYdlwNkKElwY+p97RaamHUo7ppf+fNfDgETbLhpnAgpTxJ0TJGWceYZWQYGZxukGC LSdg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:from:to:subject:date:message-id :in-reply-to:references:mime-version:content-transfer-encoding; bh=eusTs5Xm9ngoHaOIPrNQwOhLNexZt3WTqHewDIAZ2SE=; b=uAOePHc3CN5S1qaFCCsIl2Nk0PArzVx8759Zkm8qySh4LgKb4d/rmd/boHTXqNk70P L7BB5ps5P4U/QRqYWnsJk3fJtW18hMo2VOTkXlQEfcdcF22dawkM6q6Ads7SLHzAc9dj J1ib1UGd62SKl2MJOn5u5BtlgOBfJu26QRAkm4ni4VAPsndKwt1iaqinVVDWQFeWzK0j ZJcVfJ2VqzRJsXjgvQf2FfRqWN1CmRtq6EsalQNPGcECUCu3rTvXKa+KWNeKC2yS+SYm 54Z32pVWCD8pDjcW0uZO31WbOdC+uK7smeIPpA3YiPNCwTmbLCTTGVNO5MpjIWuc6u3r fnVQ== X-Gm-Message-State: AFeK/H2sUgMxOWLDXZugpIk9VoOX1m7muxQBMwZhCBBFyIezBT+a93IRYfwFbyJKdYvt+beR X-Received: by 10.223.155.129 with SMTP id d1mr28010029wrc.67.1490059640624; Mon, 20 Mar 2017 18:27:20 -0700 (PDT) Original-Received: from mpn.zrh.corp.google.com ([2620:0:105f:303:5db2:6e42:d9c7:408e]) by smtp.gmail.com with ESMTPSA id k43sm22775115wrk.42.2017.03.20.18.27.15 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 20 Mar 2017 18:27:18 -0700 (PDT) Original-Received: by mpn.zrh.corp.google.com (Postfix, from userid 126942) id 3505A1E0299; Tue, 21 Mar 2017 02:27:14 +0100 (CET) X-Mailer: git-send-email 2.12.1.500.gab5fba24ee-goog In-Reply-To: <20170321012709.19402-1-mina86@mina86.com> X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 208.118.235.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.org gmane.emacs.bugs:130769 Archived-At: casify_object had three major cases to cover and those were mostly independent of each other. Move those branches to separate function so it’s easier to comprehend each individual case. While at it, use somewhat more descriptive ch and cased variable names rather than c and c1. This commit introduces no functional changes. * src/casefiddle.c (casify_object): Split into… (do_casify_integer, do_casify_multibyte_string, do_casify_unibyte_string): …new functions. --- src/casefiddle.c | 196 +++++++++++++++++++++++++++++-------------------------- 1 file changed, 104 insertions(+), 92 deletions(-) diff --git a/src/casefiddle.c b/src/casefiddle.c index 11d59444916..72661979b4d 100644 --- a/src/casefiddle.c +++ b/src/casefiddle.c @@ -32,108 +32,120 @@ along with GNU Emacs. If not, see . */ enum case_action {CASE_UP, CASE_DOWN, CASE_CAPITALIZE, CASE_CAPITALIZE_UP}; static Lisp_Object -casify_object (enum case_action flag, Lisp_Object obj) +do_casify_natnum (enum case_action flag, Lisp_Object obj) +{ + int flagbits = (CHAR_ALT | CHAR_SUPER | CHAR_HYPER + | CHAR_SHIFT | CHAR_CTL | CHAR_META); + int flags, ch = XFASTINT (obj), cased; + bool multibyte; + + /* If the character has higher bits set above the flags, return it unchanged. + It is not a real character. */ + if (UNSIGNED_CMP (ch, >, flagbits)) + return obj; + + flags = ch & flagbits; + ch = ch & ~flagbits; + + /* FIXME: Even if enable-multibyte-characters is nil, we may manipulate + multibyte chars. This means we have a bug for latin-1 chars since when we + receive an int 128-255 we can't tell whether it's an eight-bit byte or + a latin-1 char. */ + multibyte = ch >= 256 + || !NILP (BVAR (current_buffer, enable_multibyte_characters)); + if (! multibyte) + MAKE_CHAR_MULTIBYTE (ch); + cased = flag == CASE_DOWN ? downcase (ch) : upcase (ch); + if (cased == ch) + return obj; + + if (! multibyte) + MAKE_CHAR_UNIBYTE (cased); + XSETFASTINT (obj, cased | flags); + return obj; +} + +static Lisp_Object +do_casify_multibyte_string (enum case_action flag, Lisp_Object obj) +{ + ptrdiff_t i, i_byte, size = SCHARS (obj); + bool inword = flag == CASE_DOWN; + int len, ch, cased; + USE_SAFE_ALLOCA; + ptrdiff_t o_size; + if (INT_MULTIPLY_WRAPV (size, MAX_MULTIBYTE_LENGTH, &o_size)) + o_size = PTRDIFF_MAX; + unsigned char *dst = SAFE_ALLOCA (o_size); + unsigned char *o = dst; + + for (i = i_byte = 0; i < size; i++, i_byte += len) + { + if (o_size - MAX_MULTIBYTE_LENGTH < o - dst) + string_overflow (); + ch = STRING_CHAR_AND_LENGTH (SDATA (obj) + i_byte, len); + if (inword && flag != CASE_CAPITALIZE_UP) + cased = downcase (ch); + else if (!inword || flag != CASE_CAPITALIZE_UP) + cased = upcase (ch); + else + cased = ch; + if ((int) flag >= (int) CASE_CAPITALIZE) + inword = (SYNTAX (ch) == Sword); + o += CHAR_STRING (cased, o); + } + eassert (o - dst <= o_size); + obj = make_multibyte_string ((char *) dst, size, o - dst); + SAFE_FREE (); + return obj; +} + +static Lisp_Object +do_casify_unibyte_string (enum case_action flag, Lisp_Object obj) { - int c, c1; + ptrdiff_t i, size = SCHARS (obj); bool inword = flag == CASE_DOWN; + int ch, cased; + + obj = Fcopy_sequence (obj); + for (i = 0; i < size; i++) + { + ch = SREF (obj, i); + MAKE_CHAR_MULTIBYTE (ch); + cased = ch; + if (inword && flag != CASE_CAPITALIZE_UP) + ch = downcase (ch); + else if (!uppercasep (ch) + && (!inword || flag != CASE_CAPITALIZE_UP)) + ch = upcase (cased); + if ((int) flag >= (int) CASE_CAPITALIZE) + inword = (SYNTAX (ch) == Sword); + if (ch == cased) + continue; + MAKE_CHAR_UNIBYTE (ch); + /* If the char can't be converted to a valid byte, just don't change it */ + if (ch >= 0 && ch < 256) + SSET (obj, i, ch); + } + return obj; +} +static Lisp_Object +casify_object (enum case_action flag, Lisp_Object obj) +{ /* If the case table is flagged as modified, rescan it. */ if (NILP (XCHAR_TABLE (BVAR (current_buffer, downcase_table))->extras[1])) Fset_case_table (BVAR (current_buffer, downcase_table)); if (NATNUMP (obj)) - { - int flagbits = (CHAR_ALT | CHAR_SUPER | CHAR_HYPER - | CHAR_SHIFT | CHAR_CTL | CHAR_META); - int flags = XINT (obj) & flagbits; - bool multibyte = ! NILP (BVAR (current_buffer, - enable_multibyte_characters)); - - /* If the character has higher bits set - above the flags, return it unchanged. - It is not a real character. */ - if (UNSIGNED_CMP (XFASTINT (obj), >, flagbits)) - return obj; - - c1 = XFASTINT (obj) & ~flagbits; - /* FIXME: Even if enable-multibyte-characters is nil, we may - manipulate multibyte chars. This means we have a bug for latin-1 - chars since when we receive an int 128-255 we can't tell whether - it's an eight-bit byte or a latin-1 char. */ - if (c1 >= 256) - multibyte = 1; - if (! multibyte) - MAKE_CHAR_MULTIBYTE (c1); - c = flag == CASE_DOWN ? downcase (c1) : upcase (c1); - if (c != c1) - { - if (! multibyte) - MAKE_CHAR_UNIBYTE (c); - XSETFASTINT (obj, c | flags); - } - return obj; - } - - if (!STRINGP (obj)) + return do_casify_natnum (flag, obj); + else if (!STRINGP (obj)) wrong_type_argument (Qchar_or_string_p, obj); - else if (!STRING_MULTIBYTE (obj)) - { - ptrdiff_t i; - ptrdiff_t size = SCHARS (obj); - - obj = Fcopy_sequence (obj); - for (i = 0; i < size; i++) - { - c = SREF (obj, i); - MAKE_CHAR_MULTIBYTE (c); - c1 = c; - if (inword && flag != CASE_CAPITALIZE_UP) - c = downcase (c); - else if (!uppercasep (c) - && (!inword || flag != CASE_CAPITALIZE_UP)) - c = upcase (c1); - if ((int) flag >= (int) CASE_CAPITALIZE) - inword = (SYNTAX (c) == Sword); - if (c != c1) - { - MAKE_CHAR_UNIBYTE (c); - /* If the char can't be converted to a valid byte, just don't - change it. */ - if (c >= 0 && c < 256) - SSET (obj, i, c); - } - } - return obj; - } + else if (!SCHARS (obj)) + return obj; + else if (STRING_MULTIBYTE (obj)) + return do_casify_multibyte_string (flag, obj); else - { - ptrdiff_t i, i_byte, size = SCHARS (obj); - int len; - USE_SAFE_ALLOCA; - ptrdiff_t o_size; - if (INT_MULTIPLY_WRAPV (size, MAX_MULTIBYTE_LENGTH, &o_size)) - o_size = PTRDIFF_MAX; - unsigned char *dst = SAFE_ALLOCA (o_size); - unsigned char *o = dst; - - for (i = i_byte = 0; i < size; i++, i_byte += len) - { - if (o_size - MAX_MULTIBYTE_LENGTH < o - dst) - string_overflow (); - c = STRING_CHAR_AND_LENGTH (SDATA (obj) + i_byte, len); - if (inword && flag != CASE_CAPITALIZE_UP) - c = downcase (c); - else if (!inword || flag != CASE_CAPITALIZE_UP) - c = upcase (c); - if ((int) flag >= (int) CASE_CAPITALIZE) - inword = (SYNTAX (c) == Sword); - o += CHAR_STRING (c, o); - } - eassert (o - dst <= o_size); - obj = make_multibyte_string ((char *) dst, size, o - dst); - SAFE_FREE (); - return obj; - } + return do_casify_unibyte_string (flag, obj); } DEFUN ("upcase", Fupcase, Supcase, 1, 1, 0, -- 2.12.0.367.g23dc2f6d3c-goog