From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Michal Nazarewicz Newsgroups: gmane.emacs.bugs Subject: bug#24603: [PATCHv6 2/6] Introduce case_character function Date: Tue, 21 Mar 2017 02:27:05 +0100 Message-ID: <20170321012709.19402-3-mina86@mina86.com> References: <20170309215150.9562-1-mina86@mina86.com> <20170321012709.19402-1-mina86@mina86.com> NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Trace: blaine.gmane.org 1490059721 16185 195.159.176.226 (21 Mar 2017 01:28:41 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Tue, 21 Mar 2017 01:28:41 +0000 (UTC) To: Eli Zaretskii , Andreas Schwab , 24603@debbugs.gnu.org Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Tue Mar 21 02:28:35 2017 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1cq8b6-0002rZ-AF for geb-bug-gnu-emacs@m.gmane.org; Tue, 21 Mar 2017 02:28:28 +0100 Original-Received: from localhost ([::1]:36106 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cq8bC-0006lT-7Z for geb-bug-gnu-emacs@m.gmane.org; Mon, 20 Mar 2017 21:28:34 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:46860) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cq8ak-0006WO-PM for bug-gnu-emacs@gnu.org; Mon, 20 Mar 2017 21:28:08 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cq8ah-0006pN-Ki for bug-gnu-emacs@gnu.org; Mon, 20 Mar 2017 21:28:06 -0400 Original-Received: from debbugs.gnu.org ([208.118.235.43]:38814) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1cq8ah-0006pE-HO for bug-gnu-emacs@gnu.org; Mon, 20 Mar 2017 21:28:03 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1cq8ah-0008Ok-Cm for bug-gnu-emacs@gnu.org; Mon, 20 Mar 2017 21:28:03 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Michal Nazarewicz Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Tue, 21 Mar 2017 01:28:03 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 24603 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch Original-Received: via spool by 24603-submit@debbugs.gnu.org id=B24603.149005964932182 (code B ref 24603); Tue, 21 Mar 2017 01:28:03 +0000 Original-Received: (at 24603) by debbugs.gnu.org; 21 Mar 2017 01:27:29 +0000 Original-Received: from localhost ([127.0.0.1]:37001 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1cq8a8-0008Mr-K7 for submit@debbugs.gnu.org; Mon, 20 Mar 2017 21:27:29 -0400 Original-Received: from mail-wr0-f181.google.com ([209.85.128.181]:35000) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1cq8a4-0008M7-Vi for 24603@debbugs.gnu.org; Mon, 20 Mar 2017 21:27:25 -0400 Original-Received: by mail-wr0-f181.google.com with SMTP id g10so103076056wrg.2 for <24603@debbugs.gnu.org>; Mon, 20 Mar 2017 18:27:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:from:to:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=riul7aSdYBGNF1+7+n6VLOWlxN/huARMQBFhmd04J0M=; b=Ay3piRtNNGnUL6ZgME6fD1+PcanA0m6iCD3eThfkpEhsTlqp6EHjlb9c4ijQXIUI7q WEka2Qsnzq8t6m32m9fpT4sMPnzsNpH95lRiN7W6hNQ6bXIcb7JXG8GcwZjtTX9Rr89u W0GHY/JtvTuyGxjxjjiVAs4r8uaoZxTHR0vjeF1VJCQrxpDI+Lsxc5y6qhPDiMiKlyla 7M9UW9CdKbG95e84H5Gf88eUOQyy8bK5TwdZ/WiUqzadMw0CnAIc0tjrD1eV2AdGDXW9 mn1BKYBun6x8iHElmg1c9cZOdnYIEB0l8cyGw7FqwqdOkh4tax5OsYcz8FCnz4tY51yG FtUg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:from:to:subject:date:message-id :in-reply-to:references:mime-version:content-transfer-encoding; bh=riul7aSdYBGNF1+7+n6VLOWlxN/huARMQBFhmd04J0M=; b=KDuwatxRVHL8+mhdtPS1hJ/BLtXTNoWl2S0M64bfy+QrUFOkUFQXE1nn/VGmbgzL9n Sp4o8OQONFWdnKiEOodxJGg+090GcyYewxH6Lps+U02Ya2qvyXUsErbbniV+rUfwAQFe o/4a/gqR3QipzAddQvNqKoNWSeHAjtVH42NmjZ3c0i45/ISN/zLjTSjtrwSRJsHe/yQj 6A9fG7P0+T79ujF0ryXybTKljti3ROWXoq8L0CWC+0Bbhf1pLDaE2rvmJuovwiEu1Pbo /YJ2T0dEYatQ8Mbc6kqQK9J2yhR5JvQSfZT+yqERCD9CwsB4DGRhYMISTECHE8o8iQCO InTQ== X-Gm-Message-State: AFeK/H0LP0mVVfAqECG1gvt6pTRPBNRDdiwYkxrs0A8TokTeDSe9XszmF9s0ArqR+a31+weg X-Received: by 10.223.171.15 with SMTP id q15mr27100018wrc.38.1490059638708; Mon, 20 Mar 2017 18:27:18 -0700 (PDT) Original-Received: from mpn.zrh.corp.google.com ([172.16.115.43]) by smtp.gmail.com with ESMTPSA id y4sm15442922wmy.5.2017.03.20.18.27.15 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 20 Mar 2017 18:27:15 -0700 (PDT) Original-Received: by mpn.zrh.corp.google.com (Postfix, from userid 126942) id 24A001E029A; Tue, 21 Mar 2017 02:27:14 +0100 (CET) X-Mailer: git-send-email 2.12.1.500.gab5fba24ee-goog In-Reply-To: <20170321012709.19402-1-mina86@mina86.com> X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 208.118.235.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.org gmane.emacs.bugs:130767 Archived-At: Move single-character casing logic into a separate function so that it is collected in a single place. This will make future changes to the logic easier. This commit introduces no functionality changes. * src/casefiddle.c (struct casing_context, prepare_casing_context): New sturcture for saving casing context and function to initialise it. (case_character): New function which cases character base on provided context. (do_casify_integer, do_casify_multibyte_string, do_casify_unibyte_string, casify_object, casify_region): Convert to use casing_context and case_character. --- src/casefiddle.c | 134 +++++++++++++++++++++++++++++++------------------------ 1 file changed, 76 insertions(+), 58 deletions(-) diff --git a/src/casefiddle.c b/src/casefiddle.c index 72661979b4d..dfbb5c3e172 100644 --- a/src/casefiddle.c +++ b/src/casefiddle.c @@ -30,9 +30,55 @@ along with GNU Emacs. If not, see . */ #include "keymap.h" enum case_action {CASE_UP, CASE_DOWN, CASE_CAPITALIZE, CASE_CAPITALIZE_UP}; + +/* State for casing individual characters. */ +struct casing_context { + /* User-requested action. */ + enum case_action flag; + /* If true, function operates on a buffer as opposed to a string or character. + When run on a buffer, syntax_prefix_flag_p is taken into account when + determined inword flag. */ + bool inbuffer; + /* Conceptually, this denotes whether we are inside of a word except + that if flag is CASE_UP it’s always false and if flag is CASE_DOWN + this is always true. */ + bool inword; +}; + +/* Initialise CTX structure for casing characters. */ +static void +prepare_casing_context (struct casing_context *ctx, + enum case_action flag, bool inbuffer) +{ + ctx->flag = flag; + ctx->inbuffer = inbuffer; + ctx->inword = flag == CASE_DOWN; + + /* If the case table is flagged as modified, rescan it. */ + if (NILP (XCHAR_TABLE (BVAR (current_buffer, downcase_table))->extras[1])) + Fset_case_table (BVAR (current_buffer, downcase_table)); + + if (inbuffer && (int) flag >= (int) CASE_CAPITALIZE) + SETUP_BUFFER_SYNTAX_TABLE (); /* For syntax_prefix_flag_p. */ +} + +/* Based on CTX, case character CH accordingly. Update CTX as necessary. + Return cased character. */ +static int +case_character (struct casing_context *ctx, int ch) +{ + if (ctx->inword) + ch = ctx->flag == CASE_CAPITALIZE_UP ? ch : downcase (ch); + else + ch = upcase(ch); + if ((int) ctx->flag >= (int) CASE_CAPITALIZE) + ctx->inword = SYNTAX (ch) == Sword && + (!ctx->inbuffer || ctx->inword || !syntax_prefix_flag_p (ch)); + return ch; +} static Lisp_Object -do_casify_natnum (enum case_action flag, Lisp_Object obj) +do_casify_natnum (struct casing_context *ctx, Lisp_Object obj) { int flagbits = (CHAR_ALT | CHAR_SUPER | CHAR_HYPER | CHAR_SHIFT | CHAR_CTL | CHAR_META); @@ -55,7 +101,7 @@ do_casify_natnum (enum case_action flag, Lisp_Object obj) || !NILP (BVAR (current_buffer, enable_multibyte_characters)); if (! multibyte) MAKE_CHAR_MULTIBYTE (ch); - cased = flag == CASE_DOWN ? downcase (ch) : upcase (ch); + cased = case_character (ctx, ch); if (cased == ch) return obj; @@ -66,10 +112,9 @@ do_casify_natnum (enum case_action flag, Lisp_Object obj) } static Lisp_Object -do_casify_multibyte_string (enum case_action flag, Lisp_Object obj) +do_casify_multibyte_string (struct casing_context *ctx, Lisp_Object obj) { ptrdiff_t i, i_byte, size = SCHARS (obj); - bool inword = flag == CASE_DOWN; int len, ch, cased; USE_SAFE_ALLOCA; ptrdiff_t o_size; @@ -83,14 +128,7 @@ do_casify_multibyte_string (enum case_action flag, Lisp_Object obj) if (o_size - MAX_MULTIBYTE_LENGTH < o - dst) string_overflow (); ch = STRING_CHAR_AND_LENGTH (SDATA (obj) + i_byte, len); - if (inword && flag != CASE_CAPITALIZE_UP) - cased = downcase (ch); - else if (!inword || flag != CASE_CAPITALIZE_UP) - cased = upcase (ch); - else - cased = ch; - if ((int) flag >= (int) CASE_CAPITALIZE) - inword = (SYNTAX (ch) == Sword); + cased = case_character (ctx, ch); o += CHAR_STRING (cased, o); } eassert (o - dst <= o_size); @@ -100,10 +138,9 @@ do_casify_multibyte_string (enum case_action flag, Lisp_Object obj) } static Lisp_Object -do_casify_unibyte_string (enum case_action flag, Lisp_Object obj) +do_casify_unibyte_string (struct casing_context *ctx, Lisp_Object obj) { ptrdiff_t i, size = SCHARS (obj); - bool inword = flag == CASE_DOWN; int ch, cased; obj = Fcopy_sequence (obj); @@ -111,20 +148,13 @@ do_casify_unibyte_string (enum case_action flag, Lisp_Object obj) { ch = SREF (obj, i); MAKE_CHAR_MULTIBYTE (ch); - cased = ch; - if (inword && flag != CASE_CAPITALIZE_UP) - ch = downcase (ch); - else if (!uppercasep (ch) - && (!inword || flag != CASE_CAPITALIZE_UP)) - ch = upcase (cased); - if ((int) flag >= (int) CASE_CAPITALIZE) - inword = (SYNTAX (ch) == Sword); + cased = case_character (ctx, ch); if (ch == cased) continue; - MAKE_CHAR_UNIBYTE (ch); + MAKE_CHAR_UNIBYTE (cased); /* If the char can't be converted to a valid byte, just don't change it */ - if (ch >= 0 && ch < 256) - SSET (obj, i, ch); + if (cased >= 0 && cased < 256) + SSET (obj, i, cased); } return obj; } @@ -132,20 +162,19 @@ do_casify_unibyte_string (enum case_action flag, Lisp_Object obj) static Lisp_Object casify_object (enum case_action flag, Lisp_Object obj) { - /* If the case table is flagged as modified, rescan it. */ - if (NILP (XCHAR_TABLE (BVAR (current_buffer, downcase_table))->extras[1])) - Fset_case_table (BVAR (current_buffer, downcase_table)); + struct casing_context ctx; + prepare_casing_context (&ctx, flag, false); if (NATNUMP (obj)) - return do_casify_natnum (flag, obj); + return do_casify_natnum (&ctx, obj); else if (!STRINGP (obj)) wrong_type_argument (Qchar_or_string_p, obj); else if (!SCHARS (obj)) return obj; else if (STRING_MULTIBYTE (obj)) - return do_casify_multibyte_string (flag, obj); + return do_casify_multibyte_string (&ctx, obj); else - return do_casify_unibyte_string (flag, obj); + return do_casify_unibyte_string (&ctx, obj); } DEFUN ("upcase", Fupcase, Supcase, 1, 1, 0, @@ -196,8 +225,6 @@ The argument object is not altered--the value is a copy. */) static void casify_region (enum case_action flag, Lisp_Object b, Lisp_Object e) { - int c; - bool inword = flag == CASE_DOWN; bool multibyte = !NILP (BVAR (current_buffer, enable_multibyte_characters)); ptrdiff_t start, end; ptrdiff_t start_byte; @@ -208,14 +235,12 @@ casify_region (enum case_action flag, Lisp_Object b, Lisp_Object e) ptrdiff_t opoint = PT; ptrdiff_t opoint_byte = PT_BYTE; + struct casing_context ctx; + if (EQ (b, e)) /* Not modifying because nothing marked */ return; - /* If the case table is flagged as modified, rescan it. */ - if (NILP (XCHAR_TABLE (BVAR (current_buffer, downcase_table))->extras[1])) - Fset_case_table (BVAR (current_buffer, downcase_table)); - validate_region (&b, &e); start = XFASTINT (b); end = XFASTINT (e); @@ -223,32 +248,25 @@ casify_region (enum case_action flag, Lisp_Object b, Lisp_Object e) record_change (start, end - start); start_byte = CHAR_TO_BYTE (start); - SETUP_BUFFER_SYNTAX_TABLE (); /* For syntax_prefix_flag_p. */ + prepare_casing_context (&ctx, flag, true); while (start < end) { - int c2, len; + int ch, cased, len; if (multibyte) { - c = FETCH_MULTIBYTE_CHAR (start_byte); - len = CHAR_BYTES (c); + ch = FETCH_MULTIBYTE_CHAR (start_byte); + len = CHAR_BYTES (ch); } else { - c = FETCH_BYTE (start_byte); - MAKE_CHAR_MULTIBYTE (c); + ch = FETCH_BYTE (start_byte); + MAKE_CHAR_MULTIBYTE (ch); len = 1; } - c2 = c; - if (inword && flag != CASE_CAPITALIZE_UP) - c = downcase (c); - else if (!inword || flag != CASE_CAPITALIZE_UP) - c = upcase (c); - if ((int) flag >= (int) CASE_CAPITALIZE) - inword = ((SYNTAX (c) == Sword) - && (inword || !syntax_prefix_flag_p (c))); - if (c != c2) + cased = case_character (&ctx, ch); + if (ch != cased) { last = start; if (first < 0) @@ -256,18 +274,18 @@ casify_region (enum case_action flag, Lisp_Object b, Lisp_Object e) if (! multibyte) { - MAKE_CHAR_UNIBYTE (c); - FETCH_BYTE (start_byte) = c; + MAKE_CHAR_UNIBYTE (cased); + FETCH_BYTE (start_byte) = cased; } - else if (ASCII_CHAR_P (c2) && ASCII_CHAR_P (c)) - FETCH_BYTE (start_byte) = c; + else if (ASCII_CHAR_P (cased) && ASCII_CHAR_P (ch)) + FETCH_BYTE (start_byte) = cased; else { - int tolen = CHAR_BYTES (c); + int tolen = CHAR_BYTES (cased); int j; unsigned char str[MAX_MULTIBYTE_LENGTH]; - CHAR_STRING (c, str); + CHAR_STRING (cased, str); if (len == tolen) { /* Length is unchanged. */ -- 2.12.0.367.g23dc2f6d3c-goog