From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Michal Nazarewicz Newsgroups: gmane.emacs.bugs Subject: bug#24603: [RFC 05/18] Introduce case_character function Date: Tue, 4 Oct 2016 03:10:28 +0200 Message-ID: <1475543441-10493-5-git-send-email-mina86@mina86.com> References: <1475543441-10493-1-git-send-email-mina86@mina86.com> NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Trace: blaine.gmane.org 1475543995 1679 195.159.176.226 (4 Oct 2016 01:19:55 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Tue, 4 Oct 2016 01:19:55 +0000 (UTC) To: 24603@debbugs.gnu.org Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Tue Oct 04 03:19:51 2016 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1brEOS-000709-PS for geb-bug-gnu-emacs@m.gmane.org; Tue, 04 Oct 2016 03:19:41 +0200 Original-Received: from localhost ([::1]:39756 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1brEOR-0002mK-CK for geb-bug-gnu-emacs@m.gmane.org; Mon, 03 Oct 2016 21:19:39 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:55797) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1brEGD-0005xu-WE for bug-gnu-emacs@gnu.org; Mon, 03 Oct 2016 21:11:14 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1brEG8-0001Sy-1F for bug-gnu-emacs@gnu.org; Mon, 03 Oct 2016 21:11:08 -0400 Original-Received: from debbugs.gnu.org ([208.118.235.43]:37330) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1brEG7-0001Sm-TX for bug-gnu-emacs@gnu.org; Mon, 03 Oct 2016 21:11:03 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1brEG7-0006eV-PI for bug-gnu-emacs@gnu.org; Mon, 03 Oct 2016 21:11:03 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Michal Nazarewicz Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Tue, 04 Oct 2016 01:11:03 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 24603 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: Original-Received: via spool by 24603-submit@debbugs.gnu.org id=B24603.147554346025505 (code B ref 24603); Tue, 04 Oct 2016 01:11:03 +0000 Original-Received: (at 24603) by debbugs.gnu.org; 4 Oct 2016 01:11:00 +0000 Original-Received: from localhost ([127.0.0.1]:43505 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1brEG3-0006d6-Hd for submit@debbugs.gnu.org; Mon, 03 Oct 2016 21:11:00 -0400 Original-Received: from mail-wm0-f43.google.com ([74.125.82.43]:37785) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1brEG1-0006cH-4R for 24603@debbugs.gnu.org; Mon, 03 Oct 2016 21:10:57 -0400 Original-Received: by mail-wm0-f43.google.com with SMTP id b201so113999853wmb.0 for <24603@debbugs.gnu.org>; Mon, 03 Oct 2016 18:10:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=sender:from:to:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=zpennm2a8hpudyY4Qxrq3wefseELW6pMWrYDcSil3m0=; b=a7M79YbNerC1hdUks5ULik16VHrqswuomUITtSJ/YUFddiqDjCIhLiao7AOUl1jfny e7eVH0XJIUQCHLWNxXNza8tfZZzJML59ZFqUNckmKd5c6n2uwmz0nUr9e2vac1exAGtc SxABAN2psKIqX7MsgodG7zUqB+/V6ompQmh9QBLkDXQxe2rmTK4iE//VhBsr/9M6+KUm KK0dFiDRwxjk2WneuCvq7VQ4yYETnFiCpmgz8DcLfdB2PftOzXurm9AZZj3JNF48Zpbs 4/F0fV+9kTscpH+XqB1jF7Q9RYe4LCivZUJwJU8twfX/t0rl1yavVJ7Npf+PACGQgGUV mr+Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:sender:from:to:subject:date:message-id :in-reply-to:references:mime-version:content-transfer-encoding; bh=zpennm2a8hpudyY4Qxrq3wefseELW6pMWrYDcSil3m0=; b=mKTNXGreNi5KI/YVrAPNrMsXMQEQX8TTVLS/RMD+2O125Q8M1Hlkuswd7pvqjAL6/l cGC/jo3+Zoukwzkf62i6UCLK3vZ8n9e7Cik4LtSE6CAwwIS8YBmhCg9QMAKKd1nbXS8m 4CCTomntUAy8sE/LD/+037dp7bEp6+tuD/eCV5ZtAb0PPS6+JXJAy4fZZekpiztCvkiL 6S1y7XwuAeVS8CTlL1y5VPrznVyEyQfPA3RC+WZnTjDqkASVWYcu9ZvAFmvLbXL02Vzr u/i8XTqqcEesoAUSvs9hA2L95XrBg+0bBqK3hrcr9oRCgw45/ka9rmIoamh2aiHHl9/K odAg== X-Gm-Message-State: AA6/9Rko0xdt2k68btXejmYSQS6eeptjMxXW2i9VJEokyUaG0iN1ozfx9lNnXFvfM/+uUEBt X-Received: by 10.28.113.80 with SMTP id m77mr1175900wmc.18.1475543451264; Mon, 03 Oct 2016 18:10:51 -0700 (PDT) Original-Received: from mpn.zrh.corp.google.com ([172.16.113.135]) by smtp.gmail.com with ESMTPSA id b8sm717934wjq.40.2016.10.03.18.10.49 for <24603@debbugs.gnu.org> (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 03 Oct 2016 18:10:49 -0700 (PDT) Original-Received: by mpn.zrh.corp.google.com (Postfix, from userid 126942) id 8820B1E0295; Tue, 4 Oct 2016 03:10:48 +0200 (CEST) X-Mailer: git-send-email 2.8.0.rc3.226.g39d4020 In-Reply-To: <1475543441-10493-1-git-send-email-mina86@mina86.com> X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 208.118.235.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.org gmane.emacs.bugs:124000 Archived-At: Move single-character casing logic into a separate function so that it is collected in a single place. This will make future changes to the logic easier. * src/casefiddle.c (struct casing_context, prepare_casing_context): New sturcture for saving casing context and function to initialise it. (case_character): New function which cases character base on provided context. (do_casify_integer, do_casify_multibyte_string, do_casify_unibyte_string, casify_object, casify_region): Convert to use casing_context and case_character. --- src/casefiddle.c | 135 +++++++++++++++++++++++++++++++------------------------ 1 file changed, 77 insertions(+), 58 deletions(-) diff --git a/src/casefiddle.c b/src/casefiddle.c index 47ebdf0..2fbd23b 100644 --- a/src/casefiddle.c +++ b/src/casefiddle.c @@ -30,9 +30,56 @@ along with GNU Emacs. If not, see . */ #include "keymap.h" enum case_action {CASE_UP, CASE_DOWN, CASE_CAPITALIZE, CASE_CAPITALIZE_UP}; + +/* State for casing individual characters. */ +struct casing_context { + /* User-requested action. */ + enum case_action flag; + /* If true, function operates on a buffer as opposed to a string or character. + When run on a buffer, syntax_prefix_flag_p is taken into account when + determined inword flag. */ + bool inbuffer; + /* Conceptually, this denotes whether we are inside of a word except + that if flag is CASE_UP it’s always false and if flag is CASE_DOWN + this is always true. */ + bool inword; +}; + +/* Initialise CTX structure and prepares related global data for casing + characters. */ +static void +prepare_casing_context (struct casing_context *ctx, + enum case_action flag, bool inbuffer) +{ + ctx->flag = flag; + ctx->inbuffer = inbuffer; + ctx->inword = flag == CASE_DOWN; + + /* If the case table is flagged as modified, rescan it. */ + if (NILP (XCHAR_TABLE (BVAR (current_buffer, downcase_table))->extras[1])) + Fset_case_table (BVAR (current_buffer, downcase_table)); + + if (inbuffer && (int) flag >= (int) CASE_CAPITALIZE) + SETUP_BUFFER_SYNTAX_TABLE (); /* For syntax_prefix_flag_p. */ +} + +/* Based on CTX, case character CH accordingly. Update CTX as necessary. + Return cased character. */ +static int +case_character (struct casing_context *ctx, int ch) +{ + if (ctx->inword) + ch = ctx->flag == CASE_CAPITALIZE_UP ? ch : downcase (ch); + else + ch = upcase(ch); + if ((int) ctx->flag >= (int) CASE_CAPITALIZE) + ctx->inword = SYNTAX (ch) == Sword && + (!ctx->inbuffer || ctx->inword || !syntax_prefix_flag_p (ch)); + return ch; +} static Lisp_Object -do_casify_integer (enum case_action flag, Lisp_Object obj) +do_casify_integer (struct casing_context *ctx, Lisp_Object obj) { int flagbits = (CHAR_ALT | CHAR_SUPER | CHAR_HYPER | CHAR_SHIFT | CHAR_CTL | CHAR_META); @@ -55,7 +102,7 @@ do_casify_integer (enum case_action flag, Lisp_Object obj) !NILP (BVAR (current_buffer, enable_multibyte_characters))); if (! multibyte) MAKE_CHAR_MULTIBYTE (ch); - cased = flag == CASE_DOWN ? downcase (ch) : upcase (ch); + cased = case_character (ctx, ch); if (cased == ch) return obj; @@ -66,10 +113,9 @@ do_casify_integer (enum case_action flag, Lisp_Object obj) } static Lisp_Object -do_casify_multibyte_string (enum case_action flag, Lisp_Object obj) +do_casify_multibyte_string (struct casing_context *ctx, Lisp_Object obj) { ptrdiff_t i, i_byte, size = SCHARS (obj); - bool inword = flag == CASE_DOWN; int len, ch, cased; USE_SAFE_ALLOCA; ptrdiff_t o_size; @@ -83,14 +129,7 @@ do_casify_multibyte_string (enum case_action flag, Lisp_Object obj) if (o_size - MAX_MULTIBYTE_LENGTH < o - dst) string_overflow (); ch = STRING_CHAR_AND_LENGTH (SDATA (obj) + i_byte, len); - if (inword && flag != CASE_CAPITALIZE_UP) - cased = downcase (ch); - else if (!inword || flag != CASE_CAPITALIZE_UP) - cased = upcase (ch); - else - cased = ch; - if ((int) flag >= (int) CASE_CAPITALIZE) - inword = (SYNTAX (ch) == Sword); + cased = case_character (ctx, ch); o += CHAR_STRING (cased, o); } eassert (o - dst <= o_size); @@ -100,10 +139,9 @@ do_casify_multibyte_string (enum case_action flag, Lisp_Object obj) } static Lisp_Object -do_casify_unibyte_string (enum case_action flag, Lisp_Object obj) +do_casify_unibyte_string (struct casing_context *ctx, Lisp_Object obj) { ptrdiff_t i, size = SCHARS (obj); - bool inword = flag == CASE_DOWN; int ch, cased; obj = Fcopy_sequence (obj); @@ -111,20 +149,13 @@ do_casify_unibyte_string (enum case_action flag, Lisp_Object obj) { ch = SREF (obj, i); MAKE_CHAR_MULTIBYTE (ch); - cased = ch; - if (inword && flag != CASE_CAPITALIZE_UP) - ch = downcase (ch); - else if (!uppercasep (ch) - && (!inword || flag != CASE_CAPITALIZE_UP)) - ch = upcase (cased); - if ((int) flag >= (int) CASE_CAPITALIZE) - inword = (SYNTAX (ch) == Sword); + cased = case_character (ctx, ch); if (ch == cased) continue; - MAKE_CHAR_UNIBYTE (ch); + MAKE_CHAR_UNIBYTE (cased); /* If the char can't be converted to a valid byte, just don't change it */ - if (ch >= 0 && ch < 256) - SSET (obj, i, ch); + if (cased >= 0 && cased < 256) + SSET (obj, i, cased); } return obj; } @@ -132,20 +163,19 @@ do_casify_unibyte_string (enum case_action flag, Lisp_Object obj) static Lisp_Object casify_object (enum case_action flag, Lisp_Object obj) { - /* If the case table is flagged as modified, rescan it. */ - if (NILP (XCHAR_TABLE (BVAR (current_buffer, downcase_table))->extras[1])) - Fset_case_table (BVAR (current_buffer, downcase_table)); + struct casing_context ctx; + prepare_casing_context (&ctx, flag, false); if (INTEGERP (obj)) - return do_casify_integer (flag, obj); + return do_casify_integer (&ctx, obj); else if (!STRINGP (obj)) wrong_type_argument (Qchar_or_string_p, obj); else if (!SCHARS (obj)) return obj; else if (STRING_MULTIBYTE (obj)) - return do_casify_multibyte_string (flag, obj); + return do_casify_multibyte_string (&ctx, obj); else - return do_casify_unibyte_string (flag, obj); + return do_casify_unibyte_string (&ctx, obj); } DEFUN ("upcase", Fupcase, Supcase, 1, 1, 0, @@ -196,8 +226,6 @@ The argument object is not altered--the value is a copy. */) static void casify_region (enum case_action flag, Lisp_Object b, Lisp_Object e) { - int c; - bool inword = flag == CASE_DOWN; bool multibyte = !NILP (BVAR (current_buffer, enable_multibyte_characters)); ptrdiff_t start, end; ptrdiff_t start_byte; @@ -208,14 +236,12 @@ casify_region (enum case_action flag, Lisp_Object b, Lisp_Object e) ptrdiff_t opoint = PT; ptrdiff_t opoint_byte = PT_BYTE; + struct casing_context ctx; + if (EQ (b, e)) /* Not modifying because nothing marked */ return; - /* If the case table is flagged as modified, rescan it. */ - if (NILP (XCHAR_TABLE (BVAR (current_buffer, downcase_table))->extras[1])) - Fset_case_table (BVAR (current_buffer, downcase_table)); - validate_region (&b, &e); start = XFASTINT (b); end = XFASTINT (e); @@ -223,32 +249,25 @@ casify_region (enum case_action flag, Lisp_Object b, Lisp_Object e) record_change (start, end - start); start_byte = CHAR_TO_BYTE (start); - SETUP_BUFFER_SYNTAX_TABLE (); /* For syntax_prefix_flag_p. */ + prepare_casing_context (&ctx, flag, true); while (start < end) { - int c2, len; + int ch, cased, len; if (multibyte) { - c = FETCH_MULTIBYTE_CHAR (start_byte); - len = CHAR_BYTES (c); + ch = FETCH_MULTIBYTE_CHAR (start_byte); + len = CHAR_BYTES (ch); } else { - c = FETCH_BYTE (start_byte); - MAKE_CHAR_MULTIBYTE (c); + ch = FETCH_BYTE (start_byte); + MAKE_CHAR_MULTIBYTE (ch); len = 1; } - c2 = c; - if (inword && flag != CASE_CAPITALIZE_UP) - c = downcase (c); - else if (!inword || flag != CASE_CAPITALIZE_UP) - c = upcase (c); - if ((int) flag >= (int) CASE_CAPITALIZE) - inword = ((SYNTAX (c) == Sword) - && (inword || !syntax_prefix_flag_p (c))); - if (c != c2) + cased = case_character (&ctx, ch); + if (ch != cased) { last = start; if (first < 0) @@ -256,18 +275,18 @@ casify_region (enum case_action flag, Lisp_Object b, Lisp_Object e) if (! multibyte) { - MAKE_CHAR_UNIBYTE (c); - FETCH_BYTE (start_byte) = c; + MAKE_CHAR_UNIBYTE (cased); + FETCH_BYTE (start_byte) = cased; } - else if (ASCII_CHAR_P (c2) && ASCII_CHAR_P (c)) - FETCH_BYTE (start_byte) = c; + else if (ASCII_CHAR_P (cased) && ASCII_CHAR_P (ch)) + FETCH_BYTE (start_byte) = cased; else { - int tolen = CHAR_BYTES (c); + int tolen = CHAR_BYTES (cased); int j; unsigned char str[MAX_MULTIBYTE_LENGTH]; - CHAR_STRING (c, str); + CHAR_STRING (cased, str); if (len == tolen) { /* Length is unchanged. */ -- 2.8.0.rc3.226.g39d4020