unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
From: Michal Nazarewicz <mina86@mina86.com>
To: Eli Zaretskii <eliz@gnu.org>
Cc: 24603@debbugs.gnu.org
Subject: bug#24603: [PATCH 0/3] Case table updates
Date: Mon, 24 Oct 2016 17:11:22 +0200	[thread overview]
Message-ID: <xa1tinsheq91.fsf@mina86.com> (raw)
In-Reply-To: <83bmyixixc.fsf@gnu.org>

On Tue, Oct 18 2016, Eli Zaretskii wrote:
>> From: Michal Nazarewicz <mina86@mina86.com>
>> Cc: eliz@gnu.org
>> Date: Tue, 18 Oct 2016 00:03:42 +0200
>> 
>> As I continue working on the patchset, it keeps on growing.  To
>> somewhat limit that, I’ll start applying the patches.
>
> This is up to you to some degree, but I'd like to point out that it is
> not necessary to apply patches piecemeal.  You can merge to, or rebase
> on, master everything in one go when you are done, there should be no
> technical difficulties with that whatsoever.

Correct.  It’s more of a personal issue than technical one.  The
patchset keeps growing and it’s getting somewhat harder for me to keep
of it.

> Another potential issue I'd like us to avoid is to modify the same
> parts of the code several times in related commits.  If that happens,
> I'd prefer a single commit that changes them only once.
>
> That said, separate pushes are justified if the parts you push provide
> self-contained significant features or improvements.

The second patch does add missing entries to case-table which is
self-contained.

The first patch is somehow more tricky in that regard since it has a lot
of FIXME comments and it’s commit message mentions future patches.

>> The first two map to the first two from original sumbission.  The
>> first gained tests for byte-8 characters and the second includes
>> changes requested by Eli.
>> 
>> The third is a new patch.
>> 
>> Michal Nazarewicz (3):
>>   Add tests for casefiddle.c
>>   Generate upcase and downcase tables from Unicode data
>>   Don’t generate ‘X maps to X’ entries in case tables
>
> Can't the 3rd patch break some code which assumes the current state of
> affairs, i.e. that the case-table entries for characters with no case
> variants are identical to the character itself?  IOW, this sounds like
> an incompatible change, so it should be mentioned as such in NEWS, and
> perhaps we should make sure we don't break too much code out there,
> not sure if that is possible.

I don’t think it can.  The only place where I could find case-table’s
being used directly (as in values stored in it read) were functions in
buffer.h and those have explicit path for missing entries:

    /* Downcase a character C, or make no change if that cannot be done.  */
    INLINE int
    downcase (int c)
    {
      Lisp_Object downcase_table = BVAR (current_buffer, downcase_table);
      Lisp_Object down = CHAR_TABLE_REF (downcase_table, c);
      return NATNUMP (down) ? XFASTINT (down) : c;
    }
    
    /* Upcase a character C known to be not upper case.  */
    INLINE int
    upcase1 (int c)
    {
      Lisp_Object upcase_table = BVAR (current_buffer, upcase_table);
      Lisp_Object up = CHAR_TABLE_REF (upcase_table, c);
      return NATNUMP (up) ? XFASTINT (up) : c;
    }

> What kind of memory savings does this produce, in terms of memory
> footprint of a running Emacs process?

Actually the more I think about it, the less I’m sure the savings are
there.  Since char tables allocate memory in batches, the space may
still be allocated but simply unused.  I’ll drop the patch for now.
Maybe I’ll have time to investigate further at some future date.

> I have no objections or comments to the other 2 patches.

So yeah, I dunno how strongly you feel about it.  I certainly can wait
once the whole patchset is ready, but that may take a while.

-- 
Best regards
ミハウ “𝓶𝓲𝓷𝓪86” ナザレヴイツ
«If at first you don’t succeed, give up skydiving»





  reply	other threads:[~2016-10-24 15:11 UTC|newest]

Thread overview: 89+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-10-04  1:05 bug#24603: [RFC 00/18] Improvement to casing Michal Nazarewicz
2016-10-04  1:10 ` bug#24603: [RFC 01/18] Add tests for casefiddle.c Michal Nazarewicz
2016-10-04  1:10   ` bug#24603: [RFC 02/18] Generate upcase and downcase tables from Unicode data Michal Nazarewicz
2016-10-04  7:27     ` Eli Zaretskii
2016-10-04 14:54       ` Michal Nazarewicz
2016-10-04 15:06         ` Eli Zaretskii
2016-10-04 16:57           ` Michal Nazarewicz
2016-10-04 17:27             ` Eli Zaretskii
2016-10-04 17:44               ` Eli Zaretskii
2016-10-06 20:29                 ` Michal Nazarewicz
2016-10-07  6:52                   ` Eli Zaretskii
2016-10-04  1:10   ` bug#24603: [RFC 03/18] Don’t assume character can be either upper- or lower-case when casing Michal Nazarewicz
2016-10-04  1:10   ` bug#24603: [RFC 04/18] Split casify_object into multiple functions Michal Nazarewicz
2016-10-04  1:10   ` bug#24603: [RFC 05/18] Introduce case_character function Michal Nazarewicz
2016-10-04  1:10   ` bug#24603: [RFC 06/18] Add support for title-casing letters Michal Nazarewicz
2016-10-04  1:10   ` bug#24603: [RFC 07/18] Split up casify_region function Michal Nazarewicz
2016-10-04  7:17     ` Eli Zaretskii
2016-10-18  2:27       ` Michal Nazarewicz
2016-10-04  1:10   ` bug#24603: [RFC 08/18] Support casing characters which map into multiple code points Michal Nazarewicz
2016-10-04  7:38     ` Eli Zaretskii
2016-10-06 21:40       ` Michal Nazarewicz
2016-10-07  7:46         ` Eli Zaretskii
2017-01-28 23:48           ` Michal Nazarewicz
2017-02-10  9:12             ` Eli Zaretskii
2016-10-04  1:10   ` bug#24603: [RFC 09/18] Implement special sigma casing rule Michal Nazarewicz
2016-10-04  7:22     ` Eli Zaretskii
2016-10-04  1:10   ` bug#24603: [RFC 10/18] Implement Turkic dotless and dotted i handling when casing strings Michal Nazarewicz
2016-10-04  7:12     ` Eli Zaretskii
2016-10-04  1:10   ` bug#24603: [RFC 11/18] Implement casing rules for Lithuanian Michal Nazarewicz
2016-10-04  1:10   ` bug#24603: [RFC 12/18] Implement rules for title-casing Dutch ij ‘letter’ Michal Nazarewicz
2016-10-04  1:10   ` bug#24603: [RFC 13/18] Add some tricky Unicode characters to regex test Michal Nazarewicz
2016-10-04  1:10   ` bug#24603: [RFC 14/18] Factor out character category lookup to separate function Michal Nazarewicz
2016-10-04  1:10   ` bug#24603: [RFC 15/18] Base lower- and upper-case tests on Unicode properties Michal Nazarewicz
2016-10-04  6:54     ` Eli Zaretskii
2016-10-04  1:10   ` bug#24603: [RFC 16/18] Refactor character class checking; optimise ASCII case Michal Nazarewicz
2016-10-04  7:48     ` Eli Zaretskii
2016-10-17 13:22       ` Michal Nazarewicz
2016-11-06 19:26       ` Michal Nazarewicz
2016-11-06 19:44         ` Eli Zaretskii
2016-12-20 14:32           ` Michal Nazarewicz
2016-12-20 16:39             ` Eli Zaretskii
2016-12-22 14:02               ` Michal Nazarewicz
2016-10-04  1:10   ` bug#24603: [RFC 17/18] Optimise character class matching in regexes Michal Nazarewicz
2016-10-04  1:10   ` bug#24603: [RFC 18/18] Fix case-fold-search character class matching Michal Nazarewicz
2016-10-17 22:03 ` bug#24603: [PATCH 0/3] Case table updates Michal Nazarewicz
2016-10-17 22:03   ` bug#24603: [PATCH 1/3] Add tests for casefiddle.c Michal Nazarewicz
2016-10-17 22:03   ` bug#24603: [PATCH 2/3] Generate upcase and downcase tables from Unicode data Michal Nazarewicz
2016-10-17 22:03   ` bug#24603: [PATCH 3/3] Don’t generate ‘X maps to X’ entries in case tables Michal Nazarewicz
2016-10-18  6:36   ` bug#24603: [PATCH 0/3] Case table updates Eli Zaretskii
2016-10-24 15:11     ` Michal Nazarewicz [this message]
2016-10-24 15:33       ` Eli Zaretskii
2017-03-09 21:51 ` bug#24603: [PATCHv5 00/11] Casing improvements Michal Nazarewicz
2017-03-09 21:51   ` bug#24603: [PATCHv5 01/11] Split casify_object into multiple functions Michal Nazarewicz
2017-03-10  9:00     ` Andreas Schwab
2017-03-09 21:51   ` bug#24603: [PATCHv5 02/11] Introduce case_character function Michal Nazarewicz
2017-03-09 21:51   ` bug#24603: [PATCHv5 03/11] Add support for title-casing letters (bug#24603) Michal Nazarewicz
2017-03-11  9:03     ` Eli Zaretskii
2017-03-09 21:51   ` bug#24603: [PATCHv5 04/11] Split up casify_region function (bug#24603) Michal Nazarewicz
2017-03-09 21:51   ` bug#24603: [PATCHv5 05/11] Support casing characters which map into multiple code points (bug#24603) Michal Nazarewicz
2017-03-11  9:14     ` Eli Zaretskii
2017-03-21  2:09       ` Michal Nazarewicz
2017-03-09 21:51   ` bug#24603: [PATCHv5 06/11] Implement special sigma casing rule (bug#24603) Michal Nazarewicz
2017-03-09 21:51   ` bug#24603: [PATCHv5 07/11] Introduce ‘buffer-language’ buffer-locar variable Michal Nazarewicz
2017-03-11  9:29     ` Eli Zaretskii
2017-03-09 21:51   ` bug#24603: [PATCHv5 08/11] Implement rules for title-casing Dutch ij ‘letter’ (bug#24603) Michal Nazarewicz
2017-03-11  9:40     ` Eli Zaretskii
2017-03-16 21:30       ` Michal Nazarewicz
2017-03-17 13:43         ` Eli Zaretskii
2017-03-09 21:51   ` bug#24603: [PATCHv5 09/11] Implement Turkic dotless and dotted i casing rules (bug#24603) Michal Nazarewicz
2017-03-09 21:51   ` bug#24603: [PATCHv5 10/11] Implement casing rules for Lithuanian (bug#24603) Michal Nazarewicz
2017-03-09 21:51   ` bug#24603: [PATCHv5 11/11] Implement Irish casing rules (bug#24603) Michal Nazarewicz
2017-03-11  9:44     ` Eli Zaretskii
2017-03-16 22:16       ` Michal Nazarewicz
2017-03-17  8:20         ` Eli Zaretskii
2017-03-11 10:00   ` bug#24603: [PATCHv5 00/11] Casing improvements Eli Zaretskii
2017-03-21  1:27   ` bug#24603: [PATCHv6 0/6] Casing improvements, language-independent part Michal Nazarewicz
2017-03-21  1:27     ` bug#24603: [PATCHv6 1/6] Split casify_object into multiple functions Michal Nazarewicz
2017-03-21  1:27     ` bug#24603: [PATCHv6 2/6] Introduce case_character function Michal Nazarewicz
2017-03-21  1:27     ` bug#24603: [PATCHv6 3/6] Add support for title-casing letters (bug#24603) Michal Nazarewicz
2017-03-21  1:27     ` bug#24603: [PATCHv6 4/6] Split up casify_region function (bug#24603) Michal Nazarewicz
2017-03-21  1:27     ` bug#24603: [PATCHv6 5/6] Support casing characters which map into multiple code points (bug#24603) Michal Nazarewicz
2017-03-22 16:06       ` Eli Zaretskii
2017-04-03  9:01         ` Michal Nazarewicz
2017-04-03 14:52           ` Eli Zaretskii
2019-06-25  0:09           ` Lars Ingebrigtsen
2019-06-25  0:29             ` Michał Nazarewicz
2020-08-11 13:46               ` Lars Ingebrigtsen
2021-05-10 11:51                 ` bug#24603: [RFC 00/18] Improvement to casing Lars Ingebrigtsen
2017-03-21  1:27     ` bug#24603: [PATCHv6 6/6] Implement special sigma casing rule (bug#24603) Michal Nazarewicz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xa1tinsheq91.fsf@mina86.com \
    --to=mina86@mina86.com \
    --cc=24603@debbugs.gnu.org \
    --cc=eliz@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).