From mboxrd@z Thu Jan  1 00:00:00 1970
Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail
From: Eli Zaretskii <eliz@gnu.org>
Newsgroups: gmane.emacs.devel
Subject: Re: Can watermarking Unicode text using invisible differences sneak
 through Emacs, or can Emacs detect it?
Date: Fri, 04 Feb 2022 10:03:08 +0200
Message-ID: <83tudf2h4z.fsf@gnu.org>
References: <E1nA2O7-0005jJ-RT@fencepost.gnu.org> <87sftk49ih.fsf@yahoo.com>
 <ac1b33c3ee4372818f4f081ae0e83fb2@webmail.orcon.net.nz>
 <837dawt0h4.fsf@gnu.org> <E1nAlIo-00039J-V9@fencepost.gnu.org>
 <838rv9plyf.fsf@gnu.org> <E1nB8A3-0002U0-JI@fencepost.gnu.org>
 <837dasntoj.fsf@gnu.org> <E1nBr3H-0002nk-HU@fencepost.gnu.org>
 <834k5tl4a9.fsf@gnu.org> <87mtjkt6m9.fsf@gmail.com> <83ilu8htws.fsf@gnu.org>
 <E1nCZ9R-0005I0-5n@fencepost.gnu.org>
 <3E718CA2-889F-4AEE-B79C-EB3A221D1CB2@gnu.org>
 <E1nDQvp-0005vd-Sz@fencepost.gnu.org> <83o83wc7gs.fsf@gnu.org>
 <E1nE1eT-0006oH-WB@fencepost.gnu.org> <8335l5brov.fsf@gnu.org>
 <E1nENtO-000603-J6@fencepost.gnu.org> <83mtjc838i.fsf@gnu.org>
 <E1nElMz-0003cv-CT@fencepost.gnu.org> <83zgna7hyd.fsf@gnu.org>
 <E1nF6nB-0006D1-WE@fencepost.gnu.org> <83ee4l78rw.fsf@gnu.org>
 <E1nFTf2-0001mh-6s@fencepost.gnu.org> <E1nFpdn-00053c-5X@fencepost.gnu.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214";
	logging-data="26016"; mail-complaints-to="usenet@ciao.gmane.io"
Cc: psainty@orcon.net.nz, luangruo@yahoo.com, kevin.legouguec@gmail.com,
 emacs-devel@gnu.org
To: rms@gnu.org
Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Fri Feb 04 09:49:08 2022
Return-path: <emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org>
Envelope-to: ged-emacs-devel@m.gmane-mx.org
Original-Received: from lists.gnu.org ([209.51.188.17])
	by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
	(Exim 4.92)
	(envelope-from <emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org>)
	id 1nFuHE-0006bB-0N
	for ged-emacs-devel@m.gmane-mx.org; Fri, 04 Feb 2022 09:49:08 +0100
Original-Received: from localhost ([::1]:37894 helo=lists1p.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.90_1)
	(envelope-from <emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org>)
	id 1nFuHC-0007KL-Su
	for ged-emacs-devel@m.gmane-mx.org; Fri, 04 Feb 2022 03:49:06 -0500
Original-Received: from eggs.gnu.org ([209.51.188.92]:43696)
 by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <eliz@gnu.org>) id 1nFtYo-0000Ac-I9
 for emacs-devel@gnu.org; Fri, 04 Feb 2022 03:03:20 -0500
Original-Received: from [2001:470:142:3::e] (port=44034 helo=fencepost.gnu.org)
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <eliz@gnu.org>)
 id 1nFtYm-0001Cs-Mg; Fri, 04 Feb 2022 03:03:12 -0500
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org;
 s=fencepost-gnu-org; h=MIME-version:References:Subject:In-Reply-To:To:From:
 Date; bh=g4rsP2EgQsYZHv9ifyHUTtHbl3xpOOKOIznSvx98YcU=; b=OfYbColGqSIAdkNCVYc4
 HFfXxB2fVk9Dl8qDwaGsr+CC6lKCmks0sjnPgSmMA6s7kFN07FW0weZr4awG8ANV7vwic6TeGH6Hj
 A7526BwLGSIgvqQusZqUzrA2kjitHN6O2jnljgWFvQaUHTnywm5X5eOia4ezOBxsTGQGGkbSER1cz
 NHGVcRxgWHqR/FvAg1JKvfILFj0a+d86cFTL164HwbXe1nsqsa6EaNBMBy9AtbQXvz3kMePiFq/UL
 8GAMBb+VlUPRheaq18vcm6sNmIzilzOiIdpMFJZLXdlqd/QNWHqWYDAOkTBju5NsYXlXkgQvurzV+
 aBEB2qrrWtBQjQ==;
Original-Received: from [87.69.77.57] (port=4593 helo=home-c4e4a596f7)
 by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <eliz@gnu.org>)
 id 1nFtYc-0004Vu-EJ; Fri, 04 Feb 2022 03:03:04 -0500
In-Reply-To: <E1nFpdn-00053c-5X@fencepost.gnu.org> (message from Richard
 Stallman on Thu, 03 Feb 2022 22:52:07 -0500)
X-BeenThere: emacs-devel@gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Emacs development discussions." <emacs-devel.gnu.org>
List-Unsubscribe: <https://lists.gnu.org/mailman/options/emacs-devel>,
 <mailto:emacs-devel-request@gnu.org?subject=unsubscribe>
List-Archive: <https://lists.gnu.org/archive/html/emacs-devel>
List-Post: <mailto:emacs-devel@gnu.org>
List-Help: <mailto:emacs-devel-request@gnu.org?subject=help>
List-Subscribe: <https://lists.gnu.org/mailman/listinfo/emacs-devel>,
 <mailto:emacs-devel-request@gnu.org?subject=subscribe>
Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org
Original-Sender: "Emacs-devel"
 <emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org>
Xref: news.gmane.io gmane.emacs.devel:285856
Archived-At: <http://permalink.gmane.org/gmane.emacs.devel/285856>

> From: Richard Stallman <rms@gnu.org>
> Cc: eliz@gnu.org, psainty@orcon.net.nz, luangruo@yahoo.com,
> 	emacs-devel@gnu.org, kevin.legouguec@gmail.com
> Date: Thu, 03 Feb 2022 22:52:07 -0500
> 
> It would be useful to be able to analyze and construct complex
> characters -- for instance, to operate on a-with-breve-and-tilde
> and find out that represents an a with two diacritics.

This already exists, see below.  But you seem to have something
different in mind:

> So I propose a function, `diacriticize'.  Its arguments are
> characters, and if they can be graphically combined to make a single
> character, that's what diacriticize returns.  Otherwise, it returns
> nil.
> 
>   (diacriticize ?a ?~ ?˘) => ?ã¯
>   (diacriticize ?a ?Z) => nil
> 
> It could have an inverse function, criticanalyze, which given the
> character code for a character that is (in spirit) a composition,
> would return the characters it consists of:
> 
> (criticanalyze ?ã˘) => (?a ?~ ?˘)
> 
> With these functions, latin1-display could figure out automatically
> which conversions to make.

I don't understand the specification of these functions.  How would
diacriticize decide/know that ?~ is equivalent to the ?̃ (U+0303
COMBINING TILDE) that is part of ?ã ?  We do have infrastructure in
place to decompose characters like ã into the base character ?a and
the combining diacritic(s): the call (ucs-normalize-NFD-string "ã")
returns a string of 2 characters, ?a and ?̃.  But how do you propose
to make the leap from ?̃ to ?~ ?