From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Jean Louis Newsgroups: gmane.emacs.help Subject: Re: Unicode fonts - Re: Why do I find ^L in elisp code? Date: Mon, 24 May 2021 23:19:27 +0300 Message-ID: References: <83h7iuj6i1.fsf@gnu.org> <837djodvja.fsf@gnu.org> <83r1hwc870.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="29123"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Mutt/2.0.6 (2021-03-06) Cc: help-gnu-emacs@gnu.org To: Eli Zaretskii Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane-mx.org@gnu.org Mon May 24 22:23:32 2021 Return-path: Envelope-to: geh-help-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1llH6o-0007JK-B8 for geh-help-gnu-emacs@m.gmane-mx.org; Mon, 24 May 2021 22:23:30 +0200 Original-Received: from localhost ([::1]:57816 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llH6n-0002LZ-Bt for geh-help-gnu-emacs@m.gmane-mx.org; Mon, 24 May 2021 16:23:29 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:35742) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llH6L-0002Fu-DO for help-gnu-emacs@gnu.org; Mon, 24 May 2021 16:23:01 -0400 Original-Received: from stw1.rcdrun.com ([217.170.207.13]:58891) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llH6J-0001h6-5e; Mon, 24 May 2021 16:23:00 -0400 Original-Received: from localhost ([::ffff:154.230.106.158]) (AUTH: PLAIN admin, TLS: TLS1.3,256bits,ECDHE_RSA_AES_256_GCM_SHA384) by stw1.rcdrun.com with ESMTPSA id 00000000000ADF01.0000000060AC0B1F.00004553; Mon, 24 May 2021 13:22:54 -0700 Mail-Followup-To: Eli Zaretskii , help-gnu-emacs@gnu.org Content-Disposition: inline In-Reply-To: <83r1hwc870.fsf@gnu.org> Received-SPF: pass client-ip=217.170.207.13; envelope-from=bugs@gnu.support; helo=stw1.rcdrun.com X-Spam_score_int: -8 X-Spam_score: -0.9 X-Spam_bar: / X-Spam_report: (-0.9 / 5.0 requ) BAYES_00=-1.9, FUZZY_PAYPAL=1, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: help-gnu-emacs@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Users list for the GNU Emacs text editor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: "help-gnu-emacs" Xref: news.gmane.io gmane.emacs.help:130198 Archived-At: * Eli Zaretskii [2021-05-24 21:13]: > > Date: Mon, 24 May 2021 21:05:28 +0300 > > From: Jean Louis > > Cc: help-gnu-emacs@gnu.org > > > > > Would you also like "реасе" to be supported by English screen > > > readers, for example? > > > > Definitely, just that I don't understand the meaning of your > > question. Do you mean that piece and peace would be spoken same? > > Look closer: that word I wrote is not "peace". (•◡•) Good trick to demonstrate your point. That type of style is already used on social media extensively, letters that do not belong where they should are used for expressions. That is true, and IMHO, it is up to artificial intelligence to try to decipher that. And it is possible. There is Mozilla Voice project where people donate voice for voice recognition: https://voice.mozilla.org where people speak and listen, people tell to computer what is the meaning of the voice. By using that same principle people may provide submissions, andeven реасе may be interpreted as "peace" in English if it is in the English context. Similar thing does Google on https://translate.google.com where it asks users to correct translations. 𝗛𝗲𝗹𝗹𝗼 𝘁𝗵𝗲𝗿𝗲 𝗚𝗼𝗼𝗱 𝗲𝘅𝗮𝗺𝗽𝗹𝗲: https://translate.google.com/?sl=auto&tl=it&text=%F0%9D%97%9B%F0%9D%97%B2%F0%9D%97%B9%F0%9D%97%B9%F0%9D%97%BC%20%F0%9D%98%81%F0%9D%97%B5%F0%9D%97%B2%F0%9D%97%BF%F0%9D%97%B2&op=translate In that example one can see that Google artificial intelligence recognizes 𝗛𝗲𝗹𝗹𝗼 𝘁𝗵𝗲𝗿𝗲 as English, click on the speech icon, it will speak English perfectly and Italian's 𝗵𝗲𝗹𝗹𝗼 𝘁𝗵𝗲𝗿𝗲 will be spoken in English with Italian accent. The fact is, Google's artificial intelligence does recognize Mathematical Sans-Serif Bold. There could be a more global free software licensed artificial intelligence that could collect the meanings from people, whatever they may be. > > > You are judging characters by their appearance, which is incorrect. > > > > Yes, surely I understand it may be technically incorrect, though > > humanely it gives a style even in those cases where text style cannot > > be otherwise assigned. > > No, that's a slippery slope towards the so-called "confusables", see > > https://websec.github.io/unicode-security-guide/visual-spoofing/ I understand your rejection as programmer of Emacs and that is fine in that context. Though on the other side, Emacs is used by thousands of artists who express themselves beyond technicalities. Programmers of new software have to be aware of new developments and thus take in account Unicode symbols. One may call some of those "confusables", but real problem is in Unicode's fundamental design of those characters or lack of attributes. If 𝗔 is not A technically, it is humanely still "A" with a difference that one could be displayed slightly different, but it remains the letter A. Now if Unicode would assign some attributes or additional type meanings to it, programs would get information on how to treat that easier. Now it is possible only on the higher level be telling to program how to treat a character, but Unicode could inject the type version into the character itself on fundamental level. The type could tell that character is readable, or not readable, or similar to other characters and so on and programs could interpret it correctly. That would be fundamental solution to the problem including to "confusables" as the type would be fundamental, downloaded from a central place like Unicode, and programs would just need to read the type and tell to user that pаураl.com is not equal to paypal.com and mechanism for that already exist, that is the user's preferred language when browsing, though it does not apply to internationalized domai names, but that is yet up to browser authors to harmonize it. If user wish to read English language, than it is up to browser to say "No, this domain `pаураl.com' has some cyrillic characters, do you really wish to proceed?" Computers need teaching. Browser could be instructed to watch out for the alphabet that user uses, screen reader could be instructed by the 𝗔 attribute to represent also a letter A, and to read it properly, not just to read Latin alphabet. Emacs has its properties that can keep such data and as such could be the exemplary way of presenting such text or set of characters. Jean Take action in Free Software Foundation campaigns: https://www.fsf.org/campaigns Sign an open letter in support of Richard M. Stallman https://stallmansupport.org/