From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Lars Ingebrigtsen Newsgroups: gmane.emacs.devel Subject: Re: Entering emojis Date: Wed, 27 Oct 2021 14:44:05 +0200 Message-ID: <871r46d4dm.fsf@gnus.org> References: <87cznths5j.fsf@gnus.org> <83zgqxymd3.fsf@gnu.org> <878rygj4gt.fsf@gnus.org> <83wnm0zz0q.fsf@gnu.org> <874k94j3rn.fsf@gnus.org> <83v91kzydh.fsf@gnu.org> <87tuh4holf.fsf@gnus.org> <822aec9d01909cecfc6c@heytings.org> <87a6iwhltf.fsf@gnus.org> <83tuh4zfg5.fsf@gnu.org> <87y26gfobr.fsf@gnus.org> <87tuh4f1ie.fsf@gnus.org> <87lf2fg44h.fsf@gnus.org> <87h7d3g2uu.fsf@gnus.org> <83bl3bybm3.fsf@gnu.org> <878ryfr9w0.fsf@gmail.com> <878ryfg07k.fsf@gnus.org> <8335ony9aw.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="893"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/29.0.50 (gnu/linux) Cc: rpluim@gmail.com, emacs-devel@gnu.org, gregory@heytings.org, stefankangas@gmail.com To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Wed Oct 27 14:48:16 2021 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1mfiLn-000AXl-UI for ged-emacs-devel@m.gmane-mx.org; Wed, 27 Oct 2021 14:48:16 +0200 Original-Received: from localhost ([::1]:47948 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mfiLm-00029C-RJ for ged-emacs-devel@m.gmane-mx.org; Wed, 27 Oct 2021 08:48:14 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:52496) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mfiI2-0007J3-Om for emacs-devel@gnu.org; Wed, 27 Oct 2021 08:44:26 -0400 Original-Received: from quimby.gnus.org ([2a01:4f9:2b:f0f::2]:40814) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mfiHx-0002kV-My; Wed, 27 Oct 2021 08:44:22 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnus.org; s=20200322; h=Content-Type:MIME-Version:Message-ID:In-Reply-To:Date: References:Subject:Cc:To:From:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:Resent-Date:Resent-From:Resent-Sender: Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help:List-Unsubscribe: List-Subscribe:List-Post:List-Owner:List-Archive; bh=OiPvp9Q7uJBJ12aUoz8eXO1bY2nZqQG+wB9DK3p+gUM=; b=HGXWRzKUBxthkShP5+Vmyj9EAN ziFUAuL+isZ9VwMGk86lG++gXslI6FWJkFvEqDiWtVfUEr5tMTDD/JCXcBWmjR2DS0ZJMTpu75ooy 4DkNYtu2LQiQkPaNYdY1qHb9S3kFc5Sy2w6r1mPiw/h1hYDYBgUK84+oaiLu34TngIPI=; Original-Received: from [84.212.220.105] (helo=elva) by quimby.gnus.org with esmtpsa (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1mfiHp-0005pM-Cc; Wed, 27 Oct 2021 14:44:12 +0200 Face: iVBORw0KGgoAAAANSUhEUgAAADAAAAAwBAMAAAClLOS0AAAABGdBTUEAALGPC/xhBQAAACBj SFJNAAB6JgAAgIQAAPoAAACA6AAAdTAAAOpgAAA6mAAAF3CculE8AAAAG1BMVEV7YWGAZWChdDdz T02DW0WUaTyLY0O/jS3////xjG5rAAAAAWJLR0QIht6VegAAAAd0SU1FB+UKGwwnNcKZZ4wAAAGo SURBVDjLdZRtbsMgDIZT7QI7wqRt2v/aYf9nygGWNQeoCVfo9WcbA0mlWiopPP6I4SXTVO3kz9fT 6U2fLxO4xWowz+8272BYfgaYngE2YCOgeKljlIGDASsVK5VfIIjqBdO21JVmGBOw1Qg1ybDN5pPm jNHXSNfOlkEASTLegRmKA/HwELL6nBxEyKGV1uFjIY/41G1A1ggUlxt759c7SXBIGYh0P4IDomsR 8KWtspbBBmruH50yCquAGsCNqL6WbXyPIB0QBiDvzIZQ/3iDCL0O1i3YBoDGLWL2PiRFglok2LSB COnCNRlaUGlnnlZGGJng1sVQgL5tq4oR7gARLnoqpfARzLKtUc8y3VpXTVekOpPi7WCarmYgzqKQ 1qkAE5Eu4VFG02HWNWE1RHzoYggHAFfJc+Ym4Ye3El/s+urAJSog4Si+N2mDNnCJ5gi7F+U2m4g1 cT+sdaRaII0I3NVgDr0+0qF4vYAyFBqXSOSTknkS7d/PIuTSEc73BR+B3Xk6H6/ihB8y/lFT5wBh latZZHHLELJILFvHrkT8XTPaxyWl+o2Bfy5i+pILZg5PAAAAJXRFWHRkYXRlOmNyZWF0ZQAyMDIx LTEwLTI3VDEyOjM5OjUzKzAwOjAwtrpcyQAAACV0RVh0ZGF0ZTptb2RpZnkAMjAyMS0xMC0yN1Qx MjozOTo1MyswMDowMMfn5HUAAAAASUVORK5CYII= X-Now-Playing: Aksak Maboul's _Figures (2)_: "The Untranslatable" In-Reply-To: <8335ony9aw.fsf@gnu.org> (Eli Zaretskii's message of "Tue, 26 Oct 2021 20:39:51 +0300") Received-SPF: pass client-ip=2a01:4f9:2b:f0f::2; envelope-from=larsi@gnus.org; helo=quimby.gnus.org X-Spam_score_int: -39 X-Spam_score: -4.0 X-Spam_bar: ---- X-Spam_report: (-4.0 / 5.0 requ) BAYES_00=-1.9, DKIM_INVALID=0.1, DKIM_SIGNED=0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:277951 Archived-At: Eli Zaretskii writes: >> How am I supposed to go from GOLFER to that glyph? From POLICE OFFICER >> it's no problem getting to "woman police officer: light skin tone", >> because those use the same name in the UCS file and in the zwj file, but >> GOLFING isn't the same as GOLFER. > > How many such cases are there? Can't you have a small database of > such "translations"? I'd prefer things to work automatically -- then there'll be no need to maintain this stuff as Unicode adds new things every year. But... that's perhaps a forlorn hope. We'll see; things seem to be working pretty well now with: >> So the first codepoint is what matters for determining the variants? > > Yes, AFAIK. I'm now doing (some) mapping based on that instead, and that does indeed fix the problem with golfing. But it seems like it might give some false positives (i.e., it thinks that some things that shouldn't be derived are derived), so more tweaking might be needed in the algorithm. I think it's "basically working", but I need to rewrite that algo anyway, because it's a bit of a mess with all the tweaking back and forth, and I may just have confused myself... -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no