From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.help Subject: Re: "Unidecode" functionality in Emacs Date: Tue, 20 Mar 2018 08:39:10 +0200 Message-ID: <83o9jjmesx.fsf@gnu.org> References: <87bmfjgx55.fsf@iki.fi> NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Trace: blaine.gmane.org 1521527853 10396 195.159.176.226 (20 Mar 2018 06:37:33 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Tue, 20 Mar 2018 06:37:33 +0000 (UTC) To: help-gnu-emacs@gnu.org Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Tue Mar 20 07:37:29 2018 Return-path: Envelope-to: geh-help-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eyAtk-0002bs-JX for geh-help-gnu-emacs@m.gmane.org; Tue, 20 Mar 2018 07:37:28 +0100 Original-Received: from localhost ([::1]:46434 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eyAvn-0001rP-R8 for geh-help-gnu-emacs@m.gmane.org; Tue, 20 Mar 2018 02:39:35 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:37545) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eyAvH-0001rG-D5 for help-gnu-emacs@gnu.org; Tue, 20 Mar 2018 02:39:04 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1eyAvE-00045M-3n for help-gnu-emacs@gnu.org; Tue, 20 Mar 2018 02:39:03 -0400 Original-Received: from fencepost.gnu.org ([2001:4830:134:3::e]:45678) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eyAvE-00045G-12 for help-gnu-emacs@gnu.org; Tue, 20 Mar 2018 02:39:00 -0400 Original-Received: from [176.228.60.248] (port=3841 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1eyAvD-0001iA-9L for help-gnu-emacs@gnu.org; Tue, 20 Mar 2018 02:38:59 -0400 In-reply-to: <87bmfjgx55.fsf@iki.fi> (message from Teemu Likonen on Tue, 20 Mar 2018 06:59:34 +0200) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2001:4830:134:3::e X-BeenThere: help-gnu-emacs@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: Users list for the GNU Emacs text editor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Original-Sender: "help-gnu-emacs" Xref: news.gmane.org gmane.emacs.help:116211 Archived-At: > From: Teemu Likonen > Date: Tue, 20 Mar 2018 06:59:34 +0200 > Cc: Help Gnu Emacs mailing list > > I don't know of any built-in functions but external "iconv" tool can do > similar thing for Latin scripts. Here's an example Emacs Lisp function > wrapper for "iconv": > > (defun tl-ascii-translit (string) > (with-temp-buffer > (insert string) > (call-process-region (point-min) (point-max) > "iconv" t t nil "-t" "ASCII//TRANSLIT") > (buffer-substring-no-properties (point-min) (point-max)))) > > Works for Latin scripts: > > (tl-ascii-translit "Déjà vu") ;=> "Deja vu" > (tl-ascii-translit "北亰") ;=> "??" The iconv's "TRANSLIT" is not the transliteration that's sought here. It's an attempt to present similarly-looking characters when the original character is not in the target character set (ASCII in the above snippet). So it's a small wonder this only works for European scripts, because no ASCII character can ever "look like" characters in other scripts.