From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Teemu Likonen Newsgroups: gmane.emacs.help Subject: Re: "Unidecode" functionality in Emacs Date: Tue, 20 Mar 2018 06:59:34 +0200 Message-ID: <87bmfjgx55.fsf@iki.fi> References: NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: multipart/signed; boundary="=-=-="; micalg=pgp-sha256; protocol="application/pgp-signature" X-Trace: blaine.gmane.org 1521521899 16582 195.159.176.226 (20 Mar 2018 04:58:19 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Tue, 20 Mar 2018 04:58:19 +0000 (UTC) User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.0.91 (gnu/linux) Cc: Help Gnu Emacs mailing list To: John Mastro Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Tue Mar 20 05:58:15 2018 Return-path: Envelope-to: geh-help-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ey9Li-0004BQ-SK for geh-help-gnu-emacs@m.gmane.org; Tue, 20 Mar 2018 05:58:14 +0100 Original-Received: from localhost ([::1]:46225 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ey9Nk-0000tO-75 for geh-help-gnu-emacs@m.gmane.org; Tue, 20 Mar 2018 01:00:20 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:50093) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ey9NF-0000t7-J9 for help-gnu-emacs@gnu.org; Tue, 20 Mar 2018 00:59:50 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ey9NB-0005BJ-MW for help-gnu-emacs@gnu.org; Tue, 20 Mar 2018 00:59:49 -0400 Original-Received: from mta-out1.inet.fi ([62.71.2.203]:37936 helo=johanna4.inet.fi) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ey9NB-00058u-BI for help-gnu-emacs@gnu.org; Tue, 20 Mar 2018 00:59:45 -0400 Received-SPF: neutral (johanna4.inet.fi: domain iki.fi is neutral about designating 109.240.58.136 as permitted sender) identity=mailfrom; receiver=johanna4.inet.fi; client-ip=109.240.58.136; envelope-from=tlikonen@iki.fi; helo=mithlond; RazorGate-KAS: Status: not_detected RazorGate-KAS: Rate: 0 RazorGate-KAS: Envelope from: RazorGate-KAS: Version: 5.5.3 RazorGate-KAS: LuaCore: 215 2015-05-29_17-31-22 60ae4a1b4d01d14f868b20a55aced8d7df7b2e28 RazorGate-KAS: Lua profiles 78662 [Jun 02 2015] RazorGate-KAS: Method: none Original-Received: from mithlond (109.240.58.136) by johanna4.inet.fi (9.0.002.03-2-gbe5d057) id 5AA62FB00A4F7B4F; Tue, 20 Mar 2018 06:59:37 +0200 In-Reply-To: (John Mastro's message of "Mon, 19 Mar 2018 15:04:29 -0700") X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x [fuzzy] X-Received-From: 62.71.2.203 X-BeenThere: help-gnu-emacs@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: Users list for the GNU Emacs text editor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Original-Sender: "help-gnu-emacs" Xref: news.gmane.org gmane.emacs.help:116209 Archived-At: --=-=-= Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable John Mastro [2018-03-19 15:04:29-07] wrote: > There are "Unidecode" packages for Perl[1], Python[2], and Emacs[3] > (derived from one another in that order). They each transliterate > Unicode text to ASCII, e.g.: > > (unidecode "D=C3=A9j=C3=A0 vu") > ;=3D> "Deja vu" > (unidecode "=E5=8C=97=E4=BA=B0") > ;=3D> "Bei Jing " > > Does Emacs have equivalent functionality built-in? I don't know of any built-in functions but external "iconv" tool can do similar thing for Latin scripts. Here's an example Emacs Lisp function wrapper for "iconv": (defun tl-ascii-translit (string) (with-temp-buffer (insert string) (call-process-region (point-min) (point-max) "iconv" t t nil "-t" "ASCII//TRANSLIT") (buffer-substring-no-properties (point-min) (point-max)))) Works for Latin scripts: (tl-ascii-translit "D=C3=A9j=C3=A0 vu") ;=3D> "Deja vu" (tl-ascii-translit "=E5=8C=97=E4=BA=B0") ;=3D> "??" =2D-=20 /// Teemu Likonen - .-.. // // PGP: 4E10 55DC 84E9 DFF6 13D7 8557 719D 69D3 2453 9450 /// --=-=-= Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQEzBAEBCAAdFiEEkhZiiC54Bnj5a16Skzo1BB5+rVEFAlqwlTYACgkQkzo1BB5+ rVEfkQf+KAh4mb0x4wEBa+yiMPg3gQcOgsxCTPWt+KVH8NFcAJ8xkZ0bnboBROg9 NSE0T5kQMzJl/FXlPc3YIj3V/yWkx9H9iqXdm4j4fAKjr9rXasmOTFRDawn0+X1K tnr/ni/MgG3ZDZS+YOjZdapD3HkjH8PuDg2EWVVVxgeBBRgrjIuwiTdtr2rCuVFg 3Qv4oyKRX7Kwum4JRKYsICMFZ5Rvyp/wzFpEFiDOOX0NsdyE7CMT4jFNTRVyTuQL aaBAGTlHLnFBZHqv1vs7Zv+cIyidh5FR3fF82uc/cdufnWf9rVAf4ck+sM5UEDnk z2O8TnXZdICxOFx97LgNDgbyf6Qo1A== =mxWN -----END PGP SIGNATURE----- --=-=-=--