From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Dmitry Antipov Newsgroups: gmane.emacs.devel Subject: Re: One more string functions change Date: Sun, 29 Jun 2014 20:38:26 +0400 Message-ID: <53B04102.6060803@yandex.ru> References: <53AD8D59.5000207@yandex.ru> <53AD9FDB.80705@cs.ucla.edu> <83simq6spc.fsf@gnu.org> <53AEEBA3.1030706@yandex.ru> <83zjgx54ub.fsf@gnu.org> <53AF7FB9.1030709@yandex.ru> <83r427696f.fsf@gnu.org> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Trace: ger.gmane.org 1404059948 1059 80.91.229.3 (29 Jun 2014 16:39:08 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sun, 29 Jun 2014 16:39:08 +0000 (UTC) Cc: handa@gnu.org, emacs-devel@gnu.org To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Sun Jun 29 18:39:01 2014 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1X1I82-0003v9-1A for ged-emacs-devel@m.gmane.org; Sun, 29 Jun 2014 18:38:58 +0200 Original-Received: from localhost ([::1]:58308 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1X1I81-0002Bh-J0 for ged-emacs-devel@m.gmane.org; Sun, 29 Jun 2014 12:38:57 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:56171) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1X1I7r-0002BO-S1 for emacs-devel@gnu.org; Sun, 29 Jun 2014 12:38:54 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1X1I7k-0005Wc-IO for emacs-devel@gnu.org; Sun, 29 Jun 2014 12:38:47 -0400 Original-Received: from forward1l.mail.yandex.net ([84.201.143.144]:54672) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1X1I7c-0005W5-OX; Sun, 29 Jun 2014 12:38:32 -0400 Original-Received: from smtp3h.mail.yandex.net (smtp3h.mail.yandex.net [84.201.186.20]) by forward1l.mail.yandex.net (Yandex) with ESMTP id C44A515219FF; Sun, 29 Jun 2014 20:38:29 +0400 (MSK) Original-Received: from smtp3h.mail.yandex.net (localhost [127.0.0.1]) by smtp3h.mail.yandex.net (Yandex) with ESMTP id 416631B43CF4; Sun, 29 Jun 2014 20:38:29 +0400 (MSK) Original-Received: from 225.gprs.mts.ru (225.gprs.mts.ru [213.87.137.225]) by smtp3h.mail.yandex.net (nwsmtp/Yandex) with ESMTPSA id mHsX6uWY6T-cS0S3fOn; Sun, 29 Jun 2014 20:38:28 +0400 (using TLSv1 with cipher AES128-SHA (128/128 bits)) (Client certificate not present) X-Yandex-Uniq: 5840d649-3aa5-428b-a20a-9586b9de4232 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex.ru; s=mail; t=1404059908; bh=Bl58U2syFJXaYZROgJ/bt8d9zXZZAliXIJdkzjkW928=; h=Message-ID:Date:From:User-Agent:MIME-Version:To:CC:Subject: References:In-Reply-To:Content-Type:Content-Transfer-Encoding; b=ZHi+pBisybUk3lvqQ4og57qroiIWBk8hNhgFN07eyyDkpe9H5rQlwn00pIYIDePTv pHbUnUCQskC1FO0RA/EqRAUCkrIryrua2VRbXcjoQfi6xrLlPNpZSnCP3SoSgTSrpc Yi90BZUo93GLuibcVCPiNsmnOY2F46gQ3qHIbyjU= Authentication-Results: smtp3h.mail.yandex.net; dkim=pass header.i=@yandex.ru User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.6.0 In-Reply-To: <83r427696f.fsf@gnu.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x [generic] [fuzzy] X-Received-From: 84.201.143.144 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:172815 Archived-At: On 06/29/2014 07:13 PM, Eli Zaretskii wrote: > It's possible that we should consider this now, but the answer to your > question is not a trivial one in any case. Emacs traditionally > exposed to Lisp all the Unicode character properties, as char-tables. Are these exposed properties really used from Lisp in a high-level, user-defined manner? For example, is it desirable/possible to customize related things via .emacs? Or is there major/minor mode which relies on the Lisp-visible character properties? > If we decide to use ICU, we'd need to think what to do with those > char-tables: remove them, populate them using ICU, something else? > (Having these databases twice would be an unnecessary bloat, IMO.) Yes, ICU itself is bloated enough. On my system, shared library with compiled-in Unicode data is > 20M. Nevertheless it's commonly considered "not too bloated" even for relatively small systems like the modern Android-based gadgets. > Some of these properties need to support very fast access (e.g., for > bidi display), and the question is how fast is ICU in this regard. > Also, many Unicode features are already implemented, so they should be > reworked or refactored, or maybe the corresponding ICU features left > unused. And features that depend on Unicode, like font selection, > will have to be adapted. IIUC the things are even worse because ICU uses 16- and 32-bit quantities to represent Unicode characters; this doesn't look too compatible with our internal variable-size, 1-5 bytes-width encoding. Dmitry