From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Jason Rumney Newsgroups: gmane.emacs.devel Subject: Re: highlighting non-ASCII characters Date: Wed, 24 Mar 2010 13:14:13 +0800 Message-ID: <4BA99FA5.2030902@gnu.org> References: <87sk7vllgj.fsf@mail.jurta.org> <87hbo81onq.fsf@lifelogs.com> <87k4t4zb5l.fsf@lifelogs.com> <87r5ncxp4z.fsf@lifelogs.com> <87hbo8tf4i.fsf@turtle.gmx.de> <87hbo8xis5.fsf@lifelogs.com> <87aau0t7uy.fsf@turtle.gmx.de> <87sk7svyam.fsf@lifelogs.com> <87vdcngws4.fsf@mail.jurta.org> <8739zryv6l.fsf_-_@lifelogs.com> <6932BBFEB09A4BA09156ED7F598569CE@us.oracle.com> <87pr2uv8e1.fsf@lifelogs.com> <83iq8mgxjw.fsf@gnu.org> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit X-Trace: dough.gmane.org 1269407719 1797 80.91.229.12 (24 Mar 2010 05:15:19 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Wed, 24 Mar 2010 05:15:19 +0000 (UTC) Cc: tzz@lifelogs.com, Stefan Monnier , emacs-devel@gnu.org To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Wed Mar 24 06:15:15 2010 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1NuIvy-00047w-Jl for ged-emacs-devel@m.gmane.org; Wed, 24 Mar 2010 06:15:14 +0100 Original-Received: from localhost ([127.0.0.1]:55492 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1NuIvx-0006uA-PF for ged-emacs-devel@m.gmane.org; Wed, 24 Mar 2010 01:15:13 -0400 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1NuIvp-0006u5-Ln for emacs-devel@gnu.org; Wed, 24 Mar 2010 01:15:05 -0400 Original-Received: from [140.186.70.92] (port=60263 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1NuIvn-0006tx-DW for emacs-devel@gnu.org; Wed, 24 Mar 2010 01:15:04 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69) (envelope-from ) id 1NuIvm-0001ig-59 for emacs-devel@gnu.org; Wed, 24 Mar 2010 01:15:03 -0400 Original-Received: from mail-gw0-f41.google.com ([74.125.83.41]:35192) by eggs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1NuIvj-0001i7-R7; Wed, 24 Mar 2010 01:14:59 -0400 Original-Received: by gwj21 with SMTP id 21so5461035gwj.0 for ; Tue, 23 Mar 2010 22:14:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:sender:message-id:date:from :user-agent:mime-version:to:cc:subject:references:in-reply-to :content-type:content-transfer-encoding; bh=o2a2fqnynFhlZY71s9SNyLsglvZYwrP9TsLy3zEeRLM=; b=VMJ9wrla4+T54f5Im21x+Fh+3l4zjI1zeSADWP9XshNlJOTRXMa+/hrWukChGjehxh 3aNN/nqsFrW0VAiRwwCXE1XVSrdQkWNMVAJis0kuGlJkQgjZiKbMZZByRmwz+j2e92oS b3wJ2PD1E4F1BL1bN2N3Ry+y9Eo9Y9tx0M7tg= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=sender:message-id:date:from:user-agent:mime-version:to:cc:subject :references:in-reply-to:content-type:content-transfer-encoding; b=ZmUuXDibGV/nKIDEQC4IuxbAlH8EFqovnhxNTOUAJkleDtrCGSwOuz/ksTjZH1H3Z6 evDqCeJ1obEerQm96BExFvBXrqmaVgpKyHU/4Lx/FxZLNaU6r+U8cXgri6kf9+Sxpgyr nsX6rmBSSY4sveLYto4v1c9dm/fHgzgdtJmTo= Original-Received: by 10.101.15.19 with SMTP id s19mr1770913ani.4.1269407697639; Tue, 23 Mar 2010 22:14:57 -0700 (PDT) Original-Received: from [10.1.1.55] ([61.4.103.130]) by mx.google.com with ESMTPS id 4sm2405085ywg.39.2010.03.23.22.14.52 (version=TLSv1/SSLv3 cipher=RC4-MD5); Tue, 23 Mar 2010 22:14:56 -0700 (PDT) User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1.7) Gecko/20100111 Lightning/1.0b1 Thunderbird/3.0.1 In-Reply-To: <83iq8mgxjw.fsf@gnu.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 2) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:122594 Archived-At: On 24/03/2010 12:20, Eli Zaretskii wrote: > If we go for such a metric, it would need to be augmented by a > database of words where a small number of such characters is > ``normal'', not to be highlighted. This is for words like naïve. > Otherwise the feature will be an annoyance. > It's also dependent on which characters they are - Cyrillic, Han, Greek, Hebrew etc should be expected to appear in long runs, perhaps with runs of ASCII and/or other characters interleaved. Latin-1 on the other hand would normally appear individually or in very short runs mixed in with ASCII. There is no single heuristic that can be used to identify "suspicious" characters.