From mboxrd@z Thu Jan 1 00:00:00 1970 Path: main.gmane.org!not-for-mail From: "Eli Zaretskii" Newsgroups: gmane.emacs.bugs Subject: Re: user sees \xxx but is thwarted from searching for them Date: Tue, 16 Apr 2002 15:32:11 +0300 Sender: bug-gnu-emacs-admin@gnu.org Message-ID: <7377-Tue16Apr2002153210+0300-eliz@is.elta.co.il> References: Reply-To: Eli Zaretskii NNTP-Posting-Host: localhost.gmane.org X-Trace: main.gmane.org 1018960788 12845 127.0.0.1 (16 Apr 2002 12:39:48 GMT) X-Complaints-To: usenet@main.gmane.org NNTP-Posting-Date: Tue, 16 Apr 2002 12:39:48 +0000 (UTC) Cc: bug-gnu-emacs@gnu.org Return-path: Original-Received: from fencepost.gnu.org ([199.232.76.164]) by main.gmane.org with esmtp (Exim 3.33 #1 (Debian)) id 16xSFP-0003L4-00 for ; Tue, 16 Apr 2002 14:39:47 +0200 Original-Received: from localhost ([127.0.0.1] helo=fencepost.gnu.org) by fencepost.gnu.org with esmtp (Exim 3.34 #1 (Debian)) id 16xSFW-0000en-00; Tue, 16 Apr 2002 08:39:54 -0400 Original-Received: from thor.inter.net.il ([192.114.186.11]) by fencepost.gnu.org with esmtp (Exim 3.34 #1 (Debian)) id 16xSE3-0000YN-00 for ; Tue, 16 Apr 2002 08:38:24 -0400 Original-Received: from zaretsky (diup-217-183.inter.net.il [213.8.217.183]) by thor.inter.net.il (Mirapoint Messaging Server MOS 2.9.3.2) with ESMTP id ABT33835; Tue, 16 Apr 2002 15:38:19 +0300 (IDT) Original-To: David.Kastrup@t-online.de X-Mailer: emacs 21.2.50 (via feedmail 8 I) and Blat ver 1.8.9 In-Reply-To: (message from David Kastrup on 16 Apr 2002 13:36:45 +0200) Errors-To: bug-gnu-emacs-admin@gnu.org X-BeenThere: bug-gnu-emacs@gnu.org X-Mailman-Version: 2.0.9 Precedence: bulk List-Help: List-Post: List-Subscribe: , List-Id: Bug reports for GNU Emacs, the Swiss army knife of text editors List-Unsubscribe: , List-Archive: Xref: main.gmane.org gmane.emacs.bugs:705 X-Report-Spam: http://spam.gmane.org/gmane.emacs.bugs:705 > From: David Kastrup > Newsgroups: gnu.emacs.bug > Date: 16 Apr 2002 13:36:45 +0200 > > eliz@is.elta.co.il (Eli Zaretskii) writes: > > > On 16 Apr 2002, Dan Jacobson wrote: > > > > > Anyway, the user sees a \. The user wants to hunt for a \. The user > > > must have a Ph.D. to hunt for a \. > > > > Not really. `M-: (skip-chars-forward "\000-\177") RET' will do. > > Wrapping this into a simple user command is left as an exercise for the > > interested reader. > > That's exactly what Dan means by "must have a Ph.D.". It is easy, but > non-obvious. It's easy once you know what to do. To _know_ it might require specific knowledge, but to _use_ it does not. > We have regular expressions like [::ascii::] or so, perhaps something > like [::encodable-in-the-current-default-encoding::] > [::not-encodable-in-latin2::] (look for better names) would be a > first shot at making things easier to wrap into user accessible > functions. This was discussed in preparation for Emacs 21.1, and turned out to be a very complex job. The main problem is that, contrary to what users may expect, Emacs does not actually know what characters prevent it to encode the buffer in the default coding systems. The code which implements this test (see the function select-safe-coding-system and its subroutines) calls primitives that don't return this information. Instead, they return a list of encodings that can safely encode all of the characters in the region; Emacs then compares that list with the list of default and preferred encodings, and if these two lists don't intersect, it pops up the question. Several alternatives were suggested to show the offending characters, but IIRC they were all non-trivial. On top of that, all the effort to implement that will go down the drain when Emacs switches to Unicode-based internal representation of characters. And since Handa-san, who does most of the Mule-related development, is currently busy working on Unicode support... well, you can guess the rest.