From mboxrd@z Thu Jan 1 00:00:00 1970 Path: main.gmane.org!not-for-mail From: Kenichi Handa Newsgroups: gmane.emacs.devel Subject: Re: Several serious problems Date: Tue, 23 Jul 2002 22:35:46 +0900 (JST) Sender: emacs-devel-admin@gnu.org Message-ID: <200207231335.WAA25692@etlken.m17n.org> References: <200207221711.g6MHBZo02496@aztec.santafe.edu> NNTP-Posting-Host: localhost.gmane.org X-Trace: main.gmane.org 1027431571 15069 127.0.0.1 (23 Jul 2002 13:39:31 GMT) X-Complaints-To: usenet@main.gmane.org NNTP-Posting-Date: Tue, 23 Jul 2002 13:39:31 +0000 (UTC) Cc: spiegel@gnu.org, savannah-hackers@gnu.org, emacs-devel@gnu.org Return-path: Original-Received: from quimby.gnus.org ([80.91.224.244]) by main.gmane.org with esmtp (Exim 3.33 #1 (Debian)) id 17Wzsu-0003uw-00 for ; Tue, 23 Jul 2002 15:39:28 +0200 Original-Received: from fencepost.gnu.org ([199.232.76.164]) by quimby.gnus.org with esmtp (Exim 3.12 #1 (Debian)) id 17X07F-0004yN-00 for ; Tue, 23 Jul 2002 15:54:18 +0200 Original-Received: from localhost ([127.0.0.1] helo=fencepost.gnu.org) by fencepost.gnu.org with esmtp (Exim 3.35 #1 (Debian)) id 17Wzqf-0007ya-00; Tue, 23 Jul 2002 09:37:09 -0400 Original-Received: from tsukuba.m17n.org ([192.47.44.130]) by fencepost.gnu.org with smtp (Exim 3.35 #1 (Debian)) id 17WzpR-0007uy-00; Tue, 23 Jul 2002 09:35:53 -0400 Original-Received: from fs.m17n.org (fs.m17n.org [192.47.44.2]) by tsukuba.m17n.org (8.11.6/3.7W-20010518204228) with ESMTP id g6NDZkl12323; Tue, 23 Jul 2002 22:35:46 +0900 (JST) (envelope-from handa@m17n.org) Original-Received: from etlken.m17n.org (etlken.m17n.org [192.47.44.125]) by fs.m17n.org (8.11.3/3.7W-20010823150639) with ESMTP id g6NDZk924037; Tue, 23 Jul 2002 22:35:46 +0900 (JST) Original-Received: (from handa@localhost) by etlken.m17n.org (8.8.8+Sun/3.7W-2001040620) id WAA25692; Tue, 23 Jul 2002 22:35:46 +0900 (JST) Original-To: rms@gnu.org In-Reply-To: <200207221711.g6MHBZo02496@aztec.santafe.edu> (message from Richard Stallman on Mon, 22 Jul 2002 11:11:35 -0600 (MDT)) Errors-To: emacs-devel-admin@gnu.org X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.0.11 Precedence: bulk List-Help: List-Post: List-Subscribe: , List-Id: Emacs development discussions. List-Unsubscribe: , List-Archive: Xref: main.gmane.org gmane.emacs.devel:5991 X-Report-Spam: http://spam.gmane.org/gmane.emacs.devel:5991 In article <200207221711.g6MHBZo02496@aztec.santafe.edu>, Richard Stallman writes: > I cannot save the file lisp/ChangeLog. It specifies coding system > iso-2022-7bit, but it contains something that cannot be encoded in that > coding system. It seem that this problem was already fixed. As I also found one unnecessary mule-unicode-0100-24ff char, I deleted it. > I don't know any way to find the text that causes the > problem; essentially I am helpless. At least, (find-charset-region 1 (point-max)) will give you some information. If the returned value contains a suspicious charset, we can search it (if it's not eight-bit-xxx) by: (re-search-forward "[%c-%c]" (make-char CHARSET 32 32) (make-char CHARSET 127 127)) To search for eight-bit-control: (re-search-forward "[\200-\237]") To search for eight-bit-graphic: (re-search-forward (string-as-multibyte "[\240-\377]")) It's not sophisticated. :-( > We MUST do something to make it easier for users to cope with such a > situation. We talked about this a few weeks ago but nothing was done. > Perhaps we could add a command which simply scans forward for the next > run of characters that can't be saved in the specified coding system. > The message you get in that situation could tell you about this > command. This would be a powerful solution, since you could easily > find all the problems, not just the first one. Highlighting all of > them would also be a useful thing to do. Do you mean a command something like this? (defun check-coding-system-region (from to coding-system &optional max-num) "Check if the text after point is encodable by the specified coding system. When called from a program, takes three arguments: CODING-SYSTEM, FROM, and TO. START and END are buffer positions. Value is a list of positions of characters that are not encodable by CODING-SYSTEM. Optional 4th argument MAX-NUM, if non-nil, limits the length of returned list. By default, there's no limit." (interactive (list (point) (point-max) (read-non-nil-coding-system "Coding-system: ") 1)) (check-coding-system coding-system) (or (and coding-system (integerp (coding-system-type coding-system))) (error "Invalid coding system to check: %s" coding-system)) (let ((safe-chars (coding-system-get coding-system 'safe-chars)) (positions) (n 0)) (save-excursion (save-restriction (narrow-to-region from to) (goto-char (point-min)) (or max-num (setq max-num (- (point-max) (point-min)))) (if (eq safe-chars t) (let ((re (string-as-multibyte "[\200-\237\240-\377]"))) (while (and (< n max-num) (re-search-forward re nil t)) (setq positions (cons (1- (point)) positions) n (1+ n)))) (while (and (< n max-num) (re-search-forward "[^\000-\177]" nil t)) (or (aref safe-chars (preceding-char)) (setq positions (cons (1- (point)) positions) n (1+ n))))))) (if (interactive-p) (if (not positions) (message "All characters are encodable by %s" coding-system) (goto-char (car positions)) (error "This character can't be encoded by %s" coding-system)) (setq positions (nreverse positions))))) --- Ken'ichi HANDA handa@etl.go.jp