From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: David Kastrup Newsgroups: gmane.emacs.devel Subject: Re: Creating a coding system Date: Tue, 23 Dec 2014 10:25:57 +0100 Message-ID: <87h9wmd8ey.fsf@fencepost.gnu.org> References: <87fvc63foe.fsf@gnu.org> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain X-Trace: ger.gmane.org 1419326787 19504 80.91.229.3 (23 Dec 2014 09:26:27 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Tue, 23 Dec 2014 09:26:27 +0000 (UTC) Cc: emacs-devel@gnu.org To: handa@gnu.org (K. Handa) Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Tue Dec 23 10:26:20 2014 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1Y3LjQ-0002K7-5c for ged-emacs-devel@m.gmane.org; Tue, 23 Dec 2014 10:26:20 +0100 Original-Received: from localhost ([::1]:43729 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Y3LjP-0002YJ-C3 for ged-emacs-devel@m.gmane.org; Tue, 23 Dec 2014 04:26:19 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:37184) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Y3LjL-0002YE-Lz for emacs-devel@gnu.org; Tue, 23 Dec 2014 04:26:16 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Y3LjK-0005aX-Fd for emacs-devel@gnu.org; Tue, 23 Dec 2014 04:26:15 -0500 Original-Received: from fencepost.gnu.org ([2001:4830:134:3::e]:44563) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Y3LjK-0005aS-Bw for emacs-devel@gnu.org; Tue, 23 Dec 2014 04:26:14 -0500 Original-Received: from localhost ([127.0.0.1]:51737 helo=lola) by fencepost.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Y3LjJ-0007Ap-T6; Tue, 23 Dec 2014 04:26:14 -0500 Original-Received: by lola (Postfix, from userid 1000) id 7696EE0473; Tue, 23 Dec 2014 10:25:57 +0100 (CET) In-Reply-To: <87fvc63foe.fsf@gnu.org> (K. Handa's message of "Tue, 23 Dec 2014 17:59:13 +0900") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.0.50 (gnu/linux) X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2001:4830:134:3::e X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:180548 Archived-At: handa@gnu.org (K. Handa) writes: > Hi, sorry for the late response. > > In article <87ppbeitcs.fsf@fencepost.gnu.org>, David Kastrup writes: >> Ok, what am I doing wrong here? Why does decode-coding-string not do >> anything here? > >> (define-translation-table 'midi-decode-table >> (make-translation-table-from-alist >> (mapcar >> (lambda (p) >> (cons (car p) (string-to-vector (cdr p)))) >> '(([144 0] . "c,,,,") > [...] >> (define-coding-system 'midi >> "This converts Midi note-on events to note names" >> :mnemonic ?M >> :coding-type 'charset >> :eol-type 'unix >> :decode-translation-table 'midi-decode-table >> :mime-text-unsuitable t) > > Please add > :charset-list '(iso-8859-1) > to the arguments of define-ccoding-system. > > The translation table of coding system works AFTER byte > sequences are decoded into char sequences by the basic > decoding routine which is specified by :coding-type (and the > other additional attributes). As it seems that you are > expecting that the basic decoding routine decodes the byte > 144 to the character 144, using the following set is good: > :coding-type 'charset > :charset-list '(iso-8859-1) It's one of the things I got to work. > The other method is to use CCL (i.e :coding-type 'ccl), but, > if the combination of the charset decoding and translation > table works, it's faster than running CCL code. The translation table is not happy about translating things to nothing. Apparently that makes the calculation of the reverse translation go wrong. > If you need arithmetic or conditional operation, you have to use CCL, > or :post-read-conversion. > PS. Should I read the other mails of this thread? I'm very > sorry for this lazy attitude, but I don't have a time to > read all emacs-devel mails. At the current point it would appear that I manage working with CCL. It was a big puzzler that data gets lost unless the CCL program is written as a _loop_, and it's annoying that the documentation just mentions (loop statement ...) as a construct without bothering to point out that the loop will not actually loop. "loop" apparently just places a label you can jump to using (repeat) or its ilk. So my first approaches lost data when it was arriving fast, and it took about a day to figure out why that was. If you have time to spare on that topic, I'd rather you spend it on putting some more info in the Elisp manual or at least the define-ccl-program and define-coding-system doc strings. At the current point of time, it is quite opaque what the :coding-type specification in define-coding-system does, it is not clear how CCL code is being run under which conditions, the attribute :valids (apparently part of ccl charset-type) is not documented at all. And the following afterthought in define-ccl-program is quite opaque as well: TRANSLATE := (translate-character REG(table) REG(charset) REG(codepoint)) | (translate-character SYMBOL REG(charset) REG(codepoint)) ;; SYMBOL must refer to a table defined by `define-translation-table'. LOOKUP := (lookup-character SYMBOL REG(charset) REG(codepoint)) | (lookup-integer SYMBOL REG(integer)) ;; SYMBOL refers to a table defined by `define-translation-hash-table'. MAP := (iterate-multiple-map REG REG MAP-IDs) | (map-multiple REG REG (MAP-SET)) | (map-single REG REG MAP-ID) MAP-IDs := MAP-ID ... MAP-SET := MAP-IDs | (MAP-IDs) MAP-SET MAP-ID := integer It's not clear what is input and output for TRANSLATE and LOOKUP, the operation itself can only be guessed by looking at the _data_ structures given to `define-translation-hash-table' and `define-translation-table', and there is absolutely no guessing what the MAP operations are. And looking at the code in src/ccl.c is creating rather more than less confusion as the mapping stuff is really complex and only the mechanisms (if at all) are documented rather than the purpose. There is also nothing in the DOC string of `define-coding-system' or the Elisp manual that would help in guessing what kind of options to choose for what kind of task. The purpose/definition of coding-type emacs-mule (particularly post Emacs-23) is not given, or what the various options with coding-type iso-2022 are. It is not clear when one would use coding-type raw-text and when utf-8 (and how does utf-8 relate to emacs-mule?). -- David Kastrup