From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Kenichi Handa Newsgroups: gmane.emacs.bugs Subject: bug#11073: 24.0.94; BIDI-related crash in redisplay with certain byte sequences Date: Mon, 26 Mar 2012 16:45:56 +0900 Message-ID: References: <83sjgzvb6w.fsf@gnu.org> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain X-Trace: dough.gmane.org 1332748028 19802 80.91.229.3 (26 Mar 2012 07:47:08 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Mon, 26 Mar 2012 07:47:08 +0000 (UTC) Cc: 11073@debbugs.gnu.org To: Eli Zaretskii Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Mon Mar 26 09:47:07 2012 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1SC4dr-0008Cn-Nx for geb-bug-gnu-emacs@m.gmane.org; Mon, 26 Mar 2012 09:47:03 +0200 Original-Received: from localhost ([::1]:57992 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SC4dq-0005br-Nb for geb-bug-gnu-emacs@m.gmane.org; Mon, 26 Mar 2012 03:47:02 -0400 Original-Received: from eggs.gnu.org ([208.118.235.92]:53391) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SC4dn-0005ba-M2 for bug-gnu-emacs@gnu.org; Mon, 26 Mar 2012 03:47:01 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1SC4di-00064K-QT for bug-gnu-emacs@gnu.org; Mon, 26 Mar 2012 03:46:59 -0400 Original-Received: from debbugs.gnu.org ([140.186.70.43]:60463) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SC4di-00064E-N6 for bug-gnu-emacs@gnu.org; Mon, 26 Mar 2012 03:46:54 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.72) (envelope-from ) id 1SC57q-0005qh-Kn for bug-gnu-emacs@gnu.org; Mon, 26 Mar 2012 04:18:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Kenichi Handa Original-Sender: debbugs-submit-bounces@debbugs.gnu.org Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Mon, 26 Mar 2012 08:18:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 11073 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: Original-Received: via spool by 11073-submit@debbugs.gnu.org id=B11073.133274984822423 (code B ref 11073); Mon, 26 Mar 2012 08:18:02 +0000 Original-Received: (at 11073) by debbugs.gnu.org; 26 Mar 2012 08:17:28 +0000 Original-Received: from localhost ([127.0.0.1]:39058 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1SC57H-0005pc-5S for submit@debbugs.gnu.org; Mon, 26 Mar 2012 04:17:27 -0400 Original-Received: from mx1.aist.go.jp ([150.29.246.133]:40864) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1SC56z-0005p0-8g for 11073@debbugs.gnu.org; Mon, 26 Mar 2012 04:17:25 -0400 Original-Received: from rqsmtp1.aist.go.jp (rqsmtp1.aist.go.jp [150.29.254.115]) by mx1.aist.go.jp with ESMTP id q2Q7jwnL025188; Mon, 26 Mar 2012 16:45:58 +0900 (JST) env-from (handa@m17n.org) Original-Received: from smtp3.aist.go.jp by rqsmtp1.aist.go.jp with ESMTP id q2Q7jwkG009492; Mon, 26 Mar 2012 16:45:58 +0900 (JST) env-from (handa@m17n.org) Original-Received: by smtp3.aist.go.jp with ESMTP id q2Q7jvHk008251; Mon, 26 Mar 2012 16:45:57 +0900 (JST) env-from (handa@m17n.org) In-Reply-To: <837gybupdf.fsf@gnu.org> (message from Eli Zaretskii on Fri, 23 Mar 2012 20:46:36 +0200) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.13 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 2) X-Received-From: 140.186.70.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.bugs:58160 Archived-At: In article <837gybupdf.fsf@gnu.org>, Eli Zaretskii writes: > > Why do we need this unification? Or rather, why do we need multiple > > codepoints, which then forces us to unify them? > That's something Handa-san (CC'ed) will be able to explain much better > than I ever could. It's a long story. When I designed emacs-unicode (the version before merged to the trunk, more than 10 years ago), the unification maps of CJK charsets to Unicode were not stable. In addtion, there were various conflicting policies on which character to unify to which character. One reason of this confusion was that Unicode itself didn't define mapping to/from such CJK charsets (JIS, GB, KSC). The unification problem is not only for Ideographic characters. Many CJK charsets contain, for instance, full-width version of Greek characters, but Unicode doesn't distinguish them from single-width versions (though Unicode has full-width version of 'A'..'Z', etc). There were people who wanted to distinguish full-width Greek chars from single-width chars. There also were people who have a text of iso-2022-7bit file which distinguishes characters of GB charset and JIS charset. To edit such a file and write it back as the original one, one has to disable unification of one of GB and JIS (or both of them). So, I decided at that time to give each CJK charset unique code space (above #x110000) in Emacs, and allow users to freely unify/disunify them to Unicode code space (below #x110000) by giving the function unify-charset. FYI, http://www.unicode.org/reports/tr38/ tells some difficulty of mappings. > AFAIU, there are good reasons to have some CJK > characters on separate codepoints, because they need to be treated > differently from their Unicode codepoints (perhaps a different choice > of font to display them?) That was one reaons, but the current code pay attention to `charset' text property of each character to select a proper font. --- Kenichi Handa handa@m17n.org