From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Robert Pluim Newsgroups: gmane.emacs.devel Subject: Re: ucs-normalize and diacritics Date: Thu, 26 Jul 2018 22:40:04 +0200 Message-ID: <87woth4tp7.fsf@gmail.com> References: <8736w88pnn.fsf@gmail.com> <83lga0v4ff.fsf@gnu.org> <83in54v3sp.fsf@gnu.org> <87y3e07425.fsf@gmail.com> <83h8koujs9.fsf@gnu.org> <87pnzb7ogd.fsf@gmail.com> <874lgn74sw.fsf@gmail.com> <83va93tlb4.fsf@gnu.org> <877eli5r02.fsf@gmail.com> <83bmatu9f0.fsf@gnu.org> NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain Content-Transfer-Encoding: 8bit X-Trace: blaine.gmane.org 1532637510 17330 195.159.176.226 (26 Jul 2018 20:38:30 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Thu, 26 Jul 2018 20:38:30 +0000 (UTC) Cc: emacs-devel@gnu.org To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Thu Jul 26 22:38:26 2018 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1fin1j-0004MT-0E for ged-emacs-devel@m.gmane.org; Thu, 26 Jul 2018 22:38:23 +0200 Original-Received: from localhost ([::1]:38039 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fin3n-0002B5-OZ for ged-emacs-devel@m.gmane.org; Thu, 26 Jul 2018 16:40:31 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:34656) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fin3e-0002Ap-Cy for emacs-devel@gnu.org; Thu, 26 Jul 2018 16:40:23 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fin3Z-0008Nv-7U for emacs-devel@gnu.org; Thu, 26 Jul 2018 16:40:22 -0400 Original-Received: from mail-wm0-x22e.google.com ([2a00:1450:400c:c09::22e]:38394) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fin3Y-0008MC-Nv; Thu, 26 Jul 2018 16:40:17 -0400 Original-Received: by mail-wm0-x22e.google.com with SMTP id t25-v6so3269554wmi.3; Thu, 26 Jul 2018 13:40:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:references:mail-followup-to:mail-copies-to :gmane-reply-to-list:date:in-reply-to:message-id:mime-version :content-transfer-encoding; bh=fD74tqJV+GLa3vi30h8u5nDh+icIG7gj+571UuwyiJM=; b=T5m9B5qBM3YpTCXZstjA29nGARtHhCd3JwuSAzrgt/2b4hKrGX+L/AHQcgyQ8MzzZ+ 2irA8dWQHH8hhM7WRaSRvt6/Bk0jv6GAX6+xiemYoyBtBVAuN/QITu3nSiH/CVmobr0y JIURiz8ytHpO9rHVGDqO4sNhg7i1RT41UU0n3DA5WF30nFrE2X2k6KKtmyDR5V4gQuMu SO51cq74EPl5X/Apq8fWq0+fdtNHXUYUAXqeFgkI/4OrYkUynEBvyikvqbQ9S0lHh8qe 6yzxVod0tJ2QD2DuJydwrYYdeT1yP3IuOwugfM3n50NvFEmZgN55X8P9og6Ml+8ZRURr TK9Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:references:mail-followup-to :mail-copies-to:gmane-reply-to-list:date:in-reply-to:message-id :mime-version:content-transfer-encoding; bh=fD74tqJV+GLa3vi30h8u5nDh+icIG7gj+571UuwyiJM=; b=DTqUTt7Uz5aSCYnDFBiscUkYF1Sf5gRUzNnpw/JGZsqMZ+jaVovW9tVnA5gsoUCcZu V9lOAULoBvQl3KkzuV1qGHCbrsLdsJp+m6ol50BkXP9Xbo9cIoNdQYmDXqVLS4o7+Gag Dykvfv/i8mHUdseKQQOBLUynm4yfArV95PfkwqgVgXmPhSGKp+9t1+mCEzBML0/bRGME m3pfG3RuG9kQxy0u9csx4PqVNolzmA44+nFKl7olB6G1vmV7QCPmcbtk9CJDbV/eBy/e pnq1mCCI8zrr8vqIwFMZkuyj13tH+3fTAUIvU7RDmsEuPgKg8c87T2oKUHLczLt8u1HR MvzQ== X-Gm-Message-State: AOUpUlF6JLvgTZEM+rqQwcGje7q4D50i9iES1LNsVtd9xiZ8GlY89Ipf 2DKOka0ylUDyh5AJ6CEk7i2WoahJ X-Google-Smtp-Source: AAOMgpfdG7JeHZaXirLxK3cNcZX44H2bkSph0Tp34i7fIm7THqV+S60P9GIdDoF5KRV1g5aUDey50g== X-Received: by 2002:a1c:ef0f:: with SMTP id n15-v6mr2374302wmh.116.1532637614762; Thu, 26 Jul 2018 13:40:14 -0700 (PDT) Original-Received: from rpluim-ubuntu (vav06-1-78-207-202-134.fbx.proxad.net. [78.207.202.134]) by smtp.gmail.com with ESMTPSA id y206-v6sm1853384wmg.45.2018.07.26.13.40.13 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 26 Jul 2018 13:40:13 -0700 (PDT) Mail-Followup-To: emacs-devel@gnu.org Mail-Copies-To: never Gmane-Reply-To-List: yes In-Reply-To: <83bmatu9f0.fsf@gnu.org> (Eli Zaretskii's message of "Thu, 26 Jul 2018 21:41:23 +0300") X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2a00:1450:400c:c09::22e X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.org gmane.emacs.devel:227842 Archived-At: Eli Zaretskii writes: >> From: Robert Pluim >> Cc: emacs-devel@gnu.org >> Date: Thu, 26 Jul 2018 10:40:45 +0200 >> >> How about something like: >> >> As a special case, if the character lies in the range #x3fff80 >> through #x3fff9a (128 through 159 decimal, with prefix #x3fff), it >> stands for a raw byte that does not correspond to any specific >> displayable character. Such a character lies within the >> @code{eight-bit-control} character set, and is displayed as an escaped >> octal character code (0200 through 0237), or as an escaped hex >> character code (x80 through x9a) if @code{display-raw-bytes-as-hex} is >> non-@code{nil}. > > Thanks, but the original text was wrong in more than one sense, and > needed a more thorough fix. I pushed a fix, please see if the new > text is clear and accurate. It¼s clear, but it¼s not 100% accurate as far as I can tell: - C-x = shows 'raw-byte', not 'raw byte' - It doesn¼t show this for the whole range 0200 to 0377, only for 0240 to 0377, eg for 0200: Char: € (4194176, #o17777600, #x3fff80, file #x80) point=1 of 256 (0%) column=0 C-u C-x = gives: position: 1 of 256 (0%), column: 0 character: € (displayed as €) (codepoint 4194176, #o17777600, #x3fff80) preferred charset: tis620-2533 (TIS620.2533) code point in charset: 0x80 syntax: w which means: word category: L:Left-to-right (strong) to input: type "C-x 8 RET 3fff80" buffer code: #x80 file code: #x80 (encoded by coding system raw-text-unix) display: no font available Character code properties: customize what to show general-category: Cn (Other, Not Assigned) decomposition: (4194176) ('€') Regards Robert