From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Newsgroups: gmane.emacs.help Subject: Re: how to calculate the size of string in bytes? Date: Tue, 18 Aug 2015 22:11:18 +0200 Message-ID: <20150818201118.GA26004@tuxteam.de> References: <20150818101352.GA6744@tuxteam.de> <83mvxoll2g.fsf@gnu.org> <20150818144530.GB15783@tuxteam.de> <83k2sslk0d.fsf@gnu.org> <20150818160145.GA18309@tuxteam.de> <83fv3glfm0.fsf@gnu.org> <20150818193049.GA24519@tuxteam.de> <831tf0l6l5.fsf@gnu.org> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; x-action=pgp-signed Content-Transfer-Encoding: 8bit X-Trace: ger.gmane.org 1439928737 20447 80.91.229.3 (18 Aug 2015 20:12:17 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Tue, 18 Aug 2015 20:12:17 +0000 (UTC) Cc: help-gnu-emacs@gnu.org To: Eli Zaretskii Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Tue Aug 18 22:12:17 2015 Return-path: Envelope-to: geh-help-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1ZRnEU-0004Gq-Fs for geh-help-gnu-emacs@m.gmane.org; Tue, 18 Aug 2015 22:11:42 +0200 Original-Received: from localhost ([::1]:59375 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZRnET-0007rO-Mb for geh-help-gnu-emacs@m.gmane.org; Tue, 18 Aug 2015 16:11:41 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:58511) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZRnED-0007ng-JU for help-gnu-emacs@gnu.org; Tue, 18 Aug 2015 16:11:26 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ZRnEC-0002Bj-CD for help-gnu-emacs@gnu.org; Tue, 18 Aug 2015 16:11:25 -0400 Original-Received: from mail.tuxteam.de ([5.199.139.25]:36176 helo=tomasium.tuxteam.de) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZRnE8-0002AX-7e; Tue, 18 Aug 2015 16:11:20 -0400 Original-Received: from tomas by tomasium.tuxteam.de with local (Exim 4.80) (envelope-from ) id 1ZRnE6-0006nE-Ow; Tue, 18 Aug 2015 22:11:18 +0200 In-Reply-To: <831tf0l6l5.fsf@gnu.org> User-Agent: Mutt/1.5.21 (2010-09-15) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 5.199.139.25 X-BeenThere: help-gnu-emacs@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Users list for the GNU Emacs text editor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Original-Sender: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.help:106685 Archived-At: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Tue, Aug 18, 2015 at 10:49:58PM +0300, Eli Zaretskii wrote: > > Date: Tue, 18 Aug 2015 21:30:49 +0200 > > Cc: help-gnu-emacs@gnu.org > > From: > > > > I was having difficulties in understanding you > > Sorry about that. It's a complex issue to explain in a few words. No need to be sorry. The fault's on me -- once I did my homework things improved :-) Thanks for your patience: very much appreciated. > > Now I understand: Emacs's internal (raw) coding system can represent > > "characters not expressible in utf-8". > > More accurately, it can represent characters outside the Unicode code > space. > > And please don't call that "raw"; the internal representation of > characters used by Emacs is known as 'utf-8-emacs'. Ah, OK. Point taken. > > The function encode-coding-string passes those bytes silently > > through, outputting an invalid utf-8 sequence. > > Yes. Although in interactive functions Emacs will normally complain > and ask for a better encoding. Understood > > So I venture the guess that when the Emacs buffer contains something > > epressible as valid utf-8, 'utf-8 and 'raw are equivalent > > Yes. > > > (what about combining characters?) > > Emacs doesn't normalize/compose/decompose characters when it encodes > text (with a notable exception of the utf-8-hfs encoding). > Applications that want this should do that themselves, e.g. using the > facilities in ucs-normalize.el. Thanks: I learned quite a bit now :-) regards - -- tomás -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iEUEARECAAYFAlXTkWYACgkQBcgs9XrR2kaQbwCggSK12zVBjHiFowFVsddq36SJ XmAAmON/V8XcGaUfjxW1llhEavSqcp0= =fYz9 -----END PGP SIGNATURE-----