From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Kaushal Modi Newsgroups: gmane.emacs.devel Subject: Re: Characters saved mismatch? Date: Sun, 07 May 2017 12:46:48 +0000 Message-ID: References: NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: multipart/alternative; boundary=001a113befc8bf5bd1054eee84a3 X-Trace: blaine.gmane.org 1494161230 4785 195.159.176.226 (7 May 2017 12:47:10 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Sun, 7 May 2017 12:47:10 +0000 (UTC) Cc: Emacs developers To: Angelo Graziosi , Yuri Khan Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Sun May 07 14:47:06 2017 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1d7LaZ-00017r-HU for ged-emacs-devel@m.gmane.org; Sun, 07 May 2017 14:47:03 +0200 Original-Received: from localhost ([::1]:54972 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1d7Laf-0000IW-0H for ged-emacs-devel@m.gmane.org; Sun, 07 May 2017 08:47:09 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:34782) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1d7LaY-0000EM-6I for emacs-devel@gnu.org; Sun, 07 May 2017 08:47:03 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1d7LaX-0005vc-8O for emacs-devel@gnu.org; Sun, 07 May 2017 08:47:02 -0400 Original-Received: from mail-wm0-x22a.google.com ([2a00:1450:400c:c09::22a]:35643) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1d7LaX-0005vV-1L for emacs-devel@gnu.org; Sun, 07 May 2017 08:47:01 -0400 Original-Received: by mail-wm0-x22a.google.com with SMTP id b84so27832512wmh.0 for ; Sun, 07 May 2017 05:47:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=sFIYWSDxV79qY+tjFsEwIuu8cnZYfCFj2AhbE6nRNDk=; b=CnU9/SB823MonUlvFwKNohQ8zFdov84CA49rjk98wQq9YpQMFyL0UrYgstbjj0gjxF ugAMmX5ACzVLHL6Ri0wyICjduo0j/oFJvhd/k6FYhFxKfHMOmq6dYPObAuQmsGjLLs3q Mm4Ec4BERGvIFNAO3ZaMx5du19a33ShuOjRnRWNNdPN/nJjWmaIkir0uXIZ3aq5XWozu eBxqW+aNE0nPX92EETzlDtGj2rKQeG90/GLm3b228KtlOQIAgx+ZlkCqUQmZnexBG5CR SP68gz/VrO3L6+26v120TPiaF4zjs1ViQwqOhx3lU2z+9CmVZ3oG6Bp/ShWNO38E6Z07 if6g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=sFIYWSDxV79qY+tjFsEwIuu8cnZYfCFj2AhbE6nRNDk=; b=I8THixFuM9z88cgvgI3OVj/hUDIVJ7AWFgOIvHn7LLBinXcBPm7WCkaDLMpIwNr72n 410ZFcdSusSzPry3q9tDGt54RhHDCiCsYHEfNLicTE5U9WyP08bjO1o6THQ9fU1lpug3 AVERsQbmg2xBwkox8fcuFQm6px4OsmL83yurMkAkhQN+2wMpnaRPsWK8CH5Z+HNdKAR+ 1Twe0jrHOSM69UAXo17f/hs3zXhdvMrreL0wOvxcHgxDEKzEvOO4bmtH0pVFXmD8CS1/ r6uIMIf+WmCNJqtpaHURg3MTVOk7KidWN3R439OvLx4ns166QErp3IuuZXG2T/ySocOD oPEQ== X-Gm-Message-State: AODbwcDCPfAQ1o+Ech36ZOHwsplCe9m2u6Fc12fJaHNzOnq9RIl3qJ8v m+wfD34OGl4/eu8ktJHqr/lHXmiJJg== X-Received: by 10.25.235.210 with SMTP id f79mr7785717lfk.27.1494161219867; Sun, 07 May 2017 05:46:59 -0700 (PDT) In-Reply-To: X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2a00:1450:400c:c09::22a X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.org gmane.emacs.devel:214667 Archived-At: --001a113befc8bf5bd1054eee84a3 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable On Sun, May 7, 2017, 5:16 AM Angelo Graziosi wrote: > > > Someone should explain the meaning of "Wrote =E2=80=98c:/msys64/tmp/foo.t= ext=E2=80=99 (8 > characters)". > As others stated, that's because each newline is counted as 1 character too. If it refers to the number of characters, my example contains 6 > characters: f-o-o-b-a-r and not 8. > No, it contains 8 characters: 1. f 2. o 3. o 4. Newline (Just 1 character, does not matter if it is 1 byte on unix or 2 bytes on Windows. This is character count, not byte count.) 5. b 6. a 7. r 8. Newline As I wrote, in Windows Emacs uses DOS style, more precisely 'utf-8-dos'. > That should mean 1 byte/ch and CR+LF for end line (RET). This mean that > > foo RET > bar RET > > should contain (3+2) * 2 =3D 10 bytes as, 'ls' shows.. > As written about emacs sees the newline as just 1 character. Emacs is printing character count, while ls is printing byte count, and thus the difference. Visualize that newline character as just 1 character as the "\n" used in regexps to match newlines. Then, where does "Wrote =E2=80=98c:/msys64/tmp/foo.text=E2=80=99 (8 charact= ers)" came > from, on Windows? > As explained above. > -- Kaushal Modi --001a113befc8bf5bd1054eee84a3 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
On Sun, May 7, 2017, 5:16 AM An= gelo Graziosi <angelo.grazio= si@alice.it> wrote:


Someone should explain the meaning of "Wrote =E2=80=98c:/msys64/tm= p/foo.text=E2=80=99 (8
characters)".

As others stat= ed, that's because each newline is counted as 1 character too.=C2=A0

If it refers to the number of characters, my example contains 6
characters: f-o-o-b-a-r and not 8.

No, it contains 8 characters:

1. f
2. = o
3. o
4. Newline (Just 1 character, does not matter if= it is 1 byte on unix or 2 bytes on Windows. This is character count, not b= yte count.)
5. b
6. a
7. r
8. Newli= ne

As I wrote, in Windows Emacs uses DOS style, more precisely 'utf-= 8-dos'.
That should mean 1 byte/ch and CR+LF for end line (RET). This mean that

foo RET
bar RET

should contain (3+2) * 2 =3D 10 bytes as, 'ls' shows..

As written about emacs sees the newline as ju= st 1 character. Emacs is printing character count, while ls is printing byt= e count, and thus the difference.=C2=A0

Visualize = that newline character as just 1 character as the "\n" used in re= gexps to match newlines.=C2=A0

Then, where does "Wrote =E2=80=98c:/= msys64/tmp/foo.text=E2=80=99 (8 characters)" came
from, on Windows?

As explained ab= ove.=C2=A0
--

Kaushal Modi

--001a113befc8bf5bd1054eee84a3--