From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Eric Abrahamsen Newsgroups: gmane.emacs.help Subject: string-bytes and coding systems Date: Wed, 08 Mar 2017 15:17:07 -0800 Message-ID: <87r327nyto.fsf@ericabrahamsen.net> NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain X-Trace: blaine.gmane.org 1489020952 10489 195.159.176.226 (9 Mar 2017 00:55:52 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Thu, 9 Mar 2017 00:55:52 +0000 (UTC) User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.0.50 (gnu/linux) To: help-gnu-emacs@gnu.org Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Thu Mar 09 01:55:46 2017 Return-path: Envelope-to: geh-help-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1clmMp-0001te-Oc for geh-help-gnu-emacs@m.gmane.org; Thu, 09 Mar 2017 01:55:43 +0100 Original-Received: from localhost ([::1]:59894 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1clmMv-0006PE-KP for geh-help-gnu-emacs@m.gmane.org; Wed, 08 Mar 2017 19:55:49 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:60940) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1clmMV-0006Ox-5u for help-gnu-emacs@gnu.org; Wed, 08 Mar 2017 19:55:24 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1clmMS-0008TE-4e for help-gnu-emacs@gnu.org; Wed, 08 Mar 2017 19:55:23 -0500 Original-Received: from [195.159.176.226] (port=34507 helo=blaine.gmane.org) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1clmMR-0008Sl-TU for help-gnu-emacs@gnu.org; Wed, 08 Mar 2017 19:55:20 -0500 Original-Received: from list by blaine.gmane.org with local (Exim 4.84_2) (envelope-from ) id 1clmME-0006MB-7n for help-gnu-emacs@gnu.org; Thu, 09 Mar 2017 01:55:06 +0100 X-Injected-Via-Gmane: http://gmane.org/ Original-Lines: 21 Original-X-Complaints-To: usenet@blaine.gmane.org Cancel-Lock: sha1:44BKS2ICaYBRvEf+VsMGbqvnwhM= X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 195.159.176.226 X-BeenThere: help-gnu-emacs@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: Users list for the GNU Emacs text editor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Original-Sender: "help-gnu-emacs" Xref: news.gmane.org gmane.emacs.help:112468 Archived-At: I'm writing a function that's supposed to wrap too-long text lines; the RFC says anything over 75 octets (excluding eol) needs to be wrapped, but multibyte characters must not be split. Everything seems to be working fine, but I want to make sure I'm not making any dangerous assumptions about `string-bytes' and encoding. I'm essentially taking the `string-bytes' of each line, and if it's too long, popping characters off the end until it's fewer than 75 bytes. My understanding/assumption is that `string-bytes' returns the number of bytes according to Emacs' internal coding system, which is close enough to utf-8 to make no difference. When this text gets written to file it will also be encoded as utf-8, ergo testing string lengths with `string-bytes' is going to always produce the right results in the final file. Have I understood things correctly? Thanks! Eric