From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Stefan Monnier Newsgroups: gmane.emacs.devel Subject: Obsolete string-to-multibyte hard to replace Date: Mon, 29 May 2017 09:01:24 -0400 Message-ID: NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain X-Trace: blaine.gmane.org 1496062936 1923 195.159.176.226 (29 May 2017 13:02:16 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Mon, 29 May 2017 13:02:16 +0000 (UTC) User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.0.50 (gnu/linux) To: emacs-devel@gnu.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Mon May 29 15:02:08 2017 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1dFKJC-00008S-Ub for ged-emacs-devel@m.gmane.org; Mon, 29 May 2017 15:02:07 +0200 Original-Received: from localhost ([::1]:48688 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dFKJI-0006VD-9I for ged-emacs-devel@m.gmane.org; Mon, 29 May 2017 09:02:12 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:52697) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dFKId-0006Ta-Kc for emacs-devel@gnu.org; Mon, 29 May 2017 09:01:32 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1dFKIZ-00082w-PL for emacs-devel@gnu.org; Mon, 29 May 2017 09:01:31 -0400 Original-Received: from ironport2-out.teksavvy.com ([206.248.154.181]:44029) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1dFKIZ-00082n-Iw for emacs-devel@gnu.org; Mon, 29 May 2017 09:01:27 -0400 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: A0BfDAAKGyxZ/8mSSC1eHgYMg1WKbYV/kR4Bl1wchgIEgxVDFQECAQEBAQEBAWsohUYvciYYDYphm0yRdzqLd4sshWmFJgWeI4FbnEiGfJRONSKBCjAhCDCEeQELgl8kigoBAQE X-IPAS-Result: A0BfDAAKGyxZ/8mSSC1eHgYMg1WKbYV/kR4Bl1wchgIEgxVDFQECAQEBAQEBAWsohUYvciYYDYphm0yRdzqLd4sshWmFJgWeI4FbnEiGfJRONSKBCjAhCDCEeQELgl8kigoBAQE X-IronPort-AV: E=Sophos;i="5.38,414,1491278400"; d="scan'208";a="313837434" Original-Received: from 45-72-146-201.cpe.teksavvy.com (HELO ceviche.home) ([45.72.146.201]) by smtp.teksavvy.com with ESMTP/TLS/DHE-RSA-AES256-SHA; 29 May 2017 09:01:24 -0400 Original-Received: by ceviche.home (Postfix, from userid 20848) id 58F30662E0; Mon, 29 May 2017 09:01:24 -0400 (EDT) X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 206.248.154.181 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.org gmane.emacs.devel:215305 Archived-At: I was looking at ruler-mode.el's bytecomp warning about string-to-multibyte and it seems like there is no good way to write cleaner code here: (let (... (ruler (propertize (string-to-multibyte (make-string w ruler-mode-basic-graduation-char)) ...)) ...) [...] (aset ruler k (aref c (setq m (1- m)))) Here, the problem is as follows: make-string returns a unibyte string if the initial char is ASCII and a multibyte string otherwise. Seems innucuous, but it means that if the initial char (ruler-mode-basic-graduation-char above) is ASCII, you can't later insert a multibyte char into that string with `aset`. Replacing string-to-multibyte with decode-coding-string is clearly wrong in the above case (I mean it'll work, but it's even more of a hack than using string-to-multibyte). While investigating it, I noticed that my local Emacs hacks include a change of make-string so it always returns a multibyte string (I think I made this change, because an all-ASCII multibyte string is a more precise information than an all-ASCII unibyte string: with a multibyte string, the fact that `x->size == x->size_byte` immediately tells us this is an ASCII-only string whereas with a unibyte string we can't know if that string is ASCII-only without checking each and every byte). I think it'd make sense to change make-string so it always returns a multibyte string, and maybe to also introduce a new make-unibyte-string. Stefan