From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: "David M. Cooke" Newsgroups: gmane.emacs.bugs Subject: bug#9608: 24.0.50; Emacs lisp reader thinks no-break space is 0x08a0 (should be 0x00a0) Date: Mon, 26 Sep 2011 17:00:34 -0700 Message-ID: <456C995A-EC64-43C6-A96E-FBF6004D9EDD@gmail.com> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 (Apple Message framework v1244.3) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Trace: dough.gmane.org 1317081882 4115 80.91.229.12 (27 Sep 2011 00:04:42 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Tue, 27 Sep 2011 00:04:42 +0000 (UTC) To: 9608@debbugs.gnu.org Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Tue Sep 27 02:04:37 2011 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([140.186.70.17]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1R8LA4-0006hk-E3 for geb-bug-gnu-emacs@m.gmane.org; Tue, 27 Sep 2011 02:04:36 +0200 Original-Received: from localhost ([::1]:60668 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1R8LA4-0005Sy-0b for geb-bug-gnu-emacs@m.gmane.org; Mon, 26 Sep 2011 20:04:36 -0400 Original-Received: from eggs.gnu.org ([140.186.70.92]:53963) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1R8L7q-0005KS-5Z for bug-gnu-emacs@gnu.org; Mon, 26 Sep 2011 20:02:19 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1R8L7p-0006vz-25 for bug-gnu-emacs@gnu.org; Mon, 26 Sep 2011 20:02:18 -0400 Original-Received: from debbugs.gnu.org ([140.186.70.43]:56266) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1R8L7o-0006vu-VI for bug-gnu-emacs@gnu.org; Mon, 26 Sep 2011 20:02:17 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.69) (envelope-from ) id 1R8L8Y-0004jP-OJ for bug-gnu-emacs@gnu.org; Mon, 26 Sep 2011 20:03:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: "David M. Cooke" Original-Sender: debbugs-submit-bounces@debbugs.gnu.org Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Tue, 27 Sep 2011 00:03:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: report 9608 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: X-Debbugs-Original-To: bug-gnu-emacs@gnu.org Original-Received: via spool by submit@debbugs.gnu.org id=B.131708174718139 (code B ref -1); Tue, 27 Sep 2011 00:03:02 +0000 Original-Received: (at submit) by debbugs.gnu.org; 27 Sep 2011 00:02:27 +0000 Original-Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1R8L7y-0004iV-JX for submit@debbugs.gnu.org; Mon, 26 Sep 2011 20:02:27 -0400 Original-Received: from eggs.gnu.org ([140.186.70.92]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1R8L71-0004gr-HR for submit@debbugs.gnu.org; Mon, 26 Sep 2011 20:01:28 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1R8L6G-0006pf-Ae for submit@debbugs.gnu.org; Mon, 26 Sep 2011 20:00:41 -0400 Original-Received: from lists.gnu.org ([140.186.70.17]:53308) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1R8L6G-0006pb-8y for submit@debbugs.gnu.org; Mon, 26 Sep 2011 20:00:40 -0400 Original-Received: from eggs.gnu.org ([140.186.70.92]:48459) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1R8L6E-0005FE-Uk for bug-gnu-emacs@gnu.org; Mon, 26 Sep 2011 20:00:40 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1R8L6D-0006pG-Pq for bug-gnu-emacs@gnu.org; Mon, 26 Sep 2011 20:00:38 -0400 Original-Received: from mail-yw0-f41.google.com ([209.85.213.41]:62371) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1R8L6D-0006pA-M3 for bug-gnu-emacs@gnu.org; Mon, 26 Sep 2011 20:00:37 -0400 Original-Received: by ywe9 with SMTP id 9so5678355ywe.0 for ; Mon, 26 Sep 2011 17:00:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=from:content-type:content-transfer-encoding:subject:date:message-id :to:mime-version:x-mailer; bh=oNPVra0P0jtGX0rmpcj4EiuzdH01jr7lqVKIpskAqQk=; b=rZiboz1lX6Vel9oDzLMqBvxjIIb22Mu53Q4m5LMMosBBVT6MYDhO2XpulT49Ulh4PO AO5+G2c3VjVr0SmzYT34bwy4+s2quZ3T0EUybO7s3FGraN0ZjB+zcAsRQMp/oUYUhRfd 2ikP7MqIB9m5dR0kKDnQ1SXglPqoGWVVU31yU= Original-Received: by 10.236.175.229 with SMTP id z65mr5777596yhl.45.1317081636746; Mon, 26 Sep 2011 17:00:36 -0700 (PDT) Original-Received: from mars.lan (d207-216-27-213.bchsia.telus.net. [207.216.27.213]) by mx.google.com with ESMTPS id o25sm31205985yhj.24.2011.09.26.17.00.35 (version=TLSv1/SSLv3 cipher=OTHER); Mon, 26 Sep 2011 17:00:36 -0700 (PDT) X-Mailer: Apple Mail (2.1244.3) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 2) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3) X-Mailman-Approved-At: Mon, 26 Sep 2011 20:02:25 -0400 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list Resent-Date: Mon, 26 Sep 2011 20:03:02 -0400 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 1) X-Received-From: 140.186.70.43 X-Mailman-Approved-At: Mon, 26 Sep 2011 20:04:33 -0400 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.bugs:51901 Archived-At: [zapped boilerplate header] After reading through lread.c (I was writing an emacs lisp lexer for syntax-highlighting in pygments), I discovered it treats the unicode character U+08A0 as whitespace (with the comment "NBSP"). I believe this was meant to be U+00A0 (NO-BREAK SPACE), as the code point U+08A0 has no character assigned to it yet (it lies between the Samaritan and the Devanagari blocks). Additionally, you can see this by running the following lisp code: (mapcar (lambda (sym) (string-as-unibyte (symbol-name sym) )) (read "(a b c\u00a0d e\u08a0f g \u00a0 h i \u08a0 j)")) This gives the result ("a" "b" "c\302\240d" "e" "f" "g" "\302\240" "h" "i" "j") where we can see U+00A0 (utf-8: "\302\240") is being treated as a symbol-constituent character, whereas U+08A0 is whitespace. The changes to the whitespace handling were introduced in bzr revision 78902 (on 2007-07-30, which is a few weeks after a discussion about handling NO-BREAK SPACE on the mailing list). I'm guessing using 0x8a0 was just a thinko. cheers, David M. Cooke If Emacs crashed, and you have the Emacs process in the gdb debugger, please include the output from the following gdb commands: `bt full' and `xbacktrace'. For information about debugging Emacs, please read the file /Applications/_Editors/Emacs.app/Contents/Resources/etc/DEBUG. In GNU Emacs 24.0.50.2 (x86_64-apple-darwin10.7.0, NS apple-appkit-1038.35) of 2011-05-27 on mars.lan Windowing system distributor `Apple', version 10.3.1138 configured using `configure '--with-ns'' Important settings: value of $LC_ALL: nil value of $LC_COLLATE: nil value of $LC_CTYPE: nil value of $LC_MESSAGES: nil value of $LC_MONETARY: nil value of $LC_NUMERIC: nil value of $LC_TIME: nil value of $LANG: en_CA.UTF-8 value of $XMODIFIERS: nil locale-coding-system: utf-8-unix default enable-multibyte-characters: t Major mode: Lisp Interaction Minor modes in effect: tooltip-mode: t mouse-wheel-mode: t tool-bar-mode: t menu-bar-mode: t file-name-shadow-mode: t global-font-lock-mode: t font-lock-mode: t blink-cursor-mode: t auto-composition-mode: t auto-encryption-mode: t auto-compression-mode: t line-number-mode: t transient-mark-mode: t Recent input: ( s m a p c a r SPC ' s y m b o l - n a m e SPC ( e v a l SPC " ( a SPC b SPC c \ u 0 0 a 0 d SPC e \ u 0 8 a 0 d f ) " ) ) C-j q # C-j q ' C-e C-j q " r e a d C-e C-j SPC g SPC \ u 0 0 a 0 SPC h SPC i SPC \ u 0 8 a 0 SPC j C-e C-j x r e m p o r p o r Recent messages: For information about GNU Emacs and the GNU system, type C-h C-a. Entering debugger... Back to top level. Entering debugger... Back to top level. Entering debugger... Back to top level. Load-path shadows: None found. Features: (shadow sort gnus-util time-date mail-extr message format-spec rfc822 mml mml-sec mm-decode mm-bodies mm-encode mail-parse rfc2231 rfc2047 rfc2045 ietf-drums mm-util mail-prsvr mailabbrev mail-utils gmm-utils mailheader emacsbug help-mode easymenu view debug tooltip ediff-hook vc-hooks lisp-float-type mwheel ns-win tool-bar dnd fontset image fringe lisp-mode register page menu-bar rfn-eshadow timer select scroll-bar mouse jit-lock font-lock syntax facemenu font-core frame cham georgian utf-8-lang misc-lang vietnamese tibetan thai tai-viet lao korean japanese hebrew greek romanian slovak czech european ethiopic indian cyrillic chinese case-table epa-hook jka-cmpr-hook help simple abbrev minibuffer loaddefs button faces cus-face files text-properties overlay sha1 md5 base64 format env code-pages mule custom widget hashtable-print-readable backquote make-network-process dbusbind ns multi-tty emacs)