From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Stefan Monnier Newsgroups: gmane.emacs.bugs Subject: bug#15984: 24.3; Problem with combining characters in attachment filename Date: Fri, 29 Nov 2013 10:04:04 -0500 Message-ID: References: <83iovc8eaq.fsf@gnu.org> <83a9gn8yoz.fsf@gnu.org> <831u1z8twg.fsf@gnu.org> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Trace: ger.gmane.org 1385737685 1660 80.91.229.3 (29 Nov 2013 15:08:05 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Fri, 29 Nov 2013 15:08:05 +0000 (UTC) Cc: 15984@debbugs.gnu.org To: nisse@lysator.liu.se (Niels =?UTF-8?Q?M=C3=B6ller?=) Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Fri Nov 29 16:08:08 2013 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1VmPfp-0007EK-PV for geb-bug-gnu-emacs@m.gmane.org; Fri, 29 Nov 2013 16:08:06 +0100 Original-Received: from localhost ([::1]:47800 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VmPfp-0000v6-9i for geb-bug-gnu-emacs@m.gmane.org; Fri, 29 Nov 2013 10:08:05 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:57552) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VmPd0-0005Su-Mp for bug-gnu-emacs@gnu.org; Fri, 29 Nov 2013 10:05:17 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1VmPct-0001hL-2k for bug-gnu-emacs@gnu.org; Fri, 29 Nov 2013 10:05:10 -0500 Original-Received: from debbugs.gnu.org ([140.186.70.43]:36386) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VmPcs-0001fz-QK for bug-gnu-emacs@gnu.org; Fri, 29 Nov 2013 10:05:02 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.80) (envelope-from ) id 1VmPcs-0000Gw-Jq for bug-gnu-emacs@gnu.org; Fri, 29 Nov 2013 10:05:02 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Stefan Monnier Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Fri, 29 Nov 2013 15:05:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 15984 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: Original-Received: via spool by 15984-submit@debbugs.gnu.org id=B15984.1385737453971 (code B ref 15984); Fri, 29 Nov 2013 15:05:02 +0000 Original-Received: (at 15984) by debbugs.gnu.org; 29 Nov 2013 15:04:13 +0000 Original-Received: from localhost ([127.0.0.1]:50405 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1VmPc4-0000Fb-Ow for submit@debbugs.gnu.org; Fri, 29 Nov 2013 10:04:13 -0500 Original-Received: from ironport2-out.teksavvy.com ([206.248.154.181]:43352) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1VmPc3-0000FP-3B for 15984@debbugs.gnu.org; Fri, 29 Nov 2013 10:04:11 -0500 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: Av8EABK/CFHO+KEh/2dsb2JhbABEhke4Rxdzgh4BAQQBIzMjBQsLGgIYDgICFBgNiEIGrl+SToEjjlSBEwOIYZwZgV6DFQ X-IPAS-Result: Av8EABK/CFHO+KEh/2dsb2JhbABEhke4Rxdzgh4BAQQBIzMjBQsLGgIYDgICFBgNiEIGrl+SToEjjlSBEwOIYZwZgV6DFQ X-IronPort-AV: E=Sophos;i="4.84,565,1355115600"; d="scan'208";a="40685477" Original-Received: from 206-248-161-33.dsl.teksavvy.com (HELO pastel.home) ([206.248.161.33]) by ironport2-out.teksavvy.com with ESMTP/TLS/ADH-AES256-SHA; 29 Nov 2013 10:04:05 -0500 Original-Received: by pastel.home (Postfix, from userid 20848) id 31B3760EFA; Fri, 29 Nov 2013 10:04:05 -0500 (EST) In-Reply-To: ("Niels =?UTF-8?Q?M=C3=B6ller?="'s message of "Fri, 29 Nov 2013 11:43:45 +0100") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3.50 (gnu/linux) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 140.186.70.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.bugs:81103 Archived-At: > What I think is the right thing, is to allow a sequence of unicode > values, e.g., "A" + combining character, or "A" + any random sequence of > combining characters, intern this string, and treat this as a single > "character". For the Lisp-level notion of "character", I think this would require too many deep changes. > The idea is that this character object should correspond to what the > user thinks of as a single character. E.g, one glyph per character, and > treated as a unit by forward-char, and regexp matching with "." and > character sets. For forward-char, we do try to fake that behavior (e.g. a `forward-char' command will skip over the whole A+ring combo) but not faithfully (e.g. `C-u 2 forward-char' will also just skip that combo, and not the subsequent char). It's not perfect, but it seems "close enough" that it hasn't proved problematic. Adjusting . in regexps would indeed help solve some unexpected behaviors. We would probably want to keep the ability to match a single "code point", so we'd need to introduce a new regexp operator. Maybe we could follow the lead of the POSIX collation thingy, IIRC, where [=CF=90] in case-folding mode wants to be able to match SS in a German locale. So maybe [[:any:]] could match A+ring. > E.g, there could be a mode which makes each and every unicode value a > single character, which will then be displayed as separate glyphs, > separate characters for regexp matching, etc. I think we wouldn't want to use different modes (too coarse) but different commands instead. In any case, a first step would be to find a name for that notion of "multi character character". "Grapheme cluster" doesn't sound too good if we want to expose the concept to the end user. Stefan