From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: David Kastrup Newsgroups: gmane.emacs.devel Subject: Re: [Emacs-diffs] master 9ce1d38: Use curved quotes in core elisp diagnostics Date: Sun, 30 Aug 2015 17:49:09 +0200 Message-ID: <87wpwc7p56.fsf@fencepost.gnu.org> References: <20150816160149.9416.80132@vcs.savannah.gnu.org> <0ac95dde-75ba-464b-90b2-1b19b348473e@default> <55DC530B.4040509@yandex.ru> <55DCE945.40701@yandex.ru> <41A4C5AF-6F4B-4927-8C42-E7E6048716E1@gmail.com> <877fohbf5g.fsf@gmail.com> <55DE96DE.80400@cs.ucla.edu> <55DF45E2.9010806@cs.ucla.edu> <55DFC130.7060006@cs.ucla.edu> <8737z396hb.fsf@gmx.us> <83si732pls.fsf@gnu.org> <83613y2hqg.fsf@gnu.org> <83pp2425zh.fsf@gnu.org> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain X-Trace: ger.gmane.org 1440949758 27347 80.91.229.3 (30 Aug 2015 15:49:18 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sun, 30 Aug 2015 15:49:18 +0000 (UTC) Cc: Stefan Monnier , rasmus@gmx.us, emacs-devel@gnu.org To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Sun Aug 30 17:49:17 2015 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1ZW4r7-00008N-EY for ged-emacs-devel@m.gmane.org; Sun, 30 Aug 2015 17:49:17 +0200 Original-Received: from localhost ([::1]:59101 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZW4r7-0005ip-Dn for ged-emacs-devel@m.gmane.org; Sun, 30 Aug 2015 11:49:17 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:52035) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZW4r4-0005ij-AB for emacs-devel@gnu.org; Sun, 30 Aug 2015 11:49:15 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ZW4r3-0005l6-77 for emacs-devel@gnu.org; Sun, 30 Aug 2015 11:49:14 -0400 Original-Received: from fencepost.gnu.org ([2001:4830:134:3::e]:37526) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZW4r0-0005kY-PW; Sun, 30 Aug 2015 11:49:10 -0400 Original-Received: from localhost ([127.0.0.1]:51345 helo=lola) by fencepost.gnu.org with esmtp (Exim 4.82) (envelope-from ) id 1ZW4qz-0008Hw-SS; Sun, 30 Aug 2015 11:49:10 -0400 Original-Received: by lola (Postfix, from userid 1000) id 459F2E36F4; Sun, 30 Aug 2015 17:49:09 +0200 (CEST) In-Reply-To: <83pp2425zh.fsf@gnu.org> (Eli Zaretskii's message of "Sun, 30 Aug 2015 17:41:54 +0300") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.0.50 (gnu/linux) X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2001:4830:134:3::e X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:189321 Archived-At: Eli Zaretskii writes: >> From: Stefan Monnier >> Date: Sat, 29 Aug 2015 22:00:35 -0400 >> Cc: rasmus@gmx.us, emacs-devel@gnu.org >> >> > Then how are we supposed to handle similar issues, if no one else >> > knows this, and never will? >> >> By designing a better solution, I guess. > > I'm not sure I understand: are you saying that it is fundamentally > wrong or unclean to have a syntax category for word-constituent > characters that cannot appear at word beginning or end? If so, please > explain why you think so, because this situation happens with many > characters in human languages, and is not really different from other > similar syntax categories. > > If the idea is OK, and only its current implementation is not clean, > then I see no reason to refrain from documenting the Lisp-level > feature, because it will remain unchanged even when the implementation > will be cleaned up. For the record: LilyPond's definition of the lexical category "word" is any sequence of ASCII letters and arbitrary non-ASCII (multibyte) characters interrupted by isolated hyphens and underlines. c--d is a note c with a dash-separated accent - followed by a note d. c-d is a word of its own. There is a bit of history to this where LilyPond had too many different definitions of "word" depending on its current lexical mode. The respective definitions in the (Flex-defined) lexer are: A [a-zA-Z\200-\377] WORD {A}([-_]{A}|{A})* COMMAND \\{WORD} The lexer is working on UTF-8 encoded bytes as input. Whenever a pattern accepts anything outside of the ASCII range, a checking routine makes sure that only proper UTF-8 is passed on. At any rate, it would be cool if words could be matched solely by syntax table. The "any non-ASCII character" bit might be impractical to implement, but at least the word syntax inside of the ASCII range would be nice. -- David Kastrup