From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: "Pascal J. Bourguignon" Newsgroups: gmane.emacs.help Subject: Re: if vs. when vs. and: style question Date: Wed, 01 Apr 2015 16:29:32 +0200 Organization: Informatimago Message-ID: <87d23nx6pf.fsf@kuiper.lan.informatimago.com> References: <87sicvwckx.fsf@wmi.amu.edu.pl> <87wq27yvqg.fsf@debian.uxu> <8d531e99-7260-4263-ac99-09c6871e2708@googlegroups.com> <87vbhq53lf.fsf@debian.uxu> <87a8z23p23.fsf@kuiper.lan.informatimago.com> <87lhilx0cf.fsf@debian.uxu> <87twx9360u.fsf@kuiper.lan.informatimago.com> <0d1d19ab-06e9-462d-8867-9a49b1e232d3@googlegroups.com> <87lhil2io1.fsf@kuiper.lan.informatimago.com> <87d23w3mzu.fsf@kuiper.lan.informatimago.com> <87h9t0a8az.fsf@debian.uxu> <28015c0d-28e0-4577-9728-a3ab05cc48ab@googlegroups.com> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Trace: ger.gmane.org 1427899235 20760 80.91.229.3 (1 Apr 2015 14:40:35 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Wed, 1 Apr 2015 14:40:35 +0000 (UTC) To: help-gnu-emacs@gnu.org Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Wed Apr 01 16:40:29 2015 Return-path: Envelope-to: geh-help-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1YdJoa-0002Zx-A4 for geh-help-gnu-emacs@m.gmane.org; Wed, 01 Apr 2015 16:40:20 +0200 Original-Received: from localhost ([::1]:53033 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YdJoZ-0004V7-S2 for geh-help-gnu-emacs@m.gmane.org; Wed, 01 Apr 2015 10:40:19 -0400 Original-Path: usenet.stanford.edu!fu-berlin.de!uni-berlin.de!individual.net!not-for-mail Original-Newsgroups: gnu.emacs.help Original-Lines: 77 Original-X-Trace: individual.net EKYI1yCp7lDeNauXtexj9gclnwEllx7GA5s5zlu7EDbDF8X22A Cancel-Lock: sha1:MGRjZWFhMDVhNWE5ODI1NTc2NGYwMzhkOTVmOTA0NjllZDcyYzFmOQ== sha1:CfH9aG3lHYAMcrT2dzgcIDJWlIw= Face: iVBORw0KGgoAAAANSUhEUgAAADAAAAAwAQMAAABtzGvEAAAABlBMVEUAAAD///+l2Z/dAAAA oElEQVR4nK3OsRHCMAwF0O8YQufUNIQRGIAja9CxSA55AxZgFO4coMgYrEDDQZWPIlNAjwq9 033pbOBPtbXuB6PKNBn5gZkhGa86Z4x2wE67O+06WxGD/HCOGR0deY3f9Ijwwt7rNGNf6Oac l/GuZTF1wFGKiYYHKSFAkjIo1b6sCYS1sVmFhhhahKQssRjRT90ITWUk6vvK3RsPGs+M1RuR mV+hO/VvFAAAAABJRU5ErkJggg== X-Accept-Language: fr, es, en User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3 (gnu/linux) Original-Xref: usenet.stanford.edu gnu.emacs.help:211202 X-BeenThere: help-gnu-emacs@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Users list for the GNU Emacs text editor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Original-Sender: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.help:103484 Archived-At: Rusi writes: > On Wednesday, April 1, 2015 at 7:57:07 AM UTC+5:30, Emanuel Berg wrote: >> Richard Wordingham writes: >> >> > One of the issues with using the full set of Unicode >> > characters is that many are easily misread when >> > there are no constraints. Many Greek capitals look >> > just like Roman capitals, and Latin 'o', Greek 'ο' >> > and Cyrillic 'о' may be indistinguishable. This is >> > not a good idea for writing code. >> >> Good point. In addition, there are many Unicode chars >> that aren't human language chars but instead are to be >> used in geometric figures, in math and otherwise >> scientific/engineering notation, and so on - and those >> also collide (or almost so) with for example the >> Latin 'o' and probably other letters as well. > > Of course — Richard does use the phrase "FULL set of Unicode characters" > > Currently we see programming languages ALREADY SUPPORTING large swathes of the > 1 million chars for identifier-chars -- mostly the 'Letter' and perhaps > the 'number/digit' categories. Quick, without looking it up, is: ➒ a digit? a letter? something else? What about Ⅸ or ๙? Are they digits or letters? > So there are two somewhat opposite points: > 1. Supporting the Babel of human languages in programming identifiers is > probably a mistake. In any case if a language must go that way, the choice of > html seems more sane: active opt-in with (something like) a charset declaration > rather than have the whole truckload thrown at someone unsuspecting. > So if a А (cyrillic) and the usual A got mixed up, at the least you asked for it!! Yes, a mandatory declarations could solve some problems. > 2. The basic 'infrastructure' of a language in C think "; {}()" operators, '#' > the quotes themselves etc is drawn exclusively from ASCII for historical reasons > that are 2015-irrelevant. And have alternatives too: > Now python (for example) has half a dozen 'quoteds' > - strings "... > - unicode strings u"..." > - triple quoted strings (can contain newlines) """...""" > - raw strings r"..." special chars like backslash are not special > etc > > And the chars like « ‹ seem to be just calling for use In German, they quote as: »Hallo« In French, they quote as: « Salut ! » In old books, they quote as: « One line, « another « final line » The real problem introduced by unicode, is that not only it has a lot of complicated rules in itself, but the usage of foreign-language characters would have to come with the corresponding localized rules too! There's no (contemporary) way any sane program can implement them correctly, much less a program as unrelated to this (international human language) domain as a programming language compiler. I don't say once AI will be running on your smartphones (instead of on Apple, Google or IBM supercomputers), that it won't be possible to have it deal with that, even in compiler sources. But not now. It's too early. -- __Pascal Bourguignon__ http://www.informatimago.com/ “The factory of the future will have only two employees, a man and a dog. The man will be there to feed the dog. The dog will be there to keep the man from touching the equipment.” -- Carl Bass CEO Autodesk