From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Yuri Khan Newsgroups: gmane.emacs.help Subject: Re: Getting Emacs to play nice with Hunspell and apostrophes Date: Thu, 12 Jun 2014 12:43:24 +0700 Message-ID: References: <87ha3s71mt.fsf@debian.uxu> <87tx7rsevi.fsf@debian.uxu> <8738fbscao.fsf@debian.uxu> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Trace: ger.gmane.org 1402551829 9295 80.91.229.3 (12 Jun 2014 05:43:49 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Thu, 12 Jun 2014 05:43:49 +0000 (UTC) Cc: "help-gnu-emacs@gnu.org" To: Emanuel Berg Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Thu Jun 12 07:43:41 2014 Return-path: Envelope-to: geh-help-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1WuxnY-0007rK-I9 for geh-help-gnu-emacs@m.gmane.org; Thu, 12 Jun 2014 07:43:40 +0200 Original-Received: from localhost ([::1]:51057 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WuxnX-0001QC-T6 for geh-help-gnu-emacs@m.gmane.org; Thu, 12 Jun 2014 01:43:39 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:57594) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WuxnK-0001Pu-HB for help-gnu-emacs@gnu.org; Thu, 12 Jun 2014 01:43:27 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1WuxnJ-0005Xa-AM for help-gnu-emacs@gnu.org; Thu, 12 Jun 2014 01:43:26 -0400 Original-Received: from mail-qc0-x234.google.com ([2607:f8b0:400d:c01::234]:59885) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WuxnJ-0005XV-5V for help-gnu-emacs@gnu.org; Thu, 12 Jun 2014 01:43:25 -0400 Original-Received: by mail-qc0-f180.google.com with SMTP id i17so1187695qcy.25 for ; Wed, 11 Jun 2014 22:43:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type:content-transfer-encoding; bh=oKl7PmyBmkfyG4ntu+1AmvRf6PaD+BidmlEbeGy4yjw=; b=KqBXfIOm9WXT7qJsIGAdgIBRAplPAHEXw8gHgNjx1G1CYn7UqXQ6pG7mPah5UVI8Rm TgaaSEUExW7BjIBExGVu3ohoEaZBxug7AJV3xweFCgULR6GtIktaIuYngyxw/e0mokqM 4hqu9OMzvYjGk3ixvvreP/F0yvkPVBuYp2kfOm+psVwDUpdjF8MD8ns2rTW6C2ZY0hB1 sDQiLYdH2MiyMVDDZAkSYTA2SVLBz7kzf5AqaSFokj0XXKsND8q7Q1qRACcI/5yb2TH2 4dJiPjIjpDF1KyGpQm2jYpTn9xSzUJYCuMquxUh0g3SVAHJiFB8CY8gUYatcy5m4nPCR AKgw== X-Received: by 10.224.8.131 with SMTP id h3mr58676488qah.61.1402551804247; Wed, 11 Jun 2014 22:43:24 -0700 (PDT) Original-Received: by 10.96.154.73 with HTTP; Wed, 11 Jun 2014 22:43:24 -0700 (PDT) In-Reply-To: <8738fbscao.fsf@debian.uxu> X-Google-Sender-Auth: hqZJEpKAG-OV8L5JzGfYgATuEEo X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2607:f8b0:400d:c01::234 X-BeenThere: help-gnu-emacs@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Users list for the GNU Emacs text editor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Original-Sender: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.help:98165 Archived-At: On Wed, Jun 11, 2014 at 10:20 PM, Emanuel Berg wro= te: > You still haven't said one word why anyone would > benefit from using those chars instead of the standard > " and ' (and ...) that works everywhere and that > everyone is familiar with (having trained their eyes > for them year-in, year-out). The fact that everybody uses " and ' and ` is a historical artifact, a workaround of sorts, due to the limitations of the mechanical typewriter. We need not be affected by it any more. There was no possibility of including all the required typographical characters or accented letters into the printing ball, so both quotes (=E2=80=9C and =E2=80=9D) and the diaeresis got conflated into a straight q= uote ", both single quotes (=E2=80=98 and =E2=80=99) into a straight single quote/a= postrophe ', and the backtick ` and tilde ~ were there to facilitate typing accented letters. This limitation then crept into computers, because this way the character set could be encoded in 7 bits. The computer keyboard was just modeled after the typewriter keyboard, with a few extensions. Then the inevitable struck: computers expanded from the US and UK into Germany, Sweden, Finland, France, Canada, and then countries with non-Latin scripts (Greek, Cyrillic, and CJK). And all of them wanted to have dedicated code points for their characters, e.g. type a single =C3=A4 instead of [a, backspace-no-delete, "]. For a good while, we lived in a nightmare of ten thousand code pages. In Russia, you could receive an email and see a jumble of utterly meaningless words because the message could be re-encoded (or the Content-Type charset=3D stripped or re-labeled) on any of the intermediate servers; there existed programs which were able to heuristically detect the chain of re-encodings applied on the way and decode your message for you. You could order a book in an Internet shop, have them completely b0rk up the encoding of the shipping address: http://cdn.imagepush.to/in/625x2090/i/3/30/301/24.jpg Then somebody at the postal system might decode the characters and the package would still be delivered at the intended address. Now that every widely used operating system supports Unicode, we don=E2=80= =99t have an excuse for clinging to those workarounds of the past century. We are not limited by the 7-bit ASCII encoding and can store texts in their true form. We also are not constrained by the typewriter keyboard, having input methods based on Compose or Level3 allowing us to conveniently enter all the necessary diverse characters. On X11/GNU/Linux in particular it comes bundled with the system; on Windows, one has to install a third-party package. Much of the software has already evolved to support Unicode. That which hasn=E2=80=99t, has to catch up. From a spell checker, in particular,= I expect that it should (perhaps with an optional switch) be able to flag as error any spelling of =E2=80=9Cisn=E2=80=99t=E2=80=9D where the cha= racter between n and t is not the preferred apostrophe character U+2019.