From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: "Kelly Dean" Newsgroups: gmane.emacs.devel Subject: Re: Correspondence between web-pages and Info-pages Date: Tue, 30 Dec 2014 11:17:45 +0000 Message-ID: References: NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Trace: ger.gmane.org 1419938350 17153 80.91.229.3 (30 Dec 2014 11:19:10 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Tue, 30 Dec 2014 11:19:10 +0000 (UTC) Cc: emacs-devel@gnu.org To: Stefan Monnier Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Tue Dec 30 12:19:05 2014 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1Y5upL-0003eW-Hj for ged-emacs-devel@m.gmane.org; Tue, 30 Dec 2014 12:19:03 +0100 Original-Received: from localhost ([::1]:36551 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Y5upK-0006GU-Nk for ged-emacs-devel@m.gmane.org; Tue, 30 Dec 2014 06:19:02 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:46892) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Y5up7-0006Bg-NM for emacs-devel@gnu.org; Tue, 30 Dec 2014 06:18:51 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Y5up4-00033s-Dp for emacs-devel@gnu.org; Tue, 30 Dec 2014 06:18:49 -0500 Original-Received: from relay6-d.mail.gandi.net ([2001:4b98:c:538::198]:57650) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Y5up4-00033o-5n for emacs-devel@gnu.org; Tue, 30 Dec 2014 06:18:46 -0500 Original-Received: from mfilter26-d.gandi.net (mfilter26-d.gandi.net [217.70.178.154]) by relay6-d.mail.gandi.net (Postfix) with ESMTP id 18586FB883; Tue, 30 Dec 2014 12:18:45 +0100 (CET) X-Virus-Scanned: Debian amavisd-new at mfilter26-d.gandi.net Original-Received: from relay6-d.mail.gandi.net ([217.70.183.198]) by mfilter26-d.gandi.net (mfilter26-d.gandi.net [10.0.15.180]) (amavisd-new, port 10024) with ESMTP id djC3sqU4B7Sb; Tue, 30 Dec 2014 12:18:41 +0100 (CET) X-Originating-IP: 162.248.99.114 Original-Received: from localhost (114-99-248-162-static.reverse.queryfoundry.net [162.248.99.114]) (Authenticated sender: kelly@prtime.org) by relay6-d.mail.gandi.net (Postfix) with ESMTPSA id 14341FB86F; Tue, 30 Dec 2014 12:18:30 +0100 (CET) In-Reply-To: X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2001:4b98:c:538::198 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:180885 Archived-At: Stefan Monnier wrote: > Hey, I think this is a great idea: replace the "(emacs)Title" > syntax with a URL. When passed to Info, these URL would be redirected > to the local Info pages. > > The main downside is that those URLs would take up more space. But the > upside is not just greater exposure of our HTML manuals to search > engines, but also the removal of the ad-hoc (info "(emacs)Title") synta= x. Don't overlook two important parts of this: using the same name both for = user input and for display, and using different names for different forma= ts of a page (Info vs. HTML). Web browsers have some useful navagation features: 0. An address bar, which shows the name of the currently displayed page. 1. A drop-down menu that shows the sequence of visited pages for the curr= ent buffer, and the current position within that sequence. 2. In the address bar, you can enter a new name and press enter to open t= hat page. 3. The name shown is the same string as the string you enter to open the = page by name. 4. You can copy the name that's shown. 5 Because of the preceding three features, you can save the name into a t= ext file that you use as a list of bookmarks, paste the name back into th= e address bar to return to the page, and use the name to cite the page so= your readers can open it; IOW, you can use the name to link to the page. 6. The name can include a hash mark and section name at the end, so that = when you open the page, the browser jumps to the named section. Emacs's Info browser has feature #0, but lacks the rest. Emacs's Info-his= tory command partially provides #1, but doesn't show the actual link sequ= ence that's traversed by Info-history-back and Info-history-forward. Inst= ead of #2, Emacs makes you remember a command (=C2=ABg=C2=BB, for Info-go= to-node) for entering the name of the page to open. Regarding #3, for exa= mple, I'm currently viewing the page with the shown name =E2=8C=9C(elisp)= Top > Keymaps > Translation Keymaps=E2=8C=9D[0], but that's effectively l= ike an HTML page title; it isn't the name used for opening the page. ([0]: I actually had to manually transcribe that name, because incredibly= , Emacs lacks feature #4. See bug #19471.) Features #1 and #2 would be nice to have but aren't essential, #4 is esse= ntial but fortunately is easy to implement, and #6 is unnecessary if page= s aren't too long. But the lack of #3, and consequently of #5, is the maj= or problem. If you adopt URL syntax for page names, be sure to not only u= se it for Info-goto-node, but also display it in the address bar in the I= nfo browser, e.g. =E2=8C=9Chttp://gnu.org/emacs/24.4/docs/elisp/keymaps/t= ranslation_keymaps=E2=8C=9D, regardless of whatever other syntax (e.g. =E2= =8C=9C(elisp)Translation Keymaps=E2=8C=9D as the short name) might also b= e usable to open the page. For #2, have the address bar be editable, and = have Info-goto-node simply move focus to it. There was a proposal somewhere in this ginormous thread to use the same n= ame for both an Info page and an HTML page, and serve the Info page from = the local cache but the HTML page via HTTP from the official server. That= 's a bad idea, because then the name's scope isn't global; instead, what = the name resolves to depends on which system (local or remote-official) i= s queried. If you try to fix that by relying on the User-agent or some other request= header to choose which format to return, and having Emacs cache and use = the format returned by sending =E2=8C=9CInfo=E2=8C=9D for that header and= having web browsers use the format returned by sending any other value f= or that header, then the URL is no longer the name of the page; instead, = the URL+header is the name, which is a facepalm-inducing convention that'= s already a widespread plague that Emacs shouldn't exacerbate, akin to us= ing URL+source-ip for page names in order to balkanize the web (conspicuo= us offenders include Google and CloudFlare). You could instead conflate the protocol name and the page type name and s= ay =E2=8C=9Cinfo:gnu.org/emacs/24.4/docs/elisp/keymaps/translation_keymap= s=E2=8C=9D if you want. That would still enable feature #3. Or instead ap= pend a =E2=8C=9C.info=E2=8C=9D extension to the end of the name, like is = commonly done with HTML, though that could be misleading if the page does= n't have its own dedicated Info file. Both of these require you to replac= e the =E2=8C=9Cinfo=E2=8C=9D in the name by =E2=8C=9Chttp=E2=8C=9D or =E2= =8C=9Chtml=E2=8C=9D before sending the name to non-Emacs users. I propose a cleaner solution: have the name with no type extension resolv= e to a redirect. Do client-side redirect, not server-side: serve a consis= tent response to all clients (regardless of request headers), containing = both a standard HTTP redirect that web browsers will follow, and a new In= fo-file header that Info browsers will follow (web browsers will ignore i= t). The former points to a page with the same name but with =E2=8C=9C.htm= l=E2=8C=9D appended, and the latter to the Info file that contains the re= quested Info page. This way, the extensionless URL is effectively the nam= e of a directory from which browsers automatically choose one of two file= s, but the URL alone, not the URL plus a header, is the name of the direc= tory, and the files have their own URLs. When you receive page URLs from non-Emacs users, it's easy enough to chop= off the =E2=8C=9C.html=E2=8C=9D extension. When you send them page URLs = without the extension, their browsers will automatically redirect. For example, if your browser (web or Info) sends this query for a documen= tation page: GET /emacs/24.4/docs/elisp/keymaps/translation_keymaps HTTP/1.0 Host: gnu.org then the response is: HTTP/1.0 302 Found Location: http://gnu.org/emacs/24.4/docs/elisp/keymaps/translation_keymap= s.html Info-file: http://gnu.org/emacs/24.4/docs/elisp.info Web browsers will redirect to the URL in the Location header. Info browsers will: Fetch the file named in the Info-file header. Chop the =E2=8C=9C.info=E2=8C=9D extension from the value of the Info-fil= e header to get Info-base. Chop Info-base from the front of the original page URL to get the name of= the page (=E2=8C=9C/keymaps/translation_keymaps=E2=8C=9D in this case) w= ithin the Info file. Load that page from the file. In the address bar, display the original page URL. Info can send all web requests through a cache. Distribute Emacs with the= cache preloaded with Info files, including the original URL for each of = those files. When Info queries the cache for a cached Info file, the cache returns a f= ile descriptor for that file. When Info queries for a noncached Info file, the cache downloads and cach= es it and returns a descriptor. When Info queries for for any URL that starts with a string matching the = URL of a cached Info file (excluding the =E2=8C=9C.info=E2=8C=9D extensio= n), and the query URL itself doesn't have a filename extension, the cache= generates and returns =E2=8C=9CInfo-file: X=E2=8C=9D where X is the URL = of the Info file. Info then processes this as a redirect. When Info queries for anything else, the cache sends the query to the nam= ed server and returns the response to Info. If the response is a redirect= , Info processes it. This way, no network traffic is necessary for cached files. This also let= s the same cache serve web browsers, not just Info browsers. The cache co= uld be preloaded with HTML files for people who really don't like Info, a= nd both Info and HTML files for people who like both. The cache doesn't n= eed to be a server; it can just be a library, like sqlite is, and integra= ted into Emacs if only Info and Eww use it. Indirecting through the Info-file header enables splitting or combining I= nfo files without affecting the page URLs. E.g. elisp.info could be split= up so that =C2=ABkeymaps=C2=BB, etc are in separate files, or elisp.info= , emacs.info, and all the other Info files could be combined into one big= docs.info file, but with either of those changes, the page URLs would re= main unchanged. It doesn't matter whether URLs are used in Info files (or in Texinfo file= s), or the Info browser just translates the names for input and display. = What matters for users is just the Info browser's UI. But if Info files u= se only relative names, then the browser must know the original URL of th= e file in order to construct the URL for each page and show that name in = the address bar. Therefore, the browser can't just search a path on the l= ocal system to find Info files, like it currently does when the user runs= e.g. =E2=8C=9C(info "(elisp)")=E2=8C=9D, unless the file format is chang= ed to include its own URL. Alternatively, and more cleanly, the browser c= ould just query the cache and have the cache do the search, and the cache= can return =E2=8C=9CInfo-file: X=E2=8C=9D if it finds a match, which Inf= o then processes as a redirect. For any query without a version number embedded in the name, the server s= hould respond with a redirect to the same name but with the latest versio= n number embedded. This makes it easy to check for updates, and to link t= o the always-latest version of a page. For non-English manuals, there's no need to embed the language name in th= e URL; just use the source-ip address of the request to choose which vers= ion to serve, like Google does. (Just checking if anybody is still awake.= )