From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Juri Linkov Newsgroups: gmane.emacs.devel Subject: Re: Language identification Date: Fri, 28 Aug 2009 22:16:30 +0300 Organization: JURTA Message-ID: <877hwnoi8d.fsf@mail.jurta.org> References: <87skfczqc8.fsf@mail.jurta.org> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: ger.gmane.org 1251493039 18668 80.91.229.12 (28 Aug 2009 20:57:19 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Fri, 28 Aug 2009 20:57:19 +0000 (UTC) Cc: joakim@verona.se, Emacs Development To: Stefan Monnier Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Fri Aug 28 22:57:12 2009 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1Mh8VT-0007M7-EL for ged-emacs-devel@m.gmane.org; Fri, 28 Aug 2009 22:57:11 +0200 Original-Received: from localhost ([127.0.0.1]:59674 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1Mh8VS-0000uQ-T4 for ged-emacs-devel@m.gmane.org; Fri, 28 Aug 2009 16:57:10 -0400 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1Mh73z-0007q2-KB for emacs-devel@gnu.org; Fri, 28 Aug 2009 15:24:43 -0400 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1Mh73u-0007oL-Ns for emacs-devel@gnu.org; Fri, 28 Aug 2009 15:24:43 -0400 Original-Received: from [199.232.76.173] (port=58493 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1Mh73u-0007oD-HI for emacs-devel@gnu.org; Fri, 28 Aug 2009 15:24:38 -0400 Original-Received: from smtp-out2.starman.ee ([85.253.0.4]:52452 helo=mx2.starman.ee) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1Mh73u-0002il-4l for emacs-devel@gnu.org; Fri, 28 Aug 2009 15:24:38 -0400 X-Virus-Scanned: by Amavisd-New at mx2.starman.ee Original-Received: from mail.starman.ee (82.131.54.133.cable.starman.ee [82.131.54.133]) by mx2.starman.ee (Postfix) with ESMTP id 156C23F41A0; Fri, 28 Aug 2009 22:24:30 +0300 (EEST) In-Reply-To: (Stefan Monnier's message of "Fri, 28 Aug 2009 00:58:42 -0400") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.1.50 (x86_64-pc-linux-gnu) X-detected-operating-system: by monty-python.gnu.org: GNU/Linux 2.6, seldom 2.4 (older, 4) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:114784 Archived-At: >>> I often wish that files would open in Emacs with correct mode >>> more often when there is no file extension. >> In `auto-mode-alist' you can see that with the exception of >> `archive-mode', `doc-view-mode' and `image-mode', all remaining >> modes are programming text modes. It would be more useful >> to identify file types for these modes that libmagic can't do. >> Do you know a library that identifies programming languages? >> Such a library might be implemented using a Bayesian classifier >> trained on a sufficiently large corpus of different programming >> languages. > > OTOH, how often do you see a file containg programming language code and > yet without ny extension? More often with a non-standard extension than without any extension. Also there are conflicting extensions like e.g. ".pl" for both Perl and Prolog (esp. SWI-Prolog). -- Juri Linkov http://www.jurta.org/emacs/