* using libmagic in Emacs? @ 2009-08-18 18:35 joakim 2009-08-18 19:23 ` Stefan Monnier 2009-08-28 0:27 ` Language identification (was: using libmagic in Emacs) Juri Linkov 0 siblings, 2 replies; 119+ messages in thread From: joakim @ 2009-08-18 18:35 UTC (permalink / raw) To: Emacs Development (This is probably a FAQ but my google skills seem to fail me) There are some operations in Emacs which tries to do the same thing as the "libmagic" library, which is the core of the "file" utility, does. For instance, in "image.el" there is functionality to look at magic numbers in image files. Also, I often whish that files would open in Emacs with correct mode more often when there is no file extension. Would there be interest in an Emacs patch for libmagic, or is there some obvious reason this havent been done yet? I envision this as being an inteface with 2 implementations, a lisp fallback like today, and libmagic if available. I did a libmagick wrapper for Ocaml using Swig before so I have some familiarity with the API. -- Joakim Verona ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: using libmagic in Emacs? 2009-08-18 18:35 using libmagic in Emacs? joakim @ 2009-08-18 19:23 ` Stefan Monnier 2009-08-18 20:01 ` Chong Yidong ` (2 more replies) 2009-08-28 0:27 ` Language identification (was: using libmagic in Emacs) Juri Linkov 1 sibling, 3 replies; 119+ messages in thread From: Stefan Monnier @ 2009-08-18 19:23 UTC (permalink / raw) To: joakim; +Cc: Emacs Development > (This is probably a FAQ but my google skills seem to fail me) > There are some operations in Emacs which tries to do the same thing as > the "libmagic" library, which is the core of the "file" utility, does. > For instance, in "image.el" there is functionality to look at magic > numbers in image files. > Also, I often whish that files would open in Emacs with correct mode > more often when there is no file extension. > Would there be interest in an Emacs patch for libmagic, or is there some > obvious reason this havent been done yet? I envision this as being an > inteface with 2 implementations, a lisp fallback like today, and > libmagic if available. I did a libmagick wrapper for Ocaml using Swig > before so I have some familiarity with the API. I think it's a good idea. It may require some non-trivial changes on the Lisp side, since libmagic's information is not quite the same as what Emacs currently uses: we'll probably want to use libmagic to get a MIME-type and then have a table mapping mime-types to major modes or some such. Stefan ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: using libmagic in Emacs? 2009-08-18 19:23 ` Stefan Monnier @ 2009-08-18 20:01 ` Chong Yidong 2009-08-18 20:35 ` joakim 2009-08-19 0:57 ` using libmagic in Emacs? Juri Linkov 2009-08-19 22:49 ` joakim 2 siblings, 1 reply; 119+ messages in thread From: Chong Yidong @ 2009-08-18 20:01 UTC (permalink / raw) To: Stefan Monnier; +Cc: joakim, Emacs Development Stefan Monnier <monnier@IRO.UMontreal.CA> writes: >> Would there be interest in an Emacs patch for libmagic, or is there some >> obvious reason this havent been done yet? I envision this as being an >> inteface with 2 implementations, a lisp fallback like today, and >> libmagic if available. I did a libmagick wrapper for Ocaml using Swig >> before so I have some familiarity with the API. > > I think it's a good idea. It may require some non-trivial changes on > the Lisp side, since libmagic's information is not quite the same as > what Emacs currently uses: we'll probably want to use libmagic to get > a MIME-type and then have a table mapping mime-types to major modes or > some such. This development would probably have to take place in a separate branch. ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: using libmagic in Emacs? 2009-08-18 20:01 ` Chong Yidong @ 2009-08-18 20:35 ` joakim 2009-08-18 21:11 ` Stefan Monnier 0 siblings, 1 reply; 119+ messages in thread From: joakim @ 2009-08-18 20:35 UTC (permalink / raw) To: Chong Yidong; +Cc: Stefan Monnier, Emacs Development Chong Yidong <cyd@stupidchicken.com> writes: > Stefan Monnier <monnier@IRO.UMontreal.CA> writes: > >>> Would there be interest in an Emacs patch for libmagic, or is there some >>> obvious reason this havent been done yet? I envision this as being an >>> inteface with 2 implementations, a lisp fallback like today, and >>> libmagic if available. I did a libmagick wrapper for Ocaml using Swig >>> before so I have some familiarity with the API. >> >> I think it's a good idea. It may require some non-trivial changes on >> the Lisp side, since libmagic's information is not quite the same as >> what Emacs currently uses: we'll probably want to use libmagic to get >> a MIME-type and then have a table mapping mime-types to major modes or >> some such. > > This development would probably have to take place in a separate > branch. I will work in my local git repos, and publish a patch here, much like the imagemagick patch and the xwidget patch. I can switch to bzr whenever that works. The core libmagic lisp api should, however, be rather stand-alone and non-intrusive. Client code such as the image type recognition code can then be ported sucessively. -- Joakim Verona ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: using libmagic in Emacs? 2009-08-18 20:35 ` joakim @ 2009-08-18 21:11 ` Stefan Monnier 2009-08-19 2:58 ` Eli Zaretskii 0 siblings, 1 reply; 119+ messages in thread From: Stefan Monnier @ 2009-08-18 21:11 UTC (permalink / raw) To: joakim; +Cc: Chong Yidong, Emacs Development >>>> Would there be interest in an Emacs patch for libmagic, or is there some >>>> obvious reason this havent been done yet? I envision this as being an >>>> inteface with 2 implementations, a lisp fallback like today, and >>>> libmagic if available. I did a libmagick wrapper for Ocaml using Swig >>>> before so I have some familiarity with the API. >>> >>> I think it's a good idea. It may require some non-trivial changes on >>> the Lisp side, since libmagic's information is not quite the same as >>> what Emacs currently uses: we'll probably want to use libmagic to get >>> a MIME-type and then have a table mapping mime-types to major modes or >>> some such. >> This development would probably have to take place in a separate >> branch. I don't expect it to be too intrusive, so I think it can be done on the trunk, tho of course, each step needs to be planned with care. > I will work in my local git repos, and publish a patch here, much like > the imagemagick patch and the xwidget patch. I can switch to bzr > whenever that works. That's fine as well. > The core libmagic lisp api should, however, be rather stand-alone and > non-intrusive. Client code such as the image type recognition code can > then be ported sucessively. I think it's OK to install the code in CVS as soon as the Lisp API to libmagic is ready. Once that is done, we can decide which next steps to take. Stefan ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: using libmagic in Emacs? 2009-08-18 21:11 ` Stefan Monnier @ 2009-08-19 2:58 ` Eli Zaretskii 2009-08-19 3:21 ` Stefan Monnier 0 siblings, 1 reply; 119+ messages in thread From: Eli Zaretskii @ 2009-08-19 2:58 UTC (permalink / raw) To: Stefan Monnier; +Cc: cyd, joakim, emacs-devel > From: Stefan Monnier <monnier@IRO.UMontreal.CA> > Date: Tue, 18 Aug 2009 17:11:02 -0400 > Cc: Chong Yidong <cyd@stupidchicken.com>, > Emacs Development <emacs-devel@gnu.org> > > >>> I think it's a good idea. It may require some non-trivial changes on > >>> the Lisp side, since libmagic's information is not quite the same as > >>> what Emacs currently uses: we'll probably want to use libmagic to get > >>> a MIME-type and then have a table mapping mime-types to major modes or > >>> some such. > >> This development would probably have to take place in a separate > >> branch. > > I don't expect it to be too intrusive, so I think it can be done on the > trunk, tho of course, each step needs to be planned with care. So what is the rule for new features that can be installed on the trunk at this time? I thought only relatively minor and safe ones, but this one seems to break that rule, at least in my book. If this one is okay, then why not something like bidirectional editing, for example? Maybe we should simply decide right here and now that Emacs 23.2 will be delivered from the RC branch, and open the trunk for all changes, even not-so-safe ones? ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: using libmagic in Emacs? 2009-08-19 2:58 ` Eli Zaretskii @ 2009-08-19 3:21 ` Stefan Monnier 2009-08-19 13:47 ` Chong Yidong 2009-08-19 19:05 ` installing features on trunk (was: using libmagic in Emacs?) Eli Zaretskii 0 siblings, 2 replies; 119+ messages in thread From: Stefan Monnier @ 2009-08-19 3:21 UTC (permalink / raw) To: Eli Zaretskii; +Cc: cyd, joakim, emacs-devel >> >>> I think it's a good idea. It may require some non-trivial changes on >> >>> the Lisp side, since libmagic's information is not quite the same as >> >>> what Emacs currently uses: we'll probably want to use libmagic to get >> >>> a MIME-type and then have a table mapping mime-types to major modes or >> >>> some such. >> >> This development would probably have to take place in a separate >> >> branch. >> >> I don't expect it to be too intrusive, so I think it can be done on the >> trunk, tho of course, each step needs to be planned with care. > So what is the rule for new features that can be installed on the > trunk at this time? The rule is: anything is possible, but ones that aren't simple and safe need to get confirmation here first. > I thought only relatively minor and safe ones, > but this one seems to break that rule, at least in my book. It looks pretty safe: the first step is to add the Lisp API, which should not impact any other code (tho it may cause temporary build failures, I guess). After that, set-auto-mode (and/or image.el, ...) will need to be tweaked to also take libmagic into account when available. This should also be fairly simple. > If this one is okay, then why not something like bidirectional > editing, for example? I was thinking of bidi for Emacs-24, but if you have code ready for it, and if it's not too intrusive, I'd be willing to consider it. > Maybe we should simply decide right here and now that Emacs 23.2 will > be delivered from the RC branch, and open the trunk for all changes, > even not-so-safe ones? Yes, that's pretty much where we're at, I think, yes. Stefan ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: using libmagic in Emacs? 2009-08-19 3:21 ` Stefan Monnier @ 2009-08-19 13:47 ` Chong Yidong 2009-08-19 15:57 ` joakim 2009-08-19 19:46 ` next bugfix release? [was: Re: using libmagic in Emacs?] Dan Nicolaescu 2009-08-19 19:05 ` installing features on trunk (was: using libmagic in Emacs?) Eli Zaretskii 1 sibling, 2 replies; 119+ messages in thread From: Chong Yidong @ 2009-08-19 13:47 UTC (permalink / raw) To: Stefan Monnier; +Cc: Eli Zaretskii, joakim, emacs-devel Stefan Monnier <monnier@iro.umontreal.ca> writes: >> I thought only relatively minor and safe ones, >> but this one seems to break that rule, at least in my book. > > It looks pretty safe: the first step is to add the Lisp API, which > should not impact any other code (tho it may cause temporary build > failures, I guess). After that, set-auto-mode (and/or image.el, ...) > will need to be tweaked to also take libmagic into account > when available. This should also be fairly simple. I don't think that sounds simple or safe; however, there's not enough information to know for sure, until Joakim posts the patch. >> Maybe we should simply decide right here and now that Emacs 23.2 will >> be delivered from the RC branch, and open the trunk for all changes, >> even not-so-safe ones? > > Yes, that's pretty much where we're at, I think, yes. Actually, if people want to start including more intrusive changes, I think we should cut a new branch from the current trunk. This would postphone the CEDET merge to 23.3. ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: using libmagic in Emacs? 2009-08-19 13:47 ` Chong Yidong @ 2009-08-19 15:57 ` joakim 2009-08-19 19:46 ` next bugfix release? [was: Re: using libmagic in Emacs?] Dan Nicolaescu 1 sibling, 0 replies; 119+ messages in thread From: joakim @ 2009-08-19 15:57 UTC (permalink / raw) To: Chong Yidong; +Cc: Eli Zaretskii, Stefan Monnier, emacs-devel Chong Yidong <cyd@stupidchicken.com> writes: > Stefan Monnier <monnier@iro.umontreal.ca> writes: > >>> I thought only relatively minor and safe ones, >>> but this one seems to break that rule, at least in my book. >> >> It looks pretty safe: the first step is to add the Lisp API, which >> should not impact any other code (tho it may cause temporary build >> failures, I guess). After that, set-auto-mode (and/or image.el, ...) >> will need to be tweaked to also take libmagic into account >> when available. This should also be fairly simple. > > I don't think that sounds simple or safe; however, there's not enough > information to know for sure, until Joakim posts the patch. > >>> Maybe we should simply decide right here and now that Emacs 23.2 will >>> be delivered from the RC branch, and open the trunk for all changes, >>> even not-so-safe ones? >> >> Yes, that's pretty much where we're at, I think, yes. > > Actually, if people want to start including more intrusive changes, I > think we should cut a new branch from the current trunk. This would > postphone the CEDET merge to 23.3. Please dont postpone CEDET on my behalf! That would feel terrible. -- Joakim Verona ^ permalink raw reply [flat|nested] 119+ messages in thread
* next bugfix release? [was: Re: using libmagic in Emacs?] 2009-08-19 13:47 ` Chong Yidong 2009-08-19 15:57 ` joakim @ 2009-08-19 19:46 ` Dan Nicolaescu 2009-08-19 21:06 ` next bugfix release? Chong Yidong 1 sibling, 1 reply; 119+ messages in thread From: Dan Nicolaescu @ 2009-08-19 19:46 UTC (permalink / raw) To: Chong Yidong; +Cc: Eli Zaretskii, Stefan Monnier, joakim, emacs-devel Chong Yidong <cyd@stupidchicken.com> writes: > Stefan Monnier <monnier@iro.umontreal.ca> writes: > > >> I thought only relatively minor and safe ones, > >> but this one seems to break that rule, at least in my book. > > > > It looks pretty safe: the first step is to add the Lisp API, which > > should not impact any other code (tho it may cause temporary build > > failures, I guess). After that, set-auto-mode (and/or image.el, ...) > > will need to be tweaked to also take libmagic into account > > when available. This should also be fairly simple. > > I don't think that sounds simple or safe; however, there's not enough > information to know for sure, until Joakim posts the patch. > > >> Maybe we should simply decide right here and now that Emacs 23.2 will > >> be delivered from the RC branch, and open the trunk for all changes, > >> even not-so-safe ones? > > > > Yes, that's pretty much where we're at, I think, yes. > > Actually, if people want to start including more intrusive changes, I > think we should cut a new branch from the current trunk. Are there any plans for the next release? IMO we need to make a bug fix release ASAP, this bug: http://emacsbugs.donarmstrong.com/cgi-bin/bugreport.cgi?bug=4146 warrants it. Not being able to set the C style in a file (or directory) local variable is a major annoyance for users. ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: next bugfix release? 2009-08-19 19:46 ` next bugfix release? [was: Re: using libmagic in Emacs?] Dan Nicolaescu @ 2009-08-19 21:06 ` Chong Yidong 2009-08-19 21:53 ` Dan Nicolaescu 2009-08-19 22:56 ` Alan Mackenzie 0 siblings, 2 replies; 119+ messages in thread From: Chong Yidong @ 2009-08-19 21:06 UTC (permalink / raw) To: Dan Nicolaescu; +Cc: Eli Zaretskii, Stefan Monnier, joakim, emacs-devel Dan Nicolaescu <dann@ics.uci.edu> writes: > Are there any plans for the next release? The plan (and Stefan agrees) is to spend about the same amount of time as we did between 22.1 and 22.2. This would put 23.2 around April next year. > IMO we need to make a bug fix release ASAP, this bug: > > http://emacsbugs.donarmstrong.com/cgi-bin/bugreport.cgi?bug=4146 > > warrants it. Not being able to set the C style in a file (or > directory) local variable is a major annoyance for users. This bug is pretty serious. Happily, it only affects the trunk; the 23.1 release is unaffected. The most likely culprit is the 2009-07-18 change to cc-mode.el, which we did not apply to the release branch. ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: next bugfix release? 2009-08-19 21:06 ` next bugfix release? Chong Yidong @ 2009-08-19 21:53 ` Dan Nicolaescu 2009-08-19 22:56 ` Alan Mackenzie 1 sibling, 0 replies; 119+ messages in thread From: Dan Nicolaescu @ 2009-08-19 21:53 UTC (permalink / raw) To: Chong Yidong; +Cc: Eli Zaretskii, Stefan Monnier, joakim, emacs-devel Chong Yidong <cyd@stupidchicken.com> writes: > Dan Nicolaescu <dann@ics.uci.edu> writes: > > > IMO we need to make a bug fix release ASAP, this bug: > > > > http://emacsbugs.donarmstrong.com/cgi-bin/bugreport.cgi?bug=4146 > > > > warrants it. Not being able to set the C style in a file (or > > directory) local variable is a major annoyance for users. > > This bug is pretty serious. Happily, it only affects the trunk; the > 23.1 release is unaffected. Good to hear that! (I don't have a 23.1 handy, just the CVS build...) I'll make a note of this in the bug. ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: next bugfix release? 2009-08-19 21:06 ` next bugfix release? Chong Yidong 2009-08-19 21:53 ` Dan Nicolaescu @ 2009-08-19 22:56 ` Alan Mackenzie 2009-08-19 23:16 ` Nick Roberts 1 sibling, 1 reply; 119+ messages in thread From: Alan Mackenzie @ 2009-08-19 22:56 UTC (permalink / raw) To: Chong Yidong Cc: Eli Zaretskii, Dan Nicolaescu, Stefan Monnier, joakim, emacs-devel Hi, Everybody! On Wed, Aug 19, 2009 at 05:06:29PM -0400, Chong Yidong wrote: > Dan Nicolaescu <dann@ics.uci.edu> writes: > > Are there any plans for the next release? > The plan (and Stefan agrees) is to spend about the same amount of time > as we did between 22.1 and 22.2. This would put 23.2 around April next > year. > > IMO we need to make a bug fix release ASAP, this bug: > > http://emacsbugs.donarmstrong.com/cgi-bin/bugreport.cgi?bug=4146 > > warrants it. Not being able to set the C style in a file (or > > directory) local variable is a major annoyance for users. OK. This is for me to fix, so I'm acknowledging having seen it. > This bug is pretty serious. Happily, it only affects the trunk; the > 23.1 release is unaffected. Phew! Thanks for that! > The most likely culprit is the 2009-07-18 change to cc-mode.el, which > we did not apply to the release branch. Yes. It's one of these @dfn{wallpaper paste} bugs, where when you press an area of wallpaper firmly to the wall, the paste under it pops up another bit of wallpaper somewhere else. It's going to be a horrible bug to fix. Indeed, it may not be fixable, in the sense of doing the Right Thing under every reasonable circumstance. -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: next bugfix release? 2009-08-19 22:56 ` Alan Mackenzie @ 2009-08-19 23:16 ` Nick Roberts 2009-08-20 9:02 ` Lennart Borgman 2009-08-20 15:13 ` Alan Mackenzie 0 siblings, 2 replies; 119+ messages in thread From: Nick Roberts @ 2009-08-19 23:16 UTC (permalink / raw) To: Alan Mackenzie Cc: Chong Yidong, joakim, emacs-devel, Dan Nicolaescu, Stefan Monnier, Eli Zaretskii > Yes. It's one of these @dfn{wallpaper paste} bugs, where when you press > an area of wallpaper firmly to the wall, the paste under it pops up > another bit of wallpaper somewhere else. It's going to be a horrible bug > to fix. Indeed, it may not be fixable, in the sense of doing the Right > Thing under every reasonable circumstance. Such 'bubbles', usually called regressions, might be less likely to appear if, as Cyd suggested, there was a CC mode equivalent of compilation.txt. I can't find this post now, so apologies if it reached a logical conclusion. It might also be harder to implement than compilation.txt, as expressions are probably not as self contained, but some kind of testuite seems essential to prevent these issues from recurring. -- Nick http://www.inet.net.nz/~nickrob ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: next bugfix release? 2009-08-19 23:16 ` Nick Roberts @ 2009-08-20 9:02 ` Lennart Borgman 2009-08-20 11:19 ` Eric M. Ludlam 2009-08-20 15:13 ` Alan Mackenzie 1 sibling, 1 reply; 119+ messages in thread From: Lennart Borgman @ 2009-08-20 9:02 UTC (permalink / raw) To: Nick Roberts Cc: Chong Yidong, joakim, emacs-devel, Dan Nicolaescu, Stefan Monnier, Alan Mackenzie, Eli Zaretskii On Thu, Aug 20, 2009 at 1:16 AM, Nick Roberts<nickrob@snap.net.nz> wrote: > > Yes. It's one of these @dfn{wallpaper paste} bugs, where when you press > > an area of wallpaper firmly to the wall, the paste under it pops up > > another bit of wallpaper somewhere else. It's going to be a horrible bug > > to fix. Indeed, it may not be fixable, in the sense of doing the Right > > Thing under every reasonable circumstance. > > Such 'bubbles', usually called regressions, might be less likely to appear > if, as Cyd suggested, there was a CC mode equivalent of compilation.txt. > I can't find this post now, so apologies if it reached a logical conclusion. > It might also be harder to implement than compilation.txt, as expressions > are probably not as self contained, but some kind of testuite seems essential > to prevent these issues from recurring. Is not some kind of unit tests what we want here? Adding a unit test for at least every serious and difficult bug seems to me the right thing to do. There are some unit tests frameworks on EmacsWiki. I have one modified version of one of these in nXhtml. And CEDET has some unit tests (which I have not looked at but I have run the test suite). ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: next bugfix release? 2009-08-20 9:02 ` Lennart Borgman @ 2009-08-20 11:19 ` Eric M. Ludlam 0 siblings, 0 replies; 119+ messages in thread From: Eric M. Ludlam @ 2009-08-20 11:19 UTC (permalink / raw) To: Lennart Borgman Cc: Nick Roberts, joakim, emacs-devel, Dan Nicolaescu, Stefan Monnier, Alan Mackenzie, Eli Zaretskii, Chong Yidong On Thu, 2009-08-20 at 11:02 +0200, Lennart Borgman wrote: > On Thu, Aug 20, 2009 at 1:16 AM, Nick Roberts<nickrob@snap.net.nz> wrote: > > > Yes. It's one of these @dfn{wallpaper paste} bugs, where when you press > > > an area of wallpaper firmly to the wall, the paste under it pops up > > > another bit of wallpaper somewhere else. It's going to be a horrible bug > > > to fix. Indeed, it may not be fixable, in the sense of doing the Right > > > Thing under every reasonable circumstance. > > > > Such 'bubbles', usually called regressions, might be less likely to appear > > if, as Cyd suggested, there was a CC mode equivalent of compilation.txt. > > I can't find this post now, so apologies if it reached a logical conclusion. > > It might also be harder to implement than compilation.txt, as expressions > > are probably not as self contained, but some kind of testuite seems essential > > to prevent these issues from recurring. > > > Is not some kind of unit tests what we want here? > > Adding a unit test for at least every serious and difficult bug seems > to me the right thing to do. > > > There are some unit tests frameworks on EmacsWiki. I have one modified > version of one of these in nXhtml. And CEDET has some unit tests > (which I have not looked at but I have run the test suite). Just as another vote for testing, CEDET floundered for a long time with performance and accuracy issues (say from 1996 through 2007 or so) until I started adding test suites. Ever since, I've been able to do sweeping changes to the underpinnings for performance, bugs or new features, and know that everything still works by running a single "make" command. Every user reported bug I fix gets a new test in one of the pre-existing test suites as I work on the bug, and several folks on the mailing list now are very good at providing test snippets for me, keeping maintenance and overhead low. I highly recommend doing the same for any complex task in Emacs. Eric ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: next bugfix release? 2009-08-19 23:16 ` Nick Roberts 2009-08-20 9:02 ` Lennart Borgman @ 2009-08-20 15:13 ` Alan Mackenzie 2009-08-20 15:47 ` Lennart Borgman 1 sibling, 1 reply; 119+ messages in thread From: Alan Mackenzie @ 2009-08-20 15:13 UTC (permalink / raw) To: Nick Roberts Cc: Chong Yidong, joakim, emacs-devel, Dan Nicolaescu, Stefan Monnier, Eli Zaretskii Hi, Nick, On Thu, Aug 20, 2009 at 11:16:18AM +1200, Nick Roberts wrote: > > Yes. It's one of these @dfn{wallpaper paste} bugs, where when you > > press an area of wallpaper firmly to the wall, the paste under it > > pops up another bit of wallpaper somewhere else. It's going to be > > a horrible bug to fix. Indeed, it may not be fixable, in the sense > > of doing the Right Thing under every reasonable circumstance. > Such 'bubbles', usually called regressions, .... No, that's not what I meant (although it is also a regression). I was talking about the high complexity caused by the number of things that all have to work right at the same time. The "fix" I made to cc-mode.el in July fixed one problem but created another. Getting them all working simultaneously is going to be hard. This complexity has increased recently due to the new feature "directory locals". I didn't become aware of this when it was introduced (my bad), and the person who wrote it wasn't aware of the trouble it would cause CC Mode (why should he be?). The trouble is, there are too many ways of setting a CC Mode "style variable" (such as c-basic-offset), @xref{Config Basics,,, ccmode}. It is not always the last setting which should prevail over previous ones. It is a complexity which nobody would design; it has emerged as such over CC Mode's lifetime, and is now a mess. the .dir-locals feature may have pushed the complexity over the edge of what is manageable. > ...., might be less likely to appear if, as Cyd suggested, there was a > CC mode equivalent of compilation.txt. Er, what's .../etc/compilation.txt about? It has an alleged explanation at the top, but that only makes sense if you already have some context. For example, you need to know why you'd "need matchers", and what sort of "matchers" they are. > I can't find this post now, so apologies if it reached a logical > conclusion. It might also be harder to implement than > compilation.txt, as expressions are probably not as self contained, > but some kind of testuite seems essential to prevent these issues from > recurring. CC Mode has an extensive test suite for (static) indentation and fontification. It doesn't have any such tests for things like initialisation of the mode, execution of CC Mode commands, or for indentation/fontification after buffer changes. I would very much welcome anybody stepping forward who had the time and energy to write these tests. > Nick -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: next bugfix release? 2009-08-20 15:13 ` Alan Mackenzie @ 2009-08-20 15:47 ` Lennart Borgman 0 siblings, 0 replies; 119+ messages in thread From: Lennart Borgman @ 2009-08-20 15:47 UTC (permalink / raw) To: Alan Mackenzie Cc: Nick Roberts, joakim, emacs-devel, Dan Nicolaescu, Stefan Monnier, Eli Zaretskii, Chong Yidong On Thu, Aug 20, 2009 at 5:13 PM, Alan Mackenzie<acm@muc.de> wrote: > CC Mode has an extensive test suite for (static) indentation and > fontification. It doesn't have any such tests for things like > initialisation of the mode, execution of CC Mode commands, or for > indentation/fontification after buffer changes. I would very much > welcome anybody stepping forward who had the time and energy to write > these tests. The tests I use for nXhtml tries to emulate commands (taking into pre/post etc) which might help for doing this. It also tries to fontify by explicitly calling the timers. Maybe this can help. If someone wants to try I can explain more. ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: installing features on trunk (was: using libmagic in Emacs?) 2009-08-19 3:21 ` Stefan Monnier 2009-08-19 13:47 ` Chong Yidong @ 2009-08-19 19:05 ` Eli Zaretskii 2009-08-21 18:59 ` Bidi support Stefan Monnier 1 sibling, 1 reply; 119+ messages in thread From: Eli Zaretskii @ 2009-08-19 19:05 UTC (permalink / raw) To: Stefan Monnier; +Cc: cyd, emacs-devel > From: Stefan Monnier <monnier@iro.umontreal.ca> > Cc: joakim@verona.se, cyd@stupidchicken.com, emacs-devel@gnu.org > Date: Tue, 18 Aug 2009 23:21:13 -0400 > > > I thought only relatively minor and safe ones, > > but this one seems to break that rule, at least in my book. > > It looks pretty safe As you see, even Yidong is not sure he agrees, and neither am I. > I was thinking of bidi for Emacs-24 If history is of any significance, I may not live until Emacs 24. And for some strange reason, the burden of adding this feature seems to be on my shoulders and no one else's: no development happened in this direction for the last several years, even though most of the low-level code was sitting on a branch (courtesy of Handa-san) for the last 4 years. So I'd prefer it to happen sooner rather than later, at least to the point where the foundations are in place and others can contribute the rest. > but if you have code ready for it > and if it's not too intrusive, I'd be willing to consider it. It is not ``ready'' in the sense that it is not yet production quality. It does not yet support all the features of the Emacs display engine. But it can already display bidirectional text, for now only in a left-to-right paragraph and only if the text has no faces and overlays. The code that reorders characters for display isn't activated until you flip a buffer-local variable, and then only in that buffer. Is that ``not too intrusive'' enough? ^ permalink raw reply [flat|nested] 119+ messages in thread
* Bidi support 2009-08-19 19:05 ` installing features on trunk (was: using libmagic in Emacs?) Eli Zaretskii @ 2009-08-21 18:59 ` Stefan Monnier 2009-08-21 20:44 ` Eli Zaretskii 2009-08-22 5:39 ` Stephen J. Turnbull 0 siblings, 2 replies; 119+ messages in thread From: Stefan Monnier @ 2009-08-21 18:59 UTC (permalink / raw) To: Eli Zaretskii; +Cc: cyd, emacs-devel >> I was thinking of bidi for Emacs-24 > If history is of any significance, I may not live until Emacs 24. And > for some strange reason, the burden of adding this feature seems to be > on my shoulders and no one else's: no development happened in this > direction for the last several years, even though most of the > low-level code was sitting on a branch (courtesy of Handa-san) for the > last 4 years. Has this branch been kept up-to-date w.r.t the trunk? I'd guess not. In that case, someone should do it. I just took a look at that branch, and it doesn't look too terrible. It made me discover the variable direction-reversed which I didn't even know existed (and it seems that it currently has no effect :-( > So I'd prefer it to happen sooner rather than later, at least to the > point where the foundations are in place and others can contribute > the rest. Agreed. The more I think about it, the more I think we need to open a new branch for "what will become emacs-24". Kind of like what we did with the emacs-unicode branch. I think bidi should be one of the first features to install on that branch. >> but if you have code ready for it >> and if it's not too intrusive, I'd be willing to consider it. > It is not ``ready'' in the sense that it is not yet production > quality. It does not yet support all the features of the Emacs > display engine. But it can already display bidirectional text, for > now only in a left-to-right paragraph and only if the text has no > faces and overlays. The code that reorders characters for display > isn't activated until you flip a buffer-local variable, and then only > in that buffer. Is that ``not too intrusive'' enough? I think it will stay unstable for too long, so it's not good enough for the current trunk (which I'd like to keep for shorter-term changes). Stefan ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: Bidi support 2009-08-21 18:59 ` Bidi support Stefan Monnier @ 2009-08-21 20:44 ` Eli Zaretskii 2009-08-22 3:39 ` Stefan Monnier 2009-08-22 8:18 ` Jason Rumney 2009-08-22 5:39 ` Stephen J. Turnbull 1 sibling, 2 replies; 119+ messages in thread From: Eli Zaretskii @ 2009-08-21 20:44 UTC (permalink / raw) To: Stefan Monnier; +Cc: cyd, emacs-devel > From: Stefan Monnier <monnier@iro.umontreal.ca> > Cc: cyd@stupidchicken.com, emacs-devel@gnu.org > Date: Fri, 21 Aug 2009 14:59:24 -0400 > > Has this branch been kept up-to-date w.r.t the trunk? No. > I'd guess not. In that case, someone should do it. I already did. I have a source tree where the bidi code is merged with the current trunk, and I merge them as needed all the time. That's where I do all the development beyond what's on the bidi branch (which is dead, as far as I'm concerned). I can keep it that way forever, but I'd prefer to have it in the repository soon, because others might wish to work on it. > Agreed. The more I think about it, the more I think we need to open > a new branch for "what will become emacs-24". Kind of like what we did > with the emacs-unicode branch. I think bidi should be one of the first > features to install on that branch. Why not the other way around: make a branch for Emacs 23.x, and leave Emacs 24 on the trunk? I think Yidong suggested that, and I think it's a better idea. We never left the mainline of our development on a branch before. > I think it will stay unstable for too long, so it's not good enough for > the current trunk (which I'd like to keep for shorter-term changes). I guess that's a NO, but please note that this code, however unstable, is never executed unless the user flips a variable. So I don't see how it can destabilize the default configuration by being dead ballast. ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: Bidi support 2009-08-21 20:44 ` Eli Zaretskii @ 2009-08-22 3:39 ` Stefan Monnier 2009-08-22 8:18 ` Jason Rumney 1 sibling, 0 replies; 119+ messages in thread From: Stefan Monnier @ 2009-08-22 3:39 UTC (permalink / raw) To: Eli Zaretskii; +Cc: cyd, emacs-devel > I already did. I have a source tree where the bidi code is merged > with the current trunk, and I merge them as needed all the time. Good, thank you. >> Agreed. The more I think about it, the more I think we need to open >> a new branch for "what will become emacs-24". Kind of like what we did >> with the emacs-unicode branch. I think bidi should be one of the first >> features to install on that branch. > Why not the other way around: make a branch for Emacs 23.x, and leave > Emacs 24 on the trunk? I think Yidong suggested that, and I think > it's a better idea. Sure. I tend to forget about CVS's idea that one of the branches is deemed special. So yes, "open a new branch for emacw-24" here would mean "create a new CVS branch for Emacs-23.2 and use the trunk for Emacs-24". [ I'm eagerly waiting to switch over to a system where branches are easier to use. ] > We never left the mainline of our development on a branch before. Actually we did for emacs-unicode ;-) Not that it matters, tho. >> I think it will stay unstable for too long, so it's not good enough for >> the current trunk (which I'd like to keep for shorter-term changes). > I guess that's a NO, but please note that this code, however unstable, > is never executed unless the user flips a variable. So I don't see > how it can destabilize the default configuration by being > dead ballast. I know, but somehow having such experimental code there makes me uneasy. Stefan ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: Bidi support 2009-08-21 20:44 ` Eli Zaretskii 2009-08-22 3:39 ` Stefan Monnier @ 2009-08-22 8:18 ` Jason Rumney 1 sibling, 0 replies; 119+ messages in thread From: Jason Rumney @ 2009-08-22 8:18 UTC (permalink / raw) To: Eli Zaretskii; +Cc: cyd, Stefan Monnier, emacs-devel Eli Zaretskii wrote: > Why not the other way around: make a branch for Emacs 23.x, and leave > Emacs 24 on the trunk? I think Yidong suggested that, and I think > it's a better idea. We never left the mainline of our development on > a branch before. > We did for both unicode, and for multi-tty. Once the code was usable on GNU/Linux and at least compilable and didn't break on all other platforms (which involved removing Carbon support, because noone wanted to work on it), it was merged back to the trunk. We should probably aim to keep the branch period short, but it would be useful to check what you have now into a new branch so others can try it before it goes on the trunk. ^ permalink raw reply [flat|nested] 119+ messages in thread
* Bidi support 2009-08-21 18:59 ` Bidi support Stefan Monnier 2009-08-21 20:44 ` Eli Zaretskii @ 2009-08-22 5:39 ` Stephen J. Turnbull 2009-08-22 7:31 ` Eli Zaretskii 2009-08-24 1:45 ` Kenichi Handa 1 sibling, 2 replies; 119+ messages in thread From: Stephen J. Turnbull @ 2009-08-22 5:39 UTC (permalink / raw) To: Stefan Monnier; +Cc: Eli Zaretskii, cyd, emacs-devel Stefan Monnier writes: > Agreed. The more I think about it, the more I think we need to open > a new branch for "what will become emacs-24". That's what I thought, too, about XEmacs 21.5. I was wrong. We ended up having to uproot the trunk and move it to a branch, and graft the 21.5 branch back as the trunk. Long-term development belongs either on the trunk, or in feature branches. Not on a long-term development branch which collects several features. Once you move to bzr, this will become less costly, but remember that in bzr (like CVS, but not so disastrously as CVS) some branches are more equal than others in the way they are presented to the user. It's not like git where you just do git branch emacs-23 master git branch -f master emacs-24 git branch -D emacs-24 # optional and everything's alright. > I think it will stay unstable for too long, so it's not good enough for > the current trunk (which I'd like to keep for shorter-term changes). I know how you feel, but doing things this way is either going to be a lot of work (synching "short-term changes" from the trunk to the long-term branch -- in my experience, people work on features like unicode and bidi in spurts, and they're pervasive changes so conflicts are frequent if you come back every month or so), or discourage work on the long-term branch (conflict resolution is enthusiasm-draining). It also fails to encourage work on the instabilities by third parties. I may be all wet; ask Ken'ichi and Miles how they feel about the long stalls on the unicode and lexbind branches. And maybe bzr will be flexible enough to handle it, but you won't know that until Christmas, I would guess. I'm still finding new things I dislike about Mercurial 20 months after our conversion.... ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: Bidi support 2009-08-22 5:39 ` Stephen J. Turnbull @ 2009-08-22 7:31 ` Eli Zaretskii 2009-08-24 1:45 ` Kenichi Handa 1 sibling, 0 replies; 119+ messages in thread From: Eli Zaretskii @ 2009-08-22 7:31 UTC (permalink / raw) To: Stephen J. Turnbull; +Cc: cyd, monnier, emacs-devel > From: "Stephen J. Turnbull" <stephen@xemacs.org> > Cc: Eli Zaretskii <eliz@gnu.org>, > cyd@stupidchicken.com, > emacs-devel@gnu.org > Date: Sat, 22 Aug 2009 14:39:59 +0900 > > I know how you feel, but doing things this way is either going to be a > lot of work (synching "short-term changes" from the trunk to the > long-term branch -- in my experience, people work on features like > unicode and bidi in spurts, and they're pervasive changes so conflicts > are frequent if you come back every month or so), or discourage work > on the long-term branch (conflict resolution is enthusiasm-draining). I merge once a week, for this very reason. So far, no conflicts, since the changes are limited to the display engine, where no active development happened for quite some time. Also, all the changes for now have the form if (!bidi) { old code } else { new code } and changes only happen in the `else' branch. This decreases the probability of conflicts even more (and also avoids destabilizing the stable code of yore). Once the infrastructure part is over, and people start changing Lisp packages, then yes, I guess the probability of conflict will soar. And as I wrote, I also think the trunk should be Emacs 24, while 23.x should be on a branch. ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: Bidi support 2009-08-22 5:39 ` Stephen J. Turnbull 2009-08-22 7:31 ` Eli Zaretskii @ 2009-08-24 1:45 ` Kenichi Handa 2009-08-24 3:12 ` Eli Zaretskii 2009-08-24 3:25 ` Stephen J. Turnbull 1 sibling, 2 replies; 119+ messages in thread From: Kenichi Handa @ 2009-08-24 1:45 UTC (permalink / raw) To: Stephen J. Turnbull; +Cc: cyd, eliz, monnier, emacs-devel In article <87bpm84dao.fsf@uwakimon.sk.tsukuba.ac.jp>, "Stephen J. Turnbull" <stephen@xemacs.org> writes: > Stefan Monnier writes: > Agreed. The more I think about it, the more I think we need to open > a new branch for "what will become emacs-24". > That's what I thought, too, about XEmacs 21.5. I was wrong. We ended > up having to uproot the trunk and move it to a branch, and graft the > 21.5 branch back as the trunk. Long-term development belongs either > on the trunk, or in feature branches. Not on a long-term development > branch which collects several features. Having an separate branch has at least one merit. As far as it is branched from a fairly stable version, a person working on the branch can be sure that any problem in that branch is caused by his change. For the case of bidi, if there's a plan of another big change in the display engine, the above merit is big. With a separate branch, people working on bidi code don't have to be annoyed by bugs of that another change. Otherwise, I think having bidi code in the trunk is better. By the way, the case of emacs-unicode is very special. It simply can't be in the trunk while developing because the new unicode feature can't be toggled. --- Kenichi Handa handa@m17n.org ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: Bidi support 2009-08-24 1:45 ` Kenichi Handa @ 2009-08-24 3:12 ` Eli Zaretskii 2009-08-24 7:17 ` Kenichi Handa 2009-08-24 3:25 ` Stephen J. Turnbull 1 sibling, 1 reply; 119+ messages in thread From: Eli Zaretskii @ 2009-08-24 3:12 UTC (permalink / raw) To: Kenichi Handa; +Cc: cyd, stephen, monnier, emacs-devel > From: Kenichi Handa <handa@m17n.org> > CC: monnier@iro.umontreal.ca, eliz@gnu.org, cyd@stupidchicken.com, > emacs-devel@gnu.org > Date: Mon, 24 Aug 2009 10:45:05 +0900 > > By the way, the case of emacs-unicode is very special. It > simply can't be in the trunk while developing because the > new unicode feature can't be toggled. Exactly, and that's precisely why the bidi development _can_ be done on the trunk without the burden of distinguishing its bugs from the others: toggle the feature off, and if the bug persists, it's not from bidi. OTOH, the HUGE advantage of working on the trunk is that you don't need to merge all the time, or risk falling out of sync. You also don't inherit random bugs that existed at the time of the branch creation. ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: Bidi support 2009-08-24 3:12 ` Eli Zaretskii @ 2009-08-24 7:17 ` Kenichi Handa 0 siblings, 0 replies; 119+ messages in thread From: Kenichi Handa @ 2009-08-24 7:17 UTC (permalink / raw) To: Eli Zaretskii; +Cc: cyd, emacs-devel, stephen, monnier In article <83y6p9ewgj.fsf@gnu.org>, Eli Zaretskii <eliz@gnu.org> writes: > > By the way, the case of emacs-unicode is very special. It > > simply can't be in the trunk while developing because the > > new unicode feature can't be toggled. > Exactly, and that's precisely why the bidi development _can_ be done > on the trunk without the burden of distinguishing its bugs from the > others: toggle the feature off, and if the bug persists, it's not from > bidi. Even if that bug is not from bidi, as far as bidi support suffers from it, the work for bidi support must be suspended until that bug is fixed. For instance, please consider the situation that face-handling code gets unstable at some point. If we are working on a bug that bidi-text can't be displayed by a correct face (just a hypothetical bug), that work must be suspended. > OTOH, the HUGE advantage of working on the trunk is that you don't > need to merge all the time, or risk falling out of sync. You also > don't inherit random bugs that existed at the time of the branch > creation. I understand that merit too. I even tend to agree on having the bidi code in the trunk. I just wanted to point out the possibility of the above demerit. --- Kenichi Handa handa@m17n.org ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: Bidi support 2009-08-24 1:45 ` Kenichi Handa 2009-08-24 3:12 ` Eli Zaretskii @ 2009-08-24 3:25 ` Stephen J. Turnbull 1 sibling, 0 replies; 119+ messages in thread From: Stephen J. Turnbull @ 2009-08-24 3:25 UTC (permalink / raw) To: Kenichi Handa; +Cc: cyd, eliz, monnier, emacs-devel Kenichi Handa writes: > In article <87bpm84dao.fsf@uwakimon.sk.tsukuba.ac.jp>, "Stephen J. Turnbull" <stephen@xemacs.org> writes: > > > Stefan Monnier writes: > > Agreed. The more I think about it, the more I think we need to open > > a new branch for "what will become emacs-24". > > > That's what I thought, too, about XEmacs 21.5. I was wrong. We ended > > up having to uproot the trunk and move it to a branch, and graft the > > 21.5 branch back as the trunk. Long-term development belongs either > > on the trunk, or in feature branches. Not on a long-term development > > branch which collects several features. > Having an separate branch has at least one merit. As far as > it is branched from a fairly stable version, What you describe is what I mean by "feature branch". Features branches are a tried and true way to work; I do not mean to say "don't use feature branches". > For the case of bidi, if there's a plan of another big > change in the display engine, the above merit is big. bidi already has a branch or a repository or something. It only needs to be canonized as "accepted in principle" for v24, and given an official URL. My understanding of what Stefan proposed is something different: that as the maintainers decide that some features are important to add, they be merged to the "for v24" branch: bidi with lexbind with .... > By the way, the case of emacs-unicode is very special. It > simply can't be in the trunk while developing because the > new unicode feature can't be toggled. By "can't", I guess you mean you didn't design it to be toggled? Ben Wing worked out how to have toggle-able buffer formats in 2002 (then disappeared from XEmacs, unfortunately, but the infrastructure is present). I'm not saying it's a good idea, but it's possible. ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: using libmagic in Emacs? 2009-08-18 19:23 ` Stefan Monnier 2009-08-18 20:01 ` Chong Yidong @ 2009-08-19 0:57 ` Juri Linkov 2009-08-20 3:42 ` Richard Stallman 2009-08-19 22:49 ` joakim 2 siblings, 1 reply; 119+ messages in thread From: Juri Linkov @ 2009-08-19 0:57 UTC (permalink / raw) To: Stefan Monnier; +Cc: joakim, Emacs Development >> There are some operations in Emacs which tries to do the same thing as >> the "libmagic" library, which is the core of the "file" utility, does. >> >> For instance, in "image.el" there is functionality to look at magic >> numbers in image files. image.el doesn't recognize some rare JPEG formats, so libmagic will be useful here. >> Also, I often whish that files would open in Emacs with correct mode >> more often when there is no file extension. libmagick could supplement `magic-mode-alist' and `magic-fallback-mode-alist'. > I think it's a good idea. It may require some non-trivial changes on > the Lisp side, since libmagic's information is not quite the same as > what Emacs currently uses: we'll probably want to use libmagic to get > a MIME-type and then have a table mapping mime-types to major modes or > some such. gnus/mailcap.el contains a table mapping MIME-types to major modes. -- Juri Linkov http://www.jurta.org/emacs/ ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: using libmagic in Emacs? 2009-08-19 0:57 ` using libmagic in Emacs? Juri Linkov @ 2009-08-20 3:42 ` Richard Stallman 2009-08-22 23:36 ` Juri Linkov 0 siblings, 1 reply; 119+ messages in thread From: Richard Stallman @ 2009-08-20 3:42 UTC (permalink / raw) To: Juri Linkov; +Cc: monnier, joakim, emacs-devel >> For instance, in "image.el" there is functionality to look at magic >> numbers in image files. image.el doesn't recognize some rare JPEG formats, so libmagic will be useful here. Extending image.el mightbe a lot easier than other solutions. ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: using libmagic in Emacs? 2009-08-20 3:42 ` Richard Stallman @ 2009-08-22 23:36 ` Juri Linkov 2009-08-24 0:07 ` Richard Stallman 0 siblings, 1 reply; 119+ messages in thread From: Juri Linkov @ 2009-08-22 23:36 UTC (permalink / raw) To: rms; +Cc: monnier, joakim, emacs-devel > >> For instance, in "image.el" there is functionality to look at > >> magic numbers in image files. > > image.el doesn't recognize some rare JPEG formats, so libmagic > will be useful here. > > Extending image.el mightbe a lot easier than other solutions. The problem is that `image-jpeg-p' in image.el refuses to accept non-JFIF JPEG image files whereas Emacs can correctly display them when tests in `image-jpeg-p' are ignored. Using libmagic means looking only for 2 bytes 0xffd8 (a magic number of JPEG files) as described by the magic number file: 0 beshort 0xffd8 JPEG image data It seems this is enough to determine JPEG files. But I'm not confident about removing additional tests from `image-jpeg-p'. We could keep the current rules in image.el as a fall-back when libmagic is not available. -- Juri Linkov http://www.jurta.org/emacs/ ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: using libmagic in Emacs? 2009-08-22 23:36 ` Juri Linkov @ 2009-08-24 0:07 ` Richard Stallman 2009-08-24 0:17 ` Juri Linkov 2009-08-25 20:36 ` Juri Linkov 0 siblings, 2 replies; 119+ messages in thread From: Richard Stallman @ 2009-08-24 0:07 UTC (permalink / raw) To: Juri Linkov; +Cc: monnier, joakim, emacs-devel The problem is that `image-jpeg-p' in image.el refuses to accept non-JFIF JPEG image files whereas Emacs can correctly display them when tests in `image-jpeg-p' are ignored. Using libmagic means looking only for 2 bytes 0xffd8 (a magic number of JPEG files) as described by the magic number file: 0 beshort 0xffd8 JPEG image data It seems this is enough to determine JPEG files. But I'm not confident about removing additional tests from `image-jpeg-p'. We could keep the current rules in image.el as a fall-back when libmagic is not available. Whatever we do with the function `image-jpeg-p', we could easily make Emacs test these two bytes. It makes no sense to install code to link with libmagic just to handle that and a few other similar things. Meanwhile, for operations less common and important than visiting a file, running `file' is easy to do. Combining those two approaches seems much better than adding code to link with libmagic. ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: using libmagic in Emacs? 2009-08-24 0:07 ` Richard Stallman @ 2009-08-24 0:17 ` Juri Linkov 2009-08-24 7:33 ` joakim 2009-08-25 2:08 ` Richard Stallman 2009-08-25 20:36 ` Juri Linkov 1 sibling, 2 replies; 119+ messages in thread From: Juri Linkov @ 2009-08-24 0:17 UTC (permalink / raw) To: rms; +Cc: monnier, joakim, emacs-devel > Whatever we do with the function `image-jpeg-p', we could easily make > Emacs test these two bytes. It makes no sense to install code to link > with libmagic just to handle that and a few other similar things. > > Meanwhile, for operations less common and important than visiting a file, > running `file' is easy to do. > > Combining those two approaches seems much better than adding code to > link with libmagic. Of course, before adding code to link with libmagic we should analyze how useful it would be. I see its usefulness at least in the following areas: 1. Archive file types A popular way to create new archive file types nowadays is to register a new file extension with the old data compression and archive format. For instance, Java archive files have the .jar extension but build on the ZIP file format, so they can be visited in Emacs with the help of `archive-mode'. Enterprise Java archives with the .ear extension and Web application Java archives with the .war extension all are based on the ZIP file format as well as OpenDocument files with extensions .odt .ods .odb .odp .odg .odf, Firefox add-ons (.xpi), Keyhole Markup (.kmz), and many other file types that can be potentially opened in Emacs if were identified as archive files by libmagic. We can't track and add all new formats. This is the main task of libmagic. 2. Image file types Using ImageMagick in Emacs can support over 100 image file formats. It won't possible to recognize all them without libmagic. 3. MIME-types handling Emacs can process different MIME-type detected by libmagic. Even when Emacs has no special handling for a file type, it is still useful to let Emacs run an external program associated with its MIME-type for users who prefer running programs (including GUI programs) from Emacs instead of using a window manager's application menu. -- Juri Linkov http://www.jurta.org/emacs/ ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: using libmagic in Emacs? 2009-08-24 0:17 ` Juri Linkov @ 2009-08-24 7:33 ` joakim 2009-08-25 2:08 ` Richard Stallman 1 sibling, 0 replies; 119+ messages in thread From: joakim @ 2009-08-24 7:33 UTC (permalink / raw) To: Juri Linkov; +Cc: emacs-devel, rms, monnier Juri Linkov <juri@jurta.org> writes: > > Of course, before adding code to link with libmagic we should analyze > how useful it would be. I see its usefulness at least in the following > areas: > > 1. Archive file types ... > > We can't track and add all new formats. This is the main task of libmagic. > > 2. Image file types > > Using ImageMagick in Emacs can support over 100 image file formats. > It won't possible to recognize all them without libmagic. > > 3. MIME-types handling > > Emacs can process different MIME-type detected by libmagic. > Even when Emacs has no special handling for a file type, it is > still useful to let Emacs run an external program associated > with its MIME-type for users who prefer running programs > (including GUI programs) from Emacs instead of using a window > manager's application menu. I agree completely with Juri, it is these two cases that motivates me to work on the libmagic support. There are a great deal of formats Emacs is able to open, but not to recognize, such as all the many different Java archive formats Juri mentions. There are compressed image formats like SVG. Also, if we decide to merge the imagemagick patch, hundreds of new image file formats will be supported, that libmagic will help identify. Notice that I do not propose to replace Emacs current file recognition, only to expand it when libmagic is available. We can also expand a bit of the current handling to take care of a few well known cases, such as the jpeg one. If no libmagic, then Emacs will behave as today, or slightly better. If libmagic, many new file formats will open correctly. I honestly cant see any drawbacks with this aproach, other than me consuming some list resources while developing the patch. I hope to be able to reciprocate with improved documentation for developing Emacs primitives. Lets also not forget the friendly competition from other free text editors. Emacs IMHO does the right thing with files more often than other editors, lets improve on this strength! -- Joakim Verona ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: using libmagic in Emacs? 2009-08-24 0:17 ` Juri Linkov 2009-08-24 7:33 ` joakim @ 2009-08-25 2:08 ` Richard Stallman 2009-08-25 2:19 ` Miles Bader ` (3 more replies) 1 sibling, 4 replies; 119+ messages in thread From: Richard Stallman @ 2009-08-25 2:08 UTC (permalink / raw) To: Juri Linkov; +Cc: monnier, joakim, emacs-devel For instance, Java archive files have the .jar extension but build on the ZIP file format, so they can be visited in Emacs with the help of `archive-mode'. Enterprise Java archives with the .ear extension and Web application Java archives with the .war extension all are based on the ZIP file format as well as OpenDocument files with extensions .odt .ods .odb .odp .odg .odf, Firefox add-ons (.xpi), Keyhole Markup (.kmz), and many other file types that can be potentially opened in Emacs if were identified as archive files by libmagic. How hard would it be to change the code in Emacs to recognize these using the existing mechanism? Using ImageMagick in Emacs can support over 100 image file formats. It won't possible to recognize all them without libmagic. Maybe this is useful. Is there an easy way to recognize files that could be passed to ImageMagick? Emacs can process different MIME-type detected by libmagic. That is not useful for visiting files in Emacs, since Emacs has no special handling for many of these mime types. If some other Lisp code is interested in the mime type of a file, there is a much easier way to find it out: run `file'. To complicate Emacs with another library just to make that operation a little faster is a step in the wrong direction. ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: using libmagic in Emacs? 2009-08-25 2:08 ` Richard Stallman @ 2009-08-25 2:19 ` Miles Bader 2009-08-25 5:09 ` joakim ` (2 more replies) 2009-08-25 17:36 ` Stefan Monnier ` (2 subsequent siblings) 3 siblings, 3 replies; 119+ messages in thread From: Miles Bader @ 2009-08-25 2:19 UTC (permalink / raw) To: rms; +Cc: Juri Linkov, monnier, joakim, emacs-devel Richard Stallman <rms@gnu.org> writes: > If some other Lisp code is interested in the mime type of a file, > there is a much easier way to find it out: run `file'. Well, not entirely trivial of course, since Emacs then needs to interpret the results from the file command. > To complicate Emacs with another library just to make that operation > a little faster is a step in the wrong direction. I wonder how hard it would be to have some elisp that understands the "magic" rules that the file command uses... at first glance, they don't seem particularly complex; e.g., this is the first entry from "/usr/share/file/magic": 0 lelong 0xc3cbc6c5 RISC OS Chunk data >12 string OBJ_ \b, AOF object >12 string LIB_ \b, ALF library If there was such elisp code, Emacs could use any magic rules file on the system, and in addition, could distribute it's own (perhaps smaller) set. -Miles -- 自らを空にして、心を開く時、道は開かれる ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: using libmagic in Emacs? 2009-08-25 2:19 ` Miles Bader @ 2009-08-25 5:09 ` joakim 2009-08-25 13:27 ` James Cloos 2009-08-25 21:41 ` Thien-Thi Nguyen 2 siblings, 0 replies; 119+ messages in thread From: joakim @ 2009-08-25 5:09 UTC (permalink / raw) To: Miles Bader; +Cc: Juri Linkov, emacs-devel, rms, monnier Miles Bader <miles@gnu.org> writes: > Richard Stallman <rms@gnu.org> writes: >> If some other Lisp code is interested in the mime type of a file, >> there is a much easier way to find it out: run `file'. > > Well, not entirely trivial of course, since Emacs then needs to > interpret the results from the file command. > >> To complicate Emacs with another library just to make that operation >> a little faster is a step in the wrong direction. > > I wonder how hard it would be to have some elisp that understands the > "magic" rules that the file command uses... at first glance, they don't > seem particularly complex; e.g., this is the first entry from > "/usr/share/file/magic": > > 0 lelong 0xc3cbc6c5 RISC OS Chunk data > >12 string OBJ_ \b, AOF object > >12 string LIB_ \b, ALF library > > If there was such elisp code, Emacs could use any magic rules file on > the system, and in addition, could distribute it's own (perhaps smaller) > set. I agree that the main benefit of using libmagic is getting access to the libmagic database. > -Miles -- Joakim Verona ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: using libmagic in Emacs? 2009-08-25 2:19 ` Miles Bader 2009-08-25 5:09 ` joakim @ 2009-08-25 13:27 ` James Cloos 2009-08-25 21:41 ` Thien-Thi Nguyen 2 siblings, 0 replies; 119+ messages in thread From: James Cloos @ 2009-08-25 13:27 UTC (permalink / raw) To: emacs-devel; +Cc: Juri Linkov, monnier, rms, joakim, Miles Bader >>>>> "Miles" == Miles Bader <miles@gnu.org> writes: Miles> I wonder how hard it would be to have some elisp that understands Miles> the "magic" rules that the file command uses... It shouldn't be too hard, but note that the current versions of file do not install the text-format magic file, but rather a compiled .mgc file instead. The format of the text file is documented in the magic(4) man page. (Which ought to be in section 5....) I don't see any documentation of the .mgc files, but there may be in the src. -JimC -- James Cloos <cloos@jhcloos.com> OpenPGP: 1024D/ED7DAEA6 ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: using libmagic in Emacs? 2009-08-25 2:19 ` Miles Bader 2009-08-25 5:09 ` joakim 2009-08-25 13:27 ` James Cloos @ 2009-08-25 21:41 ` Thien-Thi Nguyen 2 siblings, 0 replies; 119+ messages in thread From: Thien-Thi Nguyen @ 2009-08-25 21:41 UTC (permalink / raw) To: emacs-devel () Miles Bader <miles@gnu.org> () Tue, 25 Aug 2009 11:19:37 +0900 If there was such elisp code, Emacs could use any magic rules file on the system, and in addition, could distribute it's own (perhaps smaller) set. Some proof-of-concept grade Scheme code that (re)processes the magic rules format into sexps, and also does the non-char-encoding side of file(1) is at [0]. Converted rules (v1, v2) are at [1]. (Aside) I think the best candidate for libfoo integration with Emacs would be libguile. In contrast, when i read of other libfoo proposals, i can't help but feel somewhat deflated. thi _______________________________________________ [0] http://www.gnuvola.org/software/ttn-do/ (see file magic.scm in the tarball) [1] http://www.gnuvola.org/data/index.html (see entry "de-uglified magic file") ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: using libmagic in Emacs? 2009-08-25 2:08 ` Richard Stallman 2009-08-25 2:19 ` Miles Bader @ 2009-08-25 17:36 ` Stefan Monnier 2009-08-25 20:37 ` Juri Linkov 2009-08-29 23:19 ` Juri Linkov 3 siblings, 0 replies; 119+ messages in thread From: Stefan Monnier @ 2009-08-25 17:36 UTC (permalink / raw) To: rms; +Cc: Juri Linkov, joakim, emacs-devel > If some other Lisp code is interested in the mime type of a file, > there is a much easier way to find it out: run `file'. Actually, running `file' might be a useful fallback (especially if run via process-file so it also works on remote files, contrary to libmagic). Also, looking at the way the magic database works, it seems that we might want to run `file-type-by-magic' on a buffer's content as well. As for Miles's suggestion to interpret the magic file directly, I think it's a bad idea: let's not reinvent the wheel, the code that parses this database and uses it already exists, it's in libmagic, so we should just use it. Stefan ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: using libmagic in Emacs? 2009-08-25 2:08 ` Richard Stallman 2009-08-25 2:19 ` Miles Bader 2009-08-25 17:36 ` Stefan Monnier @ 2009-08-25 20:37 ` Juri Linkov 2009-08-29 23:19 ` Juri Linkov 3 siblings, 0 replies; 119+ messages in thread From: Juri Linkov @ 2009-08-25 20:37 UTC (permalink / raw) To: rms; +Cc: monnier, joakim, emacs-devel > For instance, Java archive files have the .jar extension but build on > the ZIP file format, so they can be visited in Emacs with the help of > `archive-mode'. Enterprise Java archives with the .ear extension > and Web application Java archives with the .war extension all are > based on the ZIP file format as well as OpenDocument files with > extensions .odt .ods .odb .odp .odg .odf, Firefox add-ons (.xpi), > Keyhole Markup (.kmz), and many other file types that can be potentially > opened in Emacs if were identified as archive files by libmagic. > > How hard would it be to change the code in Emacs to recognize these > using the existing mechanism? Not hard at all. For the ZIP file format it's just one line in `magic-fallback-mode-alist'. Unlike `image-type-auto-detected-p' from the preloaded image.el, we can't use `archive-find-type' in `magic-fallback-mode-alist' because arc-mode.el is not preloaded. So we have to copy archive magic numbers manually from arc-mode.el to `magic-fallback-mode-alist'. Index: lisp/files.el =================================================================== RCS file: /sources/emacs/emacs/lisp/files.el,v retrieving revision 1.1069 diff -u -r1.1069 files.el --- lisp/files.el 17 Aug 2009 23:40:22 -0000 1.1069 +++ lisp/files.el 25 Aug 2009 20:36:28 -0000 @@ -2399,6 +2399,7 @@ (defvar magic-fallback-mode-alist `((image-type-auto-detected-p . image-mode) + ("\\(PK00\\)?[P]K\003\004" . archive-mode) ; zip ;; The < comes before the groups (but the first) to reduce backtracking. ;; TODO: UTF-16 <?xml may be preceded by a BOM 0xff 0xfe or 0xfe 0xff. ;; We use [ \t\r\n] instead of `\\s ' to make regex overflow less likely. -- Juri Linkov http://www.jurta.org/emacs/ ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: using libmagic in Emacs? 2009-08-25 2:08 ` Richard Stallman ` (2 preceding siblings ...) 2009-08-25 20:37 ` Juri Linkov @ 2009-08-29 23:19 ` Juri Linkov 2009-08-30 3:09 ` Eli Zaretskii 3 siblings, 1 reply; 119+ messages in thread From: Juri Linkov @ 2009-08-29 23:19 UTC (permalink / raw) To: rms; +Cc: monnier, joakim, emacs-devel > If some other Lisp code is interested in the mime type of a file, > there is a much easier way to find it out: run `file'. I agree that running `file' is a simpler solution. So I propose at least the following patch to add content-based MIME-type identification to one particular function `mailcap-file-default-commands' that needs this when filename extension-based identification fails: Index: lisp/gnus/mailcap.el =================================================================== RCS file: /sources/emacs/emacs/lisp/gnus/mailcap.el,v retrieving revision 1.26 diff -c -r1.26 mailcap.el *** lisp/gnus/mailcap.el 5 Jan 2009 03:22:07 -0000 1.26 --- lisp/gnus/mailcap.el 29 Aug 2009 23:18:45 -0000 *************** *** 1030,1037 **** ;; All unique MIME types from file extensions (mailcap-delete-duplicates (mapcar (lambda (file) ! (mailcap-extension-to-mime ! (file-name-extension file t))) files))) (all-mime-info ;; All MIME info lists --- 1030,1042 ---- ;; All unique MIME types from file extensions (mailcap-delete-duplicates (mapcar (lambda (file) ! (or (mailcap-extension-to-mime ! (file-name-extension file t)) ! (replace-regexp-in-string ! ".*:\\s-*\\(.*\\)\\s-*" "\\1" ! (shell-command-to-string ! (concat "file --mime " ! (shell-quote-argument file)))))) files))) (all-mime-info ;; All MIME info lists -- Juri Linkov http://www.jurta.org/emacs/ ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: using libmagic in Emacs? 2009-08-29 23:19 ` Juri Linkov @ 2009-08-30 3:09 ` Eli Zaretskii 2009-08-30 20:54 ` Juri Linkov 2009-08-31 3:33 ` Richard Stallman 0 siblings, 2 replies; 119+ messages in thread From: Eli Zaretskii @ 2009-08-30 3:09 UTC (permalink / raw) To: Juri Linkov; +Cc: joakim, emacs-devel > From: Juri Linkov <juri@jurta.org> > Date: Sun, 30 Aug 2009 02:19:12 +0300 > Cc: monnier@IRO.UMontreal.CA, joakim@verona.se, emacs-devel@gnu.org > > I agree that running `file' is a simpler solution. PLEASE do not base Emacs infrastructure on external programs, unless they come with Emacs. `file' is not available on every platform, and even on those it is, the quality and extent of its database is unclear and so cannot be relied upon. I really don't understand why linking against a simple free library is an issue, but if it is, we should find a different solution using some database internal to Emacs, as we did until now. In any case, invoking external programs without being smart about their non-existence is not something we should have in Emacs. ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: using libmagic in Emacs? 2009-08-30 3:09 ` Eli Zaretskii @ 2009-08-30 20:54 ` Juri Linkov 2009-08-31 2:49 ` Eli Zaretskii 2009-08-31 3:33 ` Richard Stallman 1 sibling, 1 reply; 119+ messages in thread From: Juri Linkov @ 2009-08-30 20:54 UTC (permalink / raw) To: Eli Zaretskii; +Cc: joakim, emacs-devel >> I agree that running `file' is a simpler solution. > > PLEASE do not base Emacs infrastructure on external programs, unless > they come with Emacs. There are many features in Emacs that depend on external programs, e.g. `ls' for dired, `find' and `grep', `man', `ispell', `shell', `gdb', `diff', VCS tools, etc. > `file' is not available on every platform, and even on those it is, When `libmagic' is available, then usually `file' is available as well on the same platform. > the quality and extent of its database is unclear and so cannot be > relied upon. > > I really don't understand why linking against a simple free library is > an issue, but if it is, we should find a different solution using some > database internal to Emacs, as we did until now. If someone thinks adding the magic number database to Emacs is important, then fine, let's do it. But this doesn't preclude from using `file' since we already have many such pairs of external programs and their emulations for the case when an external program is not available, e.g. `ls' and ls-lisp, `grep' and multi-occur, `shell' and eshell, etc. > In any case, invoking external programs without being smart about > their non-existence is not something we should have in Emacs. My patch fails gracefully when `file' is not available, I tried removing `file' without any problem. The function just returns nil. -- Juri Linkov http://www.jurta.org/emacs/ ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: using libmagic in Emacs? 2009-08-30 20:54 ` Juri Linkov @ 2009-08-31 2:49 ` Eli Zaretskii 2009-08-31 16:17 ` Juri Linkov 2009-08-31 22:21 ` Richard Stallman 0 siblings, 2 replies; 119+ messages in thread From: Eli Zaretskii @ 2009-08-31 2:49 UTC (permalink / raw) To: Juri Linkov; +Cc: joakim, emacs-devel > From: Juri Linkov <juri@jurta.org> > Cc: emacs-devel@gnu.org, joakim@verona.se > Date: Sun, 30 Aug 2009 23:54:34 +0300 > > >> I agree that running `file' is a simpler solution. > > > > PLEASE do not base Emacs infrastructure on external programs, unless > > they come with Emacs. > > There are many features in Emacs that depend on external programs, > e.g. `ls' for dired, `find' and `grep', `man', `ispell', `shell', > `gdb', `diff', VCS tools, etc. I said infrastructure, not features. Of those you mentioned only `ls' is used in infrastructure, and that's precisely why we had to write ls-lisp.el. > > `file' is not available on every platform, and even on those it is, > > When `libmagic' is available, then usually `file' is available as well > on the same platform. No. libmagic can be bundled and compiled with Emacs. `file' needs to be present. > > In any case, invoking external programs without being smart about > > their non-existence is not something we should have in Emacs. > > My patch fails gracefully when `file' is not available, I tried > removing `file' without any problem. The function just returns nil. And that is graceful how? ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: using libmagic in Emacs? 2009-08-31 2:49 ` Eli Zaretskii @ 2009-08-31 16:17 ` Juri Linkov 2009-08-31 17:58 ` Eli Zaretskii 2009-08-31 22:21 ` Richard Stallman 1 sibling, 1 reply; 119+ messages in thread From: Juri Linkov @ 2009-08-31 16:17 UTC (permalink / raw) To: Eli Zaretskii; +Cc: joakim, emacs-devel >> > PLEASE do not base Emacs infrastructure on external programs, unless >> > they come with Emacs. >> >> There are many features in Emacs that depend on external programs, >> e.g. `ls' for dired, `find' and `grep', `man', `ispell', `shell', >> `gdb', `diff', VCS tools, etc. > > I said infrastructure, not features. Of those you mentioned only `ls' > is used in infrastructure, and that's precisely why we had to write > ls-lisp.el. `mailcap-file-default-commands' is not infrastructure. It's one small feature. >> My patch fails gracefully when `file' is not available, I tried >> removing `file' without any problem. The function just returns nil. > > And that is graceful how? When it returns nil, then default commands are the same as provided by mailcap for plain text files, i.e. `less', `konqueror', etc. -- Juri Linkov http://www.jurta.org/emacs/ ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: using libmagic in Emacs? 2009-08-31 16:17 ` Juri Linkov @ 2009-08-31 17:58 ` Eli Zaretskii 2009-09-01 12:16 ` Richard Stallman 0 siblings, 1 reply; 119+ messages in thread From: Eli Zaretskii @ 2009-08-31 17:58 UTC (permalink / raw) To: Juri Linkov; +Cc: joakim, emacs-devel > From: Juri Linkov <juri@jurta.org> > Cc: emacs-devel@gnu.org, joakim@verona.se > Date: Mon, 31 Aug 2009 19:17:52 +0300 > > >> > PLEASE do not base Emacs infrastructure on external programs, unless > >> > they come with Emacs. > >> > >> There are many features in Emacs that depend on external programs, > >> e.g. `ls' for dired, `find' and `grep', `man', `ispell', `shell', > >> `gdb', `diff', VCS tools, etc. > > > > I said infrastructure, not features. Of those you mentioned only `ls' > > is used in infrastructure, and that's precisely why we had to write > > ls-lisp.el. > > `mailcap-file-default-commands' is not infrastructure. It's one > small feature. A feature is something that stands on its own. I don't think `mailcap-file-default-commands' qualifies, nor would any other function that returns something related to a file's type. A feature would _use_ these to actually do something useful with the file. > >> My patch fails gracefully when `file' is not available, I tried > >> removing `file' without any problem. The function just returns nil. > > > > And that is graceful how? > > When it returns nil, then default commands are the same as provided > by mailcap for plain text files, i.e. `less', `konqueror', etc. Which is clearly un-graceful in my book. ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: using libmagic in Emacs? 2009-08-31 17:58 ` Eli Zaretskii @ 2009-09-01 12:16 ` Richard Stallman 2009-09-01 16:12 ` Stefan Monnier 0 siblings, 1 reply; 119+ messages in thread From: Richard Stallman @ 2009-09-01 12:16 UTC (permalink / raw) To: Eli Zaretskii; +Cc: juri, joakim, emacs-devel > `mailcap-file-default-commands' is not infrastructure. It's one > small feature. A feature is something that stands on its own. I don't think `mailcap-file-default-commands' qualifies, nor would any other function that returns something related to a file's type. A feature would _use_ these to actually do something useful with the file. You're right; but the point is that this is used in very specific features, not in something general and basic such as find-file. ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: using libmagic in Emacs? 2009-09-01 12:16 ` Richard Stallman @ 2009-09-01 16:12 ` Stefan Monnier 2009-09-01 21:20 ` Richard Stallman 0 siblings, 1 reply; 119+ messages in thread From: Stefan Monnier @ 2009-09-01 16:12 UTC (permalink / raw) To: rms; +Cc: juri, Eli Zaretskii, joakim, emacs-devel > You're right; but the point is that this is used in very specific > features, not in something general and basic such as find-file. My intended use of libmagic is as a sidekick to magic-mode-alist, so very much used by find-file. Stefan ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: using libmagic in Emacs? 2009-09-01 16:12 ` Stefan Monnier @ 2009-09-01 21:20 ` Richard Stallman 2009-09-03 19:42 ` Stefan Monnier 0 siblings, 1 reply; 119+ messages in thread From: Richard Stallman @ 2009-09-01 21:20 UTC (permalink / raw) To: Stefan Monnier; +Cc: juri, eliz, joakim, emacs-devel > You're right; but the point is that this is used in very specific > features, not in something general and basic such as find-file. My intended use of libmagic is as a sidekick to magic-mode-alist, so very much used by find-file. Depending on libmagic is too much price to pay for saving a small amount of work. It's better to do the small amount of work needed to recognize the few file types we want to recognize, using the existing mechanisms. ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: using libmagic in Emacs? 2009-09-01 21:20 ` Richard Stallman @ 2009-09-03 19:42 ` Stefan Monnier 2009-09-04 7:52 ` Richard Stallman 0 siblings, 1 reply; 119+ messages in thread From: Stefan Monnier @ 2009-09-03 19:42 UTC (permalink / raw) To: rms; +Cc: juri, eliz, joakim, emacs-devel >> You're right; but the point is that this is used in very specific >> features, not in something general and basic such as find-file. > My intended use of libmagic is as a sidekick to magic-mode-alist, so > very much used by find-file. > Depending on libmagic is too much price to pay for saving a small > amount of work. It's better to do the small amount of work needed to > recognize the few file types we want to recognize, using the existing > mechanisms. I'd rather let other people do this work. That's why libmagic makes a lot of sense. Stefan ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: using libmagic in Emacs? 2009-09-03 19:42 ` Stefan Monnier @ 2009-09-04 7:52 ` Richard Stallman 0 siblings, 0 replies; 119+ messages in thread From: Richard Stallman @ 2009-09-04 7:52 UTC (permalink / raw) To: Stefan Monnier; +Cc: juri, eliz, joakim, emacs-devel > Depending on libmagic is too much price to pay for saving a small > amount of work. It's better to do the small amount of work needed to > recognize the few file types we want to recognize, using the existing > mechanisms. I'd rather let other people do this work. That's why libmagic makes a lot of sense. It's a small amount of work, and the burden of distributing and using libmagic will be a persistant annoyance. ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: using libmagic in Emacs? 2009-08-31 2:49 ` Eli Zaretskii 2009-08-31 16:17 ` Juri Linkov @ 2009-08-31 22:21 ` Richard Stallman 1 sibling, 0 replies; 119+ messages in thread From: Richard Stallman @ 2009-08-31 22:21 UTC (permalink / raw) To: Eli Zaretskii; +Cc: juri, joakim, emacs-devel > There are many features in Emacs that depend on external programs, > e.g. `ls' for dired, `find' and `grep', `man', `ispell', `shell', > `gdb', `diff', VCS tools, etc. I said infrastructure, not features. I don't expect any general infrastructure to use `file', just a few commands. If those commands have to do their best without `file' on some systems, it won't be a big deal. > When `libmagic' is available, then usually `file' is available as well > on the same platform. No. libmagic can be bundled and compiled with Emacs. I don't think that price is worth paying for the use of libmagic. It is not important enough. ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: using libmagic in Emacs? 2009-08-30 3:09 ` Eli Zaretskii 2009-08-30 20:54 ` Juri Linkov @ 2009-08-31 3:33 ` Richard Stallman 2009-08-31 15:03 ` Chong Yidong 1 sibling, 1 reply; 119+ messages in thread From: Richard Stallman @ 2009-08-31 3:33 UTC (permalink / raw) To: Eli Zaretskii; +Cc: juri, joakim, emacs-devel PLEASE do not base Emacs infrastructure on external programs, unless they come with Emacs. `file' is not available on every platform, and libmagic isn't either. even on those it is, the quality and extent of its database is unclear and so cannot be relied upon. On systems that have libmagic, isn't it the case that file uses libmagic? I really don't understand why linking against a simple free library is an issue, but if it is, we should find a different solution using some database internal to Emacs, as we did until now. It shouldn't be a big deal. Only a few specialized places should want to run `file' on a file. ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: using libmagic in Emacs? 2009-08-31 3:33 ` Richard Stallman @ 2009-08-31 15:03 ` Chong Yidong 2009-08-31 16:19 ` Juri Linkov ` (2 more replies) 0 siblings, 3 replies; 119+ messages in thread From: Chong Yidong @ 2009-08-31 15:03 UTC (permalink / raw) To: rms; +Cc: juri, Eli Zaretskii, joakim, emacs-devel Richard Stallman <rms@gnu.org> writes: > PLEASE do not base Emacs infrastructure on external programs, unless > they come with Emacs. `file' is not available on every platform, and > > libmagic isn't either. > > even on those it is, the quality and extent of its database is unclear > and so cannot be relied upon. > > On systems that have libmagic, isn't it the case that file uses libmagic? > > I really don't understand why linking against a simple free library is > an issue, but if it is, we should find a different solution using some > database internal to Emacs, as we did until now. > > It shouldn't be a big deal. Only a few specialized places should want > to run `file' on a file. No one has yet explained why we shouldn't deal with such failures by simply fixing Emacs' file detection, instead of relying on libmagic or file. Given that libmagic/file isn't always going to be present, and there's no intention of removing our file detection code, why not make sure the latter works? Are there any situations we are inherently incapable of detecting, and why? ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: using libmagic in Emacs? 2009-08-31 15:03 ` Chong Yidong @ 2009-08-31 16:19 ` Juri Linkov 2009-08-31 23:47 ` Stefan Monnier 2009-09-01 12:16 ` Richard Stallman 2 siblings, 0 replies; 119+ messages in thread From: Juri Linkov @ 2009-08-31 16:19 UTC (permalink / raw) To: Chong Yidong; +Cc: Eli Zaretskii, rms, joakim, emacs-devel > No one has yet explained why we shouldn't deal with such failures by > simply fixing Emacs' file detection, instead of relying on libmagic or > file. Given that libmagic/file isn't always going to be present, and > there's no intention of removing our file detection code, why not make > sure the latter works? Sure, we need to support the existing file detection code for the case of absent libmagic/file. This means manually copying necessary parts of the magic database to `magic-fallback-mode-alist', `image-type-header-regexps', `archive-find-type'. > Are there any situations we are inherently incapable of detecting, > and why? One situation is when the number of formats to copy from the magic database is too high, e.g. for all ImageMagick formats. Also the magic database has simple rules for detecting C, Lisp and Perl programs, but these rules are unreliable. Another useful case for libmagic/file is getting MIME types and mailcap commands to run on them as demonstrated by my patch. But it seems this feature is not too important to justify adding libmagic to Emacs instead of running `file'. -- Juri Linkov http://www.jurta.org/emacs/ ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: using libmagic in Emacs? 2009-08-31 15:03 ` Chong Yidong 2009-08-31 16:19 ` Juri Linkov @ 2009-08-31 23:47 ` Stefan Monnier 2009-09-01 3:16 ` Eli Zaretskii 2009-09-01 12:16 ` Richard Stallman 2 siblings, 1 reply; 119+ messages in thread From: Stefan Monnier @ 2009-08-31 23:47 UTC (permalink / raw) To: Chong Yidong; +Cc: juri, Eli Zaretskii, rms, joakim, emacs-devel > No one has yet explained why we shouldn't deal with such failures by > simply fixing Emacs' file detection, instead of relying on libmagic or Because we don't want to keep reinventing the wheel, do we? Stefan ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: using libmagic in Emacs? 2009-08-31 23:47 ` Stefan Monnier @ 2009-09-01 3:16 ` Eli Zaretskii 2009-09-01 5:37 ` Stefan Monnier 0 siblings, 1 reply; 119+ messages in thread From: Eli Zaretskii @ 2009-09-01 3:16 UTC (permalink / raw) To: Stefan Monnier; +Cc: juri, cyd, rms, joakim, emacs-devel > From: Stefan Monnier <monnier@iro.umontreal.ca> > Date: Mon, 31 Aug 2009 19:47:14 -0400 > Cc: juri@jurta.org, Eli Zaretskii <eliz@gnu.org>, rms@gnu.org, joakim@verona.se, > emacs-devel@gnu.org > > > No one has yet explained why we shouldn't deal with such failures by > > simply fixing Emacs' file detection, instead of relying on libmagic or > > Because we don't want to keep reinventing the wheel, do we? What wheel is that? Emacs has been consulting various internal databases since time immemoriam. We do the same with Unicode databases as well. ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: using libmagic in Emacs? 2009-09-01 3:16 ` Eli Zaretskii @ 2009-09-01 5:37 ` Stefan Monnier 0 siblings, 0 replies; 119+ messages in thread From: Stefan Monnier @ 2009-09-01 5:37 UTC (permalink / raw) To: Eli Zaretskii; +Cc: juri, cyd, rms, joakim, emacs-devel >> > No one has yet explained why we shouldn't deal with such failures by >> > simply fixing Emacs' file detection, instead of relying on libmagic or >> Because we don't want to keep reinventing the wheel, do we? > What wheel is that? Emacs has been consulting various internal > databases since time immemoriam. We do the same with Unicode > databases as well. Yes, Emacs reinvented many wheels, and even preinvented some. That doesn't make reinvention any more desirable. Stefan ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: using libmagic in Emacs? 2009-08-31 15:03 ` Chong Yidong 2009-08-31 16:19 ` Juri Linkov 2009-08-31 23:47 ` Stefan Monnier @ 2009-09-01 12:16 ` Richard Stallman 2 siblings, 0 replies; 119+ messages in thread From: Richard Stallman @ 2009-09-01 12:16 UTC (permalink / raw) To: Chong Yidong; +Cc: juri, eliz, joakim, emacs-devel > It shouldn't be a big deal. Only a few specialized places should want > to run `file' on a file. No one has yet explained why we shouldn't deal with such failures by simply fixing Emacs' file detection, instead of relying on libmagic or file. I don't think we're talking about the same question. "Emacs' file type detection" is used inside find-file. I'm proposing the use of `file' in othehr specific features, not in find-file. ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: using libmagic in Emacs? 2009-08-24 0:07 ` Richard Stallman 2009-08-24 0:17 ` Juri Linkov @ 2009-08-25 20:36 ` Juri Linkov 1 sibling, 0 replies; 119+ messages in thread From: Juri Linkov @ 2009-08-25 20:36 UTC (permalink / raw) To: rms; +Cc: monnier, joakim, emacs-devel > The problem is that `image-jpeg-p' in image.el refuses to accept > non-JFIF JPEG image files whereas Emacs can correctly display them > when tests in `image-jpeg-p' are ignored. > > Using libmagic means looking only for 2 bytes 0xffd8 (a magic number > of JPEG files) as described by the magic number file: > > 0 beshort 0xffd8 JPEG image data > > It seems this is enough to determine JPEG files. But I'm not confident > about removing additional tests from `image-jpeg-p'. We could keep the > current rules in image.el as a fall-back when libmagic is not available. > > Whatever we do with the function `image-jpeg-p', we could easily make > Emacs test these two bytes. It makes no sense to install code to link > with libmagic just to handle that and a few other similar things. The following patch changes `image-type-header-regexps' to test only two bytes of the JPEG magic number: Index: lisp/image.el =================================================================== RCS file: /sources/emacs/emacs/lisp/image.el,v retrieving revision 1.87 diff -u -r1.87 image.el --- lisp/image.el 24 Feb 2009 10:29:00 -0000 1.87 +++ lisp/image.el 25 Aug 2009 20:33:16 -0000 @@ -43,7 +43,7 @@ static \\(unsigned \\)?char \\1_bits" . xbm) ("\\`\\(?:MM\0\\*\\|II\\*\0\\)" . tiff) ("\\`[\t\n\r ]*%!PS" . postscript) - ("\\`\xff\xd8" . (image-jpeg-p . jpeg)) + ("\\`\xff\xd8" . jpeg) (,(let* ((incomment-re "\\(?:[^-]\\|-[^-]\\)") (comment-re (concat "\\(?:!--" incomment-re "*-->[ \t\r\n]*<\\)"))) (concat "\\(?:<\\?xml[ \t\r\n]+[^>]*>\\)?[ \t\r\n]*<" -- Juri Linkov http://www.jurta.org/emacs/ ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: using libmagic in Emacs? 2009-08-18 19:23 ` Stefan Monnier 2009-08-18 20:01 ` Chong Yidong 2009-08-19 0:57 ` using libmagic in Emacs? Juri Linkov @ 2009-08-19 22:49 ` joakim 2009-08-19 23:20 ` Dan Nicolaescu ` (3 more replies) 2 siblings, 4 replies; 119+ messages in thread From: joakim @ 2009-08-19 22:49 UTC (permalink / raw) To: Stefan Monnier; +Cc: Emacs Development [-- Attachment #1: Type: text/plain, Size: 834 bytes --] Stefan Monnier <monnier@IRO.UMontreal.CA> writes: > > I think it's a good idea. It may require some non-trivial changes on > the Lisp side, since libmagic's information is not quite the same as > what Emacs currently uses: we'll probably want to use libmagic to get > a MIME-type and then have a table mapping mime-types to major modes or > some such. > > > Stefan I attach an early draft filemagic patch. Some notes: - The mime type info usualy is less granular than the free text info: file --mime /tmp/tst.xcf /tmp/tst.xcf: application/octet-stream; charset=binary file /tmp/tst.xcf /tmp/tst.xcf: GIMP XCF image data, version 0, 640 x 480, RGB Color This is dependent on the file magic info file used. - We can probably have much fun debating what the interface should look like at the lisp level. Any ideas? [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: filemagic1.patch --] [-- Type: text/x-patch, Size: 5605 bytes --] diff --git a/configure.in b/configure.in index f4096db..cb74523 100644 --- a/configure.in +++ b/configure.in @@ -137,6 +137,8 @@ OPTION_DEFAULT_ON([xft],[don't use XFT for anti aliased fonts]) OPTION_DEFAULT_ON([libotf],[don't use libotf for OpenType font support]) OPTION_DEFAULT_ON([m17n-flt],[don't use m17n-flt for text shaping]) +OPTION_DEFAULT_ON([filemagic],[don't compile with filemagic support]) + OPTION_DEFAULT_ON([toolkit-scroll-bars],[don't use Motif or Xaw3d scroll bars]) OPTION_DEFAULT_ON([xaw3d],[don't use Xaw3d]) OPTION_DEFAULT_ON([xim],[don't use X11 XIM]) @@ -2223,6 +2225,19 @@ if test x"$ac_cv_func_alloca_works" != xyes; then AC_MSG_ERROR( [a system implementation of alloca is required] ) fi + +HAVE_LIBMAGIC=no +if test "${with_filemagic}" != "no"; then + #libmagic support + AC_CHECK_HEADERS(magic.h, [ AC_CHECK_LIB(magic,magic_open,HAVE_LIBMAGIC=yes) ]) +fi + +if test "${HAVE_LIBMAGIC}" = "yes"; then + LIBMAGIC=-lmagic + AC_SUBST(LIBMAGIC) + AC_DEFINE(HAVE_LIBMAGIC, 1, [Define to 1 if using libmagic.]) +fi + # fmod, logb, and frexp are found in -lm on most systems. # On HPUX 9.01, -lm does not contain logb, so check for sqrt. AC_CHECK_LIB(m, sqrt) @@ -2954,6 +2969,7 @@ echo " Does Emacs use -lpng? ${HAVE_PNG}" echo " Does Emacs use -lrsvg-2? ${HAVE_RSVG}" echo " Does Emacs use -lgpm? ${HAVE_GPM}" echo " Does Emacs use -ldbus? ${HAVE_DBUS}" +echo " Does Emacs use -lmagic? ${HAVE_LIBMAGIC}" echo " Does Emacs use -lfreetype? ${HAVE_FREETYPE}" echo " Does Emacs use -lm17n-flt? ${HAVE_M17N_FLT}" diff --git a/src/Makefile.in b/src/Makefile.in index 425cf98..b80255a 100644 --- a/src/Makefile.in +++ b/src/Makefile.in @@ -420,6 +420,7 @@ LIBX= $(LIBXMENU) LD_SWITCH_X_SITE #endif /* not HAVE_LIBRESOLV */ LIBSOUND= @LIBSOUND@ +LIBMAGIC= @LIBMAGIC@ CFLAGS_SOUND= @CFLAGS_SOUND@ RSVG_LIBS= @RSVG_LIBS@ @@ -511,6 +512,12 @@ MSDOS_OBJ = dosfns.o msdos.o w16select.o xmenu.o #endif #endif +#ifdef HAVE_LIBMAGIC +FILEMAGIC_OBJ = filemagic.o +#else +FILEMAGIC_OBJ = +#endif + #ifdef CYGWIN CYGWIN_OBJ = sheap.o #endif @@ -551,7 +558,7 @@ obj= dispnew.o frame.o scroll.o xdisp.o menu.o $(XMENU_OBJ) window.o \ syntax.o UNEXEC bytecode.o \ process.o callproc.o \ region-cache.o sound.o atimer.o \ - doprnt.o strftime.o intervals.o textprop.o composite.o md5.o \ + doprnt.o strftime.o intervals.o textprop.o composite.o md5.o ${FILEMAGIC_OBJ} \ $(MSDOS_OBJ) $(NS_OBJ) $(CYGWIN_OBJ) $(FONT_DRIVERS) /* Object files used on some machine or other. @@ -878,7 +885,7 @@ SOME_MACHINE_LISP = ../lisp/mouse.elc \ duplicated symbols. If the standard libraries were compiled with GCC, we might need gnulib again after them. */ -LIBES = $(LOADLIBES) $(LIBS) $(LIBX) $(LIBSOUND) $(RSVG_LIBS) $(DBUS_LIBS) \ +LIBES = $(LOADLIBES) $(LIBS) $(LIBX) $(LIBSOUND) $(LIBMAGIC) $(RSVG_LIBS) $(DBUS_LIBS) \ LIBGPM LIBRESOLV LIBS_SYSTEM LIBS_MACHINE LIBS_TERMCAP \ LIBS_DEBUG $(GETLOADAVG_LIBS) \ @FREETYPE_LIBS@ @FONTCONFIG_LIBS@ @LIBOTF_LIBS@ @M17N_FLT_LIBS@ \ diff --git a/src/config.in b/src/config.in index 404e00b..c966a09 100644 --- a/src/config.in +++ b/src/config.in @@ -262,6 +262,9 @@ along with GNU Emacs. If not, see <http://www.gnu.org/licenses/>. */ /* Define to 1 if you have the gpm library (-lgpm). */ #undef HAVE_GPM +/* Define to 1 if you have the filemagic library (-lmagic). */ +#undef HAVE_LIBMAGIC + /* Define to 1 if you have the `grantpt' function. */ #undef HAVE_GRANTPT diff --git a/src/emacs.c b/src/emacs.c index 657465d..03d7744 100644 --- a/src/emacs.c +++ b/src/emacs.c @@ -1683,6 +1683,9 @@ main (int argc, char **argv) syms_of_window (); syms_of_xdisp (); syms_of_font (); +#ifdef HAVE_LIBMAGIC + syms_of_filemagic(); +#endif #ifdef HAVE_WINDOW_SYSTEM syms_of_fringe (); syms_of_image (); diff --git a/src/filemagic.c b/src/filemagic.c new file mode 100644 index 0000000..1dcf065 --- /dev/null +++ b/src/filemagic.c @@ -0,0 +1,65 @@ +#include <magic.h> +#include <config.h> +#include <stdio.h> +#include <math.h> +#include <ctype.h> + +#include "lisp.h" +/* + + */ +/* + magic_t + magic_open(int flags); + + void + magic_close(magic_t cookie); + + const char * + magic_error(magic_t cookie); + + int + magic_errno(magic_t cookie); + + const char * + magic_file(magic_t cookie, const char *filename); + + const char * + magic_buffer(magic_t cookie, const void *buffer, size_t length); + + int + magic_setflags(magic_t cookie, int flags); + + int + magic_check(magic_t cookie, const char *filename); + + int + magic_compile(magic_t cookie, const char *filename); + + int + magic_load(magic_t cookie, const char *filename); +*/ + + +DEFUN ("file-magic-file", Ffile_magic_file, Sfile_magic_file, 1,1,0, + doc: /* return libmagic file description for filename */) + (filename) + Lisp_Object filename; +{ + if (!STRINGP (filename)) return Qnil; + printf("filename:%s\n",SDATA(filename)); + magic_t cookie= magic_open(MAGIC_MIME_TYPE); + magic_load(cookie,NULL); + printf("cookie:%d\n",cookie); + char *rvs=magic_file(cookie, SDATA(filename)); + printf("rvs:%s\n",rvs); + Lisp_Object rv=intern(rvs); + magic_close(cookie); + return rv; +} + +void +syms_of_filemagic () +{ + defsubr (&Sfile_magic_file); +} [-- Attachment #3: Type: text/plain, Size: 20 bytes --] -- Joakim Verona ^ permalink raw reply related [flat|nested] 119+ messages in thread
* Re: using libmagic in Emacs? 2009-08-19 22:49 ` joakim @ 2009-08-19 23:20 ` Dan Nicolaescu 2009-08-20 1:03 ` Stephen J. Turnbull ` (2 subsequent siblings) 3 siblings, 0 replies; 119+ messages in thread From: Dan Nicolaescu @ 2009-08-19 23:20 UTC (permalink / raw) To: joakim; +Cc: Stefan Monnier, Emacs Development joakim@verona.se writes: > Stefan Monnier <monnier@IRO.UMontreal.CA> writes: > > > > I think it's a good idea. It may require some non-trivial changes on > > the Lisp side, since libmagic's information is not quite the same as > > what Emacs currently uses: we'll probably want to use libmagic to get > > a MIME-type and then have a table mapping mime-types to major modes or > > some such. > > > > > > Stefan > > I attach an early draft filemagic patch. > > Some notes: > > - The mime type info usualy is less granular than the free > text info: > > file --mime /tmp/tst.xcf > /tmp/tst.xcf: application/octet-stream; charset=binary > > file /tmp/tst.xcf > /tmp/tst.xcf: GIMP XCF image data, version 0, 640 x 480, RGB Color > > This is dependent on the file magic info file used. > > - We can probably have much fun debating what the interface should look > like at the lisp level. Any ideas? > > diff --git a/configure.in b/configure.in > index f4096db..cb74523 100644 > --- a/configure.in > +++ b/configure.in > @@ -137,6 +137,8 @@ OPTION_DEFAULT_ON([xft],[don't use XFT for anti aliased fonts]) > OPTION_DEFAULT_ON([libotf],[don't use libotf for OpenType font support]) > OPTION_DEFAULT_ON([m17n-flt],[don't use m17n-flt for text shaping]) > > +OPTION_DEFAULT_ON([filemagic],[don't compile with filemagic support]) > + > OPTION_DEFAULT_ON([toolkit-scroll-bars],[don't use Motif or Xaw3d scroll bars]) > OPTION_DEFAULT_ON([xaw3d],[don't use Xaw3d]) > OPTION_DEFAULT_ON([xim],[don't use X11 XIM]) > @@ -2223,6 +2225,19 @@ if test x"$ac_cv_func_alloca_works" != xyes; then > AC_MSG_ERROR( [a system implementation of alloca is required] ) > fi > > + > +HAVE_LIBMAGIC=no > +if test "${with_filemagic}" != "no"; then > + #libmagic support > + AC_CHECK_HEADERS(magic.h, [ AC_CHECK_LIB(magic,magic_open,HAVE_LIBMAGIC=yes) ]) > +fi > + > +if test "${HAVE_LIBMAGIC}" = "yes"; then > + LIBMAGIC=-lmagic > + AC_SUBST(LIBMAGIC) > + AC_DEFINE(HAVE_LIBMAGIC, 1, [Define to 1 if using libmagic.]) > +fi > + > # fmod, logb, and frexp are found in -lm on most systems. > # On HPUX 9.01, -lm does not contain logb, so check for sqrt. > AC_CHECK_LIB(m, sqrt) > @@ -2954,6 +2969,7 @@ echo " Does Emacs use -lpng? ${HAVE_PNG}" > echo " Does Emacs use -lrsvg-2? ${HAVE_RSVG}" > echo " Does Emacs use -lgpm? ${HAVE_GPM}" > echo " Does Emacs use -ldbus? ${HAVE_DBUS}" > +echo " Does Emacs use -lmagic? ${HAVE_LIBMAGIC}" > > echo " Does Emacs use -lfreetype? ${HAVE_FREETYPE}" > echo " Does Emacs use -lm17n-flt? ${HAVE_M17N_FLT}" > diff --git a/src/Makefile.in b/src/Makefile.in > index 425cf98..b80255a 100644 > --- a/src/Makefile.in > +++ b/src/Makefile.in > @@ -420,6 +420,7 @@ LIBX= $(LIBXMENU) LD_SWITCH_X_SITE > #endif /* not HAVE_LIBRESOLV */ > > LIBSOUND= @LIBSOUND@ > +LIBMAGIC= @LIBMAGIC@ > CFLAGS_SOUND= @CFLAGS_SOUND@ > > RSVG_LIBS= @RSVG_LIBS@ > @@ -511,6 +512,12 @@ MSDOS_OBJ = dosfns.o msdos.o w16select.o xmenu.o > #endif > #endif > > +#ifdef HAVE_LIBMAGIC > +FILEMAGIC_OBJ = filemagic.o > +#else > +FILEMAGIC_OBJ = > +#endif Can you please avoid adding new #ifdefs here? (we are trying to get rid of them). Maybe use @FILEMAGIC_OBJ@ ? Or even use the file unconditionally, just add the proper #ifdefs to make it be empty if not HAVE_LIBMAGIC ? ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: using libmagic in Emacs? 2009-08-19 22:49 ` joakim 2009-08-19 23:20 ` Dan Nicolaescu @ 2009-08-20 1:03 ` Stephen J. Turnbull 2009-08-20 3:12 ` Eli Zaretskii 2009-08-20 13:57 ` Stefan Monnier 2009-08-20 18:32 ` Richard Stallman 3 siblings, 1 reply; 119+ messages in thread From: Stephen J. Turnbull @ 2009-08-20 1:03 UTC (permalink / raw) To: joakim; +Cc: Stefan Monnier, Emacs Development joakim@verona.se writes: > - We can probably have much fun debating what the interface should look > like at the lisp level. Any ideas? I think the main interface should be just "file-magic", even if it's not in file.el. It's analogous to "file-attributes", etc. "file-magic-file" sounds like it returns the magic file (file*s* possible?) used. ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: using libmagic in Emacs? 2009-08-20 1:03 ` Stephen J. Turnbull @ 2009-08-20 3:12 ` Eli Zaretskii 2009-08-20 4:50 ` Stephen J. Turnbull 2009-08-20 18:32 ` Richard Stallman 0 siblings, 2 replies; 119+ messages in thread From: Eli Zaretskii @ 2009-08-20 3:12 UTC (permalink / raw) To: Stephen J. Turnbull; +Cc: monnier, joakim, emacs-devel > From: "Stephen J. Turnbull" <stephen@xemacs.org> > Date: Thu, 20 Aug 2009 10:03:24 +0900 > Cc: Stefan Monnier <monnier@IRO.UMontreal.CA>, > Emacs Development <emacs-devel@gnu.org> > > joakim@verona.se writes: > > > - We can probably have much fun debating what the interface should look > > like at the lisp level. Any ideas? > > I think the main interface should be just "file-magic", even if it's > not in file.el. It's analogous to "file-attributes", etc. Actually, I think the interface should be `file-type' or some such. Like `file-attributes' that is a wrapper for `stat', the API name should have a good semantic value, instead of just inheriting the name of the low-level C functions it uses to do the job. ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: using libmagic in Emacs? 2009-08-20 3:12 ` Eli Zaretskii @ 2009-08-20 4:50 ` Stephen J. Turnbull 2009-08-20 18:20 ` Eli Zaretskii 2009-08-20 18:32 ` Richard Stallman 1 sibling, 1 reply; 119+ messages in thread From: Stephen J. Turnbull @ 2009-08-20 4:50 UTC (permalink / raw) To: Eli Zaretskii; +Cc: monnier, joakim, emacs-devel Eli Zaretskii writes: > > I think the main interface should be just "file-magic", even if it's > > not in file.el. It's analogous to "file-attributes", etc. > > Actually, I think the interface should be `file-type' or some such. Seriously, "file-type" is a terrible, ambiguous name. DOS vs. Unix, the extension of the file's name (!!), what program uses it, text vs. binary, MIME type, endianness of the platform, Unicode vs. legacy coding, copyleft vs. permissive vs. proprietary vs. public domain, I could go on. Sure, it's about "files", but the "type" is what file(1) infers from magic numbers in the file, no more and no less ... and exactly the people you expect to say "huh?" will proceed to guess anything but the truth about the semantics of `file-type'. > Like `file-attributes' that is a wrapper for `stat', the API name > should have a good semantic value, instead of just inheriting the name > of the low-level C functions it uses to do the job. `file-magic' does have a good semantic value. It means "look at the first few bytes of a file and infer various metadata about it, based on a published database of 'magic numbers'." It tells not only the purpose but the exact semantics. I suppose the more explicit `file-type-by-magic' might be better. ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: using libmagic in Emacs? 2009-08-20 4:50 ` Stephen J. Turnbull @ 2009-08-20 18:20 ` Eli Zaretskii 2009-08-21 0:19 ` Stephen J. Turnbull 0 siblings, 1 reply; 119+ messages in thread From: Eli Zaretskii @ 2009-08-20 18:20 UTC (permalink / raw) To: Stephen J. Turnbull; +Cc: monnier, joakim, emacs-devel > From: "Stephen J. Turnbull" <stephen@xemacs.org> > Cc: monnier@IRO.UMontreal.CA, > joakim@verona.se, > emacs-devel@gnu.org > Date: Thu, 20 Aug 2009 13:50:13 +0900 > > Eli Zaretskii writes: > > > > I think the main interface should be just "file-magic", even if it's > > > not in file.el. It's analogous to "file-attributes", etc. > > > > Actually, I think the interface should be `file-type' or some such. > > Seriously, "file-type" is a terrible, ambiguous name. DOS vs. Unix, > the extension of the file's name (!!), what program uses it, text > vs. binary, MIME type, endianness of the platform, Unicode vs. legacy > coding, copyleft vs. permissive vs. proprietary vs. public domain, I > could go on. Sure, it's about "files", but the "type" is what file(1) > infers from magic numbers in the file, no more and no less ... and > exactly the people you expect to say "huh?" will proceed to guess > anything but the truth about the semantics of `file-type'. Maybe so, but still "man file" shows this at the very first line: file - determine file type > I suppose the more explicit `file-type-by-magic' might be better. I'm okay with that as well, but maybe `file-type-by-magic-signature' is even better (if we indeed want to advertise its inner workings). But if this function will fall back on something else if libmagic is not available, then I think this name is not a good one. ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: using libmagic in Emacs? 2009-08-20 18:20 ` Eli Zaretskii @ 2009-08-21 0:19 ` Stephen J. Turnbull 0 siblings, 0 replies; 119+ messages in thread From: Stephen J. Turnbull @ 2009-08-21 0:19 UTC (permalink / raw) To: Eli Zaretskii; +Cc: monnier, joakim, emacs-devel Eli Zaretskii writes: > Maybe so, but still "man file" shows this at the very first line: > > file - determine file type Sure. file does more than invoke libmagic, though, and in the Unix context the ambiguous "type" is not so problematic; it is basically defined by "what file(1) does." For the broader audience, if "magic" is unacceptable, they probably don't know what file(1) does. > But if this function will fall back on something else if libmagic is > not available, then I think this name is not a good one. This function should be available in a form which only uses libmagic. What other tests should be done should be done in LISP, and I see no reason to hide the libmagic functionality in an "-internal" or otherwise "not for regular use" form. ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: using libmagic in Emacs? 2009-08-20 3:12 ` Eli Zaretskii 2009-08-20 4:50 ` Stephen J. Turnbull @ 2009-08-20 18:32 ` Richard Stallman 2009-08-21 19:10 ` Stefan Monnier 1 sibling, 1 reply; 119+ messages in thread From: Richard Stallman @ 2009-08-20 18:32 UTC (permalink / raw) To: Eli Zaretskii; +Cc: stephen, monnier, joakim, emacs-devel Actually, I think the interface should be `file-type' or some such. If it will return the MIME type, `file-mime-type' seems like a good name. ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: using libmagic in Emacs? 2009-08-20 18:32 ` Richard Stallman @ 2009-08-21 19:10 ` Stefan Monnier 2009-08-22 5:03 ` Stephen J. Turnbull 0 siblings, 1 reply; 119+ messages in thread From: Stefan Monnier @ 2009-08-21 19:10 UTC (permalink / raw) To: rms; +Cc: Eli Zaretskii, stephen, joakim, emacs-devel > Actually, I think the interface should be `file-type' or some such. > If it will return the MIME type, `file-mime-type' seems like a good > name. We need two functions: - one implemented in C which provides the info from libmagic, no more no less. - one implemented in Elisp which uses the previous one as well as other techniques (maybe even auto-mode-alist). So yes, the second one can be called `file-mime-type', but the first would better be called `file-magic' or somesuch. Stefan ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: using libmagic in Emacs? 2009-08-21 19:10 ` Stefan Monnier @ 2009-08-22 5:03 ` Stephen J. Turnbull 2009-08-23 1:03 ` Stefan Monnier 0 siblings, 1 reply; 119+ messages in thread From: Stephen J. Turnbull @ 2009-08-22 5:03 UTC (permalink / raw) To: Stefan Monnier; +Cc: Eli Zaretskii, rms, joakim, emacs-devel Stefan Monnier writes: > We need two functions: > - one implemented in C which provides the info from libmagic, no more no less. +1 > So yes, the second one can be called `file-mime-type', but the first > would better be called `file-magic' or somesuch. Eli doesn't like that, and he's convinced me. `file-type-by-magic' or some such, I think. Even if the reader hasn't had the benefit of a classical education, they will have some idea of what's going on. ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: using libmagic in Emacs? 2009-08-22 5:03 ` Stephen J. Turnbull @ 2009-08-23 1:03 ` Stefan Monnier 0 siblings, 0 replies; 119+ messages in thread From: Stefan Monnier @ 2009-08-23 1:03 UTC (permalink / raw) To: Stephen J. Turnbull; +Cc: Eli Zaretskii, rms, joakim, emacs-devel >> So yes, the second one can be called `file-mime-type', but the first >> would better be called `file-magic' or somesuch. > Eli doesn't like that, and he's convinced me. `file-type-by-magic' > or some such, I think. Even if the reader hasn't had the benefit of a > classical education, they will have some idea of what's going on. As long as `magic' is in the name, that's OK. Stefan ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: using libmagic in Emacs? 2009-08-19 22:49 ` joakim 2009-08-19 23:20 ` Dan Nicolaescu 2009-08-20 1:03 ` Stephen J. Turnbull @ 2009-08-20 13:57 ` Stefan Monnier 2009-08-20 19:19 ` joakim 2009-08-20 18:32 ` Richard Stallman 3 siblings, 1 reply; 119+ messages in thread From: Stefan Monnier @ 2009-08-20 13:57 UTC (permalink / raw) To: joakim; +Cc: Emacs Development >> I think it's a good idea. It may require some non-trivial changes on >> the Lisp side, since libmagic's information is not quite the same as >> what Emacs currently uses: we'll probably want to use libmagic to get >> a MIME-type and then have a table mapping mime-types to major modes or >> some such. > I attach an early draft filemagic patch. > Some notes: > - The mime type info usualy is less granular than the free > text info: We can provide both. I think we'd want 2 functions: one to get the free text info, which just returns a string (or nil), and another to get the MIME info, which returns a cons, whose car is a symbol such as application/octet-stream, and whose cdr is an alist representing the additional optional info. > file --mime /tmp/tst.xcf > /tmp/tst.xcf: application/octet-stream; charset=binary This would look like (application/octet-stream (charset . "binary")) A few more comments: - please follow the GNU coding convention. I.e. put spaces where they need to be (e.g. around parens and operators). - don't bother with a new file. Just put it into fileio.c. - as someone else mentioned, CPP macros in the Makefile.in are things we'd like to get rid of, so please don't put more of them there. Use autoconf's m4 macros instead, thank you. Stefan ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: using libmagic in Emacs? 2009-08-20 13:57 ` Stefan Monnier @ 2009-08-20 19:19 ` joakim 2009-08-20 22:08 ` Andreas Schwab 0 siblings, 1 reply; 119+ messages in thread From: joakim @ 2009-08-20 19:19 UTC (permalink / raw) To: Stefan Monnier; +Cc: Emacs Development [-- Attachment #1: Type: text/plain, Size: 1693 bytes --] Stefan Monnier <monnier@iro.umontreal.ca> writes: >>> I think it's a good idea. It may require some non-trivial changes on >>> the Lisp side, since libmagic's information is not quite the same as >>> what Emacs currently uses: we'll probably want to use libmagic to get >>> a MIME-type and then have a table mapping mime-types to major modes or >>> some such. > >> I attach an early draft filemagic patch. > >> Some notes: > >> - The mime type info usualy is less granular than the free >> text info: > > We can provide both. I think we'd want 2 functions: one to get the free > text info, which just returns a string (or nil), and another to get the > MIME info, which returns a cons, whose car is a symbol such as > application/octet-stream, and whose cdr is an alist representing > the additional optional info. > >> file --mime /tmp/tst.xcf >> /tmp/tst.xcf: application/octet-stream; charset=binary > > This would look like (application/octet-stream (charset . "binary")) > > A few more comments: > - please follow the GNU coding convention. I.e. put spaces where > they need to be (e.g. around parens and operators). > - don't bother with a new file. Just put it into fileio.c. > - as someone else mentioned, CPP macros in the Makefile.in are things > we'd like to get rid of, so please don't put more of them there. > Use autoconf's m4 macros instead, thank you. > > > Stefan Find attached a new slightly improved patch according to the suggestions of the list. However, I see now I didnt properly read your text above, the function I wrote now returns a list with 3 elements, (MIME_TYPE MIME_ENCODING DESCRIPTION). Do we still need 2 functions as you write above? [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: filemagic2.patch --] [-- Type: text/x-patch, Size: 5232 bytes --] diff --git a/configure.in b/configure.in index f4096db..cb74523 100644 --- a/configure.in +++ b/configure.in @@ -137,6 +137,8 @@ OPTION_DEFAULT_ON([xft],[don't use XFT for anti aliased fonts]) OPTION_DEFAULT_ON([libotf],[don't use libotf for OpenType font support]) OPTION_DEFAULT_ON([m17n-flt],[don't use m17n-flt for text shaping]) +OPTION_DEFAULT_ON([filemagic],[don't compile with filemagic support]) + OPTION_DEFAULT_ON([toolkit-scroll-bars],[don't use Motif or Xaw3d scroll bars]) OPTION_DEFAULT_ON([xaw3d],[don't use Xaw3d]) OPTION_DEFAULT_ON([xim],[don't use X11 XIM]) @@ -2223,6 +2225,19 @@ if test x"$ac_cv_func_alloca_works" != xyes; then AC_MSG_ERROR( [a system implementation of alloca is required] ) fi + +HAVE_LIBMAGIC=no +if test "${with_filemagic}" != "no"; then + #libmagic support + AC_CHECK_HEADERS(magic.h, [ AC_CHECK_LIB(magic,magic_open,HAVE_LIBMAGIC=yes) ]) +fi + +if test "${HAVE_LIBMAGIC}" = "yes"; then + LIBMAGIC=-lmagic + AC_SUBST(LIBMAGIC) + AC_DEFINE(HAVE_LIBMAGIC, 1, [Define to 1 if using libmagic.]) +fi + # fmod, logb, and frexp are found in -lm on most systems. # On HPUX 9.01, -lm does not contain logb, so check for sqrt. AC_CHECK_LIB(m, sqrt) @@ -2954,6 +2969,7 @@ echo " Does Emacs use -lpng? ${HAVE_PNG}" echo " Does Emacs use -lrsvg-2? ${HAVE_RSVG}" echo " Does Emacs use -lgpm? ${HAVE_GPM}" echo " Does Emacs use -ldbus? ${HAVE_DBUS}" +echo " Does Emacs use -lmagic? ${HAVE_LIBMAGIC}" echo " Does Emacs use -lfreetype? ${HAVE_FREETYPE}" echo " Does Emacs use -lm17n-flt? ${HAVE_M17N_FLT}" diff --git a/src/Makefile.in b/src/Makefile.in index 425cf98..33d1a14 100644 --- a/src/Makefile.in +++ b/src/Makefile.in @@ -420,6 +420,7 @@ LIBX= $(LIBXMENU) LD_SWITCH_X_SITE #endif /* not HAVE_LIBRESOLV */ LIBSOUND= @LIBSOUND@ +LIBMAGIC= @LIBMAGIC@ CFLAGS_SOUND= @CFLAGS_SOUND@ RSVG_LIBS= @RSVG_LIBS@ @@ -878,7 +879,7 @@ SOME_MACHINE_LISP = ../lisp/mouse.elc \ duplicated symbols. If the standard libraries were compiled with GCC, we might need gnulib again after them. */ -LIBES = $(LOADLIBES) $(LIBS) $(LIBX) $(LIBSOUND) $(RSVG_LIBS) $(DBUS_LIBS) \ +LIBES = $(LOADLIBES) $(LIBS) $(LIBX) $(LIBSOUND) $(LIBMAGIC) $(RSVG_LIBS) $(DBUS_LIBS) \ LIBGPM LIBRESOLV LIBS_SYSTEM LIBS_MACHINE LIBS_TERMCAP \ LIBS_DEBUG $(GETLOADAVG_LIBS) \ @FREETYPE_LIBS@ @FONTCONFIG_LIBS@ @LIBOTF_LIBS@ @M17N_FLT_LIBS@ \ diff --git a/src/config.in b/src/config.in index 404e00b..c966a09 100644 --- a/src/config.in +++ b/src/config.in @@ -262,6 +262,9 @@ along with GNU Emacs. If not, see <http://www.gnu.org/licenses/>. */ /* Define to 1 if you have the gpm library (-lgpm). */ #undef HAVE_GPM +/* Define to 1 if you have the filemagic library (-lmagic). */ +#undef HAVE_LIBMAGIC + /* Define to 1 if you have the `grantpt' function. */ #undef HAVE_GRANTPT diff --git a/src/fileio.c b/src/fileio.c index 3702d4c..375502e 100644 --- a/src/fileio.c +++ b/src/fileio.c @@ -205,6 +205,10 @@ Lisp_Object Vdirectory_sep_char; int write_region_inhibit_fsync; #endif +#ifdef HAVE_LIBMAGIC +#include <magic.h> +#endif + /* Non-zero means call move-file-to-trash in Fdelete_file or Fdelete_directory. */ int delete_by_moving_to_trash; @@ -2997,6 +3001,45 @@ DEFUN ("unix-sync", Funix_sync, Sunix_sync, 0, 0, "", #endif /* HAVE_SYNC */ +#ifdef HAVE_LIBMAGIC +DEFUN ("file-magic-file", Ffile_magic_file, Sfile_magic_file, 1,1,0, + doc: /* Return (MIME_TYPE MIME_ENCODING DESCRIPTION) for FILENAME. +Return nil on error. */) + (filename) + Lisp_Object filename; +{ + magic_t cookie=NULL; + if (!STRINGP (filename)) goto libmagic_error; + char* f = SDATA (filename); + char* rvs; + cookie = magic_open (MAGIC_NONE); + magic_load (cookie,NULL); //load default database + + magic_setflags (cookie, MAGIC_MIME_TYPE); + rvs = magic_file (cookie, f); + if (rvs == NULL) goto libmagic_error; + Lisp_Object file_mime = intern (rvs); + + magic_setflags (cookie, MAGIC_MIME_ENCODING); + rvs=magic_file (cookie, f); + if (rvs == NULL) goto libmagic_error; + Lisp_Object file_encoding = intern(rvs); + + magic_setflags (cookie, MAGIC_NONE); + rvs=magic_file (cookie, f); + if (rvs == NULL) goto libmagic_error; + + Lisp_Object file_freetext = make_specified_string (rvs, strlen(rvs), strlen(rvs), NULL); + Lisp_Object rv = Fcons (file_mime, Fcons (file_encoding, Fcons (file_freetext, Qnil))); + + magic_close (cookie); + return rv; + libmagic_error: + if (cookie != NULL) magic_close (cookie); + return Qnil; +} +#endif + DEFUN ("file-newer-than-file-p", Ffile_newer_than_file_p, Sfile_newer_than_file_p, 2, 2, 0, doc: /* Return t if file FILE1 is newer than file FILE2. If FILE1 does not exist, the answer is nil; @@ -5781,6 +5824,9 @@ When non-nil, the function `move-file-to-trash' will be used by #ifdef HAVE_SYNC defsubr (&Sunix_sync); #endif +#ifdef HAVE_LIBMAGIC + defsubr (&Sfile_magic_file); +#endif } /* arch-tag: 64ba3fd7-f844-4fb2-ba4b-427eb928786c [-- Attachment #3: Type: text/plain, Size: 20 bytes --] -- Joakim Verona ^ permalink raw reply related [flat|nested] 119+ messages in thread
* Re: using libmagic in Emacs? 2009-08-20 19:19 ` joakim @ 2009-08-20 22:08 ` Andreas Schwab 2009-08-21 9:55 ` joakim 0 siblings, 1 reply; 119+ messages in thread From: Andreas Schwab @ 2009-08-20 22:08 UTC (permalink / raw) To: joakim; +Cc: Stefan Monnier, Emacs Development joakim@verona.se writes: > diff --git a/configure.in b/configure.in > index f4096db..cb74523 100644 > --- a/configure.in > +++ b/configure.in > @@ -137,6 +137,8 @@ OPTION_DEFAULT_ON([xft],[don't use XFT for anti aliased fonts]) > OPTION_DEFAULT_ON([libotf],[don't use libotf for OpenType font support]) > OPTION_DEFAULT_ON([m17n-flt],[don't use m17n-flt for text shaping]) > > +OPTION_DEFAULT_ON([filemagic],[don't compile with filemagic support]) IMHO the option should be named libmagic, since that's how the library is named. > diff --git a/src/config.in b/src/config.in > index 404e00b..c966a09 100644 > --- a/src/config.in > +++ b/src/config.in > @@ -262,6 +262,9 @@ along with GNU Emacs. If not, see <http://www.gnu.org/licenses/>. */ > /* Define to 1 if you have the gpm library (-lgpm). */ > #undef HAVE_GPM > > +/* Define to 1 if you have the filemagic library (-lmagic). */ > +#undef HAVE_LIBMAGIC > + > /* Define to 1 if you have the `grantpt' function. */ > #undef HAVE_GRANTPT > This is generated by autoheader. > diff --git a/src/fileio.c b/src/fileio.c > index 3702d4c..375502e 100644 > --- a/src/fileio.c > +++ b/src/fileio.c > @@ -205,6 +205,10 @@ Lisp_Object Vdirectory_sep_char; > int write_region_inhibit_fsync; > #endif > > +#ifdef HAVE_LIBMAGIC > +#include <magic.h> > +#endif > + > /* Non-zero means call move-file-to-trash in Fdelete_file or > Fdelete_directory. */ > int delete_by_moving_to_trash; > @@ -2997,6 +3001,45 @@ DEFUN ("unix-sync", Funix_sync, Sunix_sync, 0, 0, "", > > #endif /* HAVE_SYNC */ > > +#ifdef HAVE_LIBMAGIC > +DEFUN ("file-magic-file", Ffile_magic_file, Sfile_magic_file, 1,1,0, > + doc: /* Return (MIME_TYPE MIME_ENCODING DESCRIPTION) for FILENAME. > +Return nil on error. */) > + (filename) > + Lisp_Object filename; > +{ > + magic_t cookie=NULL; > + if (!STRINGP (filename)) goto libmagic_error; Just use CHECK_STRING. > + char* f = SDATA (filename); > + char* rvs; No C99 features yet. Be careful with raw string pointers and GC. > + cookie = magic_open (MAGIC_NONE); > + magic_load (cookie,NULL); //load default database if (cookie == NULL) ? > + > + magic_setflags (cookie, MAGIC_MIME_TYPE); > + rvs = magic_file (cookie, f); > + if (rvs == NULL) goto libmagic_error; Use report_file_error, provided that magic_file sets errno appropriately. > + Lisp_Object file_freetext = make_specified_string (rvs, strlen(rvs), strlen(rvs), NULL); Use build_string. Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: using libmagic in Emacs? 2009-08-20 22:08 ` Andreas Schwab @ 2009-08-21 9:55 ` joakim 2009-08-21 11:01 ` Eli Zaretskii 2009-08-21 13:19 ` Andreas Schwab 0 siblings, 2 replies; 119+ messages in thread From: joakim @ 2009-08-21 9:55 UTC (permalink / raw) To: Andreas Schwab; +Cc: Stefan Monnier, Emacs Development [-- Attachment #1: Type: text/plain, Size: 2964 bytes --] New libmagic patch, mostly fixing Andreas concerns, and some more error handling. I dont understand the autoheader comment below though. When I originaly compiled, the config.h wasnt generated with the libmagic info included, did I do something wrong? Is autoheader supposed to generate config.in? When does that happen? /Joakim Andreas Schwab <schwab@linux-m68k.org> writes: > joakim@verona.se writes: > >> diff --git a/configure.in b/configure.in >> index f4096db..cb74523 100644 >> --- a/configure.in >> +++ b/configure.in >> @@ -137,6 +137,8 @@ OPTION_DEFAULT_ON([xft],[don't use XFT for anti aliased fonts]) >> OPTION_DEFAULT_ON([libotf],[don't use libotf for OpenType font support]) >> OPTION_DEFAULT_ON([m17n-flt],[don't use m17n-flt for text shaping]) >> >> +OPTION_DEFAULT_ON([filemagic],[don't compile with filemagic support]) > > IMHO the option should be named libmagic, since that's how the library > is named. > >> diff --git a/src/config.in b/src/config.in >> index 404e00b..c966a09 100644 >> --- a/src/config.in >> +++ b/src/config.in >> @@ -262,6 +262,9 @@ along with GNU Emacs. If not, see <http://www.gnu.org/licenses/>. */ >> /* Define to 1 if you have the gpm library (-lgpm). */ >> #undef HAVE_GPM >> >> +/* Define to 1 if you have the filemagic library (-lmagic). */ >> +#undef HAVE_LIBMAGIC >> + >> /* Define to 1 if you have the `grantpt' function. */ >> #undef HAVE_GRANTPT >> > > This is generated by autoheader. > >> diff --git a/src/fileio.c b/src/fileio.c >> index 3702d4c..375502e 100644 >> --- a/src/fileio.c >> +++ b/src/fileio.c >> @@ -205,6 +205,10 @@ Lisp_Object Vdirectory_sep_char; >> int write_region_inhibit_fsync; >> #endif >> >> +#ifdef HAVE_LIBMAGIC >> +#include <magic.h> >> +#endif >> + >> /* Non-zero means call move-file-to-trash in Fdelete_file or >> Fdelete_directory. */ >> int delete_by_moving_to_trash; >> @@ -2997,6 +3001,45 @@ DEFUN ("unix-sync", Funix_sync, Sunix_sync, 0, 0, "", >> >> #endif /* HAVE_SYNC */ >> >> +#ifdef HAVE_LIBMAGIC >> +DEFUN ("file-magic-file", Ffile_magic_file, Sfile_magic_file, 1,1,0, >> + doc: /* Return (MIME_TYPE MIME_ENCODING DESCRIPTION) for FILENAME. >> +Return nil on error. */) >> + (filename) >> + Lisp_Object filename; >> +{ >> + magic_t cookie=NULL; >> + if (!STRINGP (filename)) goto libmagic_error; > > Just use CHECK_STRING. > >> + char* f = SDATA (filename); >> + char* rvs; > > No C99 features yet. Be careful with raw string pointers and GC. > >> + cookie = magic_open (MAGIC_NONE); >> + magic_load (cookie,NULL); //load default database > > if (cookie == NULL) ? > >> + >> + magic_setflags (cookie, MAGIC_MIME_TYPE); >> + rvs = magic_file (cookie, f); >> + if (rvs == NULL) goto libmagic_error; > > Use report_file_error, provided that magic_file sets errno appropriately. > >> + Lisp_Object file_freetext = make_specified_string (rvs, strlen(rvs), strlen(rvs), NULL); > > Use build_string. > > Andreas. [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: filemagic3.patch --] [-- Type: text/x-patch, Size: 5540 bytes --] diff --git a/configure.in b/configure.in index f4096db..49a3f15 100644 --- a/configure.in +++ b/configure.in @@ -137,6 +137,8 @@ OPTION_DEFAULT_ON([xft],[don't use XFT for anti aliased fonts]) OPTION_DEFAULT_ON([libotf],[don't use libotf for OpenType font support]) OPTION_DEFAULT_ON([m17n-flt],[don't use m17n-flt for text shaping]) +OPTION_DEFAULT_ON([libmagic],[don't compile with libmagic support]) + OPTION_DEFAULT_ON([toolkit-scroll-bars],[don't use Motif or Xaw3d scroll bars]) OPTION_DEFAULT_ON([xaw3d],[don't use Xaw3d]) OPTION_DEFAULT_ON([xim],[don't use X11 XIM]) @@ -2223,6 +2225,19 @@ if test x"$ac_cv_func_alloca_works" != xyes; then AC_MSG_ERROR( [a system implementation of alloca is required] ) fi + +HAVE_LIBMAGIC=no +if test "${with_libmagic}" != "no"; then + #libmagic support + AC_CHECK_HEADERS(magic.h, [ AC_CHECK_LIB(magic,magic_open,HAVE_LIBMAGIC=yes) ]) +fi + +if test "${HAVE_LIBMAGIC}" = "yes"; then + LIBMAGIC=-lmagic + AC_SUBST(LIBMAGIC) + AC_DEFINE(HAVE_LIBMAGIC, 1, [Define to 1 if using libmagic.]) +fi + # fmod, logb, and frexp are found in -lm on most systems. # On HPUX 9.01, -lm does not contain logb, so check for sqrt. AC_CHECK_LIB(m, sqrt) @@ -2954,6 +2969,7 @@ echo " Does Emacs use -lpng? ${HAVE_PNG}" echo " Does Emacs use -lrsvg-2? ${HAVE_RSVG}" echo " Does Emacs use -lgpm? ${HAVE_GPM}" echo " Does Emacs use -ldbus? ${HAVE_DBUS}" +echo " Does Emacs use -lmagic? ${HAVE_LIBMAGIC}" echo " Does Emacs use -lfreetype? ${HAVE_FREETYPE}" echo " Does Emacs use -lm17n-flt? ${HAVE_M17N_FLT}" diff --git a/src/Makefile.in b/src/Makefile.in index 425cf98..33d1a14 100644 --- a/src/Makefile.in +++ b/src/Makefile.in @@ -420,6 +420,7 @@ LIBX= $(LIBXMENU) LD_SWITCH_X_SITE #endif /* not HAVE_LIBRESOLV */ LIBSOUND= @LIBSOUND@ +LIBMAGIC= @LIBMAGIC@ CFLAGS_SOUND= @CFLAGS_SOUND@ RSVG_LIBS= @RSVG_LIBS@ @@ -878,7 +879,7 @@ SOME_MACHINE_LISP = ../lisp/mouse.elc \ duplicated symbols. If the standard libraries were compiled with GCC, we might need gnulib again after them. */ -LIBES = $(LOADLIBES) $(LIBS) $(LIBX) $(LIBSOUND) $(RSVG_LIBS) $(DBUS_LIBS) \ +LIBES = $(LOADLIBES) $(LIBS) $(LIBX) $(LIBSOUND) $(LIBMAGIC) $(RSVG_LIBS) $(DBUS_LIBS) \ LIBGPM LIBRESOLV LIBS_SYSTEM LIBS_MACHINE LIBS_TERMCAP \ LIBS_DEBUG $(GETLOADAVG_LIBS) \ @FREETYPE_LIBS@ @FONTCONFIG_LIBS@ @LIBOTF_LIBS@ @M17N_FLT_LIBS@ \ diff --git a/src/config.in b/src/config.in index 404e00b..c966a09 100644 --- a/src/config.in +++ b/src/config.in @@ -262,6 +262,9 @@ along with GNU Emacs. If not, see <http://www.gnu.org/licenses/>. */ /* Define to 1 if you have the gpm library (-lgpm). */ #undef HAVE_GPM +/* Define to 1 if you have the filemagic library (-lmagic). */ +#undef HAVE_LIBMAGIC + /* Define to 1 if you have the `grantpt' function. */ #undef HAVE_GRANTPT diff --git a/src/fileio.c b/src/fileio.c index 3702d4c..67d271c 100644 --- a/src/fileio.c +++ b/src/fileio.c @@ -205,6 +205,10 @@ Lisp_Object Vdirectory_sep_char; int write_region_inhibit_fsync; #endif +#ifdef HAVE_LIBMAGIC +#include <magic.h> +#endif + /* Non-zero means call move-file-to-trash in Fdelete_file or Fdelete_directory. */ int delete_by_moving_to_trash; @@ -2997,6 +3001,52 @@ DEFUN ("unix-sync", Funix_sync, Sunix_sync, 0, 0, "", #endif /* HAVE_SYNC */ +#ifdef HAVE_LIBMAGIC +DEFUN ("libmagic-file-internal", Flibmagic_file_internal, Slibmagic_file_internal, 1,1,0, + doc: /* Return (MIME_TYPE MIME_ENCODING DESCRIPTION) for FILENAME_OR_BUFFER. +Return nil on error. */) + (filename_or_buffer) + Lisp_Object filename_or_buffer; +{ + CHECK_STRING_OR_BUFFER (filename_or_buffer); + magic_t cookie=NULL; + char* f = NULL; + const char* rvs; + + if (STRINGP (filename_or_buffer)) + f = SDATA (filename_or_buffer); + if (BUFFERP (filename_or_buffer)) + f = SDATA (XBUFFER (filename_or_buffer)->filename); + cookie = magic_open (MAGIC_ERROR); + if (cookie == NULL) goto libmagic_error; + magic_load (cookie, NULL); //load default database + + magic_setflags (cookie, MAGIC_MIME_TYPE | MAGIC_ERROR); + rvs = magic_file (cookie, f); + if (rvs == NULL) goto libmagic_error; + Lisp_Object file_mime = intern (rvs); + + magic_setflags (cookie, MAGIC_MIME_ENCODING | MAGIC_ERROR); + rvs=magic_file (cookie, f); + if (rvs == NULL) goto libmagic_error; + Lisp_Object file_encoding = intern(rvs); + + magic_setflags (cookie, MAGIC_NONE | MAGIC_ERROR); + rvs=magic_file (cookie, f); + if (rvs == NULL) goto libmagic_error; + + Lisp_Object file_freetext = build_string (rvs); + Lisp_Object rv = Fcons (file_mime, Fcons (file_encoding, Fcons (file_freetext, Qnil))); + + magic_close (cookie); + return rv; + libmagic_error: + if (cookie != NULL) magic_close (cookie); + report_file_error("Libmagic error",Qnil); + return Qnil; +} +#endif + DEFUN ("file-newer-than-file-p", Ffile_newer_than_file_p, Sfile_newer_than_file_p, 2, 2, 0, doc: /* Return t if file FILE1 is newer than file FILE2. If FILE1 does not exist, the answer is nil; @@ -5781,6 +5831,9 @@ When non-nil, the function `move-file-to-trash' will be used by #ifdef HAVE_SYNC defsubr (&Sunix_sync); #endif +#ifdef HAVE_LIBMAGIC + defsubr (&Slibmagic_file_internal); +#endif } /* arch-tag: 64ba3fd7-f844-4fb2-ba4b-427eb928786c [-- Attachment #3: Type: text/plain, Size: 19 bytes --] -- Joakim Verona ^ permalink raw reply related [flat|nested] 119+ messages in thread
* Re: using libmagic in Emacs? 2009-08-21 9:55 ` joakim @ 2009-08-21 11:01 ` Eli Zaretskii 2009-08-21 17:38 ` joakim 2009-08-21 19:18 ` Stefan Monnier 2009-08-21 13:19 ` Andreas Schwab 1 sibling, 2 replies; 119+ messages in thread From: Eli Zaretskii @ 2009-08-21 11:01 UTC (permalink / raw) To: joakim; +Cc: schwab, monnier, emacs-devel > From: joakim@verona.se > Date: Fri, 21 Aug 2009 11:55:45 +0200 > Cc: Stefan Monnier <monnier@iro.umontreal.ca>, > Emacs Development <emacs-devel@gnu.org> > > Is autoheader supposed to generate config.in? Yes. > When does that happen? Either run autoheader by hand, or run autoreconf (which AFAIK is supposed to run autoheader as well). > +DEFUN ("libmagic-file-internal", Flibmagic_file_internal, Slibmagic_file_internal, 1,1,0, > + doc: /* Return (MIME_TYPE MIME_ENCODING DESCRIPTION) for FILENAME_OR_BUFFER. > +Return nil on error. */) This doc string "needs work"(TM). Please use the doc string of visited-file-name as an example. > + (filename_or_buffer) > + Lisp_Object filename_or_buffer; Using a `_' in an argument is un-Lisp'y (IMO). > +{ > + CHECK_STRING_OR_BUFFER (filename_or_buffer); > + magic_t cookie=NULL; > + char* f = NULL; > + const char* rvs; > + > + if (STRINGP (filename_or_buffer)) > + f = SDATA (filename_or_buffer); > + if (BUFFERP (filename_or_buffer)) > + f = SDATA (XBUFFER (filename_or_buffer)->filename); > + cookie = magic_open (MAGIC_ERROR); > + if (cookie == NULL) goto libmagic_error; > + magic_load (cookie, NULL); //load default database > + > + magic_setflags (cookie, MAGIC_MIME_TYPE | MAGIC_ERROR); > + rvs = magic_file (cookie, f); You need to encode file names before you pass them to C APIs. Use ENCODE_FILE to do that; see file-attributes for an example of how this is done. > + if (rvs == NULL) goto libmagic_error; > + Lisp_Object file_mime = intern (rvs); You cannot declare variables in the middle of a block: Emacs does not require a C99 compiler yet and need to support C90 or even older compilers, which will reject this code. > + magic_setflags (cookie, MAGIC_MIME_ENCODING | MAGIC_ERROR); > + rvs=magic_file (cookie, f); > + if (rvs == NULL) goto libmagic_error; > + Lisp_Object file_encoding = intern(rvs); Is file_encoding supposed to be a valid encoding, one of those for which Emacs has a coding-system? If so, perhaps you should make sure you indeed return a valid coding-system or its alias, or otherwise tell in the doc string that it's not guaranteed to be valid (so that the caller should validate it before using). ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: using libmagic in Emacs? 2009-08-21 11:01 ` Eli Zaretskii @ 2009-08-21 17:38 ` joakim 2009-08-21 17:46 ` Rupert Swarbrick ` (2 more replies) 2009-08-21 19:18 ` Stefan Monnier 1 sibling, 3 replies; 119+ messages in thread From: joakim @ 2009-08-21 17:38 UTC (permalink / raw) To: Eli Zaretskii; +Cc: schwab, monnier, emacs-devel [-- Attachment #1: Type: text/plain, Size: 1651 bytes --] New filemagic patch mostly fixing Eli:s concerns. Eli Zaretskii <eliz@gnu.org> writes: >> +DEFUN ("libmagic-file-internal", Flibmagic_file_internal, Slibmagic_file_internal, 1,1,0, >> + doc: /* Return (MIME_TYPE MIME_ENCODING DESCRIPTION) for FILENAME_OR_BUFFER. >> +Return nil on error. */) Renamed entry point to libmagic-file-internal since its meant to be of internal usage for a lisp wrapper, yet to be written. Should that be a new file BTW? > This doc string "needs work"(TM). Please use the doc string of > visited-file-name as an example. I worked on this > >> + (filename_or_buffer) >> + Lisp_Object filename_or_buffer; > > Using a `_' in an argument is un-Lisp'y (IMO). Ok. > You need to encode file names before you pass them to C APIs. Use > ENCODE_FILE to do that; see file-attributes for an example of how this > is done. Ok. >> + if (rvs == NULL) goto libmagic_error; >> + Lisp_Object file_mime = intern (rvs); > > You cannot declare variables in the middle of a block: Emacs does not > require a C99 compiler yet and need to support C90 or even older > compilers, which will reject this code. I'm habing trouble remembering not to use c99. Is there some convenient compiler flag to force lower versions? Fixed the errors I saw. > Is file_encoding supposed to be a valid encoding, one of those for > which Emacs has a coding-system? If so, perhaps you should make sure > you indeed return a valid coding-system or its alias, or otherwise > tell in the doc string that it's not guaranteed to be valid (so that > the caller should validate it before using). I described a bit more in the doc string. Ok? [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: filemagic3.patch --] [-- Type: text/x-patch, Size: 5892 bytes --] diff --git a/configure.in b/configure.in index f4096db..49a3f15 100644 --- a/configure.in +++ b/configure.in @@ -137,6 +137,8 @@ OPTION_DEFAULT_ON([xft],[don't use XFT for anti aliased fonts]) OPTION_DEFAULT_ON([libotf],[don't use libotf for OpenType font support]) OPTION_DEFAULT_ON([m17n-flt],[don't use m17n-flt for text shaping]) +OPTION_DEFAULT_ON([libmagic],[don't compile with libmagic support]) + OPTION_DEFAULT_ON([toolkit-scroll-bars],[don't use Motif or Xaw3d scroll bars]) OPTION_DEFAULT_ON([xaw3d],[don't use Xaw3d]) OPTION_DEFAULT_ON([xim],[don't use X11 XIM]) @@ -2223,6 +2225,19 @@ if test x"$ac_cv_func_alloca_works" != xyes; then AC_MSG_ERROR( [a system implementation of alloca is required] ) fi + +HAVE_LIBMAGIC=no +if test "${with_libmagic}" != "no"; then + #libmagic support + AC_CHECK_HEADERS(magic.h, [ AC_CHECK_LIB(magic,magic_open,HAVE_LIBMAGIC=yes) ]) +fi + +if test "${HAVE_LIBMAGIC}" = "yes"; then + LIBMAGIC=-lmagic + AC_SUBST(LIBMAGIC) + AC_DEFINE(HAVE_LIBMAGIC, 1, [Define to 1 if using libmagic.]) +fi + # fmod, logb, and frexp are found in -lm on most systems. # On HPUX 9.01, -lm does not contain logb, so check for sqrt. AC_CHECK_LIB(m, sqrt) @@ -2954,6 +2969,7 @@ echo " Does Emacs use -lpng? ${HAVE_PNG}" echo " Does Emacs use -lrsvg-2? ${HAVE_RSVG}" echo " Does Emacs use -lgpm? ${HAVE_GPM}" echo " Does Emacs use -ldbus? ${HAVE_DBUS}" +echo " Does Emacs use -lmagic? ${HAVE_LIBMAGIC}" echo " Does Emacs use -lfreetype? ${HAVE_FREETYPE}" echo " Does Emacs use -lm17n-flt? ${HAVE_M17N_FLT}" diff --git a/src/Makefile.in b/src/Makefile.in index 425cf98..33d1a14 100644 --- a/src/Makefile.in +++ b/src/Makefile.in @@ -420,6 +420,7 @@ LIBX= $(LIBXMENU) LD_SWITCH_X_SITE #endif /* not HAVE_LIBRESOLV */ LIBSOUND= @LIBSOUND@ +LIBMAGIC= @LIBMAGIC@ CFLAGS_SOUND= @CFLAGS_SOUND@ RSVG_LIBS= @RSVG_LIBS@ @@ -878,7 +879,7 @@ SOME_MACHINE_LISP = ../lisp/mouse.elc \ duplicated symbols. If the standard libraries were compiled with GCC, we might need gnulib again after them. */ -LIBES = $(LOADLIBES) $(LIBS) $(LIBX) $(LIBSOUND) $(RSVG_LIBS) $(DBUS_LIBS) \ +LIBES = $(LOADLIBES) $(LIBS) $(LIBX) $(LIBSOUND) $(LIBMAGIC) $(RSVG_LIBS) $(DBUS_LIBS) \ LIBGPM LIBRESOLV LIBS_SYSTEM LIBS_MACHINE LIBS_TERMCAP \ LIBS_DEBUG $(GETLOADAVG_LIBS) \ @FREETYPE_LIBS@ @FONTCONFIG_LIBS@ @LIBOTF_LIBS@ @M17N_FLT_LIBS@ \ diff --git a/src/fileio.c b/src/fileio.c index 3702d4c..cbb0461 100644 --- a/src/fileio.c +++ b/src/fileio.c @@ -205,6 +205,10 @@ Lisp_Object Vdirectory_sep_char; int write_region_inhibit_fsync; #endif +#ifdef HAVE_LIBMAGIC +#include <magic.h> +#endif + /* Non-zero means call move-file-to-trash in Fdelete_file or Fdelete_directory. */ int delete_by_moving_to_trash; @@ -2997,6 +3001,77 @@ DEFUN ("unix-sync", Funix_sync, Sunix_sync, 0, 0, "", #endif /* HAVE_SYNC */ +#ifdef HAVE_LIBMAGIC +DEFUN ("libmagic-file-internal", Flibmagic_file_internal, Slibmagic_file_internal, 1,1,0, + doc: /* Return (MIME-TYPE MIME-ENCODING DESCRIPTION) for +FILENAME-OR-BUFFER using libmagic. If FILENAME-OR-BUFFER is a file, +return information about the file. If FILENAME-OR-BUFFER is a buffer, +return information about the file of the buffer. MIME-TYPE and +MIME-ENCODING are the mime type and mime encoding as determined by +libmagic. DESCRIPTION is the human readable descripton offered by +libmagic for the file. + +The default libmagic database is used, and the quality of information +given depends on your version of that database. Often the mime type is +less exact than the description. + + */) + (filename_or_buffer) + Lisp_Object filename_or_buffer; +{ + CHECK_STRING_OR_BUFFER (filename_or_buffer); + magic_t cookie=NULL; + char* f = NULL; + const char* rvs; + Lisp_Object file_freetext; + Lisp_Object rv; + Lisp_Object file_mime; + Lisp_Object file_encoding; + + Lisp_Object filename, absname, encoded_absname; + struct gcpro gcpro1; + + GCPRO1 (f); + + if (STRINGP (filename_or_buffer)) + filename = filename_or_buffer; + if (BUFFERP (filename_or_buffer)) + filename = XBUFFER (filename_or_buffer)->filename; + absname = Fexpand_file_name (filename, current_buffer->directory); + f = SDATA(ENCODE_FILE (absname)); + + cookie = magic_open (MAGIC_ERROR); + if (cookie == NULL) goto libmagic_error; + magic_load (cookie, NULL); /* load default database */ + + magic_setflags (cookie, MAGIC_MIME_TYPE | MAGIC_ERROR); + rvs = magic_file (cookie, f); + if (rvs == NULL) goto libmagic_error; + file_mime = intern (rvs); + + magic_setflags (cookie, MAGIC_MIME_ENCODING | MAGIC_ERROR); + rvs=magic_file (cookie, f); + if (rvs == NULL) goto libmagic_error; + file_encoding = intern(rvs); + + magic_setflags (cookie, MAGIC_NONE | MAGIC_ERROR); + rvs=magic_file (cookie, f); + if (rvs == NULL) goto libmagic_error; + + file_freetext = build_string (rvs); + rv = Fcons (file_mime, Fcons (file_encoding, Fcons (file_freetext, Qnil))); + + magic_close (cookie); + UNGCPRO; + return rv; + libmagic_error: + if (cookie != NULL) magic_close (cookie); + report_file_error("Libmagic error",Qnil); + UNGCPRO; + return Qnil; +} +#endif + DEFUN ("file-newer-than-file-p", Ffile_newer_than_file_p, Sfile_newer_than_file_p, 2, 2, 0, doc: /* Return t if file FILE1 is newer than file FILE2. If FILE1 does not exist, the answer is nil; @@ -5781,6 +5856,9 @@ When non-nil, the function `move-file-to-trash' will be used by #ifdef HAVE_SYNC defsubr (&Sunix_sync); #endif +#ifdef HAVE_LIBMAGIC + defsubr (&Slibmagic_file_internal); +#endif } /* arch-tag: 64ba3fd7-f844-4fb2-ba4b-427eb928786c [-- Attachment #3: Type: text/plain, Size: 19 bytes --] -- Joakim Verona ^ permalink raw reply related [flat|nested] 119+ messages in thread
* Re: using libmagic in Emacs? 2009-08-21 17:38 ` joakim @ 2009-08-21 17:46 ` Rupert Swarbrick 2009-08-21 18:31 ` Andreas Schwab 2009-08-21 18:42 ` Eli Zaretskii 2 siblings, 0 replies; 119+ messages in thread From: Rupert Swarbrick @ 2009-08-21 17:46 UTC (permalink / raw) To: emacs-devel [-- Attachment #1: Type: text/plain, Size: 393 bytes --] joakim@verona.se writes: > I'm habing trouble remembering not to use c99. Is there some convenient > compiler flag to force lower versions? Fixed the errors I saw. > Maybe -ansi combined with -pedantic, for gcc. rupert@hake:~/tmp gcc -ansi -pedantic -o test test.c test.c: In function ‘main’: test.c:10: warning: ISO C90 forbids mixed declarations and code Rupert [-- Attachment #2: Type: application/pgp-signature, Size: 314 bytes --] ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: using libmagic in Emacs? 2009-08-21 17:38 ` joakim 2009-08-21 17:46 ` Rupert Swarbrick @ 2009-08-21 18:31 ` Andreas Schwab 2009-08-21 19:13 ` Drew Adams 2009-08-21 18:42 ` Eli Zaretskii 2 siblings, 1 reply; 119+ messages in thread From: Andreas Schwab @ 2009-08-21 18:31 UTC (permalink / raw) To: joakim; +Cc: Eli Zaretskii, monnier, emacs-devel joakim@verona.se writes: > +#ifdef HAVE_LIBMAGIC > +DEFUN ("libmagic-file-internal", Flibmagic_file_internal, Slibmagic_file_internal, 1,1,0, > + doc: /* Return (MIME-TYPE MIME-ENCODING DESCRIPTION) for The first doc line needs to be a complete sentence fitting in about 75 columns. You should only mention the argument here, and explain the structure of the return value in the second sentence. > + char* f = NULL; > + const char* rvs; > + Lisp_Object file_freetext; > + Lisp_Object rv; > + Lisp_Object file_mime; > + Lisp_Object file_encoding; > + > + Lisp_Object filename, absname, encoded_absname; > + struct gcpro gcpro1; > + > + GCPRO1 (f); You cannot GCPRO pointers, only Lisp_Objects. > + if (STRINGP (filename_or_buffer)) > + filename = filename_or_buffer; > + if (BUFFERP (filename_or_buffer)) > + filename = XBUFFER (filename_or_buffer)->filename; > + absname = Fexpand_file_name (filename, current_buffer->directory); > + f = SDATA(ENCODE_FILE (absname)); Since ENCODE_FILE can GC you need to protect every Lisp_Object variable used around the call, especially all that are Lisp_Strings. > + libmagic_error: > + if (cookie != NULL) magic_close (cookie); > + report_file_error("Libmagic error",Qnil); You need to make sure that errno is preserved from the failed operation and that you get a meaningful errno in the first place. Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." ^ permalink raw reply [flat|nested] 119+ messages in thread
* RE: using libmagic in Emacs? 2009-08-21 18:31 ` Andreas Schwab @ 2009-08-21 19:13 ` Drew Adams 0 siblings, 0 replies; 119+ messages in thread From: Drew Adams @ 2009-08-21 19:13 UTC (permalink / raw) To: 'Andreas Schwab', joakim Cc: 'Eli Zaretskii', monnier, emacs-devel > The first doc line needs to be a complete sentence > fitting in about 75 columns. No, <= 67 chars, not 75 chars. This is especially important for resizing of windows or frames to fit their widest line. From (elisp) Documentation Tips: Format the documentation string so that it fits in an Emacs window on an 80-column screen. It is a good idea for most lines to be no wider than 60 characters. The first line should not be wider than 67 characters or it will look bad in the output of `apropos'. See also bugs #4162 and #3227. ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: using libmagic in Emacs? 2009-08-21 17:38 ` joakim 2009-08-21 17:46 ` Rupert Swarbrick 2009-08-21 18:31 ` Andreas Schwab @ 2009-08-21 18:42 ` Eli Zaretskii 2009-08-21 21:48 ` joakim 2 siblings, 1 reply; 119+ messages in thread From: Eli Zaretskii @ 2009-08-21 18:42 UTC (permalink / raw) To: joakim; +Cc: schwab, monnier, emacs-devel > From: joakim@verona.se > Cc: schwab@linux-m68k.org, monnier@iro.umontreal.ca, emacs-devel@gnu.org > Date: Fri, 21 Aug 2009 19:38:15 +0200 > > Renamed entry point to libmagic-file-internal since its meant to be > of internal usage for a lisp wrapper, yet to be written. Should that be > a new file BTW? files.el sounds good enough to me. > I'm habing trouble remembering not to use c99. Is there some convenient > compiler flag to force lower versions? I think -std=c89 should do what you want (assuming you use GCC). > I described a bit more in the doc string. Ok? I suggest the following variation of it: doc: /* Return a list describing the argument FILE-OR-BUFFER. If FILE-OR-BUFFER is a file name, return information about that file. If FILE-OR-BUFFER is a buffer, return information about the buffer's file. The return value is a list of the form (MIME-TYPE MIME-ENCODING DESCRIPTION) MIME-TYPE and MIME-ENCODING are the MIME type and encoding suitable for the file's contents, as determined by libmagic. DESCRIPTION is the human readable descripton of the file type offered by libmagic. The function throws a file-error if libmagic cannot determine one of the elements of the above list. The default libmagic database is used, and the quality of information given depends on your version of that database. Often the MIME type is less exact than the description. */) Two more comments: . I am not sure you need to push the file-or-buffer dichotomy to the C level. It is easy enough to do that in Lisp, or even in the application, for that matter. Why complicate a primitive to do such a simple job? . You didn't say in the doc string whether MIME-ENCODING is guaranteed to be a valid Emacs coding-system. I think a user will be desperate to know that. Thanks for working on this. ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: using libmagic in Emacs? 2009-08-21 18:42 ` Eli Zaretskii @ 2009-08-21 21:48 ` joakim 2009-08-21 22:46 ` Andreas Schwab 0 siblings, 1 reply; 119+ messages in thread From: joakim @ 2009-08-21 21:48 UTC (permalink / raw) To: Eli Zaretskii; +Cc: schwab, monnier, emacs-devel [-- Attachment #1: Type: text/plain, Size: 892 bytes --] > I suggest the following variation of it: > > doc: /* Return a list describing the argument FILE-OR-BUFFER. > > If FILE-OR-BUFFER is a file name, return information about that file. ... > given depends on your version of that database. Often the MIME type is > less exact than the description. */) I used this description. > Two more comments: > > . I am not sure you need to push the file-or-buffer dichotomy to the > C level. It is easy enough to do that in Lisp, or even in the > application, for that matter. Why complicate a primitive to do > such a simple job? Ok, removed it. > . You didn't say in the doc string whether MIME-ENCODING is > guaranteed to be a valid Emacs coding-system. I think a user will > be desperate to know that. Did like Stefan sugested, return a string. Also improved GCPRO stuff, like Andreas suggested. -- Joakim Verona [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: filemagic4.patch --] [-- Type: text/x-patch, Size: 5853 bytes --] diff --git a/configure.in b/configure.in index f4096db..49a3f15 100644 --- a/configure.in +++ b/configure.in @@ -137,6 +137,8 @@ OPTION_DEFAULT_ON([xft],[don't use XFT for anti aliased fonts]) OPTION_DEFAULT_ON([libotf],[don't use libotf for OpenType font support]) OPTION_DEFAULT_ON([m17n-flt],[don't use m17n-flt for text shaping]) +OPTION_DEFAULT_ON([libmagic],[don't compile with libmagic support]) + OPTION_DEFAULT_ON([toolkit-scroll-bars],[don't use Motif or Xaw3d scroll bars]) OPTION_DEFAULT_ON([xaw3d],[don't use Xaw3d]) OPTION_DEFAULT_ON([xim],[don't use X11 XIM]) @@ -2223,6 +2225,19 @@ if test x"$ac_cv_func_alloca_works" != xyes; then AC_MSG_ERROR( [a system implementation of alloca is required] ) fi + +HAVE_LIBMAGIC=no +if test "${with_libmagic}" != "no"; then + #libmagic support + AC_CHECK_HEADERS(magic.h, [ AC_CHECK_LIB(magic,magic_open,HAVE_LIBMAGIC=yes) ]) +fi + +if test "${HAVE_LIBMAGIC}" = "yes"; then + LIBMAGIC=-lmagic + AC_SUBST(LIBMAGIC) + AC_DEFINE(HAVE_LIBMAGIC, 1, [Define to 1 if using libmagic.]) +fi + # fmod, logb, and frexp are found in -lm on most systems. # On HPUX 9.01, -lm does not contain logb, so check for sqrt. AC_CHECK_LIB(m, sqrt) @@ -2954,6 +2969,7 @@ echo " Does Emacs use -lpng? ${HAVE_PNG}" echo " Does Emacs use -lrsvg-2? ${HAVE_RSVG}" echo " Does Emacs use -lgpm? ${HAVE_GPM}" echo " Does Emacs use -ldbus? ${HAVE_DBUS}" +echo " Does Emacs use -lmagic? ${HAVE_LIBMAGIC}" echo " Does Emacs use -lfreetype? ${HAVE_FREETYPE}" echo " Does Emacs use -lm17n-flt? ${HAVE_M17N_FLT}" diff --git a/src/Makefile.in b/src/Makefile.in index 425cf98..33d1a14 100644 --- a/src/Makefile.in +++ b/src/Makefile.in @@ -420,6 +420,7 @@ LIBX= $(LIBXMENU) LD_SWITCH_X_SITE #endif /* not HAVE_LIBRESOLV */ LIBSOUND= @LIBSOUND@ +LIBMAGIC= @LIBMAGIC@ CFLAGS_SOUND= @CFLAGS_SOUND@ RSVG_LIBS= @RSVG_LIBS@ @@ -878,7 +879,7 @@ SOME_MACHINE_LISP = ../lisp/mouse.elc \ duplicated symbols. If the standard libraries were compiled with GCC, we might need gnulib again after them. */ -LIBES = $(LOADLIBES) $(LIBS) $(LIBX) $(LIBSOUND) $(RSVG_LIBS) $(DBUS_LIBS) \ +LIBES = $(LOADLIBES) $(LIBS) $(LIBX) $(LIBSOUND) $(LIBMAGIC) $(RSVG_LIBS) $(DBUS_LIBS) \ LIBGPM LIBRESOLV LIBS_SYSTEM LIBS_MACHINE LIBS_TERMCAP \ LIBS_DEBUG $(GETLOADAVG_LIBS) \ @FREETYPE_LIBS@ @FONTCONFIG_LIBS@ @LIBOTF_LIBS@ @M17N_FLT_LIBS@ \ diff --git a/src/fileio.c b/src/fileio.c index 3702d4c..48a1ccd 100644 --- a/src/fileio.c +++ b/src/fileio.c @@ -205,6 +205,10 @@ Lisp_Object Vdirectory_sep_char; int write_region_inhibit_fsync; #endif +#ifdef HAVE_LIBMAGIC +#include <magic.h> +#endif + /* Non-zero means call move-file-to-trash in Fdelete_file or Fdelete_directory. */ int delete_by_moving_to_trash; @@ -2997,6 +3001,78 @@ DEFUN ("unix-sync", Funix_sync, Sunix_sync, 0, 0, "", #endif /* HAVE_SYNC */ +#ifdef HAVE_LIBMAGIC +DEFUN ("libmagic-file-internal", Flibmagic_file_internal, Slibmagic_file_internal, 1,1,0, + doc: /* Return a list describing the argument FILENAME. + + + The return value is a list of the form + + (MIME-TYPE MIME-ENCODING DESCRIPTION) + + MIME-TYPE and MIME-ENCODING are the MIME type and encoding suitable + for the file's contents, as determined by libmagic. + DESCRIPTION is the human readable descripton of the file type offered by + libmagic. + + The function throws a file-error if libmagic cannot determine one of + the elements of the above list. + + The default libmagic database is used, and the quality of information + given depends on your version of that database. Often the MIME type is + less exact than the description. */) + (filename) + Lisp_Object filename; +{ + CHECK_STRING (filename); + magic_t cookie=NULL; + char* f = NULL; + const char* rvs; + Lisp_Object file_description; + Lisp_Object file_mime; + Lisp_Object file_encoding; + Lisp_Object rv; + + Lisp_Object absname, encoded_absname; + struct gcpro gcpro1, gcpro2, gcpro3, gcpro4, gcpro5, gcpro6; + + GCPRO6 (file_description, file_mime, file_encoding, rv, absname, encoded_absname); + + absname = Fexpand_file_name (filename, current_buffer->directory); + f = SDATA(ENCODE_FILE (absname)); + + cookie = magic_open (MAGIC_ERROR); + if (cookie == NULL) goto libmagic_error; + magic_load (cookie, NULL); /* load default database */ + + magic_setflags (cookie, MAGIC_MIME_TYPE | MAGIC_ERROR); + rvs = magic_file (cookie, f); + if (rvs == NULL) goto libmagic_error; + file_mime = intern (rvs); + + magic_setflags (cookie, MAGIC_MIME_ENCODING | MAGIC_ERROR); + rvs=magic_file (cookie, f); + if (rvs == NULL) goto libmagic_error; + file_encoding = build_string(rvs); + + magic_setflags (cookie, MAGIC_NONE | MAGIC_ERROR); + rvs=magic_file (cookie, f); + if (rvs == NULL) goto libmagic_error; + + file_description = build_string (rvs); + rv = Fcons (file_mime, Fcons (file_encoding, Fcons (file_description, Qnil))); + + magic_close (cookie); + UNGCPRO; + return rv; + libmagic_error: + report_file_error("Libmagic error",Qnil); + if (cookie != NULL) magic_close (cookie); + UNGCPRO; + return Qnil; +} +#endif + DEFUN ("file-newer-than-file-p", Ffile_newer_than_file_p, Sfile_newer_than_file_p, 2, 2, 0, doc: /* Return t if file FILE1 is newer than file FILE2. If FILE1 does not exist, the answer is nil; @@ -5781,6 +5857,9 @@ When non-nil, the function `move-file-to-trash' will be used by #ifdef HAVE_SYNC defsubr (&Sunix_sync); #endif +#ifdef HAVE_LIBMAGIC + defsubr (&Slibmagic_file_internal); +#endif } /* arch-tag: 64ba3fd7-f844-4fb2-ba4b-427eb928786c ^ permalink raw reply related [flat|nested] 119+ messages in thread
* Re: using libmagic in Emacs? 2009-08-21 21:48 ` joakim @ 2009-08-21 22:46 ` Andreas Schwab 2009-08-22 20:18 ` joakim 0 siblings, 1 reply; 119+ messages in thread From: Andreas Schwab @ 2009-08-21 22:46 UTC (permalink / raw) To: joakim; +Cc: Eli Zaretskii, monnier, emacs-devel joakim@verona.se writes: > + GCPRO6 (file_description, file_mime, file_encoding, rv, absname, encoded_absname); That's too much. You only need to protect variables used around calls that can GC. Arguments to lisp functions are implicitly protected. For example, there are no function calls during the lifetime of absname. And encoded_absname is completely unused. > + libmagic_error: > + report_file_error("Libmagic error",Qnil); > + if (cookie != NULL) magic_close (cookie); report_file_error throws, so you leak a resource. Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: using libmagic in Emacs? 2009-08-21 22:46 ` Andreas Schwab @ 2009-08-22 20:18 ` joakim 2009-08-22 23:13 ` Ken Raeburn 2009-08-23 3:24 ` Eli Zaretskii 0 siblings, 2 replies; 119+ messages in thread From: joakim @ 2009-08-22 20:18 UTC (permalink / raw) To: Andreas Schwab; +Cc: Eli Zaretskii, monnier, emacs-devel [-- Attachment #1: Type: text/plain, Size: 835 bytes --] Andreas Schwab <schwab@linux-m68k.org> writes: > joakim@verona.se writes: > >> + GCPRO6 (file_description, file_mime, file_encoding, rv, absname, encoded_absname); > > That's too much. You only need to protect variables used around calls > that can GC. Arguments to lisp functions are implicitly protected. For > example, there are no function calls during the lifetime of absname. > And encoded_absname is completely unused. It seems to me I only need to protect f, which I would do by GCPRO:ing absname. Since this is aparently wrong, I will leave it like it is, since it doesnt hurt to GCPRO too much. (?) >> + libmagic_error: >> + report_file_error("Libmagic error",Qnil); >> + if (cookie != NULL) magic_close (cookie); > > report_file_error throws, so you leak a resource. Fixed I think. > Andreas. -- Joakim Verona [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: filemagic5.patch --] [-- Type: text/x-patch, Size: 5853 bytes --] diff --git a/configure.in b/configure.in index f4096db..49a3f15 100644 --- a/configure.in +++ b/configure.in @@ -137,6 +137,8 @@ OPTION_DEFAULT_ON([xft],[don't use XFT for anti aliased fonts]) OPTION_DEFAULT_ON([libotf],[don't use libotf for OpenType font support]) OPTION_DEFAULT_ON([m17n-flt],[don't use m17n-flt for text shaping]) +OPTION_DEFAULT_ON([libmagic],[don't compile with libmagic support]) + OPTION_DEFAULT_ON([toolkit-scroll-bars],[don't use Motif or Xaw3d scroll bars]) OPTION_DEFAULT_ON([xaw3d],[don't use Xaw3d]) OPTION_DEFAULT_ON([xim],[don't use X11 XIM]) @@ -2223,6 +2225,19 @@ if test x"$ac_cv_func_alloca_works" != xyes; then AC_MSG_ERROR( [a system implementation of alloca is required] ) fi + +HAVE_LIBMAGIC=no +if test "${with_libmagic}" != "no"; then + #libmagic support + AC_CHECK_HEADERS(magic.h, [ AC_CHECK_LIB(magic,magic_open,HAVE_LIBMAGIC=yes) ]) +fi + +if test "${HAVE_LIBMAGIC}" = "yes"; then + LIBMAGIC=-lmagic + AC_SUBST(LIBMAGIC) + AC_DEFINE(HAVE_LIBMAGIC, 1, [Define to 1 if using libmagic.]) +fi + # fmod, logb, and frexp are found in -lm on most systems. # On HPUX 9.01, -lm does not contain logb, so check for sqrt. AC_CHECK_LIB(m, sqrt) @@ -2954,6 +2969,7 @@ echo " Does Emacs use -lpng? ${HAVE_PNG}" echo " Does Emacs use -lrsvg-2? ${HAVE_RSVG}" echo " Does Emacs use -lgpm? ${HAVE_GPM}" echo " Does Emacs use -ldbus? ${HAVE_DBUS}" +echo " Does Emacs use -lmagic? ${HAVE_LIBMAGIC}" echo " Does Emacs use -lfreetype? ${HAVE_FREETYPE}" echo " Does Emacs use -lm17n-flt? ${HAVE_M17N_FLT}" diff --git a/src/Makefile.in b/src/Makefile.in index 425cf98..33d1a14 100644 --- a/src/Makefile.in +++ b/src/Makefile.in @@ -420,6 +420,7 @@ LIBX= $(LIBXMENU) LD_SWITCH_X_SITE #endif /* not HAVE_LIBRESOLV */ LIBSOUND= @LIBSOUND@ +LIBMAGIC= @LIBMAGIC@ CFLAGS_SOUND= @CFLAGS_SOUND@ RSVG_LIBS= @RSVG_LIBS@ @@ -878,7 +879,7 @@ SOME_MACHINE_LISP = ../lisp/mouse.elc \ duplicated symbols. If the standard libraries were compiled with GCC, we might need gnulib again after them. */ -LIBES = $(LOADLIBES) $(LIBS) $(LIBX) $(LIBSOUND) $(RSVG_LIBS) $(DBUS_LIBS) \ +LIBES = $(LOADLIBES) $(LIBS) $(LIBX) $(LIBSOUND) $(LIBMAGIC) $(RSVG_LIBS) $(DBUS_LIBS) \ LIBGPM LIBRESOLV LIBS_SYSTEM LIBS_MACHINE LIBS_TERMCAP \ LIBS_DEBUG $(GETLOADAVG_LIBS) \ @FREETYPE_LIBS@ @FONTCONFIG_LIBS@ @LIBOTF_LIBS@ @M17N_FLT_LIBS@ \ diff --git a/src/fileio.c b/src/fileio.c index 3702d4c..48a1ccd 100644 --- a/src/fileio.c +++ b/src/fileio.c @@ -205,6 +205,10 @@ Lisp_Object Vdirectory_sep_char; int write_region_inhibit_fsync; #endif +#ifdef HAVE_LIBMAGIC +#include <magic.h> +#endif + /* Non-zero means call move-file-to-trash in Fdelete_file or Fdelete_directory. */ int delete_by_moving_to_trash; @@ -2997,6 +3001,78 @@ DEFUN ("unix-sync", Funix_sync, Sunix_sync, 0, 0, "", #endif /* HAVE_SYNC */ +#ifdef HAVE_LIBMAGIC +DEFUN ("libmagic-file-internal", Flibmagic_file_internal, Slibmagic_file_internal, 1,1,0, + doc: /* Return a list describing the argument FILENAME. + + + The return value is a list of the form + + (MIME-TYPE MIME-ENCODING DESCRIPTION) + + MIME-TYPE and MIME-ENCODING are the MIME type and encoding suitable + for the file's contents, as determined by libmagic. + DESCRIPTION is the human readable descripton of the file type offered by + libmagic. + + The function throws a file-error if libmagic cannot determine one of + the elements of the above list. + + The default libmagic database is used, and the quality of information + given depends on your version of that database. Often the MIME type is + less exact than the description. */) + (filename) + Lisp_Object filename; +{ + CHECK_STRING (filename); + magic_t cookie=NULL; + char* f = NULL; + const char* rvs; + Lisp_Object file_description; + Lisp_Object file_mime; + Lisp_Object file_encoding; + Lisp_Object rv; + + Lisp_Object absname, encoded_absname; + struct gcpro gcpro1, gcpro2, gcpro3, gcpro4, gcpro5, gcpro6; + + GCPRO6 (file_description, file_mime, file_encoding, rv, absname, encoded_absname); + + absname = Fexpand_file_name (filename, current_buffer->directory); + f = SDATA(ENCODE_FILE (absname)); + + cookie = magic_open (MAGIC_ERROR); + if (cookie == NULL) goto libmagic_error; + magic_load (cookie, NULL); /* load default database */ + + magic_setflags (cookie, MAGIC_MIME_TYPE | MAGIC_ERROR); + rvs = magic_file (cookie, f); + if (rvs == NULL) goto libmagic_error; + file_mime = intern (rvs); + + magic_setflags (cookie, MAGIC_MIME_ENCODING | MAGIC_ERROR); + rvs=magic_file (cookie, f); + if (rvs == NULL) goto libmagic_error; + file_encoding = build_string(rvs); + + magic_setflags (cookie, MAGIC_NONE | MAGIC_ERROR); + rvs=magic_file (cookie, f); + if (rvs == NULL) goto libmagic_error; + + file_description = build_string (rvs); + rv = Fcons (file_mime, Fcons (file_encoding, Fcons (file_description, Qnil))); + + magic_close (cookie); + UNGCPRO; + return rv; + libmagic_error: + report_file_error("Libmagic error",Qnil); + if (cookie != NULL) magic_close (cookie); + UNGCPRO; + return Qnil; +} +#endif + DEFUN ("file-newer-than-file-p", Ffile_newer_than_file_p, Sfile_newer_than_file_p, 2, 2, 0, doc: /* Return t if file FILE1 is newer than file FILE2. If FILE1 does not exist, the answer is nil; @@ -5781,6 +5857,9 @@ When non-nil, the function `move-file-to-trash' will be used by #ifdef HAVE_SYNC defsubr (&Sunix_sync); #endif +#ifdef HAVE_LIBMAGIC + defsubr (&Slibmagic_file_internal); +#endif } /* arch-tag: 64ba3fd7-f844-4fb2-ba4b-427eb928786c ^ permalink raw reply related [flat|nested] 119+ messages in thread
* Re: using libmagic in Emacs? 2009-08-22 20:18 ` joakim @ 2009-08-22 23:13 ` Ken Raeburn 2009-08-23 23:38 ` joakim 2009-08-23 3:24 ` Eli Zaretskii 1 sibling, 1 reply; 119+ messages in thread From: Ken Raeburn @ 2009-08-22 23:13 UTC (permalink / raw) To: joakim; +Cc: emacs-devel emacs-devel On Aug 22, 2009, at 16:18, joakim@verona.se wrote: > Andreas Schwab <schwab@linux-m68k.org> writes: > joakim@verona.se writes: >> >>> + GCPRO6 (file_description, file_mime, file_encoding, rv, >>> absname, encoded_absname); >> >> That's too much. You only need to protect variables used around >> calls >> that can GC. Arguments to lisp functions are implicitly >> protected. For >> example, there are no function calls during the lifetime of absname. >> And encoded_absname is completely unused. > > It seems to me I only need to protect f, which I would do by GCPRO:ing > absname. Since this is aparently wrong, I will leave it like it is, > since it doesnt hurt to GCPRO too much. (?) If ENCODE_FILE returns a new lisp string object, you need to GCPRO that object, not absname. After the call to ENCODE_FILE, absname is unused, so it won't need protection. In fact, it looks to me like after that point, GC isn't possible, so I'm not sure anything needs GCPRO'tection here. > >>> + libmagic_error: >>> + report_file_error("Libmagic error",Qnil); >>> + if (cookie != NULL) magic_close (cookie); >> >> report_file_error throws, so you leak a resource. > > Fixed I think. No, you're still trying to call magic_close after report_file_error returns, which it won't. > +{ > + CHECK_STRING (filename); > + magic_t cookie=NULL; > + char* f = NULL; CHECK_STRING is executable code, and should be moved down after the variable declarations. > + const char* rvs; > + Lisp_Object file_description; > + Lisp_Object file_mime; > + Lisp_Object file_encoding; > + Lisp_Object rv; > + > + Lisp_Object absname, encoded_absname; > + struct gcpro gcpro1, gcpro2, gcpro3, gcpro4, gcpro5, gcpro6; > + > + GCPRO6 (file_description, file_mime, file_encoding, rv, absname, > encoded_absname); It seems to be common -- I'm not sure if it's required, but I would conservatively assume so -- for any local variable GCPRO'd to get initialized before any possible call to garbage collection, so the precise(?) garbage collector won't be scanning random stack values and thinking they're lisp objects; if the initialization is unclear, often that means simply assigning Qnil just before or after the GCPRO call. Different GC marking strategies are used on different platforms, so the lack of an obvious problem on one platform doesn't mean the code will work okay on another. Ken ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: using libmagic in Emacs? 2009-08-22 23:13 ` Ken Raeburn @ 2009-08-23 23:38 ` joakim 2009-08-24 3:05 ` Eli Zaretskii 0 siblings, 1 reply; 119+ messages in thread From: joakim @ 2009-08-23 23:38 UTC (permalink / raw) To: Ken Raeburn; +Cc: emacs-devel emacs-devel [-- Attachment #1: Type: text/plain, Size: 2131 bytes --] Ken Raeburn <raeburn@raeburn.org> writes: > On Aug 22, 2009, at 16:18, joakim@verona.se wrote: >> Andreas Schwab <schwab@linux-m68k.org> writes: >> joakim@verona.se writes: >>> >>>> + GCPRO6 (file_description, file_mime, file_encoding, rv, >>>> absname, encoded_absname); >>> >>> That's too much. You only need to protect variables used around >>> calls >>> that can GC. Arguments to lisp functions are implicitly protected. >>> For >>> example, there are no function calls during the lifetime of absname. >>> And encoded_absname is completely unused. >> >> It seems to me I only need to protect f, which I would do by GCPRO:ing >> absname. Since this is aparently wrong, I will leave it like it is, >> since it doesnt hurt to GCPRO too much. (?) > > If ENCODE_FILE returns a new lisp string object, you need to GCPRO > that object, not absname. After the call to ENCODE_FILE, absname is > unused, so it won't need protection. In fact, it looks to me like > after that point, GC isn't possible, so I'm not sure anything needs > GCPRO'tection here. > I seem to be having trouble with GCPRO. Now I've marked the places I believe might gc in the code. >>> report_file_error throws, so you leak a resource. >> >> Fixed I think. > > No, you're still trying to call magic_close after report_file_error > returns, which it won't. Maybe I sent the wrong patch revision last time, better now then? > CHECK_STRING is executable code, and should be moved down after the > variable declarations. same here. > It seems to be common -- I'm not sure if it's required, but I would > conservatively assume so -- for any local variable GCPRO'd to get > initialized before any possible call to garbage collection, so the > precise(?) garbage collector won't be scanning random stack values and > thinking they're lisp objects; if the initialization is unclear, often > that means simply assigning Qnil just before or after the GCPRO call. > > Different GC marking strategies are used on different platforms, so > the lack of an obvious problem on one platform doesn't mean the code > will work okay on another. > > Ken -- Joakim Verona [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: filemagic6.patch --] [-- Type: text/x-patch, Size: 5988 bytes --] diff --git a/configure.in b/configure.in index f4096db..49a3f15 100644 --- a/configure.in +++ b/configure.in @@ -137,6 +137,8 @@ OPTION_DEFAULT_ON([xft],[don't use XFT for anti aliased fonts]) OPTION_DEFAULT_ON([libotf],[don't use libotf for OpenType font support]) OPTION_DEFAULT_ON([m17n-flt],[don't use m17n-flt for text shaping]) +OPTION_DEFAULT_ON([libmagic],[don't compile with libmagic support]) + OPTION_DEFAULT_ON([toolkit-scroll-bars],[don't use Motif or Xaw3d scroll bars]) OPTION_DEFAULT_ON([xaw3d],[don't use Xaw3d]) OPTION_DEFAULT_ON([xim],[don't use X11 XIM]) @@ -2223,6 +2225,19 @@ if test x"$ac_cv_func_alloca_works" != xyes; then AC_MSG_ERROR( [a system implementation of alloca is required] ) fi + +HAVE_LIBMAGIC=no +if test "${with_libmagic}" != "no"; then + #libmagic support + AC_CHECK_HEADERS(magic.h, [ AC_CHECK_LIB(magic,magic_open,HAVE_LIBMAGIC=yes) ]) +fi + +if test "${HAVE_LIBMAGIC}" = "yes"; then + LIBMAGIC=-lmagic + AC_SUBST(LIBMAGIC) + AC_DEFINE(HAVE_LIBMAGIC, 1, [Define to 1 if using libmagic.]) +fi + # fmod, logb, and frexp are found in -lm on most systems. # On HPUX 9.01, -lm does not contain logb, so check for sqrt. AC_CHECK_LIB(m, sqrt) @@ -2954,6 +2969,7 @@ echo " Does Emacs use -lpng? ${HAVE_PNG}" echo " Does Emacs use -lrsvg-2? ${HAVE_RSVG}" echo " Does Emacs use -lgpm? ${HAVE_GPM}" echo " Does Emacs use -ldbus? ${HAVE_DBUS}" +echo " Does Emacs use -lmagic? ${HAVE_LIBMAGIC}" echo " Does Emacs use -lfreetype? ${HAVE_FREETYPE}" echo " Does Emacs use -lm17n-flt? ${HAVE_M17N_FLT}" diff --git a/src/Makefile.in b/src/Makefile.in index 425cf98..33d1a14 100644 --- a/src/Makefile.in +++ b/src/Makefile.in @@ -420,6 +420,7 @@ LIBX= $(LIBXMENU) LD_SWITCH_X_SITE #endif /* not HAVE_LIBRESOLV */ LIBSOUND= @LIBSOUND@ +LIBMAGIC= @LIBMAGIC@ CFLAGS_SOUND= @CFLAGS_SOUND@ RSVG_LIBS= @RSVG_LIBS@ @@ -878,7 +879,7 @@ SOME_MACHINE_LISP = ../lisp/mouse.elc \ duplicated symbols. If the standard libraries were compiled with GCC, we might need gnulib again after them. */ -LIBES = $(LOADLIBES) $(LIBS) $(LIBX) $(LIBSOUND) $(RSVG_LIBS) $(DBUS_LIBS) \ +LIBES = $(LOADLIBES) $(LIBS) $(LIBX) $(LIBSOUND) $(LIBMAGIC) $(RSVG_LIBS) $(DBUS_LIBS) \ LIBGPM LIBRESOLV LIBS_SYSTEM LIBS_MACHINE LIBS_TERMCAP \ LIBS_DEBUG $(GETLOADAVG_LIBS) \ @FREETYPE_LIBS@ @FONTCONFIG_LIBS@ @LIBOTF_LIBS@ @M17N_FLT_LIBS@ \ diff --git a/src/fileio.c b/src/fileio.c index 3702d4c..0061981 100644 --- a/src/fileio.c +++ b/src/fileio.c @@ -205,6 +205,10 @@ Lisp_Object Vdirectory_sep_char; int write_region_inhibit_fsync; #endif +#ifdef HAVE_LIBMAGIC +#include <magic.h> +#endif + /* Non-zero means call move-file-to-trash in Fdelete_file or Fdelete_directory. */ int delete_by_moving_to_trash; @@ -2997,6 +3001,81 @@ DEFUN ("unix-sync", Funix_sync, Sunix_sync, 0, 0, "", #endif /* HAVE_SYNC */ +#ifdef HAVE_LIBMAGIC +DEFUN ("libmagic-file-internal", Flibmagic_file_internal, Slibmagic_file_internal, 1,1,0, + doc: /* Return a list describing the argument FILENAME. + + + The return value is a list of the form + + (MIME-TYPE MIME-ENCODING-NAME DESCRIPTION) + + MIME-TYPE and MIME-ENCODING-NAME are the MIME type and encoding + suitable for the file's contents, as determined by libmagic. + DESCRIPTION is the human readable descripton of the file type + offered by libmagic. + + The function throws a file-error if libmagic cannot determine one of + the elements of the above list. + + The default libmagic database is used, and the quality of + information given depends on your version of that database. Often + the MIME type is less exact than the description. */) + (filename) + Lisp_Object filename; +{ + magic_t cookie=NULL; + char* f = NULL; + const char* returnvaluestr; + Lisp_Object file_description; + Lisp_Object file_mime; + Lisp_Object file_encoding; + Lisp_Object returnvalue; + Lisp_Object absname; + int errsave; + struct gcpro gcpro1, gcpro2; + + CHECK_STRING (filename); + + GCPRO2 (filename, absname); + absname = Fexpand_file_name (filename, current_buffer->directory); //might gc + f = SDATA(ENCODE_FILE (absname));//might gc + UNGCPRO; + + cookie = magic_open (MAGIC_ERROR); + if (cookie == NULL) goto libmagic_error; + magic_load (cookie, NULL); /* load default database */ + + magic_setflags (cookie, MAGIC_MIME_TYPE | MAGIC_ERROR); + returnvaluestr = magic_file (cookie, f); + if (returnvaluestr == NULL) goto libmagic_error; + file_mime = intern (returnvaluestr); + + magic_setflags (cookie, MAGIC_MIME_ENCODING | MAGIC_ERROR); + returnvaluestr=magic_file (cookie, f); + if (returnvaluestr == NULL) goto libmagic_error; + file_encoding = build_string(returnvaluestr); + + magic_setflags (cookie, MAGIC_NONE | MAGIC_ERROR); + returnvaluestr=magic_file (cookie, f); + if (returnvaluestr == NULL) goto libmagic_error; + + file_description = build_string (returnvaluestr); + returnvalue = Fcons (file_mime, Fcons (file_encoding, Fcons (file_description, Qnil))); //might gc + + magic_close (cookie); + + return returnvalue; + libmagic_error: + UNGCPRO; + errsave=errno; + if (cookie != NULL) magic_close (cookie); + errno=errsave; + report_file_error("Libmagic error",Qnil); + return Qnil; +} +#endif + DEFUN ("file-newer-than-file-p", Ffile_newer_than_file_p, Sfile_newer_than_file_p, 2, 2, 0, doc: /* Return t if file FILE1 is newer than file FILE2. If FILE1 does not exist, the answer is nil; @@ -5781,6 +5860,9 @@ When non-nil, the function `move-file-to-trash' will be used by #ifdef HAVE_SYNC defsubr (&Sunix_sync); #endif +#ifdef HAVE_LIBMAGIC + defsubr (&Slibmagic_file_internal); +#endif } /* arch-tag: 64ba3fd7-f844-4fb2-ba4b-427eb928786c ^ permalink raw reply related [flat|nested] 119+ messages in thread
* Re: using libmagic in Emacs? 2009-08-23 23:38 ` joakim @ 2009-08-24 3:05 ` Eli Zaretskii 2009-08-24 12:30 ` joakim 0 siblings, 1 reply; 119+ messages in thread From: Eli Zaretskii @ 2009-08-24 3:05 UTC (permalink / raw) To: joakim; +Cc: raeburn, emacs-devel > From: joakim@verona.se > Date: Mon, 24 Aug 2009 01:38:26 +0200 > Cc: emacs-devel emacs-devel <emacs-devel@gnu.org> > > + f = SDATA(ENCODE_FILE (absname));//might gc No C99 comments, please. ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: using libmagic in Emacs? 2009-08-24 3:05 ` Eli Zaretskii @ 2009-08-24 12:30 ` joakim 0 siblings, 0 replies; 119+ messages in thread From: joakim @ 2009-08-24 12:30 UTC (permalink / raw) To: Eli Zaretskii; +Cc: raeburn, emacs-devel [-- Attachment #1: Type: text/plain, Size: 279 bytes --] Eli Zaretskii <eliz@gnu.org> writes: >> From: joakim@verona.se >> Date: Mon, 24 Aug 2009 01:38:26 +0200 >> Cc: emacs-devel emacs-devel <emacs-devel@gnu.org> >> >> + f = SDATA(ENCODE_FILE (absname));//might gc > > No C99 comments, please. This time I compiled with -std=c89. [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: filemagic7.patch --] [-- Type: text/x-patch, Size: 6002 bytes --] diff --git a/configure.in b/configure.in index e578f76..90e67e7 100644 --- a/configure.in +++ b/configure.in @@ -137,6 +137,8 @@ OPTION_DEFAULT_ON([xft],[don't use XFT for anti aliased fonts]) OPTION_DEFAULT_ON([libotf],[don't use libotf for OpenType font support]) OPTION_DEFAULT_ON([m17n-flt],[don't use m17n-flt for text shaping]) +OPTION_DEFAULT_ON([libmagic],[don't compile with libmagic support]) + OPTION_DEFAULT_ON([toolkit-scroll-bars],[don't use Motif or Xaw3d scroll bars]) OPTION_DEFAULT_ON([xaw3d],[don't use Xaw3d]) OPTION_DEFAULT_ON([xim],[don't use X11 XIM]) @@ -2225,6 +2227,19 @@ if test x"$ac_cv_func_alloca_works" != xyes; then AC_MSG_ERROR( [a system implementation of alloca is required] ) fi + +HAVE_LIBMAGIC=no +if test "${with_libmagic}" != "no"; then + #libmagic support + AC_CHECK_HEADERS(magic.h, [ AC_CHECK_LIB(magic,magic_open,HAVE_LIBMAGIC=yes) ]) +fi + +if test "${HAVE_LIBMAGIC}" = "yes"; then + LIBMAGIC=-lmagic + AC_SUBST(LIBMAGIC) + AC_DEFINE(HAVE_LIBMAGIC, 1, [Define to 1 if using libmagic.]) +fi + # fmod, logb, and frexp are found in -lm on most systems. # On HPUX 9.01, -lm does not contain logb, so check for sqrt. AC_CHECK_LIB(m, sqrt) @@ -2959,6 +2974,7 @@ echo " Does Emacs use -lpng? ${HAVE_PNG}" echo " Does Emacs use -lrsvg-2? ${HAVE_RSVG}" echo " Does Emacs use -lgpm? ${HAVE_GPM}" echo " Does Emacs use -ldbus? ${HAVE_DBUS}" +echo " Does Emacs use -lmagic? ${HAVE_LIBMAGIC}" echo " Does Emacs use -lfreetype? ${HAVE_FREETYPE}" echo " Does Emacs use -lm17n-flt? ${HAVE_M17N_FLT}" diff --git a/src/Makefile.in b/src/Makefile.in index 567bff9..ac30bc2 100644 --- a/src/Makefile.in +++ b/src/Makefile.in @@ -422,6 +422,7 @@ LIBX= $(LIBXMENU) LD_SWITCH_X_SITE #endif /* not HAVE_LIBRESOLV */ LIBSOUND= @LIBSOUND@ +LIBMAGIC= @LIBMAGIC@ CFLAGS_SOUND= @CFLAGS_SOUND@ RSVG_LIBS= @RSVG_LIBS@ @@ -880,7 +881,7 @@ SOME_MACHINE_LISP = ../lisp/mouse.elc \ duplicated symbols. If the standard libraries were compiled with GCC, we might need gnulib again after them. */ -LIBES = $(LOADLIBES) $(LIBS) $(LIBX) $(LIBSOUND) $(RSVG_LIBS) $(DBUS_LIBS) \ +LIBES = $(LOADLIBES) $(LIBS) $(LIBX) $(LIBSOUND) $(LIBMAGIC) $(RSVG_LIBS) $(DBUS_LIBS) \ LIBGPM LIBRESOLV LIBS_SYSTEM LIBS_MACHINE LIBS_TERMCAP \ LIBS_DEBUG $(GETLOADAVG_LIBS) \ @FREETYPE_LIBS@ @FONTCONFIG_LIBS@ @LIBOTF_LIBS@ @M17N_FLT_LIBS@ \ diff --git a/src/fileio.c b/src/fileio.c index 3702d4c..568728b 100644 --- a/src/fileio.c +++ b/src/fileio.c @@ -205,6 +205,10 @@ Lisp_Object Vdirectory_sep_char; int write_region_inhibit_fsync; #endif +#ifdef HAVE_LIBMAGIC +#include <magic.h> +#endif + /* Non-zero means call move-file-to-trash in Fdelete_file or Fdelete_directory. */ int delete_by_moving_to_trash; @@ -2997,6 +3001,81 @@ DEFUN ("unix-sync", Funix_sync, Sunix_sync, 0, 0, "", #endif /* HAVE_SYNC */ +#ifdef HAVE_LIBMAGIC +DEFUN ("libmagic-file-internal", Flibmagic_file_internal, Slibmagic_file_internal, 1,1,0, + doc: /* Return a list describing the argument FILENAME. + + + The return value is a list of the form + + (MIME-TYPE MIME-ENCODING-NAME DESCRIPTION) + + MIME-TYPE and MIME-ENCODING-NAME are the MIME type and encoding + suitable for the file's contents, as determined by libmagic. + DESCRIPTION is the human readable descripton of the file type + offered by libmagic. + + The function throws a file-error if libmagic cannot determine one of + the elements of the above list. + + The default libmagic database is used, and the quality of + information given depends on your version of that database. Often + the MIME type is less exact than the description. */) + (filename) + Lisp_Object filename; +{ + magic_t cookie=NULL; + char* f = NULL; + const char* returnvaluestr; + Lisp_Object file_description; + Lisp_Object file_mime; + Lisp_Object file_encoding; + Lisp_Object returnvalue; + Lisp_Object absname; + int errsave; + struct gcpro gcpro1; + + CHECK_STRING (filename); + + GCPRO1 (absname); + absname = Fexpand_file_name (filename, current_buffer->directory); /* might gc */ + absname = ENCODE_FILE (absname);/* might gc */ + f = SDATA(absname); + + cookie = magic_open (MAGIC_ERROR); + if (cookie == NULL) goto libmagic_error; + magic_load (cookie, NULL); /* load default database */ + + magic_setflags (cookie, MAGIC_MIME_TYPE | MAGIC_ERROR); + returnvaluestr = magic_file (cookie, f); + if (returnvaluestr == NULL) goto libmagic_error; + file_mime = intern (returnvaluestr); + + magic_setflags (cookie, MAGIC_MIME_ENCODING | MAGIC_ERROR); + returnvaluestr=magic_file (cookie, f); + if (returnvaluestr == NULL) goto libmagic_error; + file_encoding = build_string(returnvaluestr); + + magic_setflags (cookie, MAGIC_NONE | MAGIC_ERROR); + returnvaluestr=magic_file (cookie, f); + if (returnvaluestr == NULL) goto libmagic_error; + + file_description = build_string (returnvaluestr); + returnvalue = Fcons (file_mime, Fcons (file_encoding, Fcons (file_description, Qnil))); /* might gc */ + + magic_close (cookie); + UNGCPRO; + return returnvalue; + libmagic_error: + UNGCPRO; + errsave=errno; + if (cookie != NULL) magic_close (cookie); + errno=errsave; + report_file_error("Libmagic error",Qnil); + return Qnil; +} +#endif + DEFUN ("file-newer-than-file-p", Ffile_newer_than_file_p, Sfile_newer_than_file_p, 2, 2, 0, doc: /* Return t if file FILE1 is newer than file FILE2. If FILE1 does not exist, the answer is nil; @@ -5781,6 +5860,9 @@ When non-nil, the function `move-file-to-trash' will be used by #ifdef HAVE_SYNC defsubr (&Sunix_sync); #endif +#ifdef HAVE_LIBMAGIC + defsubr (&Slibmagic_file_internal); +#endif } /* arch-tag: 64ba3fd7-f844-4fb2-ba4b-427eb928786c [-- Attachment #3: Type: text/plain, Size: 19 bytes --] -- Joakim Verona ^ permalink raw reply related [flat|nested] 119+ messages in thread
* Re: using libmagic in Emacs? 2009-08-22 20:18 ` joakim 2009-08-22 23:13 ` Ken Raeburn @ 2009-08-23 3:24 ` Eli Zaretskii 1 sibling, 0 replies; 119+ messages in thread From: Eli Zaretskii @ 2009-08-23 3:24 UTC (permalink / raw) To: joakim; +Cc: schwab, monnier, emacs-devel > From: joakim@verona.se > Cc: Eli Zaretskii <eliz@gnu.org>, monnier@iro.umontreal.ca, > emacs-devel@gnu.org > Date: Sat, 22 Aug 2009 22:18:26 +0200 > > + magic_setflags (cookie, MAGIC_MIME_TYPE | MAGIC_ERROR); > + rvs = magic_file (cookie, f); > + if (rvs == NULL) goto libmagic_error; > + file_mime = intern (rvs); > + > + magic_setflags (cookie, MAGIC_MIME_ENCODING | MAGIC_ERROR); > + rvs=magic_file (cookie, f); > + if (rvs == NULL) goto libmagic_error; > + file_encoding = build_string(rvs); Since you are returning strings for MIME type and MIME encoding, not symbols, I suggest to state that in the doc string. Normally, when we say "SOMETHING is an encoding", we mean it's a symbol. Alternatively, use MIME-ENCODING-NAME etc., to indicate that it's just a name of the thing, not the thing itself. ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: using libmagic in Emacs? 2009-08-21 11:01 ` Eli Zaretskii 2009-08-21 17:38 ` joakim @ 2009-08-21 19:18 ` Stefan Monnier 1 sibling, 0 replies; 119+ messages in thread From: Stefan Monnier @ 2009-08-21 19:18 UTC (permalink / raw) To: Eli Zaretskii; +Cc: schwab, joakim, emacs-devel >> + magic_setflags (cookie, MAGIC_MIME_ENCODING | MAGIC_ERROR); >> + rvs=magic_file (cookie, f); >> + if (rvs == NULL) goto libmagic_error; >> + Lisp_Object file_encoding = intern(rvs); > Is file_encoding supposed to be a valid encoding, one of those for > which Emacs has a coding-system? If so, perhaps you should make sure > you indeed return a valid coding-system or its alias, or otherwise > tell in the doc string that it's not guaranteed to be valid (so that > the caller should validate it before using). The simplest route is to return a string rather than a symbol. That should clearly convey the idea that this may or may not be a valid coding-system. Stefan ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: using libmagic in Emacs? 2009-08-21 9:55 ` joakim 2009-08-21 11:01 ` Eli Zaretskii @ 2009-08-21 13:19 ` Andreas Schwab 1 sibling, 0 replies; 119+ messages in thread From: Andreas Schwab @ 2009-08-21 13:19 UTC (permalink / raw) To: joakim; +Cc: Stefan Monnier, Emacs Development joakim@verona.se writes: > I dont understand the autoheader comment below though. When I originaly > compiled, the config.h wasnt generated with the libmagic info included, > did I do something wrong? Is autoheader supposed to generate config.in? > When does that happen? Configure with --enable-maintainer-mode. Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: using libmagic in Emacs? 2009-08-19 22:49 ` joakim ` (2 preceding siblings ...) 2009-08-20 13:57 ` Stefan Monnier @ 2009-08-20 18:32 ` Richard Stallman 2009-08-20 20:27 ` Reiner Steib 2009-08-21 19:16 ` Stefan Monnier 3 siblings, 2 replies; 119+ messages in thread From: Richard Stallman @ 2009-08-20 18:32 UTC (permalink / raw) To: joakim; +Cc: monnier, emacs-devel If we go this route, we should not load gnus/mailcap.el. It contains lots of other stuff. So we ought to separate out and preload the right part of it, such as the variable `mailcap-mime-data'. ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: using libmagic in Emacs? 2009-08-20 18:32 ` Richard Stallman @ 2009-08-20 20:27 ` Reiner Steib 2009-08-21 14:08 ` Richard Stallman 2009-08-21 19:16 ` Stefan Monnier 1 sibling, 1 reply; 119+ messages in thread From: Reiner Steib @ 2009-08-20 20:27 UTC (permalink / raw) To: Richard Stallman; +Cc: monnier, joakim, emacs-devel On Thu, Aug 20 2009, Richard Stallman wrote: > If we go this route, we should not load gnus/mailcap.el. (mailcap.el doesn't load anything else, AFAICS.) > It contains lots of other stuff. Ironically last year some stuff from dired-aux.el was moved to mailcap.el and is used in `minibuffer-default-add-shell-commands' from simple.el. > So we ought to separate out and preload the right part of it, such > as the variable `mailcap-mime-data'. The initial value of `mailcap-mime-data' is a fall-back (for systems without proper mailcap files). To make it useful, probably `mailcap-parse-mailcaps' and related functions are necessary: $ emacs-23-1 -Q -f ielm -l mailcap ... ELISP> (with-temp-buffer (insert (pp-to-string mailcap-mime-data)) (point-max)) 4993 ELISP> (mailcap-parse-mailcaps) t ELISP> (with-temp-buffer (insert (pp-to-string mailcap-mime-data)) (point-max)) 40322 ELISP> system-type gnu/linux Bye, Reiner. -- ,,, (o o) ---ooO-(_)-Ooo--- | PGP key available | http://rsteib.home.pages.de/ ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: using libmagic in Emacs? 2009-08-20 20:27 ` Reiner Steib @ 2009-08-21 14:08 ` Richard Stallman 0 siblings, 0 replies; 119+ messages in thread From: Richard Stallman @ 2009-08-21 14:08 UTC (permalink / raw) To: Reiner Steib; +Cc: monnier, joakim, emacs-devel The initial value of `mailcap-mime-data' is a fall-back (for systems without proper mailcap files). To make it useful, probably `mailcap-parse-mailcaps' and related functions are necessary: There is no need for all that to be standardly loaded/used in Emacs. All that Emacs needs is to recognize the file types that Emacs has special handling for. ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: using libmagic in Emacs? 2009-08-20 18:32 ` Richard Stallman 2009-08-20 20:27 ` Reiner Steib @ 2009-08-21 19:16 ` Stefan Monnier 1 sibling, 0 replies; 119+ messages in thread From: Stefan Monnier @ 2009-08-21 19:16 UTC (permalink / raw) To: rms; +Cc: joakim, emacs-devel > If we go this route, we should not load gnus/mailcap.el. Nobody suggested otherwise until now (actually noone mentioned anything close to mailcap in this thread, AFAICT). Could you expand on what use you were thinking of making of mailcap in this context? Stefan ^ permalink raw reply [flat|nested] 119+ messages in thread
* Language identification (was: using libmagic in Emacs) 2009-08-18 18:35 using libmagic in Emacs? joakim 2009-08-18 19:23 ` Stefan Monnier @ 2009-08-28 0:27 ` Juri Linkov 2009-08-28 4:58 ` Language identification Stefan Monnier ` (2 more replies) 1 sibling, 3 replies; 119+ messages in thread From: Juri Linkov @ 2009-08-28 0:27 UTC (permalink / raw) To: joakim; +Cc: Emacs Development > I often wish that files would open in Emacs with correct mode > more often when there is no file extension. In `auto-mode-alist' you can see that with the exception of `archive-mode', `doc-view-mode' and `image-mode', all remaining modes are programming text modes. It would be more useful to identify file types for these modes that libmagic can't do. Do you know a library that identifies programming languages? Such a library might be implemented using a Bayesian classifier trained on a sufficiently large corpus of different programming languages. -- Juri Linkov http://www.jurta.org/emacs/ ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: Language identification 2009-08-28 0:27 ` Language identification (was: using libmagic in Emacs) Juri Linkov @ 2009-08-28 4:58 ` Stefan Monnier 2009-08-28 9:00 ` Stephen J. Turnbull 2009-08-28 19:16 ` Juri Linkov 2009-08-28 6:45 ` Alex Ott 2009-08-28 6:46 ` Alex Ott 2 siblings, 2 replies; 119+ messages in thread From: Stefan Monnier @ 2009-08-28 4:58 UTC (permalink / raw) To: Juri Linkov; +Cc: joakim, Emacs Development >> I often wish that files would open in Emacs with correct mode >> more often when there is no file extension. > In `auto-mode-alist' you can see that with the exception of > `archive-mode', `doc-view-mode' and `image-mode', all remaining > modes are programming text modes. It would be more useful > to identify file types for these modes that libmagic can't do. > Do you know a library that identifies programming languages? > Such a library might be implemented using a Bayesian classifier > trained on a sufficiently large corpus of different programming > languages. OTOH, how often do you see a file containg programming language code and yet without ny extension? Stefan ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: Language identification 2009-08-28 4:58 ` Language identification Stefan Monnier @ 2009-08-28 9:00 ` Stephen J. Turnbull 2009-08-28 14:56 ` Stefan Monnier 2009-08-29 0:46 ` Richard Stallman 2009-08-28 19:16 ` Juri Linkov 1 sibling, 2 replies; 119+ messages in thread From: Stephen J. Turnbull @ 2009-08-28 9:00 UTC (permalink / raw) To: Stefan Monnier; +Cc: Juri Linkov, joakim, Emacs Development Stefan Monnier writes: > OTOH, how often do you see a file containg programming language code and > yet without ny extension? Extremely frequently. The great majority that I see are correctly identified by file(1) (I believe using libmagic), however, by parsing the shebang. There are also cases of multiple extensions, where I've seen (for example) foo.c.inc used for C implementation code that is used in multiple contexts (perhaps with different behavior according to #ifdefs). This would not be recognized by typical Emacs extension parsing since although it matches something like "\.c\>", it doesn't match the more usual idioms of "\.c$" or "\.c\'". ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: Language identification 2009-08-28 9:00 ` Stephen J. Turnbull @ 2009-08-28 14:56 ` Stefan Monnier 2009-08-29 4:11 ` Stephen J. Turnbull 2009-08-29 0:46 ` Richard Stallman 1 sibling, 1 reply; 119+ messages in thread From: Stefan Monnier @ 2009-08-28 14:56 UTC (permalink / raw) To: Stephen J. Turnbull; +Cc: Juri Linkov, joakim, Emacs Development >> OTOH, how often do you see a file containg programming language code and >> yet without ny extension? > Extremely frequently. In what kind of circumstance? > The great majority that I see are correctly identified by file(1) (I > believe using libmagic), however, by parsing the shebang. Oh, so they're executables with a shebang. That's OK we don't need `file' for that since we have interpreter-mode-alist. Emacs should already DTRT for them. > There are also cases of multiple extensions, where I've seen (for > example) foo.c.inc used for C implementation code that is used in > multiple contexts (perhaps with different behavior according to > #ifdefs). This would not be recognized by typical Emacs extension > parsing since although it matches something like "\.c\>", it doesn't > match the more usual idioms of "\.c$" or "\.c\'". I've had (setq auto-mode-alist (append auto-mode-alist '(("\\.[^/.]+\\'" ignore t)))) in my .emacs for eons to cover such cases. Stefan ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: Language identification 2009-08-28 14:56 ` Stefan Monnier @ 2009-08-29 4:11 ` Stephen J. Turnbull 2009-08-29 14:21 ` Chong Yidong 0 siblings, 1 reply; 119+ messages in thread From: Stephen J. Turnbull @ 2009-08-29 4:11 UTC (permalink / raw) To: Stefan Monnier; +Cc: Juri Linkov, joakim, Emacs Development Stefan Monnier writes: > > The great majority that I see are correctly identified by file(1) (I > > believe using libmagic), however, by parsing the shebang. > > Oh, so they're executables with a shebang. That's OK we don't need > `file' for that since we have interpreter-mode-alist. Emacs should > already DTRT for them. Sure. Maybe there's a better way. Maybe libmagic is it. Maybe not. However, you asked "how often do you see files containing programming languages without an extension?" The answer is, it's very common, but the most common case is Unix command scripts with shebangs, which file handles just as well as Emacs does. > I've had > > (setq auto-mode-alist (append auto-mode-alist '(("\\.[^/.]+\\'" ignore t)))) > > in my .emacs for eons to cover such cases. Well, maybe it's time to move it from your .emacs to core emacs? ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: Language identification 2009-08-29 4:11 ` Stephen J. Turnbull @ 2009-08-29 14:21 ` Chong Yidong 0 siblings, 0 replies; 119+ messages in thread From: Chong Yidong @ 2009-08-29 14:21 UTC (permalink / raw) To: Stephen J. Turnbull Cc: Juri Linkov, Stefan Monnier, joakim, Emacs Development "Stephen J. Turnbull" <stephen@xemacs.org> writes: > However, you asked "how often do you see files containing programming > languages without an extension?" The answer is, it's very common, but > the most common case is Unix command scripts with shebangs, which file > handles just as well as Emacs does. I guess the question should be, how often do you see such files that Emacs can't handle as well as libmagic? Even in such situations, I think the response should be to improve Emacs' file handling anyway, since libmagic is not available on all platforms. ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: Language identification 2009-08-28 9:00 ` Stephen J. Turnbull 2009-08-28 14:56 ` Stefan Monnier @ 2009-08-29 0:46 ` Richard Stallman 2009-08-29 4:13 ` Stephen J. Turnbull 1 sibling, 1 reply; 119+ messages in thread From: Richard Stallman @ 2009-08-29 0:46 UTC (permalink / raw) To: Stephen J. Turnbull; +Cc: juri, monnier, joakim, emacs-devel > OTOH, how often do you see a file containg programming language code and > yet without ny extension? Extremely frequently. Why do these files not identify the language explicitly? It is easy to do. The great majority that I see are correctly identified by file(1) (I believe using libmagic), however, by parsing the shebang. That statement suggests that there are exceptions. I would expect there are, because guessing the programming language is an unreliable solution. Emacs uses a reliable solution: users should identify the language either with the file name, or inside the file with a -*- line or a local variables list. It takes very little work to make a file say what its language is, and the result is to identify the language reliably from then on. I don't think we should switch from our reliable to solution to guessing. Is there a reason why users don't use the existing reliable mechanism? Is there a real difficulty with using it? ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: Language identification 2009-08-29 0:46 ` Richard Stallman @ 2009-08-29 4:13 ` Stephen J. Turnbull 2009-08-29 15:28 ` Stefan Monnier 0 siblings, 1 reply; 119+ messages in thread From: Stephen J. Turnbull @ 2009-08-29 4:13 UTC (permalink / raw) To: rms; +Cc: juri, monnier, joakim, emacs-devel Richard Stallman writes: > Is there a reason why users don't use the existing reliable mechanism? Many of the authors don't use Emacs, is my guess. ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: Language identification 2009-08-29 4:13 ` Stephen J. Turnbull @ 2009-08-29 15:28 ` Stefan Monnier 2009-08-29 16:27 ` Stephen J. Turnbull 0 siblings, 1 reply; 119+ messages in thread From: Stefan Monnier @ 2009-08-29 15:28 UTC (permalink / raw) To: Stephen J. Turnbull; +Cc: juri, rms, joakim, emacs-devel >> Is there a reason why users don't use the existing reliable mechanism? > Many of the authors don't use Emacs, is my guess. You mean they're still using `ed'? Boggles the mind! Stefan ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: Language identification 2009-08-29 15:28 ` Stefan Monnier @ 2009-08-29 16:27 ` Stephen J. Turnbull 0 siblings, 0 replies; 119+ messages in thread From: Stephen J. Turnbull @ 2009-08-29 16:27 UTC (permalink / raw) To: Stefan Monnier; +Cc: juri, rms, joakim, emacs-devel Stefan Monnier writes: > >> Is there a reason why users don't use the existing reliable mechanism? > > Many of the authors don't use Emacs, is my guess. > > You mean they're still using `ed'? Boggles the mind! Sure, you could have too, and that probably would have been for the best. But freedom to choose is what GNU's all about, and I'm glad you had the choice (even though you failed to choose ed(1), the Godfather of editors). If-my-tongue-were-any-further-in-cheek-it-would-be-in-Mozambique-ly y'rs ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: Language identification 2009-08-28 4:58 ` Language identification Stefan Monnier 2009-08-28 9:00 ` Stephen J. Turnbull @ 2009-08-28 19:16 ` Juri Linkov 2009-08-29 1:12 ` Stefan Monnier 2009-08-29 20:20 ` Richard Stallman 1 sibling, 2 replies; 119+ messages in thread From: Juri Linkov @ 2009-08-28 19:16 UTC (permalink / raw) To: Stefan Monnier; +Cc: joakim, Emacs Development >>> I often wish that files would open in Emacs with correct mode >>> more often when there is no file extension. >> In `auto-mode-alist' you can see that with the exception of >> `archive-mode', `doc-view-mode' and `image-mode', all remaining >> modes are programming text modes. It would be more useful >> to identify file types for these modes that libmagic can't do. >> Do you know a library that identifies programming languages? >> Such a library might be implemented using a Bayesian classifier >> trained on a sufficiently large corpus of different programming >> languages. > > OTOH, how often do you see a file containg programming language code and > yet without ny extension? More often with a non-standard extension than without any extension. Also there are conflicting extensions like e.g. ".pl" for both Perl and Prolog (esp. SWI-Prolog). -- Juri Linkov http://www.jurta.org/emacs/ ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: Language identification 2009-08-28 19:16 ` Juri Linkov @ 2009-08-29 1:12 ` Stefan Monnier 2009-08-30 16:01 ` Richard Stallman 2009-08-29 20:20 ` Richard Stallman 1 sibling, 1 reply; 119+ messages in thread From: Stefan Monnier @ 2009-08-29 1:12 UTC (permalink / raw) To: Juri Linkov; +Cc: joakim, Emacs Development > Also there are conflicting extensions like e.g. ".pl" for both > Perl and Prolog (esp. SWI-Prolog). That's indeed an interesting case, where content-based mode choice might make sense. Thanks for reminding me of it. Stefan ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: Language identification 2009-08-29 1:12 ` Stefan Monnier @ 2009-08-30 16:01 ` Richard Stallman 0 siblings, 0 replies; 119+ messages in thread From: Richard Stallman @ 2009-08-30 16:01 UTC (permalink / raw) To: Stefan Monnier; +Cc: juri, joakim, emacs-devel > Also there are conflicting extensions like e.g. ".pl" for both > Perl and Prolog (esp. SWI-Prolog). That's indeed an interesting case, where content-based mode choice might make sense. Thanks for reminding me of it. I'd prefer a very specific feature for choosing between these two languages to use of a general mechanism. The specific feature would probably be more reliable, and people would not be tempted to use it for other issues where it is not the right approach. But it would be even better to convince people to use distinct extensions. ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: Language identification 2009-08-28 19:16 ` Juri Linkov 2009-08-29 1:12 ` Stefan Monnier @ 2009-08-29 20:20 ` Richard Stallman 2009-08-29 22:48 ` Juri Linkov 1 sibling, 1 reply; 119+ messages in thread From: Richard Stallman @ 2009-08-29 20:20 UTC (permalink / raw) To: Juri Linkov; +Cc: monnier, joakim, emacs-devel > OTOH, how often do you see a file containg programming language code and > yet without ny extension? More often with a non-standard extension than without any extension. So why not rename the files, or put in -*- lines? Also there are conflicting extensions like e.g. ".pl" for both Perl and Prolog (esp. SWI-Prolog). Perhaps we should promote .plg for Prolog. ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: Language identification 2009-08-29 20:20 ` Richard Stallman @ 2009-08-29 22:48 ` Juri Linkov 2009-08-31 3:32 ` Richard Stallman 2009-08-31 3:33 ` Richard Stallman 0 siblings, 2 replies; 119+ messages in thread From: Juri Linkov @ 2009-08-29 22:48 UTC (permalink / raw) To: rms; +Cc: monnier, joakim, emacs-devel > > OTOH, how often do you see a file containg programming language code and > > yet without ny extension? > > More often with a non-standard extension than without any extension. > > So why not rename the files, or put in -*- lines? Often this is not possible when files are not under my control. > Also there are conflicting extensions like e.g. ".pl" for both > Perl and Prolog (esp. SWI-Prolog). > > Perhaps we should promote .plg for Prolog. I'd rather prefer to promote changing the Perl file extension since Prolog is older than Perl :) But I think neither is realistic. Currently I use this hack in .emacs to distinguish between Perl and Prolog: (add-hook 'find-file-hooks (lambda () (when (and (looking-at "#") (string-match "Prolog" mode-name)) (perl-mode)))) since almost all Perl files begin with a comment, even library files that have no shebangs. But I agree such guessing is unreliable. -- Juri Linkov http://www.jurta.org/emacs/ ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: Language identification 2009-08-29 22:48 ` Juri Linkov @ 2009-08-31 3:32 ` Richard Stallman 2009-08-31 8:42 ` David Kastrup 2009-08-31 8:59 ` Jan D. 2009-08-31 3:33 ` Richard Stallman 1 sibling, 2 replies; 119+ messages in thread From: Richard Stallman @ 2009-08-31 3:32 UTC (permalink / raw) To: Juri Linkov; +Cc: monnier, joakim, emacs-devel since almost all Perl files begin with a comment, even library files that have no shebangs. What is a "shebang"? ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: Language identification 2009-08-31 3:32 ` Richard Stallman @ 2009-08-31 8:42 ` David Kastrup 2009-08-31 8:59 ` Jan D. 1 sibling, 0 replies; 119+ messages in thread From: David Kastrup @ 2009-08-31 8:42 UTC (permalink / raw) To: emacs-devel Richard Stallman <rms@gnu.org> writes: > since almost all Perl files begin with a comment, even library files > that have no shebangs. > > What is a "shebang"? #! -- David Kastrup ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: Language identification 2009-08-31 3:32 ` Richard Stallman 2009-08-31 8:42 ` David Kastrup @ 2009-08-31 8:59 ` Jan D. 1 sibling, 0 replies; 119+ messages in thread From: Jan D. @ 2009-08-31 8:59 UTC (permalink / raw) To: rms@gnu.org Cc: Juri Linkov, monnier@iro.umontreal.ca, joakim@verona.se, emacs-devel@gnu.org 31 aug 2009 kl. 05.32 skrev Richard Stallman <rms@gnu.org>: > since almost all Perl files begin with a comment, even library > files > that have no shebangs. > > What is a "shebang"? > #! Jan D ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: Language identification 2009-08-29 22:48 ` Juri Linkov 2009-08-31 3:32 ` Richard Stallman @ 2009-08-31 3:33 ` Richard Stallman 1 sibling, 0 replies; 119+ messages in thread From: Richard Stallman @ 2009-08-31 3:33 UTC (permalink / raw) To: Juri Linkov; +Cc: monnier, joakim, emacs-devel > Also there are conflicting extensions like e.g. ".pl" for both > Perl and Prolog (esp. SWI-Prolog). > > Perhaps we should promote .plg for Prolog. I'd rather prefer to promote changing the Perl file extension since Prolog is older than Perl :) But I think neither is realistic. Why so pessimistic? I'm sure lots of people find this conflict annoying, and not just Emacs users. Solving it would help everyone affected, which mainly means Prolog users, since tools are more likely to assume Perl. We should not give up on the best solution without at least proposing it. Is anyone here active in the Prolog community? ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: Language identification 2009-08-28 0:27 ` Language identification (was: using libmagic in Emacs) Juri Linkov 2009-08-28 4:58 ` Language identification Stefan Monnier @ 2009-08-28 6:45 ` Alex Ott 2009-08-28 6:46 ` Alex Ott 2 siblings, 0 replies; 119+ messages in thread From: Alex Ott @ 2009-08-28 6:45 UTC (permalink / raw) To: Juri Linkov; +Cc: joakim, Emacs Development Hello N-Gram algorithms is could be used to identify languages - it simpler than bayes, and requires smaller database Juri Linkov at "Fri, 28 Aug 2009 03:27:35 +0300" wrote: >> I often wish that files would open in Emacs with correct mode >> more often when there is no file extension. JL> In `auto-mode-alist' you can see that with the exception of JL> `archive-mode', `doc-view-mode' and `image-mode', all remaining JL> modes are programming text modes. It would be more useful JL> to identify file types for these modes that libmagic can't do. JL> Do you know a library that identifies programming languages? JL> Such a library might be implemented using a Bayesian classifier JL> trained on a sufficiently large corpus of different programming JL> languages. -- With best wishes, Alex Ott, MBA http://alexott.blogspot.com/ http://xtalk.msk.su/~ott/ http://alexott-ru.blogspot.com/ ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: Language identification 2009-08-28 0:27 ` Language identification (was: using libmagic in Emacs) Juri Linkov 2009-08-28 4:58 ` Language identification Stefan Monnier 2009-08-28 6:45 ` Alex Ott @ 2009-08-28 6:46 ` Alex Ott 2009-08-28 19:08 ` Juri Linkov 2 siblings, 1 reply; 119+ messages in thread From: Alex Ott @ 2009-08-28 6:46 UTC (permalink / raw) To: Juri Linkov; +Cc: joakim, Emacs Development Sorry, I skipped, that this was about programming languages, not real languages. Juri Linkov at "Fri, 28 Aug 2009 03:27:35 +0300" wrote: >> I often wish that files would open in Emacs with correct mode >> more often when there is no file extension. JL> In `auto-mode-alist' you can see that with the exception of JL> `archive-mode', `doc-view-mode' and `image-mode', all remaining JL> modes are programming text modes. It would be more useful JL> to identify file types for these modes that libmagic can't do. JL> Do you know a library that identifies programming languages? JL> Such a library might be implemented using a Bayesian classifier JL> trained on a sufficiently large corpus of different programming JL> languages. -- With best wishes, Alex Ott, MBA http://alexott.blogspot.com/ http://xtalk.msk.su/~ott/ http://alexott-ru.blogspot.com/ ^ permalink raw reply [flat|nested] 119+ messages in thread
* Re: Language identification 2009-08-28 6:46 ` Alex Ott @ 2009-08-28 19:08 ` Juri Linkov 0 siblings, 0 replies; 119+ messages in thread From: Juri Linkov @ 2009-08-28 19:08 UTC (permalink / raw) To: Alex Ott; +Cc: joakim, Emacs Development >>> In `auto-mode-alist' you can see that with the exception of >>> `archive-mode', `doc-view-mode' and `image-mode', all remaining >>> modes are programming text modes. It would be more useful >>> to identify file types for these modes that libmagic can't do. >>> Do you know a library that identifies programming languages? >>> Such a library might be implemented using a Bayesian classifier >>> trained on a sufficiently large corpus of different programming >>> languages. >> >> N-Gram algorithms is could be used to identify languages - it simpler >> than bayes, and requires smaller database > > Sorry, I skipped, that this was about programming languages, not real > languages. It would be interesting to try using N-Gram algorithms for programming languages and see how well they perform. For example, most frequently used bigram "/*" belongs to C, most frequently used trigram ";;;" belongs to Lisp, etc. -- Juri Linkov http://www.jurta.org/emacs/ ^ permalink raw reply [flat|nested] 119+ messages in thread
end of thread, other threads:[~2009-09-04 7:52 UTC | newest] Thread overview: 119+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-08-18 18:35 using libmagic in Emacs? joakim 2009-08-18 19:23 ` Stefan Monnier 2009-08-18 20:01 ` Chong Yidong 2009-08-18 20:35 ` joakim 2009-08-18 21:11 ` Stefan Monnier 2009-08-19 2:58 ` Eli Zaretskii 2009-08-19 3:21 ` Stefan Monnier 2009-08-19 13:47 ` Chong Yidong 2009-08-19 15:57 ` joakim 2009-08-19 19:46 ` next bugfix release? [was: Re: using libmagic in Emacs?] Dan Nicolaescu 2009-08-19 21:06 ` next bugfix release? Chong Yidong 2009-08-19 21:53 ` Dan Nicolaescu 2009-08-19 22:56 ` Alan Mackenzie 2009-08-19 23:16 ` Nick Roberts 2009-08-20 9:02 ` Lennart Borgman 2009-08-20 11:19 ` Eric M. Ludlam 2009-08-20 15:13 ` Alan Mackenzie 2009-08-20 15:47 ` Lennart Borgman 2009-08-19 19:05 ` installing features on trunk (was: using libmagic in Emacs?) Eli Zaretskii 2009-08-21 18:59 ` Bidi support Stefan Monnier 2009-08-21 20:44 ` Eli Zaretskii 2009-08-22 3:39 ` Stefan Monnier 2009-08-22 8:18 ` Jason Rumney 2009-08-22 5:39 ` Stephen J. Turnbull 2009-08-22 7:31 ` Eli Zaretskii 2009-08-24 1:45 ` Kenichi Handa 2009-08-24 3:12 ` Eli Zaretskii 2009-08-24 7:17 ` Kenichi Handa 2009-08-24 3:25 ` Stephen J. Turnbull 2009-08-19 0:57 ` using libmagic in Emacs? Juri Linkov 2009-08-20 3:42 ` Richard Stallman 2009-08-22 23:36 ` Juri Linkov 2009-08-24 0:07 ` Richard Stallman 2009-08-24 0:17 ` Juri Linkov 2009-08-24 7:33 ` joakim 2009-08-25 2:08 ` Richard Stallman 2009-08-25 2:19 ` Miles Bader 2009-08-25 5:09 ` joakim 2009-08-25 13:27 ` James Cloos 2009-08-25 21:41 ` Thien-Thi Nguyen 2009-08-25 17:36 ` Stefan Monnier 2009-08-25 20:37 ` Juri Linkov 2009-08-29 23:19 ` Juri Linkov 2009-08-30 3:09 ` Eli Zaretskii 2009-08-30 20:54 ` Juri Linkov 2009-08-31 2:49 ` Eli Zaretskii 2009-08-31 16:17 ` Juri Linkov 2009-08-31 17:58 ` Eli Zaretskii 2009-09-01 12:16 ` Richard Stallman 2009-09-01 16:12 ` Stefan Monnier 2009-09-01 21:20 ` Richard Stallman 2009-09-03 19:42 ` Stefan Monnier 2009-09-04 7:52 ` Richard Stallman 2009-08-31 22:21 ` Richard Stallman 2009-08-31 3:33 ` Richard Stallman 2009-08-31 15:03 ` Chong Yidong 2009-08-31 16:19 ` Juri Linkov 2009-08-31 23:47 ` Stefan Monnier 2009-09-01 3:16 ` Eli Zaretskii 2009-09-01 5:37 ` Stefan Monnier 2009-09-01 12:16 ` Richard Stallman 2009-08-25 20:36 ` Juri Linkov 2009-08-19 22:49 ` joakim 2009-08-19 23:20 ` Dan Nicolaescu 2009-08-20 1:03 ` Stephen J. Turnbull 2009-08-20 3:12 ` Eli Zaretskii 2009-08-20 4:50 ` Stephen J. Turnbull 2009-08-20 18:20 ` Eli Zaretskii 2009-08-21 0:19 ` Stephen J. Turnbull 2009-08-20 18:32 ` Richard Stallman 2009-08-21 19:10 ` Stefan Monnier 2009-08-22 5:03 ` Stephen J. Turnbull 2009-08-23 1:03 ` Stefan Monnier 2009-08-20 13:57 ` Stefan Monnier 2009-08-20 19:19 ` joakim 2009-08-20 22:08 ` Andreas Schwab 2009-08-21 9:55 ` joakim 2009-08-21 11:01 ` Eli Zaretskii 2009-08-21 17:38 ` joakim 2009-08-21 17:46 ` Rupert Swarbrick 2009-08-21 18:31 ` Andreas Schwab 2009-08-21 19:13 ` Drew Adams 2009-08-21 18:42 ` Eli Zaretskii 2009-08-21 21:48 ` joakim 2009-08-21 22:46 ` Andreas Schwab 2009-08-22 20:18 ` joakim 2009-08-22 23:13 ` Ken Raeburn 2009-08-23 23:38 ` joakim 2009-08-24 3:05 ` Eli Zaretskii 2009-08-24 12:30 ` joakim 2009-08-23 3:24 ` Eli Zaretskii 2009-08-21 19:18 ` Stefan Monnier 2009-08-21 13:19 ` Andreas Schwab 2009-08-20 18:32 ` Richard Stallman 2009-08-20 20:27 ` Reiner Steib 2009-08-21 14:08 ` Richard Stallman 2009-08-21 19:16 ` Stefan Monnier 2009-08-28 0:27 ` Language identification (was: using libmagic in Emacs) Juri Linkov 2009-08-28 4:58 ` Language identification Stefan Monnier 2009-08-28 9:00 ` Stephen J. Turnbull 2009-08-28 14:56 ` Stefan Monnier 2009-08-29 4:11 ` Stephen J. Turnbull 2009-08-29 14:21 ` Chong Yidong 2009-08-29 0:46 ` Richard Stallman 2009-08-29 4:13 ` Stephen J. Turnbull 2009-08-29 15:28 ` Stefan Monnier 2009-08-29 16:27 ` Stephen J. Turnbull 2009-08-28 19:16 ` Juri Linkov 2009-08-29 1:12 ` Stefan Monnier 2009-08-30 16:01 ` Richard Stallman 2009-08-29 20:20 ` Richard Stallman 2009-08-29 22:48 ` Juri Linkov 2009-08-31 3:32 ` Richard Stallman 2009-08-31 8:42 ` David Kastrup 2009-08-31 8:59 ` Jan D. 2009-08-31 3:33 ` Richard Stallman 2009-08-28 6:45 ` Alex Ott 2009-08-28 6:46 ` Alex Ott 2009-08-28 19:08 ` Juri Linkov
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/emacs.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).