* CEDET merge question @ 2009-09-05 16:28 Chong Yidong 2009-09-05 17:22 ` David Engster ` (2 more replies) 0 siblings, 3 replies; 29+ messages in thread From: Chong Yidong @ 2009-09-05 16:28 UTC (permalink / raw) To: emacs-devel I have a question about CEDET that hopefully someone on this list, who has more experience using CEDET than me, can help answer (I've been corresponding with Eric Ludlam, but he's gone on vacation). The Semantic parser appears to have two major "back-ends", bovine and wisent, which are used to generate Semantic tags. Does anyone know how crucial these packages are, and whether one or the other (or both) be dropped or somehow trimmed down? I ask because the CEDET merge already involves an uncomfortably large amount of code, and it's rather dismaying to see these two big code trees "embedded" in subdirectories of Semantic. (Wisent, for instance, appears to be an entire Elisp reimplementation of Bison...) (The CVS branch I'm using for the CEDET merge is not yet suitable for general testing; I'll inform the list when it's ready.) ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: CEDET merge question 2009-09-05 16:28 CEDET merge question Chong Yidong @ 2009-09-05 17:22 ` David Engster 2009-09-05 20:53 ` Chong Yidong 2009-09-06 15:37 ` Richard Stallman 2009-09-08 8:11 ` joakim 2 siblings, 1 reply; 29+ messages in thread From: David Engster @ 2009-09-05 17:22 UTC (permalink / raw) To: Chong Yidong; +Cc: emacs-devel Chong Yidong <cyd@stupidchicken.com> writes: > The Semantic parser appears to have two major "back-ends", bovine and > wisent, which are used to generate Semantic tags. Does anyone know how > crucial these packages are, and whether one or the other (or both) be > dropped or somehow trimmed down? I think it depends on the question if people should be able to edit and compile the grammars itself, only using Emacs proper. The bovine/wisent parsers and major-modes are crucial for development, but I think they are not necessarily needed for the resulting parser; I may be wrong though, especially when it comes to the Wisent parser, which I'm not familiar with at all. For example, the file semantic/bovine/c.by is the Bison grammar for C/C++ parsing. During CEDET's make process, the 'bovine' code generates the file semantic/bovine/semantic-c-by.el, which is the resulting C(++) lexer in Emacs Lisp. This file is then required by semantic-c.el. Therefore, I would think that including the resulting semantic-c-by.el should be enough for the C parser to be working. As mentioned above, there are also the major-modes for bison/wisent in CEDET (bovine-grammar.el, wisent-grammar.el) which are needed for writing and debugging the grammar files. I think those would also not necessarily be needed in Emacs. However, if people would like to extend or fix grammar files (or write new ones), they would then have to get CEDET from CVS. > (Wisent, for instance, appears to be an entire Elisp reimplementation > of Bison...) Yes, it is exactly that. :-) Regards, David ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: CEDET merge question 2009-09-05 17:22 ` David Engster @ 2009-09-05 20:53 ` Chong Yidong 2009-09-05 23:08 ` David Engster 0 siblings, 1 reply; 29+ messages in thread From: Chong Yidong @ 2009-09-05 20:53 UTC (permalink / raw) To: David Engster; +Cc: emacs-devel David Engster <deng@randomsample.de> writes: > The bovine/wisent parsers and major-modes are crucial for development, > but I think they are not necessarily needed for the resulting parser; I > may be wrong though, especially when it comes to the Wisent parser, > which I'm not familiar with at all. > > For example, the file semantic/bovine/c.by is the Bison grammar for > C/C++ parsing. During CEDET's make process, the 'bovine' code generates > the file semantic/bovine/semantic-c-by.el, which is the resulting C(++) > lexer in Emacs Lisp. This file is then required by > semantic-c.el. Therefore, I would think that including the resulting > semantic-c-by.el should be enough for the C parser to be working. I see. I think it's better for us to merge just the generated Lisp grammar files, leaving the grammar development for upstream. It's an awful lot of infrastructure to pull in, considering that CEDET development won't be carried out in our repository anyway. Do you know if the bovine and wisent parsers are mutually replacable? For instance, the default parser seems to be bovine; would it be a big deal if we included just the bovine parser? ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: CEDET merge question 2009-09-05 20:53 ` Chong Yidong @ 2009-09-05 23:08 ` David Engster 0 siblings, 0 replies; 29+ messages in thread From: David Engster @ 2009-09-05 23:08 UTC (permalink / raw) To: Chong Yidong; +Cc: emacs-devel Chong Yidong <cyd@stupidchicken.com> writes: > Do you know if the bovine and wisent parsers are mutually replacable? > For instance, the default parser seems to be bovine; would it be a big > deal if we included just the bovine parser? I don't think it makes much sense to include just the Bovine parser. If you look at http://cedet.sourceforge.net/languagesupport.shtml you'll see the currently supported languages in CEDET, together with their current status regarding completion, project support etc.. The grammar column shows the type of grammar, "LL" or "LALR". The former is done with Bovine/Bison, the latter with Wisent. So Bison isn't really the default, but it's the older one, and especially the C/C++ support is pretty stable by now (there's also a Wisent parser for C, but it doesn't support C++ and AFAIK is currently not used). Some of the Wisent grammars are in the contrib directory, which probably means they basically work, but lack further infrastructure in Semantic. But I think the Wisent grammars work pretty much the same as the Bison ones, i.e., during CEDET's compilation a file 'wisent-<LANG>-wy.el' file is created, which contains the actual parser. -David ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: CEDET merge question 2009-09-05 16:28 CEDET merge question Chong Yidong 2009-09-05 17:22 ` David Engster @ 2009-09-06 15:37 ` Richard Stallman 2009-09-06 17:46 ` Ken Raeburn 2009-09-08 8:11 ` joakim 2 siblings, 1 reply; 29+ messages in thread From: Richard Stallman @ 2009-09-06 15:37 UTC (permalink / raw) To: Chong Yidong; +Cc: emacs-devel I ask because the CEDET merge already involves an uncomfortably large amount of code, and it's rather dismaying to see these two big code trees "embedded" in subdirectories of Semantic. (Wisent, for instance, appears to be an entire Elisp reimplementation of Bison...) Is it possible to use Bison itself rather than implement the same functionality differently? Or perhaps add an option to Bison to output its data in whatever format is convenient? ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: CEDET merge question 2009-09-06 15:37 ` Richard Stallman @ 2009-09-06 17:46 ` Ken Raeburn 2009-09-06 21:11 ` David Engster 2009-09-07 13:34 ` Richard Stallman 0 siblings, 2 replies; 29+ messages in thread From: Ken Raeburn @ 2009-09-06 17:46 UTC (permalink / raw) To: rms; +Cc: Chong Yidong, emacs-devel On Sep 6, 2009, at 11:37, Richard Stallman wrote: > I ask because the CEDET merge already involves an uncomfortably > large > amount of code, and it's rather dismaying to see these two big code > trees "embedded" in subdirectories of Semantic. (Wisent, for > instance, > appears to be an entire Elisp reimplementation of Bison...) > > Is it possible to use Bison itself rather than implement the > same functionality differently? Or perhaps add an option > to Bison to output its data in whatever format is convenient? Guile is also using a translation/reimplementation of Bison in Scheme. I haven't looked at the CEDET code, but Guile's version wants the grammar input using Scheme (s-expression) syntax. ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: CEDET merge question 2009-09-06 17:46 ` Ken Raeburn @ 2009-09-06 21:11 ` David Engster 2009-09-06 22:26 ` Ken Raeburn 2009-09-07 13:33 ` Richard Stallman 2009-09-07 13:34 ` Richard Stallman 1 sibling, 2 replies; 29+ messages in thread From: David Engster @ 2009-09-06 21:11 UTC (permalink / raw) To: Ken Raeburn; +Cc: Chong Yidong, rms, emacs-devel Ken Raeburn <raeburn@raeburn.org> writes: > On Sep 6, 2009, at 11:37, Richard Stallman wrote: >> Is it possible to use Bison itself rather than implement the >> same functionality differently? Or perhaps add an option >> to Bison to output its data in whatever format is convenient? > > Guile is also using a translation/reimplementation of Bison in Scheme. > I haven't looked at the CEDET code, but Guile's version wants the > grammar input using Scheme (s-expression) syntax. CEDET uses Bison grammars which are extended through "Optional Lambda Expressions" (OLE). They produce the actual tags, which are the basic objects resulting from the parsing stage. I don't think this can be easily replaced by Bison itself or Guile. But there's really not that much additional framework associated with Bison/Bovine. In the 'bovine' subdirectory, there are the actual grammar files (like c.by, erlang.by, etc.), and the major- and debugging-modes (bovine-grammar.el, bovine-debug.el). I think they are really only needed for developing and testing grammars. The file semantic-bovine.el contains the parsing core and is crucial. Then, there are files which deal with language-specific issues, for example semantic-c.el, semantic-erlang.el, semantic-java.el, etc.. These files contain overrides and helper functions to deal with stuff which usually differs between languages, like smart completion, local variables, namespaces and scoping issues, special preprocessor macros, etc. These files are only crucial for parsing the named language. The file semantic-gcc.el sets up stuff like system include paths for C/C++ by looking at the local gcc installation; it's very helpful for people using gcc. The files semantic-skel.el and skeleton.by are just there to get people started developing their own grammars and overrides; I think they can be safely dropped. As mentioned in my previous mail, the files semantic-<LANG>-by.el result from the compilation of the *.by files and could probably just be provided 'as is' in Emacs, without the additional grammar developing framework (this also implies that files could not just get synced from CEDET CVS to Emacs, but would need a compilation step in between). I can't speak for Eric here, of course. Maybe there's some not-so-obvious dependency, or another good reason to include the full grammar framework. Regards, David ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: CEDET merge question 2009-09-06 21:11 ` David Engster @ 2009-09-06 22:26 ` Ken Raeburn 2009-09-07 13:33 ` Richard Stallman 1 sibling, 0 replies; 29+ messages in thread From: Ken Raeburn @ 2009-09-06 22:26 UTC (permalink / raw) To: David Engster; +Cc: Chong Yidong, rms, emacs-devel On Sep 6, 2009, at 17:11, David Engster wrote: > Ken Raeburn <raeburn@raeburn.org> writes: >> On Sep 6, 2009, at 11:37, Richard Stallman wrote: >>> Is it possible to use Bison itself rather than implement the >>> same functionality differently? Or perhaps add an option >>> to Bison to output its data in whatever format is convenient? >> >> Guile is also using a translation/reimplementation of Bison in >> Scheme. >> I haven't looked at the CEDET code, but Guile's version wants the >> grammar input using Scheme (s-expression) syntax. > > CEDET uses Bison grammars which are extended through "Optional Lambda > Expressions" (OLE). They produce the actual tags, which are the basic > objects resulting from the parsing stage. I don't think this can be > easily replaced by Bison itself or Guile. Sorry, I didn't mean to suggest replacing it with Guile, more that, if the requirements were similar enough, Bison extensions to support both CEDET and Guile might be possible. But if you're extending the grammar with Lisp code, that may not be feasible.... Ken ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: CEDET merge question 2009-09-06 21:11 ` David Engster 2009-09-06 22:26 ` Ken Raeburn @ 2009-09-07 13:33 ` Richard Stallman 2009-09-12 12:49 ` Eric M. Ludlam 1 sibling, 1 reply; 29+ messages in thread From: Richard Stallman @ 2009-09-07 13:33 UTC (permalink / raw) To: David Engster; +Cc: cyd, raeburn, emacs-devel CEDET uses Bison grammars which are extended through "Optional Lambda Expressions" (OLE). They produce the actual tags, which are the basic objects resulting from the parsing stage. I don't think this can be easily replaced by Bison itself or Guile. Why is it hard to add these to Bison? It can handle embedded C code, so why not embedded Lisp code? It should be straightforward to make such changes. ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: CEDET merge question 2009-09-07 13:33 ` Richard Stallman @ 2009-09-12 12:49 ` Eric M. Ludlam 2009-09-12 13:37 ` Miles Bader ` (3 more replies) 0 siblings, 4 replies; 29+ messages in thread From: Eric M. Ludlam @ 2009-09-12 12:49 UTC (permalink / raw) To: rms; +Cc: cyd, raeburn, David Engster, emacs-devel On Mon, 2009-09-07 at 09:33 -0400, Richard Stallman wrote: > CEDET uses Bison grammars which are extended through "Optional Lambda > Expressions" (OLE). They produce the actual tags, which are the basic > objects resulting from the parsing stage. I don't think this can be > easily replaced by Bison itself or Guile. > > Why is it hard to add these to Bison? > It can handle embedded C code, so why not embedded Lisp code? > It should be straightforward to make such changes. I don't know how bison works, but I would assume that bison parses basic C code (thus replacing $1 with some other piece of code.) In the same way, it would need to be taught about Emacs Lisp, Scheme, or any other language someone might want. Bison also outputs the code needed for traversing the generated parser table. When creating more than one parser in one application (ie - any scripting language case) this would be detrimental since it is basically the same code for every parser, which is wasteful. That said, I do think that it is possible, and maybe even desirable to do such a thing. The end result, however, would involve rather extreme changes to bison, and possibly flex if flex is also used. As others have pointed out, there are newer parser technologies available too such as PEG. How much of that is fad vs fabulous, I don't really know. What I do know is that the CEDET tools don't care much about the specifics of the parser. The parser tools it does have are to make it easy to create new parsers so Emacs can support a large number of languages. A very similar question to "why not make bison support Emacs Lisp output", is "why not have gcc support tagging output". If gcc supported a tagging output format with the details needed for CEDET to get its job done, it could just call out to gcc instead of parsing it in Emacs. CEDET would then magically support a lot more languages. There are a huge number of tools out there trying to do what gcc does, like ctags, etags, ectags, cscope, gnu global, doxygen, and idutils. What's worse is that none of them work well. Of course, an Emacs Lisp parser can do lots of other things besides create tags. That's just what it is currently used for. Eric ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: CEDET merge question 2009-09-12 12:49 ` Eric M. Ludlam @ 2009-09-12 13:37 ` Miles Bader 2009-09-13 16:39 ` Richard Stallman 2009-09-12 16:34 ` David Engster ` (2 subsequent siblings) 3 siblings, 1 reply; 29+ messages in thread From: Miles Bader @ 2009-09-12 13:37 UTC (permalink / raw) To: eric; +Cc: cyd, raeburn, rms, David Engster, emacs-devel "Eric M. Ludlam" <eric@siege-engine.com> writes: > As others have pointed out, there are newer parser technologies > available too such as PEG. How much of that is fad vs fabulous, I don't > really know. It's only anecdotal, but my experience with Lpeg is that it's a lot more convenient and approachable than olde-style stuff like bison/flex, and not obviously any less powerful. In any case, it seems clear that some thought should be given before putting any significant effort into bison/flex. -Miles -- Fast, small, soon; pick any 2. ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: CEDET merge question 2009-09-12 13:37 ` Miles Bader @ 2009-09-13 16:39 ` Richard Stallman 2009-09-14 11:22 ` tomas 0 siblings, 1 reply; 29+ messages in thread From: Richard Stallman @ 2009-09-13 16:39 UTC (permalink / raw) To: Miles Bader; +Cc: cyd, raeburn, emacs-devel, deng, eric What is Lpeg, and what does it do? ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: CEDET merge question 2009-09-13 16:39 ` Richard Stallman @ 2009-09-14 11:22 ` tomas 2009-09-14 12:15 ` Miles Bader 0 siblings, 1 reply; 29+ messages in thread From: tomas @ 2009-09-14 11:22 UTC (permalink / raw) To: Richard Stallman; +Cc: deng, cyd, emacs-devel, raeburn, eric, Miles Bader -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Sun, Sep 13, 2009 at 12:39:44PM -0400, Richard Stallman wrote: > What is Lpeg, and what does it do? PEG stands for "Parsing Expression Grammars" and it is a grammar notation which basically represents formally a recursive descent parser. They are said to be a bit more powerful than context free grammars and (usually) more expressive. The most salient point for us "old-timers" is probably that the choices are "ordered" -- this has some price, but we get someething for that: the distinction between lexer and parser becomes more flexible. The relevant paper seems to be [1]. It seems that they are very nice to bind to a languag. LPEG is the implementation of PEGs to be used in Lua. [1] <http://pdos.csail.mit.edu/~baford/packrat/popl04/peg-popl04.pdf> Regards - -- tomás -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFKridbBcgs9XrR2kYRAr7mAJ4wFQQd1aKLujMnAvlNST/TlibSUQCfTeCI qOWOujkLZVNLsv+I8/vUlbM= =88yG -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: CEDET merge question 2009-09-14 11:22 ` tomas @ 2009-09-14 12:15 ` Miles Bader 2009-09-14 20:04 ` tomas 0 siblings, 1 reply; 29+ messages in thread From: Miles Bader @ 2009-09-14 12:15 UTC (permalink / raw) To: tomas; +Cc: Richard Stallman, deng, cyd, emacs-devel, raeburn, eric tomas@tuxteam.de writes: > PEG stands for "Parsing Expression Grammars" and it is a grammar > notation which basically represents formally a recursive descent parser. > > They are said to be a bit more powerful than context free grammars and > (usually) more expressive. The most salient point for us "old-timers" > is probably that the choices are "ordered" -- this has some price, but > we get someething for that: the distinction between lexer and parser > becomes more flexible. The relevant paper seems to be [1]. > > LPEG is the implementation of PEGs to be used in Lua. > > [1] <http://pdos.csail.mit.edu/~baford/packrat/popl04/peg-popl04.pdf> Note that while LPEG is a PEG parser, it's _not_ a packrat parser (as in [1]); the packrat algorithm is just an implementation technique. I've appended a copy of an a message I sent a while ago to emacs-devel on the same subject (LPEG vs. packrat). Note that I think it's not just the implementation technique which is interesting about LPEG, but also the very nice manner in which it's integrated with the language and made available for use. It's just an amazingly powerful and handy tool. I recommend reading the LPEG web page, where it gives a quick overview of it. Since Lua is in many ways feels quite similar to lisp, I think an elisp version would be similarly very natural and powerful. One difference though -- for Lua, LPEG uses overloaded operators for building up grammars; in elisp, it would probably be better to just use s-expressions to represent grammars, using backquotes to embed non-literal values. [earlier message:] You also might be interested in Roberto Ierusalimschy's paper on the implemenation of LPEG, which is a PEG implementation for Lua: http://www.inf.puc-rio.br/~roberto/docs/peg.pdf Note that LPEG does _not_ use the packrat algorithm, as apparently it presents some serious practical problems for common uses of parsing tools: In 2002, Ford proposed Packrat [5], an adaptation of the original algorithm that uses lazy evaluation to avoid that inefficiency. Even with this improvement, however, the space complexity of the algorithm is still linear on the subject size (with a somewhat big constant), even in the best case. As its author himself recognizes, this makes the algorithm not befitting for parsing “large amounts of relatively flat” data ([5], p. 57). However, unlike parsing tools, regular-expression tools aim exactly at large amounts of relatively flat data. To avoid these difficulties, we did not use the Packrat algorithm for LPEG. To implement LPEG we created a virtual parsing machine, not unlike Knuth’s parsing machine [15], where each pattern is represented as a program for the machine. The program is somewhat similar to a recursive-descendent parser (with limited backtracking) for the pattern, but it uses an explicit stack instead of recursion. The general LPEG page is here: http://www.inf.puc-rio.br/~roberto/lpeg/lpeg.html -Miles -- Back, n. That part of your friend which it is your privilege to contemplate in your adversity. ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: CEDET merge question 2009-09-14 12:15 ` Miles Bader @ 2009-09-14 20:04 ` tomas 0 siblings, 0 replies; 29+ messages in thread From: tomas @ 2009-09-14 20:04 UTC (permalink / raw) To: Miles Bader Cc: Richard Stallman, deng, cyd, emacs-devel, tomas, eric, raeburn -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Mon, Sep 14, 2009 at 09:15:02PM +0900, Miles Bader wrote: [...] > Note that while LPEG is a PEG parser, it's _not_ a packrat parser > [...] Thanks, Miles for the (as usually clearly expounded) insights. Regards - -- tomás -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFKrqHUBcgs9XrR2kYRAhQOAJoDlVSzNGIa0TTaPK0tThYoKSW1bgCdFefi VVADVlO9geSG8Vonhv9d/Jk= =gAgk -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: CEDET merge question 2009-09-12 12:49 ` Eric M. Ludlam 2009-09-12 13:37 ` Miles Bader @ 2009-09-12 16:34 ` David Engster 2009-09-13 16:39 ` Richard Stallman 2009-09-13 16:40 ` Richard Stallman 3 siblings, 0 replies; 29+ messages in thread From: David Engster @ 2009-09-12 16:34 UTC (permalink / raw) To: eric; +Cc: cyd, raeburn, rms, emacs-devel Eric M. Ludlam <eric@siege-engine.com> writes: > On Mon, 2009-09-07 at 09:33 -0400, Richard Stallman wrote: >> CEDET uses Bison grammars which are extended through "Optional Lambda >> Expressions" (OLE). They produce the actual tags, which are the basic >> objects resulting from the parsing stage. I don't think this can be >> easily replaced by Bison itself or Guile. >> >> Why is it hard to add these to Bison? >> It can handle embedded C code, so why not embedded Lisp code? >> It should be straightforward to make such changes. [...] > A very similar question to "why not make bison support Emacs Lisp > output", is "why not have gcc support tagging output". > > If gcc supported a tagging output format with the details needed for > CEDET to get its job done, it could just call out to gcc instead of > parsing it in Emacs. CEDET would then magically support a lot more > languages. Yes, I think that would be the way to go. Some time ago, I looked at a way to add Fortran 90/95 parsing to CEDET. It seems there's no free Bison grammar out there, but there is for example g95-xml [1], which apparently reuses the g95 parser and produces a XML output file, which could then be converted to Emacs Lisp tags. Also, in gfortran, there is a debug option '-fdump-parse-tree', which seems to produce an output almost usable by Semantic (most importantly, it's missing any source code information like line numbers, etc.). Similar to g95-xml, there's gcc-xml [2], which uses gcc's C++ parser to output a XML file. But it seems its development has stalled, and it currently can't parse templates, for example. One problem with this approach is how the parser reacts to 'code in progress', meaning syntactically incorrect code which is, for example, lacking some closing statements. I think that g95-xml just aborted in this case, which is why I never went further with this project. -David [1] http://sourceforge.net/projects/g95-xml/ [2] http://www.gccxml.org/HTML/Index.html ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: CEDET merge question 2009-09-12 12:49 ` Eric M. Ludlam 2009-09-12 13:37 ` Miles Bader 2009-09-12 16:34 ` David Engster @ 2009-09-13 16:39 ` Richard Stallman 2009-09-13 17:38 ` Eric M. Ludlam 2009-09-13 16:40 ` Richard Stallman 3 siblings, 1 reply; 29+ messages in thread From: Richard Stallman @ 2009-09-13 16:39 UTC (permalink / raw) To: eric; +Cc: cyd, raeburn, deng, emacs-devel A very similar question to "why not make bison support Emacs Lisp output", is "why not have gcc support tagging output". Could you please explain what you mean by that? ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: CEDET merge question 2009-09-13 16:39 ` Richard Stallman @ 2009-09-13 17:38 ` Eric M. Ludlam 2009-09-14 18:28 ` Richard Stallman 0 siblings, 1 reply; 29+ messages in thread From: Eric M. Ludlam @ 2009-09-13 17:38 UTC (permalink / raw) To: rms; +Cc: cyd, raeburn, deng, emacs-devel On Sun, 2009-09-13 at 12:39 -0400, Richard Stallman wrote: > A very similar question to "why not make bison support Emacs Lisp > output", is "why not have gcc support tagging output". > > Could you please explain what you mean by that? Sure. Etags, ctags, gnu global, idutils and cscope all have parsers of some sort that parse C and C++ code. Some use regexp matchers. Others have primitive parsers. gcc, of course, has a full language compliant parser which it uses to compile code. I'm not a gcc expert, but I assume that as it parses, it keeps track of the various symbols (functions, variables, namespaces, etc) and where they are. (ie - debug info for gdb). As such, it should be possible for gcc to easily output text representing a tags file. Etags style would be fairly simple. The output of exuberant ctags is more complex. The data needed by CEDET is more complex still, but is still a subset of everything that gcc needs to know. For CEDET, if gcc saw this file: ------------- int main(int argc, char *argv[]) { } ------------- it would be handy (for my application) for it to output: -------------- (("main" function (:arguments (("argc" variable (:type "int") [ 11 20 ]) ("argv" variable (:pointer 1 :dereference 1 :type "char") [ 21 34] )) :type "int") [2 38])) --------------- though, to be honest, any text output that is very regular would be fine. The part that makes this imperfect is that in Emacs, a file that needs parsing may be in the middle of an edit. Handling these cases can be a bit tricky for my simplified parser, and gcc doesn't have that editing information available. To handle this, the CEDET tools have different ways to parse files, such as "on save", and can track when a file is unparsable and take alternate actions when that happens. Eric ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: CEDET merge question 2009-09-13 17:38 ` Eric M. Ludlam @ 2009-09-14 18:28 ` Richard Stallman 0 siblings, 0 replies; 29+ messages in thread From: Richard Stallman @ 2009-09-14 18:28 UTC (permalink / raw) To: eric; +Cc: cyd, raeburn, deng, emacs-devel Etags, ctags, gnu global, idutils and cscope all have parsers of some sort that parse C and C++ code. Some use regexp matchers. Others have primitive parsers. gcc, of course, has a full language compliant parser which it uses to compile code. I'm not a gcc expert, but I assume that as it parses, it keeps track of the various symbols (functions, variables, namespaces, etc) and where they are. (ie - debug info for gdb). Now I know what you are talking about. This idea seems very appealing, but it has a grave flaw. The flaw comes from the way GCC handles input: it does preprocessing first, and real parsing operates only on the output of preprocessing. So the output that GCC can easily make would describe only the output of preprocessing. Definitions and calls which are not actually compiled won't be seen at all. Macros and references to them won't be seen at all. What etags does now is much better, because it avoids that problem. It is true that output from GCC would give more details about types, etc., and would avoid getting confused in a few strange situations. So there is indeed an advantage to generating the output from GCC. But the disadvantage is much more important. I designed a way to make GCC analyze and report on macros and on the code that's not compiled in. That would get the best of both aspects. But this is not a small job. Please don't ask me to write more details unless you're prepared to do a substantial amount of work and study the GCC parsing code carefully. ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: CEDET merge question 2009-09-12 12:49 ` Eric M. Ludlam ` (2 preceding siblings ...) 2009-09-13 16:39 ` Richard Stallman @ 2009-09-13 16:40 ` Richard Stallman 3 siblings, 0 replies; 29+ messages in thread From: Richard Stallman @ 2009-09-13 16:40 UTC (permalink / raw) To: eric; +Cc: cyd, raeburn, deng, emacs-devel I don't know how bison works, but I would assume that bison parses basic C code (thus replacing $1 with some other piece of code.) In the same way, it would need to be taught about Emacs Lisp, Scheme, or any other language someone might want. Bison parses grammar definition files, which can contain segments of code. Normally the syntax for a segment of code is {...}. Bison generates tables for a parser, and puts the segments of code into a function to do the parsing. Normally that function is written in C. However, using a different language and different syntax is just a superficial change. The end result, however, would involve rather extreme changes to bison, and possibly flex if flex is also used. Oh no. The complex parts of Bison would not be changed at all. Only some of the parser and the output code. These are the parts that are easy to understand, without even minimal knowledge of parsing. ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: CEDET merge question 2009-09-06 17:46 ` Ken Raeburn 2009-09-06 21:11 ` David Engster @ 2009-09-07 13:34 ` Richard Stallman 1 sibling, 0 replies; 29+ messages in thread From: Richard Stallman @ 2009-09-07 13:34 UTC (permalink / raw) To: Ken Raeburn; +Cc: cyd, emacs-devel > Is it possible to use Bison itself rather than implement the > same functionality differently? Or perhaps add an option > to Bison to output its data in whatever format is convenient? Guile is also using a translation/reimplementation of Bison in Scheme. That may be wasteful too, but it is a separate issue. ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: CEDET merge question 2009-09-05 16:28 CEDET merge question Chong Yidong 2009-09-05 17:22 ` David Engster 2009-09-06 15:37 ` Richard Stallman @ 2009-09-08 8:11 ` joakim 2009-09-08 9:07 ` Lennart Borgman 2009-09-08 14:41 ` Chong Yidong 2 siblings, 2 replies; 29+ messages in thread From: joakim @ 2009-09-08 8:11 UTC (permalink / raw) To: Chong Yidong; +Cc: Tom Tromey, emacs-devel Chong Yidong <cyd@stupidchicken.com> writes: > I have a question about CEDET that hopefully someone on this list, who > has more experience using CEDET than me, can help answer (I've been > corresponding with Eric Ludlam, but he's gone on vacation). > > The Semantic parser appears to have two major "back-ends", bovine and > wisent, which are used to generate Semantic tags. Does anyone know how > crucial these packages are, and whether one or the other (or both) be > dropped or somehow trimmed down? > > I ask because the CEDET merge already involves an uncomfortably large > amount of code, and it's rather dismaying to see these two big code > trees "embedded" in subdirectories of Semantic. (Wisent, for instance, > appears to be an entire Elisp reimplementation of Bison...) > Emacs hackers would still need easy access to these tools. Maybe this is a further case for including something like Tom Tromeys ELPA in Emacs? If we had something like that by default it wouldnt be a big deal to distribute tools like these to Emacs hackers. > > (The CVS branch I'm using for the CEDET merge is not yet suitable for > general testing; I'll inform the list when it's ready.) > -- Joakim Verona ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: CEDET merge question 2009-09-08 8:11 ` joakim @ 2009-09-08 9:07 ` Lennart Borgman 2009-09-08 9:09 ` Lennart Borgman 2009-09-08 14:41 ` Chong Yidong 1 sibling, 1 reply; 29+ messages in thread From: Lennart Borgman @ 2009-09-08 9:07 UTC (permalink / raw) To: joakim; +Cc: Tom Tromey, Chong Yidong, emacs-devel On Tue, Sep 8, 2009 at 10:11 AM, <joakim@verona.se> wrote: > > Emacs hackers would still need easy access to these tools. Maybe this is > a further case for including something like Tom Tromeys ELPA in Emacs? > If we had something like that by default it wouldnt be a big deal to > distribute tools like these to Emacs hackers. But does ELPA have the necessary version info structure? ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: CEDET merge question 2009-09-08 9:07 ` Lennart Borgman @ 2009-09-08 9:09 ` Lennart Borgman 0 siblings, 0 replies; 29+ messages in thread From: Lennart Borgman @ 2009-09-08 9:09 UTC (permalink / raw) To: joakim; +Cc: Tom Tromey, Chong Yidong, emacs-devel On Tue, Sep 8, 2009 at 11:07 AM, Lennart Borgman<lennart.borgman@gmail.com> wrote: > On Tue, Sep 8, 2009 at 10:11 AM, <joakim@verona.se> wrote: >> >> Emacs hackers would still need easy access to these tools. Maybe this is >> a further case for including something like Tom Tromeys ELPA in Emacs? >> If we had something like that by default it wouldnt be a big deal to >> distribute tools like these to Emacs hackers. > > > But does ELPA have the necessary version info structure? Eh, sorry. I mean wouldn't it be better to have a tool to install directly from the repository where this part of CEDET is? I think it would be rather easy to right such a tool which access the web interface of the repository just for downloading. ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: CEDET merge question 2009-09-08 8:11 ` joakim 2009-09-08 9:07 ` Lennart Borgman @ 2009-09-08 14:41 ` Chong Yidong 2009-09-08 15:10 ` joakim 2009-09-08 21:21 ` Romain Francoise 1 sibling, 2 replies; 29+ messages in thread From: Chong Yidong @ 2009-09-08 14:41 UTC (permalink / raw) To: joakim; +Cc: Tom Tromey, emacs-devel joakim@verona.se writes: > Emacs hackers would still need easy access to these tools. Maybe this is > a further case for including something like Tom Tromeys ELPA in Emacs? Eric's still going to develop CEDET in his repository, so if you'll be hacking on CEDET, I think you should use the version of CEDET he has installed, instead of the version that will eventually be bundled with Emacs. This is a practical matter: since CEDET is such a large and complicated package, we shouldn't make changes directly to our copy of it, apart from those that are necessary to adapt it to Emacs' conventions and build system (which is what I've been working on). Instead, changes should be applied first to Eric's repository, then merged back into our tree. ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: CEDET merge question 2009-09-08 14:41 ` Chong Yidong @ 2009-09-08 15:10 ` joakim 2009-09-08 17:18 ` Chong Yidong 2009-09-08 21:21 ` Romain Francoise 1 sibling, 1 reply; 29+ messages in thread From: joakim @ 2009-09-08 15:10 UTC (permalink / raw) To: Chong Yidong; +Cc: Tom Tromey, emacs-devel Chong Yidong <cyd@stupidchicken.com> writes: > joakim@verona.se writes: > >> Emacs hackers would still need easy access to these tools. Maybe this is >> a further case for including something like Tom Tromeys ELPA in Emacs? > > Eric's still going to develop CEDET in his repository, so if you'll be > hacking on CEDET, I think you should use the version of CEDET he has > installed, instead of the version that will eventually be bundled with > Emacs. > > This is a practical matter: since CEDET is such a large and complicated > package, we shouldn't make changes directly to our copy of it, apart > from those that are necessary to adapt it to Emacs' conventions and > build system (which is what I've been working on). Instead, changes > should be applied first to Eric's repository, then merged back into our > tree. I will manage. I was more thinking of newcommers to the project that would like to contribute grammars for instance. -- Joakim Verona ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: CEDET merge question 2009-09-08 15:10 ` joakim @ 2009-09-08 17:18 ` Chong Yidong 0 siblings, 0 replies; 29+ messages in thread From: Chong Yidong @ 2009-09-08 17:18 UTC (permalink / raw) To: joakim; +Cc: Tom Tromey, emacs-devel joakim@verona.se writes: >> This is a practical matter: since CEDET is such a large and complicated >> package, we shouldn't make changes directly to our copy of it, apart >> from those that are necessary to adapt it to Emacs' conventions and >> build system (which is what I've been working on). Instead, changes >> should be applied first to Eric's repository, then merged back into our >> tree. > > I will manage. I was more thinking of newcommers to the project that > would like to contribute grammars for instance. I agree that it would be good to make it easier for newcomers to hack on the parsing infrastructure. The ELP suggestion is a good one, but I think it's a bit ambitious to implement it in the 23.2 timeframe. ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: CEDET merge question 2009-09-08 14:41 ` Chong Yidong 2009-09-08 15:10 ` joakim @ 2009-09-08 21:21 ` Romain Francoise 2009-09-08 22:27 ` Chong Yidong 1 sibling, 1 reply; 29+ messages in thread From: Romain Francoise @ 2009-09-08 21:21 UTC (permalink / raw) To: Chong Yidong; +Cc: emacs-devel Chong Yidong <cyd@stupidchicken.com> writes: > This is a practical matter: since CEDET is such a large and > complicated package, we shouldn't make changes directly to our > copy of it, apart from those that are necessary to adapt it to > Emacs' conventions and build system (which is what I've been > working on). Instead, changes should be applied first to Eric's > repository, then merged back into our tree. If it's so large and complicated that we can't handle it like the rest of Emacs, is it really a good idea to merge it in? I don't think we have any other packages where the rule is "make the change upstream first". That sounds like a liability to me. ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: CEDET merge question 2009-09-08 21:21 ` Romain Francoise @ 2009-09-08 22:27 ` Chong Yidong 0 siblings, 0 replies; 29+ messages in thread From: Chong Yidong @ 2009-09-08 22:27 UTC (permalink / raw) To: Romain Francoise; +Cc: emacs-devel Romain Francoise <romain@orebokech.com> writes: > I don't think we have any other packages where the rule is "make the > change upstream first". That sounds like a liability to me. Actually, that's the situation for Org mode. If you want to *develop* Org mode, I would encourage you to work on the upstream version, not the version in Emacs. This refers to development, not bug fixes (I apologize if my prior post caused confusion). In the preceding discussion, Joakim was talking about writing new Semantic grammars, i.e. development. Bugfixes can of course be applied to the version in the Emacs repository. (Though bugfixes should also be pushed upstream too, in most cases.) In the future, we will want to integrate CEDET more deeply into Emacs. When that time comes, we'll need a new arrangement. But for the time being, I'd prefer to treat CEDET more than (say) Org mode than (say) Calendar. ^ permalink raw reply [flat|nested] 29+ messages in thread
end of thread, other threads:[~2009-09-14 20:04 UTC | newest] Thread overview: 29+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-09-05 16:28 CEDET merge question Chong Yidong 2009-09-05 17:22 ` David Engster 2009-09-05 20:53 ` Chong Yidong 2009-09-05 23:08 ` David Engster 2009-09-06 15:37 ` Richard Stallman 2009-09-06 17:46 ` Ken Raeburn 2009-09-06 21:11 ` David Engster 2009-09-06 22:26 ` Ken Raeburn 2009-09-07 13:33 ` Richard Stallman 2009-09-12 12:49 ` Eric M. Ludlam 2009-09-12 13:37 ` Miles Bader 2009-09-13 16:39 ` Richard Stallman 2009-09-14 11:22 ` tomas 2009-09-14 12:15 ` Miles Bader 2009-09-14 20:04 ` tomas 2009-09-12 16:34 ` David Engster 2009-09-13 16:39 ` Richard Stallman 2009-09-13 17:38 ` Eric M. Ludlam 2009-09-14 18:28 ` Richard Stallman 2009-09-13 16:40 ` Richard Stallman 2009-09-07 13:34 ` Richard Stallman 2009-09-08 8:11 ` joakim 2009-09-08 9:07 ` Lennart Borgman 2009-09-08 9:09 ` Lennart Borgman 2009-09-08 14:41 ` Chong Yidong 2009-09-08 15:10 ` joakim 2009-09-08 17:18 ` Chong Yidong 2009-09-08 21:21 ` Romain Francoise 2009-09-08 22:27 ` Chong Yidong
Code repositories for project(s) associated with this external index https://git.savannah.gnu.org/cgit/emacs.git https://git.savannah.gnu.org/cgit/emacs/org-mode.git This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.