Hello all, Thanks for opening this can of, er, threads. I was going to ask about these things myself soon in any case, because it's clear that js2-mode is not doing a very effective job of surfacing its rich information in Emacs. This is partly my fault, but it is also partly due to some issues with font-lock that I'll describe in nauseating detail. There are several important ideas being conflated in this thread that I think need to be teased apart before we can talk responsibly about any of them. I've called out the top five conflations in sections below delimited by roman numerals. This is all in some sense an elaboration of what Eric Ludlam just posted, to which I can only add my miserable +1. Stephen Eilert wrote: > I do not think that was done without a very good reason (and there's a lengthy post explaining it), unless the author is a complete masochist. I don't think of myself that way. Here, as requested, is a lengthy post explaining my approach. For the record, it could have been much lengthier, and I have lengthy replies ready for all your objections and concerns. (Just in case you were wondering.) I really do want to get this resolved, though. I. Asynchronous parsing js2-mode performs both syntactic and (some) semantic analysis. It knows, for instance, when you're using a symbol that's not defined in its file. js2-mode does not currently understand project structure, but I'm doing some work in this area, and it may at some point gather semantic information collected from several files. Because this analysis requires parsing the entire file at least once (see my discussion of partial/incremental parsing below), and it may someday involve looking at symbol tables from other files, it seemed best to run the parse asynchronously, so as not to interfere with the user's editing. One byproduct of having an accurate parser and symbol table is that you can obtain style runs with relatively small effort, so js2-mode does its own highlighting. The downside is that this highlighting information is unavailable at font-lock time, and it is not available piecewise -- it's all-or-nothing. There is a relatively simple alternative that might appease Daniel: I could have js2-mode simply not do any highlighting by default, except for errors and warnings. We'd use whatever highlighting is provided by espresso-mode, and users would be able to choose between espresso-highlighting and js2-mode highlighting. With the former, they'd get "instantaneous" font-locking, albeit not as rich as what js2-mode can provide. This would be trivial to change. I am actively maintaining js2-mode, and the only reason I haven't checked in any changes since my initial commit to the trunk is inexperience: I'm trying to get a handle on how many changes people tend to aggregate before checking in a change to any given mode. But I have several fixes (including some patches contributed from users) that are ready to commit, and more on the way. Errors and warnings would still need to be asynchronous (if they're enabled). So, too, would the imenu outline and my in-progress buffer-based outline, which is somewhat nicer than the IMenu one. But I think the main objection to js2-mode revolves around its highlighting, correct? If so, AND if we can solve the font-lock integration issues, AND if we can fix the multi-mode issues (II below), then I'm hopeful that js2-mode might become a reasonable choice as the default editing mode for JavaScript. I think espresso-mode is a fine fallback position. Anything but java-mode! The default today is java-mode, and I had no qualms about replacing it as the default for JavaScript. Note: diagnostic messages in js2-mode are highlighted using overlays. I tried using overlays for all highlighting but it was unacceptably slow and had a tendency to crash Emacs. But there are usually not prohibitively many errors and warnings, since the error-recovery algorithm is somewhat coarse-grained. So error-reporting works independently of font-lock. II. Multi-mode support JavaScript is especially needful of mumamo (or equivalent) multi-mode support, because much of the JavaScript in the wild is embedded in HTML, in template files, even in strings in other languages. js2-mode does not support mumamo (or mmm-mode, which which I am currently more familiar) because js2-mode's lexer needs to support ignoring parts of the buffer. I do not think this would be very hard to implement, but I have not done it yet. If I don't get to it before the next version of Emacs launches, then I think this should effectively disqualify js2-mode from being the default JavaScript mode. It would be an inconsistent user experience to have one JavaScript mode in .js files and another mode for JavaScript inside multi-mode-enabled files. I'm ready to give it a try, though, and I'll ping Lennart offline about integrating the two somehow. III. Incremental and partial parsing Lennart and others have asked whether it is possible for js2-mode to support partial or incremental parsing. The short answer is "incremental: yes; partial: no". nxml-mode, last I checked, does incremental parsing. It parses ahead in the buffer, but then stops and saves its state. If you jump forward in the buffer, it resumes and continues the parse until some point beyond the section you're viewing. js2-mode could do it this way without much additional effort. I chose not to because once you've decided to use background parsing, it doesn't seem like an especially useful optimization. But I could see it being helpful in some cases, such as when you're editing near the top of a large file -- as long as the whole file isn't encased in some top-level expression, which unfortunately is often the case in JS. Partial parsing is a different beast entirely. The goal of a partial parser is to re-parse the minimum amount necessary, given some region that has changed. I've dug into this a bit, because originally I wanted to support it in js2-mode. I even made some progress on an implementation. While a few production parsers (for Java and JavaScript) have implemented partial parsing, the vast majority of them do not support it -- instead, they re-parse from the top. They do this because the incremental benefit of partial parsing is debatable, assuming you're time- and resource-constrained, as most of us are. I took a close look at Eclipse and IntelliJ, and even asked some of their users to characterize the highlighting behavior of the IDE. Without exception, the IDE users had internalized a ~1000 ms delay in highlighting and error reporting as part of their normal workflow, and they uniformly described it as "instant{aneous}" until I made them time it. I've been an Emacs user for 20+ years now, and like many I found the idea of a parsing delay to be somewhere between "undesirable" and "sickening". But the majority of programmers today have apparently learned not to notice delays of ~1sec as long as it never interferes with their typing or indentation (see IV below). So after looking at my ~8000 lines of elisp devoted to parsing JavaScript, I weighed it and decided not to support partial parsing. It's certainly possible to support it, but I think my time would be better spent on things that average users are more likely to notice. YMMV, of course. The upshot is that if I'm going to support mumamo, it will need to work within js2-mode's existing full-reparse framework. I can think of various ways to make it work, though, and as I mentioned I'll talk to Lennart about it. IV. Indentation The indentation in js2-mode is broken. I'll be the first to say it. It is based on the indentation in Karl Langstrom's mode, which does a better job for JavaScript than any indenter based on cc-engine, but that doesn't mean it's a good job. And it's essentially unconfigurable. espresso-mode shares this problem, which means that for this important use case it is not an improvement over js2-mode. Daniel's objections to js2-mode's non-interaction with font-lock apply equally to the non-interaction with cc-engine's indentation configuration system. The indent configuration for JavaScript should share as many settings as practical with cc-mode. I actually made a serious attempt to generate the `c-style-alist' data structure for js2-mode using the parse tree, but ran into three issues: 1) it's much harder than I thought it would be, even with a full parse tree available. I had some 2000 lines of elisp invested in it when I pooped out, to be perfectly frank. 2) `c-style-alist' (like font-lock) does not have enough semantic variables to encompass the range of indentation contexts that JavaScript programmers care about. I think we'd need to add 5-10 more, although it's been 18 months since I looked into it. 3) indentation in "normal" Emacs modes also runs synchronously as the user types. Waiting 500-800 msec or more for the parse to finish is (I think) not acceptable for indentation. For small files the parse time is acceptable, but it would not be generally scalable. #3 is the reason I gave up on #1. It didn't seem to be worth the effort to produce an accurate but slow indenter. I don't know exactly how to solve this problem. I have lots of ideas, but it appears there are few low-hanging fruit in this space. V. Font Lock framework design problems There seems to be a common misconception flitting about to the effect that font-lock is perfect and will never need to change. This is a somewhat paradoxical viewpoint in view of the corpses littering the path to jit-lock, which include font-lock, fast-lock, lazy-lock, and vapor-lock. Each decade we've had a cadre of people claiming that *-lock meets everyone's needs, and then it gets rewritten anyway. So it's hard to understand how it remains such a popular viewpoint. I'll make yet another attempt to dispel it, since once we're past the emotional stumbling blocks, font-lock may be able to evolve again. Va) Inadequate/insufficient style names There are not enough font-lock faces to represent all the semantic style runs that are identifiable to "real" language analyzers. js2-mode makes several semantic distinctions not available in most Emacs modes, although such distinctions are available in JDEE and other Cedet-enabled modes, so js2-mode is by no means alone in its needs. In addition to the autoloaded font-lock faces, which js2-mode uses whenever possible, js2-mode defines several new faces, including: * function parameters * "class" instance members (in JS, prototype and instance props) * local variables * undeclared variables * private members (although I implemented it poorly -- see below) * html/xml tags, attr names and delimiters -- used both for html in jsdoc comments and for E4X literals * doc tags such as those typically found in javadoc/jsdoc comments * warnings, errors, and informational diagnostics I do not expect that this set is all-inclusive -- over time as js2-mode and similar modes get smarter, they will be able to make other semantic distinctions that users may wish to customize independently. Given that Emacs is the most configurable editor on the planet, I do not see any reason to entertain arguments to the contrary. Vb) Ad-hoc default faces that are not being autoloaded There are some modes (e.g. sgml-mode, html-mode, nxml-mode) that define their own versions of some of the xml/html faces, but it did not seem right to make js2-mode 'require one of these modes just to get at ad-hoc "standard" definitions for these faces. We should define standard faces for xml/html tags and entities, and for any other faces that are effectively defined by 2 or more modes. Vc) Additional semantic styles not needed by JavaScript I have other language modes in progress, and together they define an ever larger set of semantic styles. The set of available font-lock names should try to encompass the _union_ of the needs of most languages, not the intersection. There should, for instance, be a font-lock-symbol-face for languages with distinguished symbols such as Lisp, Scheme and Ruby. I think this is relatively easy to fix, provided a little thought goes into choosing the new faces. Vd and Ve below should help clarify why it requires greater than zero thought. Vd) Composable semantic styles Some font-lock faces represent "primary" semantic roles, in a vague way. For instance, there is a font-lock-function-name-face, and this is different from font-lock-variable-name-face. While in some languages (including JavaScript) the distinction is not necessarily exact, they can usually be reconciled -- e.g. being a function is a more "important" property of an identifier than being a variable. Most of the font-lock faces represent very common primary roles: strings, comments, keywords, types, preprocessor macros. But not all. font-lock-constant face is actually orthogonal to the primary role. A class or method or parameter can be const or non-const in some languages. The semantic notion of public/private/protected/package/friend visibility is another example. So is "abstract"/"pure virtual". Emacs supports composable faces (a style run may have multiple faces, and the attributes compose according to predefined rules), but font-lock provides neither consistent nor adequate support for this notion. Ve) Ambiguous semantic styles At least one of the face names is ambiguous -- it's not clear what font-lock-builtin-face is actually supposed to highlight. The result is that different language modes use it for different kinds of entities. If you customize the face for one mode, you may wind up with unsatisfying results in another mode due to the differences in relative weighting/distribution of semantic types across languages. As a hypothetical example, someone might enhance python-mode to use font-lock-builtin-face to highlight True/False/None and possibly "self", since they're not keywords but they are all handled specially by the runtime. (font-lock-type-face might be better for this, but since they're not really classes, you could argue it either way). These tokens appear relatively infrequently in Python. If someone else were to use it to highlight functions implemented in C in elisp, there would be a lot more of that face appearing in elisp buffers, and it might not be easy to choose one face that looks nice in both situations. Regardless of the fate of js2-mode, font-lock needs to add more semantic faces. By default these new faces might simply inherit face attributes from their "syntactic parents" -- e.g. the faces for locals, parameters, instance and static vars might all inherit the settings for `font-lock-variable-name-face'. But users should be able to differentiate among them when the information is available. Vf) No font-lock interface for setting exact style runs I could be mistaken here -- if so, please correct me. My limited understanding of font-lock and its main entry-point mechanisms such as font-lock-keywords and font-lock-apply-highlight, all of which use the MATCH-HIGHLIGHT data structure, is that they are not quite powerful enough for my needs in their current incarnation. This issue is independent of asynchronous parsing -- I think that even if my parser were instantaneous, I would still have this issue. The problem is that I need a way, in a given font-lock redisplay, to say "highlight the region from X to Y with text properties {Z}". This use case does not seem like it should be inordinately difficult to support, but it does not seem to be supported today. When I assert that it's not possible, I understand that it's _theoretically_ possible. Given a JavaScript file with 2500 style runs, assuming I had that information available at font-lock time, I could return a matcher that contains 2500 regular expressions, each one of which is tailored to match one and exactly one region in the buffer. In practice, however, I am not aware of a way to do this that is either clean or efficient. If this simple feature were supported, I would have a great deal more incentive to try to get my parsing to be fast enough to work within the time constraints users expect from font-lock. Vg) Lack of differentiation between mode- and minor-mode styles One of the most common complaints from the thousands of users of js2-mode, most of whom have exercised enough self-restraint to use the term "work in progress" in preference to "abomination", is that js2-mode has poor support for minor modes that do their work with font-lock -- 80-column highlighters being a popular example, although there are others. The fundamental problem here is that the font-lock framework does not differentiate between the mode's syntax highlighting and the keywords installed by minor modes and by user code. Instead, it merges them. As far as I can tell, the officially supported mechanism for adding additional font-lock patterns is `font-lock-add-keywords'. This either appends or prepends the keywords to the defaults. It might be possible to reverse-engineer it, for instance by manually diffing the buffer's font-lock-defaults and font-lock-keywords and trying to figure out which ones were added by participants other than the major mode. Even if it's possible, it's not clear that it always works now, and would always work in the future. For one thing, it's possible (as Daniel observes) to bypass this mechanism and call font-lock-apply-highlight directly, which makes the reverse-engineering even more cumbersome and fragile. (Vf) is the reason (Vg) is a problem for js2-mode. font-lock-defaults does not seem to be a very satisfactory way to apply 2000-10000 precise style runs to a buffer, so I do all my own highlighting, and it doesn't include style-run contributions from minor modes. I've made some halfhearted attempts to hack around the problem, but they've proven fragile. If font-lock were to support (Vf), then I think (Vg) should "just work". VI. Summary I've called out some of the main integration issues I've encountered. I've penned several major and minor language modes, not just js2-mode, and I've chosen to whine here about the problems that could best be classified as "problem themes". I'm around, and I'm available for nontrivial work. If group consensus is that js2-mode isn't ready yet, I'm happy to keep hacking on it and taking user patches and feedback until Emacs 24 rolls around. But it would be nice to have more direct support for modes like mine. I'm willing to do my end of it, but I'm always oversubscribed, and I've already signed up to support mouse-enter and mouse-left text props as part of another js2-mode-related thread. So a little help would go a long way. -steve