Hello Emacs maintainers, A few years ago, I took up the challenge of implementing indentation support for the JavaScript extension “JSX” (by Facebook, for their “React” UI library), and we got my patches merged into js-mode. Now I think it’s time we reexamined and began to improve Emacs’ JSX+React support. My initial implementation was not too great, because it failed to indent JSX code properly in many common scenarios. My hueristics for finding HTML-like code inside a JS buffer were too conservative, due to a (perhaps excessive/misguided) attention to performance in particular areas of the code. Since my patch’s release, I’ve been watching from a distance as the bug reports have piled up. A package called “rjsx-mode” came onto the scene not too later, which seems to solve indentation problems using an AST (building on js2-mode’s AST); however, according to Steve Yegge’s post-mortem on js2-mode (http://steve-yegge.blogspot.com/2008/03/js2-mode-new-javascript-mode-for-emacs.html), relying on an AST being available after every keystroke is probably not optimal for performance, especially with large files. rjsx-mode also performs syntax highlighting (again, using its AST), which js-mode still does not do at all for JSX. Recently, I finally resolved to “do my laundry,” with respect to addressing the JSX indentation issues accumulated within debbugs and the js2-mode GitHub repo (where lots of js-mode bugs accidentally get filed). In the past week, I assembled some test cases, drafted some algorithms, and proceeded to implement a number of improvements to the indentation logic. As I started playing with the code again, many new ideas for improving JSX+React support began popping into my head, beyond just indentation improvements. I wanted to share those ideas here, and see if anyone wants to thumbs-up them before I go ahead and implement them (perhaps avoiding some bad ideas and/or rework). Also, I wanted to share my work-in-progress indentation reimplementation, and determine if I am not going completely in the wrong direction, and if an alternate plan I’ll also present for indentation is at all viable in comparison. Proposition: Comprehensive JSX support in Emacs First: Consider reworking obtuse APIs JSX indentation support is currently enabled by activating “js-jsx-mode”, using a function called “js-jsx-indent-line”, and “sgml-” variables control the indentation of the JSX code. I think introducing JSX support in the form of a derivative mode may have been a mistake. It wasn’t initially obvious to some people to use this mode. There was not an auto-mode-alist item for it, either. It also had the effect of creating a “matrix” of modes, where, for instance, Flycheck has to support all the different JavaScript modes like this: :modes (js-mode js-jsx-mode js2-mode js2-jsx-mode js3-mode rjsx-mode) which seems messy to me. And what if we want to support more JavaScript syntax extensions, like Facebook’s “Flow,” and support the use of multiple extensions at once? We wouldn’t want to also have a js-flow-mode, js-jsx-flow-mode, js2-flow-mode, js2-jsx-flow-mode… therefore, I think it’d be better to start scaling back, and instead enable JSX, Flow, et al support with a variable. Also, upon reflection, I’m becoming more certain that controlling JavaScript indentation (even HTML-like indentation) with sgml- variables was unintuitive and therefore also a mistake. I think JSX should simply be indented using js-indent-level, perhaps optionally in combination with a js-jsx-attribute-offset, to complement the sgml-attribute-offset that we would otherwise be eliminating as an option. However, having added sgml-attribute-offset to Emacs myself merely as a precaution after a minor breaking change I made to SGML indentation a few years ago, I think we should let users actually demand JSX attribute indentation deltas, rather than assume they desire that level of granularity (considering we went so long without it in sgml-mode). Rather, users have indicated in bug reports that they’d prefer better defaults for the indentation (assuming indentation not matching “official” conventions is simply “erroneous;” and what a blessing to have conventions). I think JSX shoud “just work” in Emacs; i.e., with js-mode, whenever a user opens a file using JSX syntax, adapting to the user’s conventions for JS. Deprecation: Guide users to new, nicer APIs * Deprecate js-jsx-mode (and js2-jsx-mode downstream, and have rjsx-mode derive from js2-mode). Both modes will still exist, but will be marked obsolete, and will simply make a file-local copy of js-syntax-extensions (see below) with 'jsx added to the front of the list, instead of binding indentation like they currently do. * Deprecate js-jsx-indent-line, marking it obsolete. It will still exist, but it will simply call through to js-indent-line, with a copy of js-syntax-extensions let-bound, with 'jsx added to the front of the list. New features: * js-syntax-extensions: A list that will include symbols representing syntax extensions for which we will enable indentation/fontification. o In theory, if we supported more syntax extensions, those would go here as well; also, if any of the extensions happened to conflict with each other in some ways but not others, we could use the order of the list to determine the “precedence” of the extensions. o Having one variable for this gives us an opportunity to document all the supported syntaxes in one place, so more people will discover what’s all available. o Add a Customize interface for adding to this list like a set, picking from a set of options. (Not sure which widget would be best to use for that.) * js-indent-line will switch to a JSX indentation strategy when called while js-syntax-extensions contains 'jsx. * js-mode will automatically detect whether a file uses JSX syntax using some heuristic. Suggestions: o 'jsx is in js-syntax-extensions via init file, mode hook, or in .dir-locals.el o buffer-file-name has extension “.jsx” (some people use this) o “^\(var\|let\|const\|import\) React” matches within the first 1000 lines after any keystroke. (Under typical compiler configuration, “React” must be a JavaScript variable in scope in order for the compiled JSX to function properly, and most people import or require() stuff at the beginning of their files. React is usually used in combination with a JavaScript compiler, implying the user probably isn’t using React as a global variable, because compilers make that practice obsolete, so I expect this check will be pretty reliable) * Add auto-mode-alist item mapping “.jsx” files to js-mode, enabling the buffer-file-name heuristic. * As syntax extensions are enabled/disabled, update the mode name; ex: o No syntax extensions: "JavaScript" o js-syntax-extensions = (jsx): "JavaScript[JSX]" o js-syntax-extensions = (flow jsx): "JavaScript[Flow,JSX]" * Fill in current gaps in indentation support for JSX. * Add font locking for JSX syntax, similar if not the same as sgml-mode’s font locking. It is enabled when js-syntax-extensions contains 'jsx. Would appreciate feedback on these feature propositions; thanks! Indentation: WIP As I mentioned, I’ve already started hacking on the indentation. WIP patches are attached for preliminary review of approach (please don’t merge these yet, I haven’t vetted them thoroughly outside make test cases). I started by reworking my “JSX detection” code to use sgml-get-context to figure out if we were inside JSX, which was a lot more reliable than the previous heuristic (JSX pretty much always had to be wrapped in parens). I then proceeded to mostly solve Bug#24896 by distinguishing “<” and “>” in JS and JSX, so the sgml-mode indentation code could parse the code without tripping over arrow functions (“=>”), thinking they were part of HTML “”. I disambiguated the symbols in a syntax-propertize-function by doing a quick parse near every “<” and “>” char to determine if the surrounding code could only be syntactically valid as JSX, and not as JS. This seems to have fixed a lot of the failing cases. However, it still fails in a weird case where JSX appears nested inside JS, where that JS is also nested inside an outer JSX, a situation that just isn’t relevant in SGML, and thus the indenter doesn’t seem to support it. // JS expressions should not break indentation // (https://github.com/mooz/js2-mode/issues/462). return ( (
nothing
)} />
) Maybe I could fix this recursive indentation issue by parsing the JSX further, and marking any “{”, “}” pairs therein with some syntax property that would get sgml-indent-line to treat it like some nested thing. However, that also might not work, and I don’t think I want to jump through any more hoops to make sgml-indent-line work inside js-mode. Also, since I’ve already started parsing JSX to disambiguate it from JS, I figure I may as well finish the parse (it’s pretty simple), since I could use that parse as an opportunity to mark characters for JSX-contextual font locking. And, since I’m adding these markings to buffer positions, I now have a great new way to figure out if I’m inside JSX when I’m indenting. So I’m thinking it may be time to graduate from sgml-indent-line delegation, and use the new data we now have to implement the JSX indentation logic in its entirety in js-mode. This way we could definitely support recursive JSX indentation, and also align all the indentation conventions with those used by the React community, and potentially maximize performance by avoiding a lot of checks that poor sgml-mode has to do for HTML’s loose and more complex semantics. Please let me know what you think of my current patches and the direction they’re going in, and if porting a simplified sgml-indent-line to js-mode is not too crazy of an idea! Thanks, Jackson