unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* [ELPA] New package: phps-mode
@ 2019-07-12 17:15 Christian Johansson
  2019-07-12 22:35 ` Stephen Leake
  2019-07-12 23:36 ` Lars Ingebrigtsen
  0 siblings, 2 replies; 11+ messages in thread
From: Christian Johansson @ 2019-07-12 17:15 UTC (permalink / raw)
  To: emacs-devel

Hi!

My name is Christian Johansson, I have done the FSF paperwork and want 
to get my package phps-mode into ELPA for GNU Emacs, I have a repo here: 
https://github.com/cjohansson/emacs-phps-mode

Package summary:

The phps-mode plug-in for Emacs provides a new major-mode for PHP that 
is not based on CC-mode. It has a built-in semantic lexer that should be 
equivalent to the original PHP re2c lexer, supports PSR-1 and PSR-2 
coding styles and uses lexer tokens for syntax coloring, imenu and 
indentation. In the future the goals are to provide a semantic parser, 
eldoc, html/css/javascript indentation for inlines areas


Regards
Christian





^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [ELPA] New package: phps-mode
  2019-07-12 17:15 [ELPA] New package: phps-mode Christian Johansson
@ 2019-07-12 22:35 ` Stephen Leake
  2019-07-12 23:36 ` Lars Ingebrigtsen
  1 sibling, 0 replies; 11+ messages in thread
From: Stephen Leake @ 2019-07-12 22:35 UTC (permalink / raw)
  To: emacs-devel

Christian Johansson <christian@cvj.se> writes:

> The phps-mode plug-in for Emacs provides a new major-mode for PHP that
> is not based on CC-mode. It has a built-in semantic lexer that should
> be equivalent to the original PHP re2c lexer, supports PSR-1 and PSR-2 
> coding styles and uses lexer tokens for syntax coloring, imenu and
> indentation. In the future the goals are to provide a semantic parser,
> eldoc, html/css/javascript indentation for inlines areas

My Gnu ELPA package 'wisi' provides an interface to an external process
running the re2c lexer and an LR error-correcting parser. Currently
used only by ada-mode, but I'd like to have other people try it

The external process code is written in Ada, but what's one more
language? :). If your needs are simple enough, you don't need to write
any Ada code, just the lexer and grammar specification.

-- 
-- Stephe



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [ELPA] New package: phps-mode
  2019-07-12 17:15 [ELPA] New package: phps-mode Christian Johansson
  2019-07-12 22:35 ` Stephen Leake
@ 2019-07-12 23:36 ` Lars Ingebrigtsen
  2019-07-13 13:14   ` Stefan Monnier
  1 sibling, 1 reply; 11+ messages in thread
From: Lars Ingebrigtsen @ 2019-07-12 23:36 UTC (permalink / raw)
  To: Christian Johansson; +Cc: emacs-devel

Christian Johansson <christian@cvj.se> writes:

> The phps-mode plug-in for Emacs provides a new major-mode for PHP that
> is not based on CC-mode. It has a built-in semantic lexer that should
> be equivalent to the original PHP re2c lexer, supports PSR-1 and PSR-2 
> coding styles and uses lexer tokens for syntax coloring, imenu and
> indentation.

This sounds awesome.

Emacs doesn't have an in-tree PHP mode, and PHP is a major, major
language.  Would it make sense to include this in the Emacs tree?

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [ELPA] New package: phps-mode
  2019-07-12 23:36 ` Lars Ingebrigtsen
@ 2019-07-13 13:14   ` Stefan Monnier
  2019-07-13 13:28     ` Lars Ingebrigtsen
  0 siblings, 1 reply; 11+ messages in thread
From: Stefan Monnier @ 2019-07-13 13:14 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: Christian Johansson, emacs-devel

> Emacs doesn't have an in-tree PHP mode, and PHP is a major, major
> language.  Would it make sense to include this in the Emacs tree?

As mentioned previously, I'd much rather we change our build system such
that the release tarball includes some of GNU ELPA's packages rather
than move these kinds of packages to emacs.git.


        Stefan




^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [ELPA] New package: phps-mode
  2019-07-13 13:14   ` Stefan Monnier
@ 2019-07-13 13:28     ` Lars Ingebrigtsen
  0 siblings, 0 replies; 11+ messages in thread
From: Lars Ingebrigtsen @ 2019-07-13 13:28 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Christian Johansson, emacs-devel

Stefan Monnier <monnier@iro.umontreal.ca> writes:

>> Emacs doesn't have an in-tree PHP mode, and PHP is a major, major
>> language.  Would it make sense to include this in the Emacs tree?
>
> As mentioned previously, I'd much rather we change our build system such
> that the release tarball includes some of GNU ELPA's packages rather
> than move these kinds of packages to emacs.git.

Makes sense to me.

I have no idea how to put packages in ELPA; if somebody could take a
look at phps-mode and do the needful (as they say), that'd be very
nice.  The feature list sounds great, at least.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [ELPA] New package: phps-mode
@ 2019-07-16 10:00 Mattias Engdegård
  2019-07-17  5:43 ` Christian Johansson
  0 siblings, 1 reply; 11+ messages in thread
From: Mattias Engdegård @ 2019-07-16 10:00 UTC (permalink / raw)
  To: christian; +Cc: Emacs developers

Thank you for your contribution! A regexp scan on phps-mode, using relint, found some irregularities. Summary here, along with some things that relint didn't catch:

php-mode-lexer.el:160:
(defvar phps-mode-lexer-TOKENS "[][;\\:,\.()|^&+-/*=%!~\\$<>?@]"

The hyphen (-) is special and should be placed last to avoid being interpreted as a range.

The lone backslash in front of the dot has no effect, since backslashes must be doubled inside string literals.
On the regexp level, backslashes are not special inside [...] and only represent themselves, with no escaping power. This regexp includes it multiple times which was probably unintended.

php-mode-lexer.el:1367:
                   (if (looking-at "[^\\\\]\"")

No need to double the backslash; it's not special inside [...].

php-mode-lexer.el:151:
  "[a-zA-Z_\x80-\xff][a-zA-Z0-9_\x80-\xff]*"

Hex and octal escapes in the 128-255 range do not denote Unicode (Latin-1) characters but raw bytes, which you likely did not intend to match here. To match U+0080-U+00FF, write "\u0080-\u00FF". I don't know PHP's lexing rules, but if you want to match Unicode identifiers, you'd be better off using something like "[[:alpha:]_][[:alnum:]_]*", or syntax classes.

php-mode-functions.el:990:
            (when (looking-at-p " \\*\/")

Ineffective backslash before `/', which does not need escaping anyway.
The same pattern (`\/') occurs in several other places in this file.




^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [ELPA] New package: phps-mode
  2019-07-16 10:00 Mattias Engdegård
@ 2019-07-17  5:43 ` Christian Johansson
  2019-07-17  8:44   ` Mattias Engdegård
  0 siblings, 1 reply; 11+ messages in thread
From: Christian Johansson @ 2019-07-17  5:43 UTC (permalink / raw)
  To: Mattias Engdegård; +Cc: Emacs developers

Hi!

Thanks for your review, I should have fixed all those items now and 
pushed them to ELPA

Best Regards
Christian

On 2019-07-16 12:00, Mattias Engdegård wrote:
> Thank you for your contribution! A regexp scan on phps-mode, using relint, found some irregularities. Summary here, along with some things that relint didn't catch:
>
> php-mode-lexer.el:160:
> (defvar phps-mode-lexer-TOKENS "[][;\\:,\.()|^&+-/*=%!~\\$<>?@]"
>
> The hyphen (-) is special and should be placed last to avoid being interpreted as a range.
>
> The lone backslash in front of the dot has no effect, since backslashes must be doubled inside string literals.
> On the regexp level, backslashes are not special inside [...] and only represent themselves, with no escaping power. This regexp includes it multiple times which was probably unintended.
>
> php-mode-lexer.el:1367:
>                     (if (looking-at "[^\\\\]\"")
>
> No need to double the backslash; it's not special inside [...].
>
> php-mode-lexer.el:151:
>    "[a-zA-Z_\x80-\xff][a-zA-Z0-9_\x80-\xff]*"
>
> Hex and octal escapes in the 128-255 range do not denote Unicode (Latin-1) characters but raw bytes, which you likely did not intend to match here. To match U+0080-U+00FF, write "\u0080-\u00FF". I don't know PHP's lexing rules, but if you want to match Unicode identifiers, you'd be better off using something like "[[:alpha:]_][[:alnum:]_]*", or syntax classes.
>
> php-mode-functions.el:990:
>              (when (looking-at-p " \\*\/")
>
> Ineffective backslash before `/', which does not need escaping anyway.
> The same pattern (`\/') occurs in several other places in this file.
>



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [ELPA] New package: phps-mode
  2019-07-17  5:43 ` Christian Johansson
@ 2019-07-17  8:44   ` Mattias Engdegård
  2019-07-17  9:36     ` Christian Johansson
  2019-07-17 20:27     ` Christian Johansson
  0 siblings, 2 replies; 11+ messages in thread
From: Mattias Engdegård @ 2019-07-17  8:44 UTC (permalink / raw)
  To: Christian Johansson; +Cc: Emacs developers

17 juli 2019 kl. 07.43 skrev Christian Johansson <christian@cvj.se>:
> 
> Thanks for your review, I should have fixed all those items now and pushed them to ELPA

>(defvar phps-mode-lexer-LABEL
>  "[a-zA-Z_\u0080-\u00FF][a-zA-Z0-9_\x80-\xff]*"

Unfinished?

It looks like PHP accepts any Unicode character above and including U+0080 in labels implicitly, by including 80-ff at the byte level and the implicit fact that most PHP code is in UTF-8. So your regexp would probably be something like

 "[A-Za-z_[:nonascii:]][0-9A-Za-z_[:nonascii:]]*"

You could always try and see if your code correctly treats $γνῶσις, say.




^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [ELPA] New package: phps-mode
  2019-07-17  8:44   ` Mattias Engdegård
@ 2019-07-17  9:36     ` Christian Johansson
  2019-07-17 12:24       ` Stefan Monnier
  2019-07-17 20:27     ` Christian Johansson
  1 sibling, 1 reply; 11+ messages in thread
From: Christian Johansson @ 2019-07-17  9:36 UTC (permalink / raw)
  To: Mattias Engdegård; +Cc: Emacs developers

[-- Attachment #1: Type: text/plain, Size: 1103 bytes --]

Ah I missed that, the original re2c regex is
[a-zA-Z_\x80-\xff][a-zA-Z0-9_\x80-\xff]*

(from https://github.com/php/php-src/blob/master/Zend/zend_language_scanner.l#L1252)

But I’m not sure about the equivalent in emacs-lisp but I know PHP does not fully support UTF-8 yet.

Is the equivalent
"[a-zA-Z_\u0080-\u00FF][a-zA-Z0-9_\u0080-\u00FF]*"
?

> 17 juli 2019 kl. 10:44 skrev Mattias Engdegård <mattiase@acm.org>:
> 
>> 17 juli 2019 kl. 07.43 skrev Christian Johansson <christian@cvj.se>:
>> 
>> Thanks for your review, I should have fixed all those items now and pushed them to ELPA
> 
>> (defvar phps-mode-lexer-LABEL
>> "[a-zA-Z_\u0080-\u00FF][a-zA-Z0-9_\x80-\xff]*"
> 
> Unfinished?
> 
> It looks like PHP accepts any Unicode character above and including U+0080 in labels implicitly, by including 80-ff at the byte level and the implicit fact that most PHP code is in UTF-8. So your regexp would probably be something like
> 
> "[A-Za-z_[:nonascii:]][0-9A-Za-z_[:nonascii:]]*"
> 
> You could always try and see if your code correctly treats $γνῶσις, say.
> 

[-- Attachment #2: Type: text/html, Size: 2611 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [ELPA] New package: phps-mode
  2019-07-17  9:36     ` Christian Johansson
@ 2019-07-17 12:24       ` Stefan Monnier
  0 siblings, 0 replies; 11+ messages in thread
From: Stefan Monnier @ 2019-07-17 12:24 UTC (permalink / raw)
  To: Christian Johansson; +Cc: Mattias Engdegård, Emacs developers

> Ah I missed that, the original re2c regex is
> [a-zA-Z_\x80-\xff][a-zA-Z0-9_\x80-\xff]*

I strongly suspect that this regexp is applied to a stream of *bytes*,
whereas in phps-mode you're dealing with a stream of *characters*.

IOW for the re2c code, `λ` is the two byte sequence of \xCE and \xBB
(assuming the file is using utf-8) whereas in phps-mode it's just `λ` or
\u03BB.

So the regexp above actually matches all the non-ascii Unicode chars,
assuming the file uses utf-8.

> But I’m not sure about the equivalent in emacs-lisp but I know PHP does not fully support UTF-8 yet.

In that case, maybe

    [[:alpha:]_][[:alnum:]_]*

is the saner choice.


        Stefan




^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [ELPA] New package: phps-mode
  2019-07-17  8:44   ` Mattias Engdegård
  2019-07-17  9:36     ` Christian Johansson
@ 2019-07-17 20:27     ` Christian Johansson
  1 sibling, 0 replies; 11+ messages in thread
From: Christian Johansson @ 2019-07-17 20:27 UTC (permalink / raw)
  To: Mattias Engdegård; +Cc: Emacs developers

Hi again!

Thanks, I have updated the regex, labels with UTF-8 are now supported 
just like in PHP 7

Best Regards
Christian

On 2019-07-17 10:44, Mattias Engdegård wrote:
> 17 juli 2019 kl. 07.43 skrev Christian Johansson <christian@cvj.se>:
>> Thanks for your review, I should have fixed all those items now and pushed them to ELPA
>> (defvar phps-mode-lexer-LABEL
>>   "[a-zA-Z_\u0080-\u00FF][a-zA-Z0-9_\x80-\xff]*"
> Unfinished?
>
> It looks like PHP accepts any Unicode character above and including U+0080 in labels implicitly, by including 80-ff at the byte level and the implicit fact that most PHP code is in UTF-8. So your regexp would probably be something like
>
>   "[A-Za-z_[:nonascii:]][0-9A-Za-z_[:nonascii:]]*"
>
> You could always try and see if your code correctly treats $γνῶσις, say.
>



^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2019-07-17 20:27 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-07-12 17:15 [ELPA] New package: phps-mode Christian Johansson
2019-07-12 22:35 ` Stephen Leake
2019-07-12 23:36 ` Lars Ingebrigtsen
2019-07-13 13:14   ` Stefan Monnier
2019-07-13 13:28     ` Lars Ingebrigtsen
  -- strict thread matches above, loose matches on Subject: below --
2019-07-16 10:00 Mattias Engdegård
2019-07-17  5:43 ` Christian Johansson
2019-07-17  8:44   ` Mattias Engdegård
2019-07-17  9:36     ` Christian Johansson
2019-07-17 12:24       ` Stefan Monnier
2019-07-17 20:27     ` Christian Johansson

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).