unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#44653: 28.0.50; sql-mode gets confused about string literals
@ 2020-11-15  6:37 Dale Sedivec
  2020-11-16 22:42 ` Lars Ingebrigtsen
  2021-01-27  3:57 ` Lars Ingebrigtsen
  0 siblings, 2 replies; 7+ messages in thread
From: Dale Sedivec @ 2020-11-15  6:37 UTC (permalink / raw)
  To: 44653

I think `syntax-ppss' has started returning incorrect information about
apostrophe-delimited strings in sql-mode in master.  I am actually on
native-comp with this afternoon's master merged in myself, but I am
fairly confident you can reproduce this on master, nothing to do with
native-comp.

Steps to reproduce:

1. emacs -Q

2. Evaluate the following in *scratch*:

      (let ((buf (generate-new-buffer "sql")))
        (switch-to-buffer buf)
        (sql-mode)
        (insert "select '''")
        (goto-char 1)
        (delete-region 1 8)
        (goto-char (point-max)))

   Point should now be at the end of an `sql-mode' buffer containing
   "'''" (three apostrophes).

4. Press backspace to erase the third apostrophe.

5. M-: (nth 3 (syntax-ppss)) RET

Expected result: fourth element of syntax-ppss, the delimiter character
for the current string, is nil, since we are no longer in a string

Observed result: fourth element is ?' (39), indicating that point is
still inside a string

My first guess is that this is related to commit 289d6b2265e and #40231.

I came across this while trying to get back something resembling the
behavior of `electric-pair-mode', and in particular
`electric-pair-skip-self', as it was prior to 289d6b2265e.  I'm almost
there, but I ran into the above bug and got stuck.

Kind regards,
Dale


In GNU Emacs 28.0.50 (build 1, x86_64-apple-darwin19.6.0, NS appkit-1894.60 Version 10.15.7 (Build 19H15))
of 2020-11-14 built on dale
Repository revision: 99cbb313a3fd037b55ad3700635f607f56b0fa3e
Repository branch: feature/native-comp
Windowing system distributor 'Apple', version 10.3.1894
System Description:  Mac OS X 10.15.7

Configured using:
'configure --without-x --with-modules --with-threads --with-xwidgets
--with-zlib --with-xml2 --with-json --with-cairo --with-gnutls
--with-xpm --with-jpeg --with-tiff --with-gif --with-png --with-rsvg
--with-nativecomp --with-ns --enable-ns-self-contained 'CFLAGS=-O2
-I/opt/local/include/gcc10' LDFLAGS=-L/opt/local/lib/gcc10'

Configured features:
PNG RSVG GLIB NOTIFY KQUEUE ACL GNUTLS LIBXML2 ZLIB TOOLKIT_SCROLL_BARS
XIM NS MODULES NATIVE_COMP THREADS XWIDGETS JSON PDUMPER LCMS2

Important settings:
  value of $LC_COLLATE: C
  value of $LANG: en_US.UTF-8
  locale-coding-system: utf-8-unix

Major mode: SQL[ANSI]

Minor modes in effect:
  tooltip-mode: t
  global-eldoc-mode: t
  electric-indent-mode: t
  mouse-wheel-mode: t
  tool-bar-mode: t
  menu-bar-mode: t
  file-name-shadow-mode: t
  global-font-lock-mode: t
  font-lock-mode: t
  blink-cursor-mode: t
  auto-composition-mode: t
  auto-encryption-mode: t
  auto-compression-mode: t
  line-number-mode: t
  transient-mark-mode: t

Load-path shadows:
None found.

Features:
(shadow sort mail-extr emacsbug message rmc puny dired dired-loaddefs
rfc822 mml mml-sec epa derived epg epg-config gnus-util rmail
rmail-loaddefs text-property-search time-date mm-decode mm-bodies
mm-encode mail-parse rfc2231 mailabbrev gmm-utils mailheader sendmail
rfc2047 rfc2045 ietf-drums mm-util mail-prsvr mail-utils sql auth-source
eieio eieio-core eieio-loaddefs password-cache json map view thingatpt
comint ansi-color ring comp warnings subr-x rx cl-seq cl-macs cl-extra
help-mode easymenu seq byte-opt gv cl-loaddefs cl-lib bytecomp
byte-compile cconv tooltip eldoc electric uniquify ediff-hook vc-hooks
lisp-float-type mwheel term/ns-win ns-win ucs-normalize mule-util
term/common-win tool-bar dnd fontset image regexp-opt fringe
tabulated-list replace newcomment text-mode elisp-mode lisp-mode
prog-mode register page tab-bar menu-bar rfn-eshadow isearch timer
select scroll-bar mouse jit-lock font-lock syntax facemenu font-core
term/tty-colors frame minibuffer cl-generic cham georgian utf-8-lang
misc-lang vietnamese tibetan thai tai-viet lao korean japanese eucjp-ms
cp51932 hebrew greek romanian slovak czech european ethiopic indian
cyrillic chinese composite charscript charprop case-table epa-hook
jka-cmpr-hook help simple abbrev obarray cl-preloaded nadvice button
loaddefs faces cus-face pcase macroexp files window text-properties
overlay sha1 md5 base64 format env code-pages mule custom widget
hashtable-print-readable backquote threads xwidget-internal kqueue cocoa
ns lcms2 multi-tty make-network-process nativecomp emacs)

Memory information:
((conses 16 83359 5743)
(symbols 48 8493 1)
(strings 32 23422 3672)
(string-bytes 1 888517)
(vectors 16 16669)
(vector-slots 8 316241 15044)
(floats 8 29 23)
(intervals 56 234 0)
(buffers 992 13))






^ permalink raw reply	[flat|nested] 7+ messages in thread

* bug#44653: 28.0.50; sql-mode gets confused about string literals
  2020-11-15  6:37 bug#44653: 28.0.50; sql-mode gets confused about string literals Dale Sedivec
@ 2020-11-16 22:42 ` Lars Ingebrigtsen
  2021-01-27  3:57 ` Lars Ingebrigtsen
  1 sibling, 0 replies; 7+ messages in thread
From: Lars Ingebrigtsen @ 2020-11-16 22:42 UTC (permalink / raw)
  To: Florian v. Savigny; +Cc: 44653, Dale Sedivec

Dale Sedivec <dale@codefu.org> writes:

> 2. Evaluate the following in *scratch*:
>
>       (let ((buf (generate-new-buffer "sql")))
>         (switch-to-buffer buf)
>         (sql-mode)
>         (insert "select '''")
>         (goto-char 1)
>         (delete-region 1 8)
>         (goto-char (point-max)))
>
>    Point should now be at the end of an `sql-mode' buffer containing
>    "'''" (three apostrophes).
>
> 4. Press backspace to erase the third apostrophe.
>
> 5. M-: (nth 3 (syntax-ppss)) RET
>
> Expected result: fourth element of syntax-ppss, the delimiter character
> for the current string, is nil, since we are no longer in a string
>
> Observed result: fourth element is ?' (39), indicating that point is
> still inside a string
>
> My first guess is that this is related to commit 289d6b2265e and #40231.

Yes, sounds likely.  I've added Florian to the Cc's -- perhaps he has
some comments here.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 7+ messages in thread

* bug#44653: 28.0.50; sql-mode gets confused about string literals
  2020-11-15  6:37 bug#44653: 28.0.50; sql-mode gets confused about string literals Dale Sedivec
  2020-11-16 22:42 ` Lars Ingebrigtsen
@ 2021-01-27  3:57 ` Lars Ingebrigtsen
  2021-01-27 18:09   ` Dale Sedivec
  1 sibling, 1 reply; 7+ messages in thread
From: Lars Ingebrigtsen @ 2021-01-27  3:57 UTC (permalink / raw)
  To: Dale Sedivec; +Cc: 44653, Stefan Monnier

Dale Sedivec <dale@codefu.org> writes:

> I think `syntax-ppss' has started returning incorrect information about
> apostrophe-delimited strings in sql-mode in master.  I am actually on
> native-comp with this afternoon's master merged in myself, but I am
> fairly confident you can reproduce this on master, nothing to do with
> native-comp.
>
> Steps to reproduce:
>
> 1. emacs -Q
>
> 2. Evaluate the following in *scratch*:
>
>       (let ((buf (generate-new-buffer "sql")))
>         (switch-to-buffer buf)
>         (sql-mode)
>         (insert "select '''")
>         (goto-char 1)
>         (delete-region 1 8)
>         (goto-char (point-max)))
>
>    Point should now be at the end of an `sql-mode' buffer containing
>    "'''" (three apostrophes).
>
> 4. Press backspace to erase the third apostrophe.
>
> 5. M-: (nth 3 (syntax-ppss)) RET
>
> Expected result: fourth element of syntax-ppss, the delimiter character
> for the current string, is nil, since we are no longer in a string
>
> Observed result: fourth element is ?' (39), indicating that point is
> still inside a string

I can reproduce this behaviour...  but if I then type, say, "a DEL",
then (nth 3 (syntax-ppss)) returns nil.

So it seems like syntax-ppss doesn't recompute the status until a new
character is inserted...  which I think makes sense?  Until you've typed
something more, Emacs doesn't really know whether we've entered a new
syntax state or not here.

(I'm also wondering what the actual bug you're experiencing is, since
I', guessing you don't go typing M-: (nth 3 (syntax-ppss)) RET at random
just for fun.  :-))

I've added Stefan M to the CCs; perhaps he has some comments.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 7+ messages in thread

* bug#44653: 28.0.50; sql-mode gets confused about string literals
  2021-01-27  3:57 ` Lars Ingebrigtsen
@ 2021-01-27 18:09   ` Dale Sedivec
  2021-01-28  4:35     ` Lars Ingebrigtsen
  2021-01-28  9:41     ` martin rudalics
  0 siblings, 2 replies; 7+ messages in thread
From: Dale Sedivec @ 2021-01-27 18:09 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: 44653, Stefan Monnier

On Jan 26, 2021, at 21:57, Lars Ingebrigtsen <larsi@gnus.org> wrote:
> Dale Sedivec <dale@codefu.org> writes:
> 
>> I think `syntax-ppss' has started returning incorrect information about
>> apostrophe-delimited strings in sql-mode in master. [...]
>> 
>> Steps to reproduce:
>> 
>> 1. emacs -Q
>> 
>> 2. Evaluate the following in *scratch*:
>> 
>>      (let ((buf (generate-new-buffer "sql")))
>>        (switch-to-buffer buf)
>>        (sql-mode)
>>        (insert "select '''")
>>        (goto-char 1)
>>        (delete-region 1 8)
>>        (goto-char (point-max)))
>> 
>>   Point should now be at the end of an `sql-mode' buffer containing
>>   "'''" (three apostrophes).
>> 
>> 4. Press backspace to erase the third apostrophe.
>> 
>> 5. M-: (nth 3 (syntax-ppss)) RET
>> 
>> Expected result: fourth element of syntax-ppss, the delimiter character
>> for the current string, is nil, since we are no longer in a string
>> 
>> Observed result: fourth element is ?' (39), indicating that point is
>> still inside a string
> 
> I can reproduce this behaviour...  but if I then type, say, "a DEL",
> then (nth 3 (syntax-ppss)) returns nil.
> 
> So it seems like syntax-ppss doesn't recompute the status until a new
> character is inserted...  which I think makes sense?  Until you've typed
> something more, Emacs doesn't really know whether we've entered a new
> syntax state or not here.

Thanks for looking at this.

Do I correctly understand your statement to mean that parse state is only updated when characters are added to a buffer, not when characters are deleted?  If so, that would indeed seem to mean that this is not a bug.  I tested something similar in a python-mode buffer, and indeed syntax-ppss still thinks it's in a string when I delete an apostrophe, just as in the above example.  It's possible I just missed that explanation in the manual.

If things are working as expected, please feel free to close this.  It's probably inconvenient that syntax parsing can't be relied upon after deleting characters, but I can imagine that changing this is not simple.

> (I'm also wondering what the actual bug you're experiencing is, since
> I', guessing you don't go typing M-: (nth 3 (syntax-ppss)) RET at random
> just for fun.  :-))
[...]

While acknowledging that I sometimes forget the names of my family members: I *think* I was trying to change how apostrophe (') behaves in sql-mode buffers with electric-pair-mode turned on.  In case it helps, or you're just curious, specifics follow.

#40231 (https://debbugs.gnu.org/cgi/bugreport.cgi?bug=40231) made a change to the syntax of apostrophe in an SQL string, which made electric-pair-mode behave in a way I did not appreciate.  While writing code to make ' behave in the way I expected, I ran into difficulties due to the behavior of syntax-ppss I described in my original message.

To demonstrate what I was trying to change:

1. Create an sql-mode buffer with electric-pair-mode turned on

2. Type ', which results in '|' with point at |

3. Type ' again

Desired behavior, and pre-40231's patch behavior (IIRC): ''| with point at |

Post-40231 behavior: ''|'

I wanted the pre-40231 behavior.

Here's what I'm currently using to achieve the pre-40231 behavior (it ain't pretty and I'm struggling to remember why the third cond clause is there):

;; https://debbugs.gnu.org/cgi/bugreport.cgi?bug=40231
;;
;; In addition to test case in comment below, also try '''''' at BOB
;; and try inserting ' at BOB with '' in front of point.  Also make
;; sure apostrophes don't pair in comments.

(defun my:sql-mode-electric-apostrophe (arg)
  (interactive "P")
  (cond
    ((or (not electric-pair-mode)
         arg)
     (call-interactively #'self-insert-command))
    ((and (eq (char-after) ?')
          (eq (nth 3 (syntax-ppss)) ?'))
     ;; We were already at a string, and the character after point is
     ;; an apostrophe.  Just move beyond it.  (Behavior changed after
     ;; changes from Emacs bug #40231.)
     (forward-char 1))
    ;; XXX I have no idea if any of this makes sense if/when
    ;; https://debbugs.gnu.org/cgi/bugreport.cgi?bug=44653 gets fixed.
    ;; Check back then.  I think `self-insert-command' should be
    ;; sufficient, but ISTR I had to avoid using it in this case
    ;; because elec-pair was confounding me?  I don't quite remember.
    ((and (eq (char-before) ?')
          (not (nth 3 (progn
                        ;; (syntax-ppss-flush-cache (- (point) 2))
                        (syntax-ppss)))))
     ;; (self-insert-command 1)
     (insert "'")
     (save-excursion (electric-pair--insert ?')))
    (t
     (self-insert-command 1))))

(with-eval-after-load 'sql
  (define-key sql-mode-map "'" #'my:sql-mode-electric-apostrophe))

Dale




^ permalink raw reply	[flat|nested] 7+ messages in thread

* bug#44653: 28.0.50; sql-mode gets confused about string literals
  2021-01-27 18:09   ` Dale Sedivec
@ 2021-01-28  4:35     ` Lars Ingebrigtsen
  2021-01-28  4:48       ` Lars Ingebrigtsen
  2021-01-28  9:41     ` martin rudalics
  1 sibling, 1 reply; 7+ messages in thread
From: Lars Ingebrigtsen @ 2021-01-28  4:35 UTC (permalink / raw)
  To: Dale Sedivec; +Cc: 44653, Stefan Monnier

Dale Sedivec <dale@codefu.org> writes:

> 1. Create an sql-mode buffer with electric-pair-mode turned on
>
> 2. Type ', which results in '|' with point at |
>
> 3. Type ' again
>
> Desired behavior, and pre-40231's patch behavior (IIRC): ''| with point at |
>
> Post-40231 behavior: ''|'

Ah, yes, that's definitely a bug.

Poking away at this, it seems like the base issue here is that

(electric-pair-syntax-info ?')

inside

'foo'|'

in sql-mode now returns nil instead of 

(34 39 nil nil), as it did in Emacs 27.  (electric-pair-mode works by
having the ' inserted first, and then it starts asking for syntax info.)
At this point, the new syntax rule in sql mode has decided that

'foo''

we're still in the string, so there's nothing for electric-pair-mode to
do.

I'm not sure how this should be fixed...  Hm...  At 

'foo''|

we're in a string, but

'foo'|'

we're probably not?  (| marks point.)  Perhaps that can be used...
somehow...

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 7+ messages in thread

* bug#44653: 28.0.50; sql-mode gets confused about string literals
  2021-01-28  4:35     ` Lars Ingebrigtsen
@ 2021-01-28  4:48       ` Lars Ingebrigtsen
  0 siblings, 0 replies; 7+ messages in thread
From: Lars Ingebrigtsen @ 2021-01-28  4:48 UTC (permalink / raw)
  To: Dale Sedivec; +Cc: 44653, Stefan Monnier

Lars Ingebrigtsen <larsi@gnus.org> writes:

> I'm not sure how this should be fixed...  Hm...  At 
>
> 'foo''|
>
> we're in a string, but
>
> 'foo'|'
>
> we're probably not?  (| marks point.)  Perhaps that can be used...
> somehow...

But we don't have access to that information in the rules, apparently --
syntax-propertize moves point around.

Anybody got any ideas here?

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 7+ messages in thread

* bug#44653: 28.0.50; sql-mode gets confused about string literals
  2021-01-27 18:09   ` Dale Sedivec
  2021-01-28  4:35     ` Lars Ingebrigtsen
@ 2021-01-28  9:41     ` martin rudalics
  1 sibling, 0 replies; 7+ messages in thread
From: martin rudalics @ 2021-01-28  9:41 UTC (permalink / raw)
  To: Dale Sedivec, Lars Ingebrigtsen; +Cc: 44653, Stefan Monnier

 > Do I correctly understand your statement to mean that parse state is
 > only updated when characters are added to a buffer, not when
 > characters are deleted?

The parse state is always updated after any buffer modification that
happened in the part before the position for which the state shall be
calculated.  The problem is, that syntax properties may override the
parse state.  And combined delimiters are always a pain (IIRC there were
discussions here whether in the middle of the C comment delimiters "/*"
and "*/" we are already or still in the corresponding comment.

martin





^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-01-28  9:41 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-15  6:37 bug#44653: 28.0.50; sql-mode gets confused about string literals Dale Sedivec
2020-11-16 22:42 ` Lars Ingebrigtsen
2021-01-27  3:57 ` Lars Ingebrigtsen
2021-01-27 18:09   ` Dale Sedivec
2021-01-28  4:35     ` Lars Ingebrigtsen
2021-01-28  4:48       ` Lars Ingebrigtsen
2021-01-28  9:41     ` martin rudalics

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).