From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Vladimir Kazanov Newsgroups: gmane.emacs.devel Subject: Re: Overlay mechanic improvements Date: Sat, 20 Sep 2014 11:08:27 +0300 Message-ID: References: NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Trace: ger.gmane.org 1411200546 19392 80.91.229.3 (20 Sep 2014 08:09:06 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sat, 20 Sep 2014 08:09:06 +0000 (UTC) Cc: emacs-devel@gnu.org To: rms@gnu.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Sat Sep 20 10:09:02 2014 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1XVFj3-0003b6-Nh for ged-emacs-devel@m.gmane.org; Sat, 20 Sep 2014 10:09:01 +0200 Original-Received: from localhost ([::1]:33729 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XVFj3-0003XM-6U for ged-emacs-devel@m.gmane.org; Sat, 20 Sep 2014 04:09:01 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:41265) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XVFiy-0003XH-Gu for emacs-devel@gnu.org; Sat, 20 Sep 2014 04:08:57 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1XVFix-0001LQ-6D for emacs-devel@gnu.org; Sat, 20 Sep 2014 04:08:56 -0400 Original-Received: from mail-ie0-x235.google.com ([2607:f8b0:4001:c03::235]:39798) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XVFiv-0001Ew-Cs; Sat, 20 Sep 2014 04:08:53 -0400 Original-Received: by mail-ie0-f181.google.com with SMTP id tr6so4724473ieb.26 for ; Sat, 20 Sep 2014 01:08:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type:content-transfer-encoding; bh=iEtEu+ezaEWQa+nXXyWB4alcmEax/H30NHVtY1GscQ8=; b=F2GFX8MFjR89IZ4jtgBMLe+VDAsfN0BnI3CWxTxoTdIOGaozXuHT7YD9/Cqd41+p1I YJeH5ik4979SpZj0M7iaY1DVoVa7D30rjJFkfl1EL0FPm9+6/Xhlx12hMdj9FKDRTLxD c7SRVYxqcXPOxuBqn4uWMBHcNnFPONEq2SR8c48yriaRQLZDMcJbEJMbJDtIMRW9kr+a DXoK7bzTk+XsnE9Hm+lNlRlJwfb705HCIt5nqmZCby3tTSggUH6XN0OHRRXf+APaRFTy jK4bZ60O3HE3VD84ET9zpRAN2gLQRxQvH3nVq2cGKL/XwdV3NPVnHAAuu/MV44KDtgCV S7aw== X-Received: by 10.42.38.134 with SMTP id c6mr5699935ice.16.1411200527841; Sat, 20 Sep 2014 01:08:47 -0700 (PDT) Original-Received: by 10.107.18.133 with HTTP; Sat, 20 Sep 2014 01:08:27 -0700 (PDT) In-Reply-To: X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2607:f8b0:4001:c03::235 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:174575 Archived-At: The problem here is that tokens are not just lexemes(token text). To be general and granular enough one also has to remember dependencies between tokens. So, a token is: ["lexeme", lookback, lookahead, position]. Here - "lexeme" is the string itself; - "lookahead" is a number of chars *after* the lexeme used to define the lexeme (in the simplest case - just a whitespace between symbol names, i.e. a single char); - "lookback" a number of previous tokens whose lookaheads span over current token's zone of responsibility (this one can be recalculated on the fly) - "position" is just a token beginning/end Thus, we can define "a token" as a sum of a lexeme and context required to extract the lexeme. Whenever something happens within a particular lexem+lookahead interval -> the token has to be reparsed and n=3Dlookback previous tokens also have to be fixed. More on this can be found in this paper: http://harmonia.cs.berkeley.edu/papers/twagner-lexing.pdf. Quite a lengthy read, I must warn. What I like about it is that it *proves* token stream consistency after fixing. My system will hopefully be simplified compared to the original idea while keeping the proved part. > All I need > is an ability to save position pairs, the positions should survive te= xt > insertion/deletion > > I see multiple meanings for that; could you clarify? I mean something like relative positioning of markers/overlays: changes in unrelated buffer parts should change positions accordingly. > and there should be a way to find those pairs give= n a > buffer point. > > Text properties are designed to be preserved through copying of text. > Overlays are not. So it seems to me that you must use text properties. Yes, properties do survive copying, but token context might change and the whole thing will have to be reparsed anyway. We can take a piece of comments and drop it into the middle of an expression. I don't care whether the repositioned text was a part of a token or not, I just need to know that something changed within a particular interval (lexem+lookahead) > For each token, you put a text property 'token' onto the characters in > the token. The value of the property would say what token they are. > The property would be eq for all the characters in one token. > > Then you can use 'next-single-char-property-change' and > 'previous-single-char-property-change' to find the end and the > beginning of the token. I chose overlays mostly because they allow to control *intervals of text*, and the intervals can overlap. For example, I can add a modification-hook to an overlay - and it won't be called for every character, just for an interval(-s) overlapping. Citing the "Special properties" documentation page for text properties: "you can't predict how many times the function will be called". One can imagine a workaround for this, but it would be just too cumbersome. > If you run into any difficulties using the existing interfaces > for text properties, we should improve the interfaces to make your > program easier to write. > Should I just do that or try both (improving overlays on the way)? After all overlays do have to be fixed anyway. --=20 Yours sincerely, Vladimir Kazanov -- =D0=A1 =D1=83=D0=B2=D0=B0=D0=B6=D0=B5=D0=BD=D0=B8=D0=B5=D0=BC, =D0=92=D0=BB=D0=B0=D0=B4=D0=B8=D0=BC=D0=B8=D1=80 =D0=9A=D0=B0=D0=B7=D0=B0= =D0=BD=D0=BE=D0=B2