From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.ciao.gmane.io!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.devel Subject: Re: Line wrap reconsidered Date: Thu, 28 May 2020 21:05:05 +0300 Message-ID: <83v9kgq6jy.fsf@gnu.org> References: <92FF4412-04FB-4521-B6CE-52B08526E4E5@gmail.com> <878shfsq35.fsf@gnus.org> <83imgivjak.fsf@gnu.org> <83lfletr03.fsf@gnu.org> <4895C6EE-5E1F-44BF-93C1-CC5F7C096F73@gmail.com> <9766BA3D-B8F9-456B-9F59-60D21B86E390@gmail.com> <83sgfls2ul.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Injection-Info: ciao.gmane.io; posting-host="ciao.gmane.io:159.69.161.202"; logging-data="50407"; mail-complaints-to="usenet@ciao.gmane.io" Cc: larsi@gnus.org, emacs-devel@gnu.org To: Yuan Fu Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Thu May 28 20:05:50 2020 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1jeMub-000D0y-LK for ged-emacs-devel@m.gmane-mx.org; Thu, 28 May 2020 20:05:49 +0200 Original-Received: from localhost ([::1]:41450 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jeMua-0003YV-Nj for ged-emacs-devel@m.gmane-mx.org; Thu, 28 May 2020 14:05:48 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:56406) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jeMu4-0002bI-TJ for emacs-devel@gnu.org; Thu, 28 May 2020 14:05:16 -0400 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]:38099) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jeMu4-0001YA-7P; Thu, 28 May 2020 14:05:16 -0400 Original-Received: from [176.228.60.248] (port=2552 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1jeMu1-0000Zr-MW; Thu, 28 May 2020 14:05:15 -0400 In-Reply-To: (message from Yuan Fu on Thu, 28 May 2020 13:31:37 -0400) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:251568 Archived-At: > From: Yuan Fu > Date: Thu, 28 May 2020 13:31:37 -0400 > Cc: Lars Ingebrigtsen , > emacs-devel@gnu.org > > Before answering your questions, this is my understanding of the word wrapping in redisplay: > On every iteration we check if current character allow wrapping after it, if so, we set may_wrap to true. Yes. > That basically means the _previous char_ allows wrapping after it. No, we never wrap at the character where we set may_wrap = true, we wrap _after_ it. In the current code, may_wrap is set when we see a SPC character, and when we wrap, that SPC is left on the line before the wrap. > While we are at the same iteration, we may want to wrapping point (wrap_it) if 1) the previous char allows wrapping after it (from may_wrap’s value set by _previous iteration_) and 2) the current character allows wrapping before it. Yes. But I don't understand what you mean by "at the same iteration" here: if may_wrap was set, then we will not try to save the wrap point until we process the next character. So I cannot call this "the same iteration", it's rather "the next iteration". > When we found ourselves at the end of a line You mean, when we reach the edge of the window, right? > we have two choices: continue to the next line (which I assume is what “continue” means in the comment), or instead go to a previously saved wrap point and break the line there. Which "continue" are you alluding to here? Do you mean "lines are continued"? because if lines are not being wrapped, we have only one choice: go back to the last wrap point we found, end the screen line (a.k.a. "break the line") there, and continue with the text on the next screen line. > If there is no previous wrap point, we continue, if there is a previous wrap point but we actually can wrap at this point (previous char can wrap after & this char can wrap before), we just continue, since this position is a valid wrap position. Otherwise we go back to previous wrap point and wrap there (goto back_to_wrap;). You lost me here. The logic is actually: if there is a wrap point, go back to it and break the line there; if there's no wrap point, break the line where we are now (i.e. at the last character that still fits inside the window). > Although the original code only has one checker (IS_WHITESPACE), it serves a dual purpose: it is used to determine if we can wrap after—whitespace and tab allow wrapping after; it is also used to determine if we can wrap before—they don’t allow wrapping before (otherwise you see whitespace and tabs on the beginning of the next line). That is true. I suggested to have a single function so that you could in that single function perform the same test, just using two different categories. then you could basically keep the rest of the logic intact. If that is somehow not possible, can you explain why? > Needless to say, I’m a newbie in Emacs C internals and redisplay, so my understanding from reading the original code and comments could be wrong. But the code seems to work right so I think the truth isn’t far away. The truth isn't far away, but we need to go all the way so that the result doesn't break some use cases (of which there are a gazillion in the display code). > > Do we really need two separate functions? And note that each one > > calls IT_DISPLAYING_WHITESPACE, so in some situations you will be > > testing that twice for the same character -- because you have 2 > > separate functions. > > IT_DISPLAYING_WHITESPACE would run twice, too: one time to check for warp after and one time for wrap before. My point was to bring the two tests together so that it could be just one test. Is that possible? If not, why not? > >> + if (may_wrap && char_can_wrap_before(it)) > >> { > >> /* We have reached a glyph that follows one or more > >> - whitespace characters. If the position is > >> - already found, we are done. */ > >> + whitespace characters (or a character that allows > >> + wrapping after it). If the position is already > >> + found, we are done. */ > > > > The code calls char_can_wrap_before, but the comment says we can wrap > > after it. Which one is right? > > The comment says this char _follows_ a char that allows wrapping after, we still need to check if _this_ char allows wrapping before. But the comment is after the char_can_wrap_before test, so we have already tested that, no? > >> - /* If we are at a whitespace character > >> - that barely fits on this screen line, > >> - but the next character is also > >> - whitespace, we cannot wrap here. */ > >> + /* If the previous character says we can > >> + wrap after it, but the current > >> + character says we can't wrap before > >> + it, then we can't wrap here. */ > > > > It sounds like your Emacs is set up to use only spaces for indentation > > in C source files, whereas our convention is to use tabs and spaces. > > I’m not sure what you mean, do you mean the indent style for the new code, or are you talking about the word wrapping? I mean the indentation: the original code used TABs and spaces, but your code uses only spaces. > > And finally, one more general comment/question: isn't your code assume > > implicitly that buffer text is always processed in the logical order, > > i.e. in the increasing order of buffer positions? I mean, the fact > > that you have the "before" and the "after" function seems to imply > > that you do assume that, and the logic of processing the categories is > > relying on that, expecting that when you see a wrap_after character, > > you can wrap on the next character. Is this so? because if it is, > > then this will break when processing RTL text, since we may process it > > in the reverse order of buffer positions. Please look into these > > situations and see that the code does TRT in them. > > I tested with some Alaric text made by google translate, and the wrapping seems not take effect. IIUC bidi.c reorders the line to RTL Yes. > and the redisplay iterator will still go through them LTR Not sure what you mean by "go through them LTR". The iterator can move forward or backward, or even jump to a far-away place. You cannot assume that the next character examined will be the next character in the buffer. > I wonder how does the original wrapping works in that case. Does the old code handle bidi text? Of course it does, you can easily see that if you run the unmodified Emacs. It would be a terrible misfeature if we didn't handle wrappng correctly in bidirectional scripts.