unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* fixup-whitespace for scripts with no inter-word space
@ 2013-11-05  5:55 Eric Abrahamsen
  2013-11-05 20:58 ` Stefan Monnier
  0 siblings, 1 reply; 5+ messages in thread
From: Eric Abrahamsen @ 2013-11-05  5:55 UTC (permalink / raw)
  To: emacs-devel

A while ago I posted on emacs.help about improving inter-word boundary
behavior for Chinese prose editing. I was directed to post here, but
also given some good tips. I realized I'm looking at two separate
issues, and I'm posting about the simpler of the two first.

That problem is interword spaces -- Chinese doesn't put spaces between
words. I was pointed at fill-nospace-between-words-table, which sure
enough knows all about this:

(aref fill-nospace-between-words-table ?中) -> t

This is used in fill-paragraph (actually fill-delete-newlines) to good
effect. I realized what was actually annoying me was delete-indentation,
which calls fixup-whitespace, which (mostly) unconditionally adds a
space between joined lines.

I did a quick grep of emacs' lisp directory and in the basic libraries,
at least, it seems fixup-whitespace is only called by
delete-indentation.

Would it be acceptable to patch fixup-whitespace so that it does what
fill-delete-newlines does? Ie, rewrite as such:

(defun fixup-whitespace ()
  "Fixup white space between objects around point.
Leave one space or none, according to the context."
  (interactive "*")
  (save-excursion
    (delete-horizontal-space)
    (if (or (looking-at "^\\|\\s)")
	    (save-excursion (forward-char -1)
			    (looking-at "$\\|\\s(\\|\\s'"))
	    (and enable-multibyte-characters
		 (let ((prev (preceding-char))
		       (next (following-char)))
		   (and (or (aref (char-category-set next) ?|)
			    (aref (char-category-set prev) ?|))
			(or (aref fill-nospace-between-words-table next)
			    (aref fill-nospace-between-words-table prev))))))
	nil
      (insert ?\s))))

 If this seems acceptable in principle I'll report a bug and provide a
 patch.

 Thanks,
 Eric




^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: fixup-whitespace for scripts with no inter-word space
  2013-11-05  5:55 fixup-whitespace for scripts with no inter-word space Eric Abrahamsen
@ 2013-11-05 20:58 ` Stefan Monnier
  2013-11-06  3:55   ` Eric Abrahamsen
  0 siblings, 1 reply; 5+ messages in thread
From: Stefan Monnier @ 2013-11-05 20:58 UTC (permalink / raw)
  To: Eric Abrahamsen; +Cc: emacs-devel

> Would it be acceptable to patch fixup-whitespace so that it does what
> fill-delete-newlines does? Ie, rewrite as such:

Sounds good, yes.  It should also insert 2 spaces after a full-stop,
according to sentence-end-double-space, while you're at it.


        Stefan



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: fixup-whitespace for scripts with no inter-word space
  2013-11-05 20:58 ` Stefan Monnier
@ 2013-11-06  3:55   ` Eric Abrahamsen
  2013-11-06  6:52     ` Stefan Monnier
  0 siblings, 1 reply; 5+ messages in thread
From: Eric Abrahamsen @ 2013-11-06  3:55 UTC (permalink / raw)
  To: emacs-devel

Stefan Monnier <monnier@iro.umontreal.ca> writes:

>> Would it be acceptable to patch fixup-whitespace so that it does what
>> fill-delete-newlines does? Ie, rewrite as such:
>
> Sounds good, yes.  It should also insert 2 spaces after a full-stop,
> according to sentence-end-double-space, while you're at it.

Okay, but it seems like this is starting to encroach on fill territory.
In fact, the more I messed with it, the more I felt like I was
replicating `fill-delete-newlines'.

There's probably an obvious reason why this a bad idea, but it seems
like `fill-delete-newlines' already does everything that
`delete-indentation' ought to do. It handles sentence ends and multibyte
characters, and calls `canonically-space-region'.

The only additional thing `delete-indentation' does is kill the
fill-prefix, if there is one.

I'm probably getting over-enthusiastic, but doesn't it seem like
`delete-indentation' could just delete the fill-prefix and then call
`fill-delete-newlines' on the appropriate region?

Eric




^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: fixup-whitespace for scripts with no inter-word space
  2013-11-06  3:55   ` Eric Abrahamsen
@ 2013-11-06  6:52     ` Stefan Monnier
  2013-11-06  7:09       ` Eric Abrahamsen
  0 siblings, 1 reply; 5+ messages in thread
From: Stefan Monnier @ 2013-11-06  6:52 UTC (permalink / raw)
  To: Eric Abrahamsen; +Cc: emacs-devel

> I'm probably getting over-enthusiastic, but doesn't it seem like
> `delete-indentation' could just delete the fill-prefix and then call
> `fill-delete-newlines' on the appropriate region?

They should definitely share more code.  But you can't just call
fill-delete-newlines because (for example) it will (and should) add a space
between "a" and ")" whereas fixup-whitespace won't (and shouldn't).


        Stefan



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: fixup-whitespace for scripts with no inter-word space
  2013-11-06  6:52     ` Stefan Monnier
@ 2013-11-06  7:09       ` Eric Abrahamsen
  0 siblings, 0 replies; 5+ messages in thread
From: Eric Abrahamsen @ 2013-11-06  7:09 UTC (permalink / raw)
  To: emacs-devel

Stefan Monnier <monnier@iro.umontreal.ca> writes:

>> I'm probably getting over-enthusiastic, but doesn't it seem like
>> `delete-indentation' could just delete the fill-prefix and then call
>> `fill-delete-newlines' on the appropriate region?
>
> They should definitely share more code.  But you can't just call
> fill-delete-newlines because (for example) it will (and should) add a space
> between "a" and ")" whereas fixup-whitespace won't (and shouldn't).

That's what I was afraid of. Let me have a stare at it, and I'll come up
with something that's more ambitious or less ambitious, depending on how
well I understand what's going on.

E




^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2013-11-06  7:09 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-11-05  5:55 fixup-whitespace for scripts with no inter-word space Eric Abrahamsen
2013-11-05 20:58 ` Stefan Monnier
2013-11-06  3:55   ` Eric Abrahamsen
2013-11-06  6:52     ` Stefan Monnier
2013-11-06  7:09       ` Eric Abrahamsen

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).