unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#11948: 24.1.50; word-wrap should allow wrapping at all breaking space characters, not just space and tab
@ 2012-07-16 12:42 Reuben Thomas
  2012-07-16 16:05 ` Eli Zaretskii
  2022-04-22 12:47 ` Lars Ingebrigtsen
  0 siblings, 2 replies; 24+ messages in thread
From: Reuben Thomas @ 2012-07-16 12:42 UTC (permalink / raw)
  To: 11948

I noticed this when in visual-line-mode, and it failed to wrap at an em
space (U+2003), but of course there are lotsof other breaking space
characters.


In GNU Emacs 24.1.50.2 (x86_64-unknown-linux-gnu, GTK+ Version 2.24.10)
 of 2012-07-14 on skwd
Bzr revision: 109087 cyd@gnu.org-20120714053223-jxkxt958pqg8tisb
Windowing system distributor `The X.Org Foundation', version 11.0.11103000
Important settings:
  value of $LC_MONETARY: en_GB.UTF-8
  value of $LC_NUMERIC: en_GB.UTF-8
  value of $LC_TIME: en_GB.UTF-8
  value of $LANG: en_US.UTF-8
  locale-coding-system: utf-8-unix
  default enable-multibyte-characters: t

Major mode: Help

Minor modes in effect:
  shell-dirtrack-mode: t
  diff-auto-refine-mode: t
  recentf-mode: t
  show-paren-mode: t
  server-mode: t
  savehist-mode: t
  minibuffer-electric-default-mode: t
  iswitchb-mode: t
  icomplete-mode: t
  global-auto-revert-mode: t
  desktop-save-mode: t
  tooltip-mode: t
  mouse-wheel-mode: t
  file-name-shadow-mode: t
  global-font-lock-mode: t
  font-lock-mode: t
  blink-cursor-mode: t
  auto-composition-mode: t
  auto-encryption-mode: t
  auto-compression-mode: t
  column-number-mode: t
  line-number-mode: t
  transient-mark-mode: t

Recent input:
<right> C-y C-d C-d C-d C-d C-d C-d C-d C-d C-d C-d 
C-d C-d <left> <left> <left> <left> <left> <left> <left> 
<left> <left> <left> <left> <left> <left> <right> <right> 
<right> <right> <right> <right> <right> <right> <right> 
<down-mouse-1> <mouse-1> <down-mouse-1> <mouse-1> <help-echo> 
<right> <right> <right> <right> <right> <right> <right> 
<right> <right> <right> <right> <right> <right> <right> 
<right> <right> <right> <right> <right> <right> <right> 
<right> <right> <right> <right> <right> <right> <right> 
<right> <right> <right> <right> <right> <left> <left> 
<left> <left> <left> <left> <down-mouse-3> <help-echo> 
C-g C-g C-h k C-c l C-n C-f C-f C-f C-f C-f C-f C-f 
C-f C-f C-f C-f C-f C-f C-f C-f C-f C-f C-f C-f C-f 
C-f C-f C-f C-f C-f C-f C-f C-f C-f C-f C-f <return> 
C-x 1 C-n C-n C-n C-n C-n C-n C-n C-n C-n C-n C-n C-n 
C-n C-n C-n C-n C-n C-n C-n C-n C-n C-n C-n C-n M-x 
c u s o t m i z e - <backspace> <backspace> <backspace> 
<backspace> <backspace> <backspace> <backspace> t o 
m i z e - g g r <backspace> <backspace> r o u p <return> 
v i s u <tab> <return> <help-echo> <help-echo> <help-echo> 
<down-mouse-1> <help-echo> <mouse-movement> <mouse-1> 
<help-echo> <down-mouse-1> <mouse-1> <help-echo> <help-echo> 
<down-mouse-1> <mouse-1> <help-echo> <help-echo> C-h 
f v i s u a l - l i n e - m o d e <return> C-n C-n 
C-n C-n C-n C-n C-n C-n C-n C-n C-f C-f C-f C-f C-f 
C-f C-f C-f C-f C-f C-f C-f C-f C-f C-f C-f C-f C-f 
C-f C-f C-f C-f C-f C-f C-f C-f C-f C-f C-f C-f C-f 
C-f C-f C-f C-f C-f C-f C-f <return> <help-echo> <help-echo> 
<down-mouse-1> <mouse-1> <help-echo> <help-echo> <down-mouse-1> 
<mouse-1> M-x r e p o r t - b <backspace> e m a c s 
- b u g <return>

Recent messages:
Type "q" to restore previous buffer.
uncompressing simple.el.gz...done
Note: file is write protected
Creating customization items...
Creating group...
Creating group entries...done
Creating customization items ...done
Resetting customization items...done
Creating customization setup...done
Type "q" to restore previous buffer. [2 times]

Load-path shadows:
/home/rrt/.emacs.d/elpa/dictionary-1.8.7/dictionary-init hides /usr/local/share/emacs/24.1.50/site-lisp/dictionary-el/dictionary-init
/home/rrt/.emacs.d/elpa/dictionary-1.8.7/dictionary hides /usr/local/share/emacs/24.1.50/site-lisp/dictionary-el/dictionary
/home/rrt/.emacs.d/elpa/dictionary-1.8.7/link hides /usr/local/share/emacs/24.1.50/site-lisp/dictionary-el/link
/home/rrt/.emacs.d/elpa/dictionary-1.8.7/connection hides /usr/local/share/emacs/24.1.50/site-lisp/dictionary-el/connection
/home/rrt/local/share/emacs/site-lisp/dict hides /usr/local/share/emacs/24.1.50/site-lisp/emacs-goodies-el/dict
/usr/local/share/emacs/24.1.50/site-lisp/auctex/tex-style hides /usr/share/emacs/site-lisp/auctex/tex-style
/usr/local/share/emacs/24.1.50/site-lisp/auctex/tex-mik hides /usr/share/emacs/site-lisp/auctex/tex-mik
/usr/local/share/emacs/24.1.50/site-lisp/auctex/multi-prompt hides /usr/share/emacs/site-lisp/auctex/multi-prompt
/usr/local/share/emacs/24.1.50/site-lisp/auctex/tex-jp hides /usr/share/emacs/site-lisp/auctex/tex-jp
/usr/local/share/emacs/24.1.50/site-lisp/auctex/tex-info hides /usr/share/emacs/site-lisp/auctex/tex-info
/usr/local/share/emacs/24.1.50/site-lisp/auctex/latex hides /usr/share/emacs/site-lisp/auctex/latex
/usr/local/share/emacs/24.1.50/site-lisp/auctex/tex hides /usr/share/emacs/site-lisp/auctex/tex
/usr/local/share/emacs/24.1.50/site-lisp/auctex/texmathp hides /usr/share/emacs/site-lisp/auctex/texmathp
/usr/local/share/emacs/24.1.50/site-lisp/auctex/context-nl hides /usr/share/emacs/site-lisp/auctex/context-nl
/usr/local/share/emacs/24.1.50/site-lisp/auctex/tex-font hides /usr/share/emacs/site-lisp/auctex/tex-font
/usr/local/share/emacs/24.1.50/site-lisp/auctex/toolbar-x hides /usr/share/emacs/site-lisp/auctex/toolbar-x
/usr/local/share/emacs/24.1.50/site-lisp/auctex/tex-buf hides /usr/share/emacs/site-lisp/auctex/tex-buf
/usr/local/share/emacs/24.1.50/site-lisp/auctex/tex-fptex hides /usr/share/emacs/site-lisp/auctex/tex-fptex
/usr/local/share/emacs/24.1.50/site-lisp/auctex/bib-cite hides /usr/share/emacs/site-lisp/auctex/bib-cite
/usr/local/share/emacs/24.1.50/site-lisp/auctex/context-en hides /usr/share/emacs/site-lisp/auctex/context-en
/usr/local/share/emacs/24.1.50/site-lisp/auctex/tex-fold hides /usr/share/emacs/site-lisp/auctex/tex-fold
/usr/local/share/emacs/24.1.50/site-lisp/auctex/tex-bar hides /usr/share/emacs/site-lisp/auctex/tex-bar
/usr/local/share/emacs/24.1.50/site-lisp/auctex/context hides /usr/share/emacs/site-lisp/auctex/context
/usr/local/share/emacs/24.1.50/site-lisp/auctex/font-latex hides /usr/share/emacs/site-lisp/auctex/font-latex

Features:
(shadow sort gnus-util mail-extr emacsbug message format-spec rfc822 mml
mml-sec mm-decode mm-bodies mm-encode mail-parse rfc2231 mailabbrev
gmm-utils mailheader sendmail rfc2047 rfc2045 ietf-drums mm-util
mail-prsvr mail-utils cus-edit find-func vc ediff-merg ediff-diff
ediff-wind ediff-help ediff-util ediff-mult ediff-init ediff
vc-dispatcher todoo shell pcomplete grep multi-isearch help-mode
jka-compr info etags nxml-uchnm rng-xsd xsd-regexp rng-cmpct image-mode
rng-nxml rng-valid rng-loc rng-uri rng-parse nxml-parse rng-match rng-dt
rng-util rng-pttrn nxml-ns nxml-mode nxml-outln nxml-rap nxml-util
nxml-glyph nxml-enc xmltok sgml-mode js byte-opt bytecomp byte-compile
cconv json imenu thingatpt inform-mode cc-langs cc-mode cc-fonts
cc-guess cc-menus cc-cmds cc-styles cc-align cc-engine cc-vars cc-defs
autoconf autoconf-mode make-mode noutline outline lua-mode diff-git
diff-mode cperl-mode flymake compile comint ansi-color ring vc-git
face-remap flyspell smart-quotes auto-dictionary-autoloads
c-eldoc-autoloads dictionary-autoloads diff-git-autoloads
dired-isearch-autoloads full-ack-autoloads guess-style-autoloads
kill-ring-search-autoloads magit-autoloads mv-shell-autoloads
tumble-autoloads http-post-simple-autoloads package completing-help
recentf tree-widget wid-edit uniquify paren server savehist
minibuf-eldef iswitchb icomplete autorevert desktop cus-start cus-load
ropemacs pymacs go-mode-load ispell advice advice-preload yasnippet
help-fns derived edmacro kmacro cl-macs gv easymenu assoc cl cl-lib
macroexp muse-autoloads emacs-goodies-el emacs-goodies-custom
emacs-goodies-loaddefs easy-mmode preview-latex tex-site auto-loads
user-site-loaddefs time-date tooltip ediff-hook vc-hooks lisp-float-type
mwheel x-win x-dnd tool-bar dnd fontset image regexp-opt fringe
tabulated-list newcomment lisp-mode register page menu-bar rfn-eshadow
timer select scroll-bar mouse jit-lock font-lock syntax facemenu
font-core frame cham georgian utf-8-lang misc-lang vietnamese tibetan
thai tai-viet lao korean japanese hebrew greek romanian slovak czech
european ethiopic indian cyrillic chinese case-table epa-hook
jka-cmpr-hook help simple abbrev minibuffer loaddefs button faces
cus-face files text-properties overlay sha1 md5 base64 format env
code-pages mule custom widget hashtable-print-readable backquote
make-network-process dbusbind dynamic-setting system-font-setting
font-render-setting move-toolbar gtk x-toolkit x multi-tty emacs)

-- 
http://rrt.sc3d.org/





^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#11948: 24.1.50; word-wrap should allow wrapping at all breaking space characters, not just space and tab
  2012-07-16 12:42 bug#11948: 24.1.50; word-wrap should allow wrapping at all breaking space characters, not just space and tab Reuben Thomas
@ 2012-07-16 16:05 ` Eli Zaretskii
  2012-07-16 18:21   ` Reuben Thomas
  2012-07-16 20:37   ` Stefan Monnier
  2022-04-22 12:47 ` Lars Ingebrigtsen
  1 sibling, 2 replies; 24+ messages in thread
From: Eli Zaretskii @ 2012-07-16 16:05 UTC (permalink / raw)
  To: Reuben Thomas; +Cc: 11948

> From: Reuben Thomas <rrt@sc3d.org>
> Date: Mon, 16 Jul 2012 13:42:18 +0100
> 
> I noticed this when in visual-line-mode, and it failed to wrap at an em
> space (U+2003), but of course there are lotsof other breaking space
> characters.

A prerequisite for doing something about this is to decide which
characters should allow breaking the line.  Is there some guidance in
the Unicode standard or elsewhere about this?  If not, we will have to
decide on our own.





^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#11948: 24.1.50; word-wrap should allow wrapping at all breaking space characters, not just space and tab
  2012-07-16 16:05 ` Eli Zaretskii
@ 2012-07-16 18:21   ` Reuben Thomas
  2012-07-16 19:47     ` Eli Zaretskii
  2012-07-16 20:37   ` Stefan Monnier
  1 sibling, 1 reply; 24+ messages in thread
From: Reuben Thomas @ 2012-07-16 18:21 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 11948

On 16 July 2012 17:05, Eli Zaretskii <eliz@gnu.org> wrote:
>
> A prerequisite for doing something about this is to decide which
> characters should allow breaking the line.  Is there some guidance in
> the Unicode standard or elsewhere about this?  If not, we will have to
> decide on our own.

The Unicode line breaking algorithm is probably the place to go:

http://unicode.org/reports/tr14/

-- 
http://rrt.sc3d.org





^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#11948: 24.1.50; word-wrap should allow wrapping at all breaking space characters, not just space and tab
  2012-07-16 18:21   ` Reuben Thomas
@ 2012-07-16 19:47     ` Eli Zaretskii
  2012-07-16 19:48       ` Reuben Thomas
  2012-07-17  9:49       ` martin rudalics
  0 siblings, 2 replies; 24+ messages in thread
From: Eli Zaretskii @ 2012-07-16 19:47 UTC (permalink / raw)
  To: Reuben Thomas; +Cc: 11948

> Date: Mon, 16 Jul 2012 19:21:00 +0100
> From: Reuben Thomas <rrt@sc3d.org>
> Cc: 11948@debbugs.gnu.org
> 
> On 16 July 2012 17:05, Eli Zaretskii <eliz@gnu.org> wrote:
> >
> > A prerequisite for doing something about this is to decide which
> > characters should allow breaking the line.  Is there some guidance in
> > the Unicode standard or elsewhere about this?  If not, we will have to
> > decide on our own.
> 
> The Unicode line breaking algorithm is probably the place to go:
> 
> http://unicode.org/reports/tr14/

Thanks, but that's not what I meant.  Implementing UAX#14 in full is
an effort similar (although smaller) to what was required for
implementing UAX#9, the Unicode Bidirectional Algorithm.  The main
problem is that, like with UAX#9, the algorithms in UAX#14 are
specified assuming that text is processed for display in batches.  By
contrast, the Emacs display engine, which implements word-wrap,
examines and processes characters one by one.  So one needs to
"serialize", so to speak, the UAX#14 algorithms so that its decisions
could be made on a character-by-character basis.

I think just supporting more characters from LineBreak.txt on which to
wrap should be a good start, and much easier than implementing UAX#14.
Even for that, we will need an efficient char-table for the related
properties, probably via the uniprop_table machinery, like what bidi.c
uses.  Otherwise, referencing the ordinary char-tables of character
properties for each character we display could slow down redisplay too
much.





^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#11948: 24.1.50; word-wrap should allow wrapping at all breaking space characters, not just space and tab
  2012-07-16 19:47     ` Eli Zaretskii
@ 2012-07-16 19:48       ` Reuben Thomas
  2012-07-17  2:48         ` Eli Zaretskii
  2012-07-17  9:49       ` martin rudalics
  1 sibling, 1 reply; 24+ messages in thread
From: Reuben Thomas @ 2012-07-16 19:48 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 11948

On 16 July 2012 20:47, Eli Zaretskii <eliz@gnu.org> wrote:
>
> Thanks, but that's not what I meant.  Implementing UAX#14 in full is
> an effort similar (although smaller) to what was required for
> implementing UAX#9, the Unicode Bidirectional Algorithm.  The main
> problem is that, like with UAX#9, the algorithms in UAX#14 are
> specified assuming that text is processed for display in batches.

I wasn't suggesting you should implement the algorithm; I just assumed
it would contain a list of breaking space characters. You seem to have
found such a thing anyway!

-- 
http://rrt.sc3d.org





^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#11948: 24.1.50; word-wrap should allow wrapping at all breaking space characters, not just space and tab
  2012-07-16 16:05 ` Eli Zaretskii
  2012-07-16 18:21   ` Reuben Thomas
@ 2012-07-16 20:37   ` Stefan Monnier
  2012-07-16 20:40     ` Reuben Thomas
  1 sibling, 1 reply; 24+ messages in thread
From: Stefan Monnier @ 2012-07-16 20:37 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 11948, Reuben Thomas

>> I noticed this when in visual-line-mode, and it failed to wrap at an em
>> space (U+2003), but of course there are lotsof other breaking space
>> characters.
> A prerequisite for doing something about this is to decide which
> characters should allow breaking the line.  Is there some guidance in
> the Unicode standard or elsewhere about this?  If not, we will have to
> decide on our own.

I think the issue here is whether we want to "render" the text, or
whether we want to show the file's content to the user.

For text-rendering, any space-like thingy that Unicode says isn't
unbreakable would probably be fine, but for the other case, it can
be important for the user to see the difference between a normal space
and some other space and wrapping the line can hide the difference.

Maybe we can rely on a variable such as nobreak-char-display (where
there's a similar issue).


        Stefan





^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#11948: 24.1.50; word-wrap should allow wrapping at all breaking space characters, not just space and tab
  2012-07-16 20:37   ` Stefan Monnier
@ 2012-07-16 20:40     ` Reuben Thomas
  2012-07-16 21:16       ` Stefan Monnier
  0 siblings, 1 reply; 24+ messages in thread
From: Reuben Thomas @ 2012-07-16 20:40 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: 11948

On 16 July 2012 21:37, Stefan Monnier <monnier@iro.umontreal.ca> wrote:
>
> I think the issue here is whether we want to "render" the text, or
> whether we want to show the file's content to the user.

Isn't whitespace-mode for showing the content, as far as spaces are concerned?

-- 
http://rrt.sc3d.org





^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#11948: 24.1.50; word-wrap should allow wrapping at all breaking space characters, not just space and tab
  2012-07-16 20:40     ` Reuben Thomas
@ 2012-07-16 21:16       ` Stefan Monnier
  2012-07-16 21:18         ` Reuben Thomas
  0 siblings, 1 reply; 24+ messages in thread
From: Stefan Monnier @ 2012-07-16 21:16 UTC (permalink / raw)
  To: Reuben Thomas; +Cc: 11948

>> I think the issue here is whether we want to "render" the text, or
>> whether we want to show the file's content to the user.
> Isn't whitespace-mode for showing the content, as far as spaces
> are concerned?

whitespace mode is good at showing where you have things like
trailing-spaces, but other than anal retentive guys like us,
nobody cares.  OTOH many people (myself included) have wasted hours
tracking bugs where some chunk of code contained some weird char like
a NBSP that displayed exactly like a normal space but isn't parsed the
same way.

For similar reasons, we don't treat ~ in TeX as whitespace: while its
rendering will display as whitespace its meaning in the source code is
non-trivial.

I'm aware that neither ~ in TeX nor NBSP are quite the same as the
problem at hand, but there is still the same general issue of
distinguishing the specification from its rendering and when editing
a file in Emacs you often what to see what the file specifies more than
what it will render to (which you'll want to see maybe elsewhere such
as in a browser).


        Stefan





^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#11948: 24.1.50; word-wrap should allow wrapping at all breaking space characters, not just space and tab
  2012-07-16 21:16       ` Stefan Monnier
@ 2012-07-16 21:18         ` Reuben Thomas
  0 siblings, 0 replies; 24+ messages in thread
From: Reuben Thomas @ 2012-07-16 21:18 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: 11948

On 16 July 2012 22:16, Stefan Monnier <monnier@iro.umontreal.ca> wrote:
>>> I think the issue here is whether we want to "render" the text, or
>>> whether we want to show the file's content to the user.
>> Isn't whitespace-mode for showing the content, as far as spaces
>> are concerned?
>
> whitespace mode is good at showing where you have things like
> trailing-spaces, but other than anal retentive guys like us,
> nobody cares.

whitespace-mode is quite a bit more general than that: it allows one
to visualise various space characters, every time they appear. It
could presumably be extended to visualise more unicode space
characters.

-- 
http://rrt.sc3d.org





^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#11948: 24.1.50; word-wrap should allow wrapping at all breaking space characters, not just space and tab
  2012-07-16 19:48       ` Reuben Thomas
@ 2012-07-17  2:48         ` Eli Zaretskii
  0 siblings, 0 replies; 24+ messages in thread
From: Eli Zaretskii @ 2012-07-17  2:48 UTC (permalink / raw)
  To: Reuben Thomas; +Cc: 11948

> Date: Mon, 16 Jul 2012 20:48:52 +0100
> From: Reuben Thomas <rrt@sc3d.org>
> Cc: 11948@debbugs.gnu.org
> 
> On 16 July 2012 20:47, Eli Zaretskii <eliz@gnu.org> wrote:
> >
> > Thanks, but that's not what I meant.  Implementing UAX#14 in full is
> > an effort similar (although smaller) to what was required for
> > implementing UAX#9, the Unicode Bidirectional Algorithm.  The main
> > problem is that, like with UAX#9, the algorithms in UAX#14 are
> > specified assuming that text is processed for display in batches.
> 
> I wasn't suggesting you should implement the algorithm; I just assumed
> it would contain a list of breaking space characters. You seem to have
> found such a thing anyway!

Yes, with your help: the file was mentioned in UAX#14.

Thanks.





^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#11948: 24.1.50; word-wrap should allow wrapping at all breaking space characters, not just space and tab
  2012-07-16 19:47     ` Eli Zaretskii
  2012-07-16 19:48       ` Reuben Thomas
@ 2012-07-17  9:49       ` martin rudalics
  2012-07-17 12:00         ` Lennart Borgman
  2012-07-19 19:47         ` Eli Zaretskii
  1 sibling, 2 replies; 24+ messages in thread
From: martin rudalics @ 2012-07-17  9:49 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 11948, Reuben Thomas

 >> The Unicode line breaking algorithm is probably the place to go:
 >>
 >> http://unicode.org/reports/tr14/
 >
 > Thanks, but that's not what I meant.  Implementing UAX#14 in full is
 > an effort similar (although smaller) to what was required for
 > implementing UAX#9, the Unicode Bidirectional Algorithm.  The main
 > problem is that, like with UAX#9, the algorithms in UAX#14 are
 > specified assuming that text is processed for display in batches.  By
 > contrast, the Emacs display engine, which implements word-wrap,
 > examines and processes characters one by one.  So one needs to
 > "serialize", so to speak, the UAX#14 algorithms so that its decisions
 > could be made on a character-by-character basis.
 >
 > I think just supporting more characters from LineBreak.txt on which to
 > wrap should be a good start, and much easier than implementing UAX#14.
 > Even for that, we will need an efficient char-table for the related
 > properties, probably via the uniprop_table machinery, like what bidi.c
 > uses.  Otherwise, referencing the ordinary char-tables of character
 > properties for each character we display could slow down redisplay too
 > much.

While you're all there: If anybody has any idea how to support a
practical and simplified version of collation, see

http://www.unicode.org/reports/tr10/

in emacs, I'd be all ears.

martin





^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#11948: 24.1.50; word-wrap should allow wrapping at all breaking space characters, not just space and tab
  2012-07-17  9:49       ` martin rudalics
@ 2012-07-17 12:00         ` Lennart Borgman
  2012-07-17 13:13           ` martin rudalics
  2012-07-19 19:47         ` Eli Zaretskii
  1 sibling, 1 reply; 24+ messages in thread
From: Lennart Borgman @ 2012-07-17 12:00 UTC (permalink / raw)
  To: martin rudalics; +Cc: 11948, Reuben Thomas

On Tue, Jul 17, 2012 at 11:49 AM, martin rudalics <rudalics@gmx.at> wrote:
>
> While you're all there: If anybody has any idea how to support a
> practical and simplified version of collation, see
>
> http://www.unicode.org/reports/tr10/
>
> in emacs, I'd be all ears.

How does operating systems support this?





^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#11948: 24.1.50; word-wrap should allow wrapping at all breaking space characters, not just space and tab
  2012-07-17 12:00         ` Lennart Borgman
@ 2012-07-17 13:13           ` martin rudalics
  2012-07-17 15:58             ` Eli Zaretskii
  0 siblings, 1 reply; 24+ messages in thread
From: martin rudalics @ 2012-07-17 13:13 UTC (permalink / raw)
  To: Lennart Borgman; +Cc: 11948, Reuben Thomas

 > How does operating systems support this?

I suppose most of them support it in some locale dependent manner.
Sadly, `sort-lines' and `dired' don't support it all.  I can't use
`dired' because it doesn't.

martin





^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#11948: 24.1.50; word-wrap should allow wrapping at all breaking space characters, not just space and tab
  2012-07-17 13:13           ` martin rudalics
@ 2012-07-17 15:58             ` Eli Zaretskii
  2012-07-18 16:16               ` martin rudalics
  0 siblings, 1 reply; 24+ messages in thread
From: Eli Zaretskii @ 2012-07-17 15:58 UTC (permalink / raw)
  To: martin rudalics; +Cc: 11948, rrt

> Date: Tue, 17 Jul 2012 15:13:20 +0200
> From: martin rudalics <rudalics@gmx.at>
> CC: Eli Zaretskii <eliz@gnu.org>, 11948@debbugs.gnu.org, 
>  Reuben Thomas <rrt@sc3d.org>
> 
>  > How does operating systems support this?
> 
> I suppose most of them support it in some locale dependent manner.
> Sadly, `sort-lines' and `dired' don't support it all.  I can't use
> `dired' because it doesn't.

I suggest to file a separate bug report, and please explain there why
you cannot run Dired because of this, because I don't see the
relation.





^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#11948: 24.1.50; word-wrap should allow wrapping at all breaking space characters, not just space and tab
  2012-07-17 15:58             ` Eli Zaretskii
@ 2012-07-18 16:16               ` martin rudalics
  0 siblings, 0 replies; 24+ messages in thread
From: martin rudalics @ 2012-07-18 16:16 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 11948, rrt

 >> Sadly, `sort-lines' and `dired' don't support it all.  I can't use
 >> `dired' because it doesn't.
 >
 > I suggest to file a separate bug report, and please explain there why
 > you cannot run Dired because of this, because I don't see the
 > relation.

I didn't say that I cannot "run" it.  I said that I cannot "use" it.
And I can't use it because ls/dired and I have different understandings
of what "alphabetically" means.

martin





^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#11948: 24.1.50; word-wrap should allow wrapping at all breaking space characters, not just space and tab
  2012-07-17  9:49       ` martin rudalics
  2012-07-17 12:00         ` Lennart Borgman
@ 2012-07-19 19:47         ` Eli Zaretskii
  2012-07-21 11:02           ` martin rudalics
  2012-07-22  9:41           ` Stefan Monnier
  1 sibling, 2 replies; 24+ messages in thread
From: Eli Zaretskii @ 2012-07-19 19:47 UTC (permalink / raw)
  To: martin rudalics; +Cc: 11948, rrt

> Date: Tue, 17 Jul 2012 11:49:56 +0200
> From: martin rudalics <rudalics@gmx.at>
> CC: Reuben Thomas <rrt@sc3d.org>, 11948@debbugs.gnu.org
> 
> While you're all there: If anybody has any idea how to support a
> practical and simplified version of collation, see
> 
> http://www.unicode.org/reports/tr10/
> 
> in emacs, I'd be all ears.

We could provide a function suitable to be a PREDICATE argument for
'sort', which would call 'strcoll' in the underlying C library, can't
we?





^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#11948: 24.1.50; word-wrap should allow wrapping at all breaking space characters, not just space and tab
  2012-07-19 19:47         ` Eli Zaretskii
@ 2012-07-21 11:02           ` martin rudalics
  2012-07-21 12:42             ` Eli Zaretskii
  2012-07-22  9:41           ` Stefan Monnier
  1 sibling, 1 reply; 24+ messages in thread
From: martin rudalics @ 2012-07-21 11:02 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 11948, rrt

 > We could provide a function suitable to be a PREDICATE argument for
 > 'sort', which would call 'strcoll' in the underlying C library, can't
 > we?

That would be awesome.  Can you try doing that?

martin





^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#11948: 24.1.50; word-wrap should allow wrapping at all breaking space characters, not just space and tab
  2012-07-21 11:02           ` martin rudalics
@ 2012-07-21 12:42             ` Eli Zaretskii
  0 siblings, 0 replies; 24+ messages in thread
From: Eli Zaretskii @ 2012-07-21 12:42 UTC (permalink / raw)
  To: martin rudalics; +Cc: 11948, rrt

> Date: Sat, 21 Jul 2012 13:02:10 +0200
> From: martin rudalics <rudalics@gmx.at>
> CC: rrt@sc3d.org, 11948@debbugs.gnu.org
> 
>  > We could provide a function suitable to be a PREDICATE argument for
>  > 'sort', which would call 'strcoll' in the underlying C library, can't
>  > we?
> 
> That would be awesome.  Can you try doing that?

I can try, but would you please file a separate bug report for this?





^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#11948: 24.1.50; word-wrap should allow wrapping at all breaking space characters, not just space and tab
  2012-07-19 19:47         ` Eli Zaretskii
  2012-07-21 11:02           ` martin rudalics
@ 2012-07-22  9:41           ` Stefan Monnier
  1 sibling, 0 replies; 24+ messages in thread
From: Stefan Monnier @ 2012-07-22  9:41 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 11948, rrt

>> While you're all there: If anybody has any idea how to support a
>> practical and simplified version of collation, see
>> http://www.unicode.org/reports/tr10/
>> in emacs, I'd be all ears.
> We could provide a function suitable to be a PREDICATE argument for
> 'sort', which would call 'strcoll' in the underlying C library, can't
> we?

We'd need to select a utf-8 locale before doing that, right?


        Stefan





^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#11948: 24.1.50; word-wrap should allow wrapping at all breaking space characters, not just space and tab
  2012-07-16 12:42 bug#11948: 24.1.50; word-wrap should allow wrapping at all breaking space characters, not just space and tab Reuben Thomas
  2012-07-16 16:05 ` Eli Zaretskii
@ 2022-04-22 12:47 ` Lars Ingebrigtsen
  2022-04-22 12:56   ` Eli Zaretskii
  1 sibling, 1 reply; 24+ messages in thread
From: Lars Ingebrigtsen @ 2022-04-22 12:47 UTC (permalink / raw)
  To: Reuben Thomas; +Cc: 11948

Reuben Thomas <rrt@sc3d.org> writes:

> I noticed this when in visual-line-mode, and it failed to wrap at an em
> space (U+2003), but of course there are lotsof other breaking space
> characters.

Eli, now that we have word-wrap-by-category, wouldn't this be easy to
implement?  I.e., do 

(modify-category-entry #x2003 ?|)

for all characters of general-category Zs in character.el?  Or would
that have other negative consequences?

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#11948: 24.1.50; word-wrap should allow wrapping at all breaking space characters, not just space and tab
  2022-04-22 12:47 ` Lars Ingebrigtsen
@ 2022-04-22 12:56   ` Eli Zaretskii
  2022-04-23 11:32     ` Lars Ingebrigtsen
  0 siblings, 1 reply; 24+ messages in thread
From: Eli Zaretskii @ 2022-04-22 12:56 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: 11948, rrt

> From: Lars Ingebrigtsen <larsi@gnus.org>
> Cc: 11948@debbugs.gnu.org, Eli Zaretskii <eliz@gnu.org>
> Date: Fri, 22 Apr 2022 14:47:23 +0200
> 
> Reuben Thomas <rrt@sc3d.org> writes:
> 
> > I noticed this when in visual-line-mode, and it failed to wrap at an em
> > space (U+2003), but of course there are lotsof other breaking space
> > characters.
> 
> Eli, now that we have word-wrap-by-category, wouldn't this be easy to
> implement?  I.e., do 
> 
> (modify-category-entry #x2003 ?|)
> 
> for all characters of general-category Zs in character.el?  Or would
> that have other negative consequences?

Yes, now people who want what the OP wanted should be able to have
that easily.  But I would hesitate making that the default, instead
leaving it to user customizations.  We could have a minor mode to do
that, though, so that users who want this won't need to customize
individually each character's category set.

Of course, the harder part here is to decide which of the Zs
characters will allow word-wrap on them.  I don't think all of them
should.





^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#11948: 24.1.50; word-wrap should allow wrapping at all breaking space characters, not just space and tab
  2022-04-22 12:56   ` Eli Zaretskii
@ 2022-04-23 11:32     ` Lars Ingebrigtsen
  2022-04-23 11:49       ` Eli Zaretskii
  0 siblings, 1 reply; 24+ messages in thread
From: Lars Ingebrigtsen @ 2022-04-23 11:32 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 11948, rrt

Eli Zaretskii <eliz@gnu.org> writes:

> Yes, now people who want what the OP wanted should be able to have
> that easily.  But I would hesitate making that the default, instead
> leaving it to user customizations.  We could have a minor mode to do
> that, though, so that users who want this won't need to customize
> individually each character's category set.

Sure, a minor mode would work well here.

> Of course, the harder part here is to decide which of the Zs
> characters will allow word-wrap on them.  I don't think all of them
> should.

Looking over these:

17 matches for "Zs" in buffer: UnicodeData.txt
     33:0020;SPACE;Zs;0;WS;;;;;N;;;;;
    161:00A0;NO-BREAK SPACE;Zs;0;CS;<noBreak> 0020;;;;N;NON-BREAKING SPACE;;;;
   5187:1680;OGHAM SPACE MARK;Zs;0;WS;;;;;N;;;;;
   7354:2000;EN QUAD;Zs;0;WS;2002;;;;N;;;;;
   7355:2001;EM QUAD;Zs;0;WS;2003;;;;N;;;;;
   7356:2002;EN SPACE;Zs;0;WS;<compat> 0020;;;;N;;;;;
   7357:2003;EM SPACE;Zs;0;WS;<compat> 0020;;;;N;;;;;
   7358:2004;THREE-PER-EM SPACE;Zs;0;WS;<compat> 0020;;;;N;;;;;
   7359:2005;FOUR-PER-EM SPACE;Zs;0;WS;<compat> 0020;;;;N;;;;;
   7360:2006;SIX-PER-EM SPACE;Zs;0;WS;<compat> 0020;;;;N;;;;;
   7361:2007;FIGURE SPACE;Zs;0;WS;<noBreak> 0020;;;;N;;;;;
   7362:2008;PUNCTUATION SPACE;Zs;0;WS;<compat> 0020;;;;N;;;;;
   7363:2009;THIN SPACE;Zs;0;WS;<compat> 0020;;;;N;;;;;
   7364:200A;HAIR SPACE;Zs;0;WS;<compat> 0020;;;;N;;;;;
   7401:202F;NARROW NO-BREAK SPACE;Zs;0;CS;<noBreak> 0020;;;;N;;;;;
   7449:205F;MEDIUM MATHEMATICAL SPACE;Zs;0;WS;<compat> 0020;;;;N;;;;;
  11232:3000;IDEOGRAPHIC SPACE;Zs;0;WS;<wide> 0020;;;;N;;;;;

I think only the no-break ones shouldn't trigger wrapping?

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#11948: 24.1.50; word-wrap should allow wrapping at all breaking space characters, not just space and tab
  2022-04-23 11:32     ` Lars Ingebrigtsen
@ 2022-04-23 11:49       ` Eli Zaretskii
  2022-04-23 12:13         ` Lars Ingebrigtsen
  0 siblings, 1 reply; 24+ messages in thread
From: Eli Zaretskii @ 2022-04-23 11:49 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: 11948, rrt

> From: Lars Ingebrigtsen <larsi@gnus.org>
> Cc: rrt@sc3d.org,  11948@debbugs.gnu.org
> Date: Sat, 23 Apr 2022 13:32:48 +0200
> 
> > Of course, the harder part here is to decide which of the Zs
> > characters will allow word-wrap on them.  I don't think all of them
> > should.
> 
> Looking over these:
> 
> 17 matches for "Zs" in buffer: UnicodeData.txt
>      33:0020;SPACE;Zs;0;WS;;;;;N;;;;;
>     161:00A0;NO-BREAK SPACE;Zs;0;CS;<noBreak> 0020;;;;N;NON-BREAKING SPACE;;;;
>    5187:1680;OGHAM SPACE MARK;Zs;0;WS;;;;;N;;;;;
>    7354:2000;EN QUAD;Zs;0;WS;2002;;;;N;;;;;
>    7355:2001;EM QUAD;Zs;0;WS;2003;;;;N;;;;;
>    7356:2002;EN SPACE;Zs;0;WS;<compat> 0020;;;;N;;;;;
>    7357:2003;EM SPACE;Zs;0;WS;<compat> 0020;;;;N;;;;;
>    7358:2004;THREE-PER-EM SPACE;Zs;0;WS;<compat> 0020;;;;N;;;;;
>    7359:2005;FOUR-PER-EM SPACE;Zs;0;WS;<compat> 0020;;;;N;;;;;
>    7360:2006;SIX-PER-EM SPACE;Zs;0;WS;<compat> 0020;;;;N;;;;;
>    7361:2007;FIGURE SPACE;Zs;0;WS;<noBreak> 0020;;;;N;;;;;
>    7362:2008;PUNCTUATION SPACE;Zs;0;WS;<compat> 0020;;;;N;;;;;
>    7363:2009;THIN SPACE;Zs;0;WS;<compat> 0020;;;;N;;;;;
>    7364:200A;HAIR SPACE;Zs;0;WS;<compat> 0020;;;;N;;;;;
>    7401:202F;NARROW NO-BREAK SPACE;Zs;0;CS;<noBreak> 0020;;;;N;;;;;
>    7449:205F;MEDIUM MATHEMATICAL SPACE;Zs;0;WS;<compat> 0020;;;;N;;;;;
>   11232:3000;IDEOGRAPHIC SPACE;Zs;0;WS;<wide> 0020;;;;N;;;;;
> 
> I think only the no-break ones shouldn't trigger wrapping?

Those marked with "<no break>", you mean?  Yes.  But I think we should
add U+200B ZERO WIDTH SPACE to the list, although it's not Zs.





^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#11948: 24.1.50; word-wrap should allow wrapping at all breaking space characters, not just space and tab
  2022-04-23 11:49       ` Eli Zaretskii
@ 2022-04-23 12:13         ` Lars Ingebrigtsen
  0 siblings, 0 replies; 24+ messages in thread
From: Lars Ingebrigtsen @ 2022-04-23 12:13 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 11948, rrt

Eli Zaretskii <eliz@gnu.org> writes:

> Those marked with "<no break>", you mean?  Yes.  But I think we should
> add U+200B ZERO WIDTH SPACE to the list, although it's not Zs.

Now added as word-wrap-whitespace-mode in Emacs 29, but if you have a
better name, feel free to change (and tweak the list of space characters
further).

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2022-04-23 12:13 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-07-16 12:42 bug#11948: 24.1.50; word-wrap should allow wrapping at all breaking space characters, not just space and tab Reuben Thomas
2012-07-16 16:05 ` Eli Zaretskii
2012-07-16 18:21   ` Reuben Thomas
2012-07-16 19:47     ` Eli Zaretskii
2012-07-16 19:48       ` Reuben Thomas
2012-07-17  2:48         ` Eli Zaretskii
2012-07-17  9:49       ` martin rudalics
2012-07-17 12:00         ` Lennart Borgman
2012-07-17 13:13           ` martin rudalics
2012-07-17 15:58             ` Eli Zaretskii
2012-07-18 16:16               ` martin rudalics
2012-07-19 19:47         ` Eli Zaretskii
2012-07-21 11:02           ` martin rudalics
2012-07-21 12:42             ` Eli Zaretskii
2012-07-22  9:41           ` Stefan Monnier
2012-07-16 20:37   ` Stefan Monnier
2012-07-16 20:40     ` Reuben Thomas
2012-07-16 21:16       ` Stefan Monnier
2012-07-16 21:18         ` Reuben Thomas
2022-04-22 12:47 ` Lars Ingebrigtsen
2022-04-22 12:56   ` Eli Zaretskii
2022-04-23 11:32     ` Lars Ingebrigtsen
2022-04-23 11:49       ` Eli Zaretskii
2022-04-23 12:13         ` Lars Ingebrigtsen

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).