From: nisse@lysator.liu.se (Niels Möller)
To: Eli Zaretskii <eliz@gnu.org>
Cc: 15984@debbugs.gnu.org
Subject: bug#15984: 24.3; Problem with combining characters in attachment filename
Date: Fri, 29 Nov 2013 13:41:01 +0100 [thread overview]
Message-ID: <nnsiufmlcy.fsf@bacon.lysator.liu.se> (raw)
In-Reply-To: <83r49z78jp.fsf@gnu.org> (Eli Zaretskii's message of "Fri, 29 Nov 2013 13:26:50 +0200")
Eli Zaretskii <eliz@gnu.org> writes:
> However, we do want to give the user a way to
> delete only one or more of the combining characters, so forcing the
> entire combination to be a single indivisible entity would not be TRT
> for users.
Good question, how to handle this.
Today, to remove the dots from an "ä" character, I'll have to delete the
complete "ä" character and insert a new "a" character. Or similarly for
the reverse edit. I think this "atomic" handling is the desired
behaviour in many cases. And I don't think it should behave differently
depending on the representation of "ä" in the original file. But if you
have a complex sequence of unicode combining characters, I agree there's
some need to be able to edit it. Maybe put point on the character and
invoke edit-char to go in some special mode which explodes the usually
"atomic" character into smaller pieces.
And such a character edit mode might be useful for more things than
unicode composing characters, e.g, manipulationg the different sub-parts
of a chinese character. Anyway, this user interface is not intimately
tied to the internal character representation; its overall effect on the
buffer will be the same as replacing any substring.
>> When reading text files, the character boundaries may be configurble.
>
> The important question is what to do by default,
I'm pretty sure the default should be that a sequence of one unicode
base char and all following unicode combining chars is interned as a
single "emacs character". (I think the detailed rules for this are
spelled out in the unicode book). With some arbitrary limit to prevent a
GByte file with only unicode combining characters to get read as a
single emacs character; say at most 10 combining characters.
> You are mixing display issues with editing issues and with how
> characters are represented internally in an Emacs buffer.
I think it's confusing for users if the units of text which forward-char
skips over, do not correspond to the units matched by "." in
isearch-forward-regexp.
My suggested internal representation seems to be a natural way to get
this correspondence right, at the cost of some memory (or lots of
complexity in reducing memory usage). I'm sure there are other ways, and
maybe also a lot better ways, to implement the same thing.
> Thanks, I will try that.
Now I've also reproduced it on the same machine, without my normal Gnus
setup getting in the way. I start emacs with
$ rm -rf ~/tmp/home/ && mkdir ~/tmp/home/ && HOME=$HOME/tmp/home emacs -nw -Q -l bug.el
where bug.el contains
(setq gnus-init-file nil)
(setq gnus-nntp-server nil)
(gnus-no-server)
Then create the group with G d, pointing out the spool-like directory,
enter the group (RET), view the message (RET), try to write out the
attachment ("o" on the attachment button). Still crashes for me.
Regards,
/Niels
--
Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.
next prev parent reply other threads:[~2013-11-29 12:41 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-11-28 8:08 bug#15984: 24.3; Problem with combining characters in attachment filename Niels Möller
2013-11-28 20:25 ` Eli Zaretskii
2013-11-28 22:17 ` Niels Möller
2013-11-28 22:46 ` Niels Möller
2013-11-29 7:16 ` Eli Zaretskii
2013-11-29 8:49 ` Niels Möller
2013-11-29 9:00 ` Eli Zaretskii
2013-11-29 10:43 ` Niels Möller
2013-11-29 11:26 ` Eli Zaretskii
2013-11-29 12:41 ` Niels Möller [this message]
2013-11-29 14:50 ` Eli Zaretskii
2013-11-29 16:18 ` Eli Zaretskii
2013-11-30 13:20 ` Eli Zaretskii
2013-11-30 14:25 ` Kenichi Handa
2013-11-30 16:09 ` Eli Zaretskii
2013-11-30 15:50 ` Niels Möller
2013-11-29 15:04 ` Stefan Monnier
2013-11-29 15:27 ` Eli Zaretskii
2013-11-30 8:53 ` Niels Möller
2013-11-29 13:11 ` Kenichi Handa
[not found] ` <87eh574qmm.fsf@gnu.org>
2014-01-17 13:30 ` K. Handa
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/emacs/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=nnsiufmlcy.fsf@bacon.lysator.liu.se \
--to=nisse@lysator.liu.se \
--cc=15984@debbugs.gnu.org \
--cc=eliz@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).