From: "James K. Lowden" <jklowden@speakeasy.net>
To: help-gnu-emacs@gnu.org
Subject: Re: Is there a way to "asciify" a string?
Date: Thu, 31 May 2018 19:23:48 -0400 [thread overview]
Message-ID: <20180531192348.22baa2917129486248557378@speakeasy.net> (raw)
In-Reply-To: mailman.871.1527781438.1292.help-gnu-emacs@gnu.org
On Thu, 31 May 2018 17:42:33 +0200
Marcin Borkowski <mbork@mbork.pl> wrote:
> > I really strongly recommend you try to solve this problem by doing
> > nothing: keep the name in its full glory. Nowadays users *should*
> > expect this to work.
>
> It's tempting, but no: these files will eventually be sent to
> e.g. people on Windows XP and the like. I don't want to take risks of
> unreadable filenames.
It's good advice, though treacherous. If you use any encoding other
than ASCII, you'll need to indicate the encoding used, and put up with
recipients who don't know what "encoding" is, or can't re-encode the
names to their machine's preferred encoding.
For instance, if you send UTF-8, you can expect befuddlement from
Windows users, whose system implicitly recognizes UTF-16LE.
I can hardly blame you for not wanting to do that.
If Windows's filename rules were the actual constraint, the allowed
characters in a Windows filename is well defined. The
prohibited characters could be URL-encoded or similar. That would
yield a recognizable, unique name, and the original could be recovered
by reversing the process.
If I were solving your problem, I'd look for something similar to what
you describe, but wholly reversible. I'd use ascii//TRANSLIT or similar
to get the "unaccented" version of the character, and insert a
URL-style escape after each one representing the original
Unicode character in hex. So,
Jönköping
becomes
Jo%F6nko%F6ping
If you escape literal percent signs, too, ("%" becomes "%%25") then
the reversal rule is simply "for every /%[:xdigit:]{2}/, replace the
previous character with the indicated codepoint".
This approach preserves uniqueness in the filename, so you can dispense
with "uniquifying" it with a meaningless integer.
--jkl
next prev parent reply other threads:[~2018-05-31 23:23 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-05-27 6:22 Is there a way to "asciify" a string? Marcin Borkowski
2018-05-27 7:36 ` tomas
2018-05-27 12:36 ` Marcin Borkowski
2018-05-27 12:52 ` Teemu Likonen
2018-05-27 16:07 ` Eli Zaretskii
2018-05-27 16:59 ` Teemu Likonen
2018-05-28 5:24 ` Tak Kunihiro
2018-05-30 10:12 ` Marcin Borkowski
2018-05-30 17:05 ` Eli Zaretskii
2018-05-30 19:38 ` Marcin Borkowski
2018-05-27 20:00 ` tomas
2018-05-28 18:27 ` Eli Zaretskii
2018-05-29 6:37 ` tomas
2018-05-27 13:04 ` Yuri Khan
2018-05-30 10:14 ` Marcin Borkowski
2018-05-30 11:51 ` Yuri Khan
2018-05-30 15:04 ` Marcin Borkowski
2018-05-31 2:03 ` John Mastro
2018-06-02 18:07 ` Marcin Borkowski
2018-06-02 18:48 ` tomas
2018-06-07 17:16 ` Marcin Borkowski
2018-06-02 22:33 ` Drew Adams
2018-06-07 17:15 ` Marcin Borkowski
2018-06-02 18:12 ` Marcin Borkowski
2018-05-27 19:53 ` tomas
2018-05-28 8:15 ` Philipp Stephani
2018-05-28 10:28 ` Marcin Borkowski
2018-05-28 10:39 ` tomas
2018-05-28 15:30 ` Yuri Khan
2018-05-28 16:02 ` tomas
2018-05-30 10:12 ` Marcin Borkowski
2018-05-31 14:23 ` Stefan Monnier
2018-05-31 15:08 ` S. Champailler
2018-05-31 22:52 ` Richard Wordingham
2018-05-31 15:42 ` Marcin Borkowski
2018-05-31 15:53 ` Eli Zaretskii
2018-05-31 16:20 ` Yuri Khan
2018-05-31 19:03 ` Stefan Monnier
[not found] ` <mailman.871.1527781438.1292.help-gnu-emacs@gnu.org>
2018-05-31 23:23 ` James K. Lowden [this message]
2018-06-01 2:04 ` Stefan Monnier
2018-06-01 7:02 ` Eli Zaretskii
2018-05-27 14:55 ` Eric Abrahamsen
2018-05-27 16:00 ` Eli Zaretskii
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180531192348.22baa2917129486248557378@speakeasy.net \
--to=jklowden@speakeasy.net \
--cc=help-gnu-emacs@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.