From: Boruch Baum <boruch_baum@gmx.com>
To: Eli Zaretskii <eliz@gnu.org>
Cc: emacs-devel@gnu.org
Subject: Re: fixing url-unhex-string for unicode/multi-byte charsets
Date: Fri, 6 Nov 2020 07:28:46 -0500 [thread overview]
Message-ID: <20201106122846.unoizvad53blgncf@E15-2016.optimum.net> (raw)
In-Reply-To: <83pn4q8zdz.fsf@gnu.org>
On 2020-11-06 14:04, Eli Zaretskii wrote:
> > Date: Fri, 6 Nov 2020 05:27:56 -0500
> > From: Boruch Baum <boruch_baum@gmx.com>
> > Cc: emacs-devel@gnu.org
> I can't, not in full: I don't have a Freedesktop trash anywhere I have
> access to. I did try the 2 file names you posted, including the one
> with Hebrew characters, and it did work for me, on the assumption that
> file-name-coding-system is UTF-8.
>
> > To reproduce, touch and then trash a file named some two Hebrew
> > words delimited by a space. Navigate to the trash directory's 'info'
> > sub-directory and extract the 'path' value from the file's meta-data
> > .info file. That's the string we need to decode. Apply the string to
> > your solution and see that you do not get the space-delimited two
> > Hebrew words.
>
> A stand-alone test case, which doesn't require an actual trash, would
> be appreciated, so I could see which parrt doesn't work, and how to
> fix it.
That would be the two file names that I previously posted. You say that
they succeeded for you, but they didn't for me. The result I got was
good for the first case (English two words), and garbage for the second
case (Hebrew two words).
> Alternatively, maybe you could explain why you needed to insert the
> text into a temporary buffer and then extract it from there? AFAIK,
> we have the same primitives that work on decoding strings as we have
> for decoding buffer text.
I don't need to. It's implementation done in emacs-w3m. I also pointed
out that eww does it differently. I think the need in emacs-w3m is to
mix the ascii characters and selected binary output, which can't be done
with say replace-regexp-in-string. So what they do is use a temporary
buffer, set `buffer-multibyte' to nil, and instead of
replace-regexp-in-string build the result in the temporary buffer.
--
hkp://keys.gnupg.net
CA45 09B5 5351 7C11 A9D1 7286 0036 9E45 1595 8BC0
next prev parent reply other threads:[~2020-11-06 12:28 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-11-06 7:47 fixing url-unhex-string for unicode/multi-byte charsets Boruch Baum
2020-11-06 8:02 ` Eli Zaretskii
2020-11-06 10:27 ` Boruch Baum
2020-11-06 12:04 ` Eli Zaretskii
2020-11-06 12:28 ` Boruch Baum [this message]
2020-11-06 13:34 ` Eli Zaretskii
2020-11-06 14:59 ` Stefan Monnier
2020-11-06 15:04 ` Eli Zaretskii
2020-11-08 9:12 ` Boruch Baum
2020-11-08 13:39 ` Stefan Monnier
2020-11-08 15:07 ` Eli Zaretskii
2020-11-06 14:38 ` Stefan Monnier
-- strict thread matches above, loose matches on Subject: below --
2020-11-06 7:54 Boruch Baum
2020-11-06 8:05 ` Eli Zaretskii
2020-11-06 10:34 ` Boruch Baum
2020-11-06 12:06 ` Eli Zaretskii
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/emacs/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20201106122846.unoizvad53blgncf@E15-2016.optimum.net \
--to=boruch_baum@gmx.com \
--cc=eliz@gnu.org \
--cc=emacs-devel@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).