unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
From: Lars Ingebrigtsen <larsi@gnus.org>
To: "Johannes Grødem" <fjas@grdm.no>
Cc: 54591@debbugs.gnu.org
Subject: bug#54591: 29.0.50; sqlite-select returns blob result as multibyte string
Date: Sat, 02 Apr 2022 14:59:21 +0200	[thread overview]
Message-ID: <87v8vrljyu.fsf@gnus.org> (raw)
In-Reply-To: <878rso7iuu.fsf@flokut.localdomain> ("Johannes Grødem"'s message of "Fri, 01 Apr 2022 20:34:49 +0200")

Johannes Grødem <fjas@grdm.no> writes:

> I might be misunderstanding the issue, but SQLite column types are more
> like documentation than actual rules to be enforced, unless STRICT
> tables are enabled.

Yeah, you can put anything you want into TEXT and BLOB columns.  What
I'd like to see happening is that the Emacs interface here is
predictable and convenient, and that makes my brain hurt a bit here.

Let's take a TEXT column first.  Currently, if you have the multibyte
string "fóo" and insert with "insert into ... (?)", we encode to utf-8
and put the bytes #x66#xc3#xb3#x6f into the database.  Selecting from
the database, we get the bytes #x66#xc3#xb3#x6f back, decode and return
the string "fóo".

If you have a unibyte string containing the bytes #x66#xc3#xb3#x6f, we
don't do anything with that, but insert the bytes as is.  When
selecting, we decode and return "fóo", which is not what the user
inserted.  In this case, it would be nice to signal an error, but we
can't, because we don't know that it's a TEXT column in the first place.

Conversely, with BLOB columns, we would prefer to signal an error on
multibyte strings, but we can't, because we don't know that it's a BLOB
column.  But we do the right thing with unibyte strings -- if you give
it #x66#xc3#xb3#x6f, it'll put those bytes into the BLOB column, and
when selecting, we do know that it's a BLOB column, so we could return
the unibyte string #x66#xc3#xb3#x6f, and everything's fine.  However, if
the user wanted to insert the string "fóo", they'll be getting
#x66#xc3#xb3#x6f back and will probably be sad.

Today, the semantics are at least predictable: We insert everything
encoded to utf-8 (no matter whether using bound parameters or inside the
string), and if the user wanted something binary in the BLOB they
selected, they just have to call `decode-coding-string BLOB-RESULT
'utf-8' to get the binary data.

Which I understand is confusing, because it's very confusing indeed.
But it's consistent, at least.

If we knew what the type of the column we were inserting into, we could
be more helpful in the interface, but there doesn't seem to be a way to
get at that information?

> By the way, if you want to insert BLOBs in the query itself you can do
> it like this, but I guess this doesn't need Emacs support, except maybe
> a helper function for the conversion:
>
>   INSERT INTO foo VALUES (X'deadcafe');

Yes, but that leaves the issue to the caller, and the issue about what
to do when selecting is still unclear.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





  parent reply	other threads:[~2022-04-02 12:59 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-27  5:49 bug#54591: 29.0.50; sqlite-select returns blob result as multibyte string Yuan Fu
2022-03-27  6:40 ` Eli Zaretskii
2022-03-27 12:04   ` Po Lu via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-03-27 14:56     ` Eli Zaretskii
2022-03-29 14:38   ` Lars Ingebrigtsen
2022-03-29 15:38     ` Eli Zaretskii
2022-03-31 11:54       ` Lars Ingebrigtsen
2022-04-01 18:34         ` Johannes Grødem
2022-04-02  5:31           ` Eli Zaretskii
2022-04-02  6:33             ` Johannes Grødem
2022-04-02  6:52               ` Eli Zaretskii
2022-04-02 12:59           ` Lars Ingebrigtsen [this message]
2022-04-02 13:51             ` Eli Zaretskii
2022-04-02 13:59               ` Lars Ingebrigtsen
2022-04-02 14:22                 ` Eli Zaretskii
2022-04-02 14:38                   ` Lars Ingebrigtsen
2022-04-02 15:26                     ` Eli Zaretskii
2022-04-02 15:28                       ` Lars Ingebrigtsen
2022-04-02 15:40                         ` Lars Ingebrigtsen
2022-04-03 10:42                           ` Rudolf Schlatte
2022-04-03 11:43                             ` Lars Ingebrigtsen
2022-04-02 14:06           ` Lars Ingebrigtsen
2022-04-28 12:58             ` Lars Ingebrigtsen
2022-04-29  4:59         ` Yuan Fu
2022-04-29 10:04           ` Lars Ingebrigtsen
2022-04-30  5:27             ` Yuan Fu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87v8vrljyu.fsf@gnus.org \
    --to=larsi@gnus.org \
    --cc=54591@debbugs.gnu.org \
    --cc=fjas@grdm.no \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).