Eli Zaretskii <eliz@gnu.org> schrieb am Fr., 29. Sep. 2017 um 21:56 Uhr:

> > From: Philipp Stephani <p.stephani2@gmail.com>
> > Date: Thu, 28 Sep 2017 21:19:00 +0000
> > Cc: emacs-devel@gnu.org
> >
> > IIUC Jansson only accepts UTF-8 strings (i.e. it will generate an error
> some input is not an UTF-8 string), and
> > will only return UTF-8 strings as well. Therefore I think that direct
> conversion between Lisp strings and C
> > strings (using SDATA etc.) is always correct because the internal Emacs
> encoding is a superset of UTF-8.
> > Also build_string should always be correct because it will generate a
> correct multibyte string for an UTF-8
> > string with non-ASCII characters, and a correct unibyte string for an
> ASCII string, right?
>
> I don't think it's a good idea to write code which has such
> assumptions embedded in it.  We don't do that in other cases, although
> UTF-8 based systems are widespread nowadays.  Instead, we make sure
> that encoding and decoding UTF-8 byte stream is implemented
> efficiently, and when possible simply reuses the same string data.
>
> Besides, these assumptions are not always true, for example:
>
>   . The Emacs internal representation could include raw bytes, whose
>     representations (both of them) is not valid UTF-8;
>   . Strings we receive from the library could be invalid UTF-8, in
>     which case putting them into a buffer or string without decoding
>     will mean trouble for programs that will try to process them;
>
> So I think decoding and encoding any string passed to/from Jansson is
> better for stability and future maintenance.  If you worry about
> performance, you shouldn't: we convert UTF-8 into our internal
> representation as efficiently as possible.
>
> >  > + /* LISP now must be a vector or hashtable. */
> >  > + if (++lisp_eval_depth > max_lisp_eval_depth)
> >  > + xsignal0 (Qjson_object_too_deep);
> >
> >  This error could mislead: the problem could be in the nesting of
> >  surrounding Lisp being too deep, and the JSON part could be just fine.
> >
> > Agreed, but I think it's better to use lisp_eval_depth here because it's
> the total nesting depth that could cause
> > stack overflows.
>
> Well, at least the error message should not point exclusively to a
> JSON problem, it should mention the possibility of a Lisp eval depth
> overflow as well.
>

OK, I've attached a new patch that incorporates most of these changes.


>
> >  > + Lisp_Object string
> >  > + = make_string (buffer_and_size->buffer, buffer_and_size->size);
> >
> >  This is arbitrary text, so I'm not sure make_string is appropriate.
> >  Could the text be a byte stream, i.e. not human-readable text? If so,
> >  do we want to create a unibyte string or a multibyte string here?
> >
> > It should always be UTF-8.
>
> How does JSON express byte streams, then?  Doesn't it support data (as
> opposed to text)?
>

Usually using base64.


>
> >  > + {
> >  > + bool overflow = INT_ADD_WRAPV (BUFFER_CEILING_OF (point), 1, &end);
> >  > + eassert (!overflow);
> >  > + }
> >  > + size_t count;
> >  > + {
> >  > + bool overflow = INT_SUBTRACT_WRAPV (end, point, &count);
> >  > + eassert (!overflow);
> >  > + }
> >
> >  Why did you need these blocks in braces?
> >
> > To be able to reuse the "overflow" name/
>
> Why can't you reuse it without the braces?
>
>
Then I'd need to reuse the variable. Not a big deal, just personal style.