Eli Zaretskii schrieb am Fr., 29. Sep. 2017 um 21:56 Uhr: > > From: Philipp Stephani > > Date: Thu, 28 Sep 2017 21:19:00 +0000 > > Cc: emacs-devel@gnu.org > > > > IIUC Jansson only accepts UTF-8 strings (i.e. it will generate an error > some input is not an UTF-8 string), and > > will only return UTF-8 strings as well. Therefore I think that direct > conversion between Lisp strings and C > > strings (using SDATA etc.) is always correct because the internal Emacs > encoding is a superset of UTF-8. > > Also build_string should always be correct because it will generate a > correct multibyte string for an UTF-8 > > string with non-ASCII characters, and a correct unibyte string for an > ASCII string, right? > > I don't think it's a good idea to write code which has such > assumptions embedded in it. We don't do that in other cases, although > UTF-8 based systems are widespread nowadays. Instead, we make sure > that encoding and decoding UTF-8 byte stream is implemented > efficiently, and when possible simply reuses the same string data. > > Besides, these assumptions are not always true, for example: > > . The Emacs internal representation could include raw bytes, whose > representations (both of them) is not valid UTF-8; > . Strings we receive from the library could be invalid UTF-8, in > which case putting them into a buffer or string without decoding > will mean trouble for programs that will try to process them; > > So I think decoding and encoding any string passed to/from Jansson is > better for stability and future maintenance. If you worry about > performance, you shouldn't: we convert UTF-8 into our internal > representation as efficiently as possible. > > > > + /* LISP now must be a vector or hashtable. */ > > > + if (++lisp_eval_depth > max_lisp_eval_depth) > > > + xsignal0 (Qjson_object_too_deep); > > > > This error could mislead: the problem could be in the nesting of > > surrounding Lisp being too deep, and the JSON part could be just fine. > > > > Agreed, but I think it's better to use lisp_eval_depth here because it's > the total nesting depth that could cause > > stack overflows. > > Well, at least the error message should not point exclusively to a > JSON problem, it should mention the possibility of a Lisp eval depth > overflow as well. > OK, I've attached a new patch that incorporates most of these changes. > > > > + Lisp_Object string > > > + = make_string (buffer_and_size->buffer, buffer_and_size->size); > > > > This is arbitrary text, so I'm not sure make_string is appropriate. > > Could the text be a byte stream, i.e. not human-readable text? If so, > > do we want to create a unibyte string or a multibyte string here? > > > > It should always be UTF-8. > > How does JSON express byte streams, then? Doesn't it support data (as > opposed to text)? > Usually using base64. > > > > + { > > > + bool overflow = INT_ADD_WRAPV (BUFFER_CEILING_OF (point), 1, &end); > > > + eassert (!overflow); > > > + } > > > + size_t count; > > > + { > > > + bool overflow = INT_SUBTRACT_WRAPV (end, point, &count); > > > + eassert (!overflow); > > > + } > > > > Why did you need these blocks in braces? > > > > To be able to reuse the "overflow" name/ > > Why can't you reuse it without the braces? > > Then I'd need to reuse the variable. Not a big deal, just personal style.