Eli Zaretskii schrieb am Di., 19. Sep. 2017 um 21:10 Uhr: > > From: Philipp Stephani > > Date: Tue, 19 Sep 2017 08:18:14 +0000 > > > > Here's a newer version of the patch. The only significant difference is > that now the Lisp values for JSON null > > and false are :null and :false, respectively. Using a dedicated symbol > for :null reduces the mental overhead of > > the triple meaning of nil (null, false, empty list), and is more > future-proof, should we ever want to support lists. > > Thanks, a few comments below. > Thanks for the review. Most of the comments are about converting between C and Lisp strings, so let me summarize my questions here. IIUC Jansson only accepts UTF-8 strings (i.e. it will generate an error some input is not an UTF-8 string), and will only return UTF-8 strings as well. Therefore I think that direct conversion between Lisp strings and C strings (using SDATA etc.) is always correct because the internal Emacs encoding is a superset of UTF-8. Also build_string should always be correct because it will generate a correct multibyte string for an UTF-8 string with non-ASCII characters, and a correct unibyte string for an ASCII string, right? > > > +static _Noreturn void > > +json_parse_error (const json_error_t *error) > > +{ > > + xsignal (Qjson_parse_error, > > + list5 (build_string (error->text), build_string > (error->source), > > + make_natnum (error->line), make_natnum > (error->column), > > + make_natnum (error->position))); > > +} > > I think error->source could include non-ASCII characters, in which > case you need to use make_specified_string with its last argument > non-zero, not build_string, which has its own ideas about when to > produce a multibyte string. > > > +static _GL_ARG_NONNULL ((2)) Lisp_Object > > +lisp_to_json_1 (Lisp_Object lisp, json_t **json) > > +{ > > + if (VECTORP (lisp)) > > + { > > + ptrdiff_t size = ASIZE (lisp); > > + eassert (size >= 0); > > + if (size > SIZE_MAX) > > + xsignal1 (Qoverflow_error, build_pure_c_string ("vector is too > long")); > > I don't think you can allocate pure storage at run time, only at dump > time. (There are more of this elsewhere in the patch.) > OK, will be fixed in the next version. > > > + /* LISP now must be a vector or hashtable. */ > > + if (++lisp_eval_depth > max_lisp_eval_depth) > > + xsignal0 (Qjson_object_too_deep); > > This error could mislead: the problem could be in the nesting of > surrounding Lisp being too deep, and the JSON part could be just fine. > Agreed, but I think it's better to use lisp_eval_depth here because it's the total nesting depth that could cause stack overflows. > > > + Lisp_Object string > > + = make_string (buffer_and_size->buffer, buffer_and_size->size); > > This is arbitrary text, so I'm not sure make_string is appropriate. > Could the text be a byte stream, i.e. not human-readable text? If so, > do we want to create a unibyte string or a multibyte string here? > It should always be UTF-8. > > > + insert_from_string (string, 0, 0, SCHARS (string), SBYTES (string), > false); > > Hmmm... if you want to insert the text into the buffer, you need to > make sure it has the right representation. What kind of text is this? > It probably should be decoded. > > In any case, going through a string sounds gross. You should insert > the text directly into the gap, like we do in a couple of places > already. See insert_from_gap and its users, and maybe also > decode_coding_gap. > OK, I'll have to check that, but it sounds doable. > > > +DEFUN ("json-parse-string", Fjson_parse_string, Sjson_parse_string, 1, > 1, NULL, > > + doc: /* Parse the JSON STRING into a Lisp object. > > +This is essentially the reverse operation of `json-serialize', which > > +see. The returned object will be a vector or hashtable. Its elements > > +will be `:null', `:false', t, numbers, strings, or further vectors and > > +hashtables. If there are duplicate keys in an object, all but the > > +last one are ignored. If STRING doesn't contain a valid JSON object, > > +an error of type `json-parse-error' is signaled. */) > > + (Lisp_Object string) > > +{ > > + ptrdiff_t count = SPECPDL_INDEX (); > > + check_string_without_embedded_nulls (string); > > + > > + json_error_t error; > > + json_t *object = json_loads (SSDATA (string), 0, &error); > > Doesn't json_loads require the string to be encoded in some particular > encoding? If so, passing it our internal representation might not be > TRT. > > > + /* First, parse from point to the gap or the end of the accessible > > + portion, whatever is closer. */ > > + ptrdiff_t point = d->point; > > + ptrdiff_t end; > > + { > > + bool overflow = INT_ADD_WRAPV (BUFFER_CEILING_OF (point), 1, &end); > > + eassert (!overflow); > > + } > > + size_t count; > > + { > > + bool overflow = INT_SUBTRACT_WRAPV (end, point, &count); > > + eassert (!overflow); > > + } > > Why did you need these blocks in braces? > To be able to reuse the "overflow" name/ > > > +(provide 'json-tests) > > +;;; json-tests.el ends here > > IMO, it would be good to test also non-ASCII text in JSON objects. > > Yes, once the patch is in acceptable shape, I plan to add many more tests.