From: Paul Eggert <eggert@cs.ucla.edu>
To: Philipp Stephani <p.stephani2@gmail.com>, Eli Zaretskii <eliz@gnu.org>
Cc: larsi@gnus.org, johnw@gnu.org, emacs-devel@gnu.org
Subject: Re: Character literals for Unicode (control) characters
Date: Mon, 14 Mar 2016 13:03:38 -0700 [thread overview]
Message-ID: <56E7191A.60507@cs.ucla.edu> (raw)
In-Reply-To: <CAArVCkTjD+09yPfY50Xb7TbxC2hW_Fuf03pQZy0p0bfAU0hafQ@mail.gmail.com>
Thanks, here's a detailed low level review.
> Subject: [PATCH 4/4] Use `ucs-names'.
Summary lines like "Use `ucs-names'." should not end with "." and should
be as informative as possible within a 50-char limit.
> +#include <stdnoreturn.h>
This include reportedly doesn't work well with Microsoft compilers. Omit
it and use _Noreturn instead of noreturn.
> +/* Signals an `invalid-read-syntax' error indicating that the
> + character name in an \N{...} literal is invalid. */
Use active voice "Signal an" rather than a non-sentence. Don't use grave
quoting in comments (no quoting needed here anyway).
> +static noreturn void invalid_character_name (Lisp_Object name)
Put "static _Noreturn void" on the first line, and the rest on the next
line; that's the usual GNU style.
> +/* Checks that CODE is a valid Unicode scalar value, and returns its
> + value. CODE should be parsed from the character name given by
> + NAME. NAME is used for error messages. */
Active voice: "Checks" -> "Check".
> +static int check_scalar_value (Lisp_Object code, Lisp_Object name)
"static int" in a separate line.
> +{
> + if (! RANGED_INTEGERP (0, code, MAX_UNICODE_CHAR) ||
> + /* Don't allow surrogates. */
> + RANGED_INTEGERP (0xD800, code, 0xDFFF))
> + invalid_character_name (name);
> + return XINT (code);
> +}
RANGED_INTEGERP implies two tests for integer. Better would be an
explicit NUMBERP check, followed by an XINT, followed by C-language
range checks. Just use <= or < in range checks (not >= or >).
Also, don't put operators like || at the end of a line; put them at the
start of the next line instead.
> +/* If NAME starts with PREFIX, interpret the rest as a hexadecimal
> + number and return its value. Raises `invalid-read-syntax' if the
> + number is not a valid scalar value. Returns -1 if NAME doesn't
> + start with PREFIX. */
Active voice. No need for grave quoting.
> +static int
> +parse_code_after_prefix (Lisp_Object name, const char* prefix)
"char* x" -> "char *x" in GNU style.
> + if (name_len > prefix_len && name_len <= prefix_len + 8
Just use < or <= for range checks.
> + Lisp_Object code = string_to_number (SDATA (name) + prefix_len,
> 16, false);
> + if (! NILP (code))
> + return check_scalar_value (code, name);
Why is nil treated differently from other invalid values (e.g.,
floating-point numbers)? They're all invalid character names, right?
>
> + /* Various ranges of CJK characters; see UnicodeData.txt. */
> + if ((code >= 0x3400 && code <= 0x4DB5) ||
> + (code >= 0x4E00 && code <= 0x9FD5) ||
> + (code >= 0x20000 && code <= 0x2A6D6) ||
> + (code >= 0x2A700 && code <= 0x2B734) ||
> + (code >= 0x2B740 && code <= 0x2B81D) ||
> + (code >= 0x2B820 && code <= 0x2CEA1))
> + return code;
Use only <= here, and put || at the start of lines. What's the
likelihood that the numbers in the above test will change?
>
> + if (! CONSP (names))
> + invalid_syntax ("Unicode character name database not loaded");
This test is not needed, as ucs-names always returns a cons, and anyway
even if it didn't then Fassoc would do the right thing.
> + /* 200 characters is hopefully long enough. Increase if
> + not. */
> + char name[200];
Give a name to this constant, e.g.,
/* Bound on the length of a Unicode character name.
As of Unicode 9.0.0 the maximum is 83, so this should be safe. */
enum { UNICODE_CHARACTER_NAME_LENGTH_BOUND = 199 };
...
char name[UNICODE_CHARACTER_NAME_LENGTH_BOUND + 1];
next prev parent reply other threads:[~2016-03-14 20:03 UTC|newest]
Thread overview: 47+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-03-03 5:47 Character literals for Unicode (control) characters Lars Ingebrigtsen
2016-03-03 6:20 ` John Wiegley
2016-03-03 6:25 ` Lars Ingebrigtsen
2016-03-03 6:34 ` Drew Adams
2016-03-03 16:11 ` Paul Eggert
2016-03-03 20:48 ` Eli Zaretskii
2016-03-03 23:58 ` Paul Eggert
2016-03-05 15:28 ` Philipp Stephani
2016-03-05 15:39 ` Marcin Borkowski
2016-03-05 16:51 ` Philipp Stephani
2016-03-06 2:27 ` John Wiegley
2016-03-06 15:24 ` Philipp Stephani
2016-03-06 15:54 ` Eli Zaretskii
2016-03-06 17:35 ` Philipp Stephani
2016-03-06 18:08 ` Paul Eggert
2016-03-06 18:28 ` Philipp Stephani
2016-03-06 19:03 ` Paul Eggert
2016-03-06 19:16 ` Philipp Stephani
2016-03-06 20:05 ` Eli Zaretskii
2016-03-13 20:31 ` Philipp Stephani
2016-03-14 20:03 ` Paul Eggert [this message]
2016-03-14 20:30 ` Eli Zaretskii
2016-03-15 11:09 ` Nikolai Weibull
2016-03-15 17:10 ` Eli Zaretskii
2016-03-16 8:16 ` Nikolai Weibull
2016-03-14 21:27 ` Clément Pit--Claudel
2016-03-14 21:48 ` Paul Eggert
2016-03-19 16:27 ` Philipp Stephani
2016-03-20 12:58 ` Paul Eggert
2016-03-20 13:25 ` Philipp Stephani
2016-03-25 17:41 ` Philipp Stephani
2016-04-22 2:39 ` Paul Eggert
2016-04-22 7:57 ` Eli Zaretskii
2016-04-22 8:01 ` Eli Zaretskii
2016-04-22 9:39 ` Elias Mårtenson
2016-04-22 10:01 ` Eli Zaretskii
2016-04-25 17:48 ` Paul Eggert
2016-03-05 16:35 ` Clément Pit--Claudel
2016-03-05 17:12 ` Paul Eggert
2016-03-05 17:53 ` Clément Pit--Claudel
2016-03-05 18:16 ` Eli Zaretskii
2016-03-05 18:34 ` Clément Pit--Claudel
2016-03-05 18:56 ` Eli Zaretskii
2016-03-05 19:08 ` Drew Adams
2016-03-05 22:52 ` Clément Pit--Claudel
2016-03-06 15:49 ` Joost Kremers
2016-03-06 16:55 ` Drew Adams
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=56E7191A.60507@cs.ucla.edu \
--to=eggert@cs.ucla.edu \
--cc=eliz@gnu.org \
--cc=emacs-devel@gnu.org \
--cc=johnw@gnu.org \
--cc=larsi@gnus.org \
--cc=p.stephani2@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.