From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.devel Subject: Re: Dynamic loading progress Date: Thu, 19 Nov 2015 17:26:59 +0200 Message-ID: <83y4du80xo.fsf@gnu.org> References: <83k2ptq5t3.fsf@gnu.org> <87h9kxx60e.fsf@lifelogs.com> <877flswse5.fsf@lifelogs.com> <8737wgw7kf.fsf@lifelogs.com> <87io5bv1it.fsf@lifelogs.com> <87egfzuwca.fsf@lifelogs.com> <876118u6f2.fsf@lifelogs.com> <8737w3qero.fsf@lifelogs.com> <831tbn9g9j.fsf@gnu.org> <878u5upw7o.fsf@lifelogs.com> <83ziya8xph.fsf@gnu.org> Reply-To: Eli Zaretskii NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-Trace: ger.gmane.org 1447946929 5636 80.91.229.3 (19 Nov 2015 15:28:49 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Thu, 19 Nov 2015 15:28:49 +0000 (UTC) Cc: emacs-devel@gnu.org To: Ted Zlatanov , =?iso-8859-1?Q?Aur=E9lien?= Aptel Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Thu Nov 19 16:28:40 2015 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1ZzR8Y-0006BS-N3 for ged-emacs-devel@m.gmane.org; Thu, 19 Nov 2015 16:28:38 +0100 Original-Received: from localhost ([::1]:42465 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZzR8Y-0004VI-1u for ged-emacs-devel@m.gmane.org; Thu, 19 Nov 2015 10:28:38 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:34725) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZzR7K-0003FD-Gq for emacs-devel@gnu.org; Thu, 19 Nov 2015 10:27:24 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ZzR7F-0002SA-DB for emacs-devel@gnu.org; Thu, 19 Nov 2015 10:27:22 -0500 Original-Received: from mtaout24.012.net.il ([80.179.55.180]:45127) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZzR7E-0002Rl-RH for emacs-devel@gnu.org; Thu, 19 Nov 2015 10:27:17 -0500 Original-Received: from conversion-daemon.mtaout24.012.net.il by mtaout24.012.net.il (HyperSendmail v2007.08) id <0NY200J00IF66600@mtaout24.012.net.il> for emacs-devel@gnu.org; Thu, 19 Nov 2015 17:20:10 +0200 (IST) Original-Received: from HOME-C4E4A596F7 ([84.94.185.246]) by mtaout24.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0NY200GPMILL3J30@mtaout24.012.net.il>; Thu, 19 Nov 2015 17:20:10 +0200 (IST) In-reply-to: <83ziya8xph.fsf@gnu.org> X-012-Sender: halo1@inter.net.il X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-Received-From: 80.179.55.180 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:194789 Archived-At: Thanks for making this possible. Please allow me a few questions/comments about the dynamic modules code, based on some initial reading: Why module.c, but confusingly emacs_module.h? Are there any reasons not to use the same base name, like we do for other C sources? This comment in module.c: /* If checking is enabled, abort if the current thread is not the Emacs main thread. */ static void check_main_thread (void); confused me, because a later comment says: void module_init (void) { /* It is not guaranteed that dynamic initializers run in the main thread, therefore we detect the main thread here. */ If dynamic initializers might run in a thread different from the Emacs main thread, then the code in module_init will record that other thread, not the Emacs main thread, right? Also, another comment: /* On Windows, we store both a handle to the main thread and the thread ID because the latter can be reused when a thread terminates. */ seems to imply that 'main_thread' here is not the Emacs's main thread, because that thread never terminates as long as the Emacs session is alive. So what's the deal here? what does this thread checking supposed to detect? In this code from module.c: Lisp_Object value = HASH_VALUE (h, i); eassert (NATNUMP (value)); const EMACS_UINT refcount = XFASTINT (value); if (refcount >= MOST_POSITIVE_FIXNUM) { module_non_local_exit_signal_1 (env, Qoverflow_error, Qnil); return NULL; } how can the 'if' clause ever be true? refcount is an Emacs integer, as you have just verified, no? And if this somehow can happen, then why isn't there a similar check in the other functions? Re this fragment from module.c: Lisp_Object ret = list4 (Qlambda, list2 (Qand_rest, Qargs), documentation ? build_string (documentation) : Qnil, list3 (module_call_func, envobj, Qargs)); Thou shalt not use build_string, except when you _know_ the argument will always be a pure-ASCII string. Practically, this means the argument must be a constant ASCII string. See these messages (and the preceding discussion, if you are interested) for the gory details: http://lists.gnu.org/archive/html/bug-gnu-emacs/2013-10/msg00955.html http://lists.gnu.org/archive/html/bug-gnu-emacs/2013-10/msg00976.html http://lists.gnu.org/archive/html/bug-gnu-emacs/2013-10/msg00979.html The above should call make_multibyte_string instead. Again in module.c: /* * Emacs internal encoding is more-or-less UTF8, let's assume utf8 * encoded emacs string are the same byte size. */ if (!buffer || length == 0 || *length-1 < raw_size) { *length = raw_size + 1; return false; } I don't understand why you need this assumption. You are going to encode the string in a moment, so why not test 'length' against the size you actually obtain there? (The above test will misfire when the original string includes characters above the Unicode codespace, which require 5 bytes internally, but their encoding maps them to Unicode codepoints which cannot take more than 4 bytes. So you might reject perfectly valid calls.) In module_make_string you have: /* Assume STR is utf8 encoded */ return lisp_to_value (env, make_string (str, length)); The discussion I pointed to above concluded that make_string is a bug. So please use make_multibyte_string here instead. It looks like XUSER_PTR is used both as an lvalue and an rvalue. This is different from any other object, where we have separate Xfoo and XSETfoo macros. Suggest to follow suit. static void module_set_user_finalizer (emacs_env *env, emacs_value uptr, emacs_finalizer_function fin) { check_main_thread (); eassert (module_non_local_exit_check (env) == emacs_funcall_exit_return); const Lisp_Object lisp = value_to_lisp (uptr); if (! USER_PTRP (lisp)) module_wrong_type (env, Quser_ptr, lisp); XUSER_PTR (lisp)->finalizer = fin; <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< } No validity checks on 'fin'? In module_vec_get: /* Prevent error-prone comparison between types of different signedness. */ const size_t size = ASIZE (lvec); eassert (size >= 0); How can the assertion be ever violated? In module-load: CHECK_STRING (file); handle = dynlib_open (SDATA (file)); Every Lisp primitive that accepts file arguments _must_ call expand-file-name on the file name, before using it. Otherwise, relative file names will produce subtle and hard-to-debug problems when the Lisp program calling them involves changing the current directory of the Emacs process. The other mandatory thing absent from the above is ENCODE_FILE. You cannot pass unencoded file names to C runtime functions. struct { struct emacs_runtime pub; struct emacs_runtime_private priv; } runtime = { .pub = { .size = sizeof runtime.pub, .get_environment = module_get_environment, .private_members = &runtime.priv } }; Is this portable enough? int r = module_init (&runtime.pub); I think calling this function "module_init" is sub-optimal: you have a void function by the same name in this source file. How about renaming it to something else? Also, it seems that once a module is loaded, it cannot be unloaded (i.e., no unload-feature equivalent is provided). Is that on purpose? An Emacs process can live for a very long time, so keeping all of the modules open in it at all times is not necessarily TRT. E.g., how do I update to a newer version of a module? Shouldn't module-init call dynlib_close before it returns? Otherwise we are leaking descriptors here, no? In module-call: const EMACS_INT len = XINT (Flength (arglist)); eassert (len >= 0); if (len > MOST_POSITIVE_FIXNUM) xsignal0 (Qoverflow_error); How can the 'if' clause ever be true? XINT by definition cannot produce anything but a valid EMACS_INT, can it? if (len > INT_MAX || len < envptr->min_arity || (envptr->max_arity >= 0 && len > envptr->max_arity)) xsignal2 (Qwrong_number_of_arguments, module_format_fun_env (envptr), make_number (len)); Why the test against INT_MAX? EMACS_INT can legitimately be a 64-bit data type with values far exceeding INT_MAX. Why impose this limitation here? allocate_emacs_value calls malloc; shouldn't it call xmalloc instead, or at least conform to the XMALLOC_BLOCK_INPUT_CHECK protocol? In module_format_fun_env, you produce a unibyte string, and then use that in calls to functions like xsignal1, which expect Lisp strings in their internal multibyte representation. You should instead decode the unibyte string (using UTF-8) before you return it. Btw, I wonder whether we should provide a more friendly capabilities for retrieving the function name and its module. dladdr is not portable enough, and we have all the information at our fingertips in the module's init function, we just need to keep it instead of throwing it away. I envision debugging module-related problems will not be a very rare situation, so we need any help we can get. WDYT? In syms_of_module: /* Unintern `module-environments' because it is only used internally. */ Funintern (Qmodule_environments, Qnil); What if some Lisp defines a (interned) symbol by that name? Won't they clash? The Windows-specific parts of dynlib.c need work, e.g. you cannot pass UTF-8 encoded file names to Windows APIs. And there are some other issues. I'll take care of that. About the tests: The Makefile in mod-test is Unix-specific: it uses a literal .so extension. I also think the Python script should be rewritten in Emacs Lisp, so that Python installation is not required. Finally, all of the module tests and associated files should be moved into test/, preferably even test/automated/ and made part of the "make check" run.