From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: "Brennan Vincent" Newsgroups: gmane.emacs.devel Subject: Re: [PATCH] Add a mechanism for passing unibyte strings from lisp to modules. Date: Wed, 26 Jun 2024 23:36:16 -0400 Message-ID: <87wmmbgiq7.fsf@taipei.mail-host-address-is-not-set> References: <225D336D-933E-4CA3-B245-89992D7E6C41@umanwizard.com> <86frt5jwtc.fsf@gnu.org> <87o77t6lyn.fsf@taipei.mail-host-address-is-not-set> <867cehgdn1.fsf@gnu.org> <861q4md0o1.fsf@gnu.org> <87jzickjq8.fsf@taipei.mail-host-address-is-not-set> <86wmmb99f7.fsf@gnu.org> <86ikxv96sd.fsf@gnu.org> <867ceb90qj.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="32725"; mail-complaints-to="usenet@ciao.gmane.io" Cc: acorallo@gnu.org, stefankangas@gmail.com, emacs-devel@gnu.org To: Eli Zaretskii , tomas@tuxteam.de Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Thu Jun 27 05:37:28 2024 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1sMfwu-0008KE-HW for ged-emacs-devel@m.gmane-mx.org; Thu, 27 Jun 2024 05:37:28 +0200 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sMfw6-0000Cp-Ew; Wed, 26 Jun 2024 23:36:38 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sMfw4-0000Ca-Gt for emacs-devel@gnu.org; Wed, 26 Jun 2024 23:36:36 -0400 Original-Received: from smtp.umanwizard.com ([54.203.248.109]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sMfw2-0001oQ-T2 for emacs-devel@gnu.org; Wed, 26 Jun 2024 23:36:36 -0400 Original-Received: from localhost ([108.6.22.48]) by smtp.umanwizard.com ; 27 Jun 2024 03:36:18 +0000 X-Fes-Received-For: emacs-devel@gnu.org X-Fes-Received-From: In-Reply-To: <867ceb90qj.fsf@gnu.org> X-Fes-Encrypted: true X-Fes-Ehlo-Domain: localhost Received-SPF: pass client-ip=54.203.248.109; envelope-from=brennan@umanwizard.com; helo=smtp.umanwizard.com X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.devel:320739 Archived-At: Eli Zaretskii writes: >> Date: Wed, 26 Jun 2024 15:33:09 +0200 >> From: tomas@tuxteam.de >> Cc: brennan@umanwizard.com, acorallo@gnu.org, stefankangas@gmail.com, >> emacs-devel@gnu.org >> >> > > > How will it be different from the Lisp vectors we already have? >> > > >> > > The box around every byte. >> > >> > What box? Please tell more, as I don't think I follow. >> >> Maybe I'm all wrong, but AFAIU, a vector can contain arbitrary Lisp >> values. That makes 64bits/8bits plus boxing/unboxing (which is, I >> assume, quick, but nonzero). >> >> Having a specialized "array of bytes" (as there is one for bools) >> might be beneficial for big arrays, and perhaps avoid big data moving >> operations over the C/LISP fence. > > If you are saying that using 64-bit values there incurs a run-time > performance penalty, then accessing bytes does that as well. Someone > should profile this and present evidence wrt the relative performance > of these, then we can discuss whether the penalty is real and whether > it is worth adding yet another data type to Emacs. Sure, I wrote a quick benchmark that passes a 10MB buffer to a module which just sums the bytes and returns and integer. It is about 200x faster using a unibyte string (with my original patch) than a vector. C code: // Compile with gcc -O3 -fPIC -shared -o test-module.so test.c #include #include int plugin_is_GPL_compatible; static emacs_value Fcall_test(emacs_env *env, ptrdiff_t nargs, emacs_value args[], void *) EMACS_NOEXCEPT { unsigned char sum = 0; emacs_value vec = args[0]; size_t sz = env->vec_size(env, vec); for (int i = 0; i < sz; ++i) sum += env->extract_integer(env, env->vec_get(env, vec, i)); return env->make_integer(env, sum); } static emacs_value Fcall_test2(emacs_env *env, ptrdiff_t nargs, emacs_value args[], void *) EMACS_NOEXCEPT { unsigned char sum = 0; emacs_value arr = args[0]; char *buf; ptrdiff_t sz = 0; env->copy_unibyte_string_contents(env, arr, NULL, &sz); buf = malloc(sz); env->copy_unibyte_string_contents(env, arr, buf, &sz); for (int i = 0; i < sz - 1; ++i) sum += buf[i]; return env->make_integer(env, sum); } /* bind c_func (native) to e_func (elisp) */ static void bind(emacs_env *env, emacs_value (*c_func) (emacs_env *env, ptrdiff_t nargs, emacs_value args[], void *) EMACS_NOEXCEPT, const char *e_func, ptrdiff_t min_arity, ptrdiff_t max_arity, const char *doc, void *data) { emacs_value fset_args[2]; fset_args[0] = env->intern(env, e_func); fset_args[1] = env->make_function(env, min_arity, max_arity, c_func, doc, data); env->funcall(env, env->intern(env, "fset"), 2, fset_args); } int emacs_module_init(struct emacs_runtime *ert) { emacs_env *env = ert->get_environment(ert); bind(env, Fcall_test, "btv--test", 1, 1, "test using vector", NULL); bind(env, Fcall_test2, "btv--test2", 1, 1, "test using byte array", NULL); emacs_value provide_arg = env->intern(env, "test-module"); env->funcall(env, env->intern(env, "provide"), 1, &provide_arg); return 0; } Elisp code: (require 'test-module) (require 'benchmark) (setq v (make-vector 10000001 37)) (setq v2 (make-string 10000001 37)) `(,(benchmark-elapse (btv--test v)) ,(benchmark-elapse (btv--test2 v2))) Result of evaluating elisp code: (0.17861138 0.000805208)