From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.devel Subject: Re: "args-out-of-range" error when using data from external process on Windows Date: Thu, 18 Apr 2024 09:01:38 +0300 Message-ID: <86msprfbul.fsf@gnu.org> References: <87bk671b7l.fsf@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="4156"; mail-complaints-to="usenet@ciao.gmane.io" Cc: emacs-devel@gnu.org To: Alexis Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Thu Apr 18 08:02:32 2024 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1rxKqt-0000pi-Jf for ged-emacs-devel@m.gmane-mx.org; Thu, 18 Apr 2024 08:02:31 +0200 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rxKqC-0005YT-G6; Thu, 18 Apr 2024 02:01:48 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rxKq7-0005WR-5O for emacs-devel@gnu.org; Thu, 18 Apr 2024 02:01:44 -0400 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rxKq6-00060I-EY; Thu, 18 Apr 2024 02:01:42 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=MIME-version:References:Subject:In-Reply-To:To:From: Date; bh=sIi2nhARSwxzEFUTpucdyFb7eF7cuGQGE/fSKCaouPA=; b=f1LBKEbJCd9zzkFU1Z43 h90Ch/aRAaLfNMJOH8eAyrLQj4u15Zau7R63MZoiGqa9nU+BPNjWIUTjSkowebv44BZ9IlfcmlowP Gx/7TGME6rAfDUWVfREBzyfdltz40VQSXjBNzMEjYgZohK9XmEJ4qZb0/+UIGWTzoP3T7ezQQbM4u upYQpmAwh4jAhoIvRm8nRfx/eN6EkLKPwlRPorgKdQMwHLu6bnIE0b1LlygdLJEt1GTdJ38ZlzsdX Hvg6RBqeF+NyFLKkWyWy7a1gtQQunViuiIf9cOBxwHuKhBDEQO0su0LPJHa0OcxggAyBAzgzcINZC lMsvRDlkV0aaVQ==; In-Reply-To: <87bk671b7l.fsf@gmail.com> (message from Alexis on Thu, 18 Apr 2024 15:39:10 +1000) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.devel:317796 Archived-At: > From: Alexis > Date: Thu, 18 Apr 2024 15:39:10 +1000 > > [Not currently subscribed to the list, so please cc me on > replies.] Hi all, A user of my `Ebuku` package has reported an > "args-out-of-range" error that i'm out of my depth trying to > diagnose. Here's the GitHub issue: > > https://github.com/flexibeast/ebuku/issues/32 > > i can't reproduce the issue on my own system: * Gentoo + Emacs > 29.3. * LANG=en_AU.UTF-8 * The only set LC_* variables are: > LC_MESSAGES=C LC_TIME=en_AU.UTF-8 > * current-language-environment = "English" locale-coding-system = > * utf-8-unix Their system: Windows 11, using Emacs 29.2 > * obtained via Scoop package manager; not using WSL > * LANG=zh_CN.UTF-8, LC_ALL=zh_CN.UTF-8 > * current-language-environment: UTF-8 locale-coding-system = cp936 > * default-process-coding-system = '(utf-8-dos . utf-8-unix) > * `Ebuku` uses `call-process` to call the Python-based `buku` > * bookmark database manager and present the resulting output in > * Emacs. buku stores data in an SQLite database. > > https://github.com/jarun/buku/ > > The link: > > https://google.github.io/comprehensive-rust/ > > in the buku database results in: ``` Debugger entered--Lisp > error: (args-out-of-range "1884. Welcome to Comprehensive Rust 🦀 > - Comprehens..." 15862 15893) > match-string(1 "1884. Welcome to Comprehensive Rust 🦀 - > Comprehensive Rust 🦀") ebuku--search-helper("--print" "[all]" > "-1000" "") ebuku-show-all() ebuku() > funcall-interactively(ebuku)1 command-execute(ebuku record) > execute-extended-command(nil "ebuku" "ebuku") > funcall-interactively(execute-extended-command nil "ebuku" > "ebuku") command-execute(execute-extended-command) > ``` Once the Unicode CRAB emoji is removed, there's no issue. > The link: > > https://coredumped.dev/2021/05/26/taking-org-roam-everywhere-with-logseq/ > > in the buku database results in: ``` Debugger entered--Lisp > error: (args-out-of-range "2027. Taking org-roam everywhere with > logseq • Core Dumped" 32318 32355) > match-string(1 "2027. Taking org-roam everywhere with logseq • > Cor...") (setq tags (match-string 1 line)) (progn (string-match > "^\\s-*[#] \\(.*\\)$" line) (setq tags (match-string 1 line))) > [snip rest of traceback] > ``` The user has confirmed that the buku database is UTF-8. > > Does anyone have any suggestions about what might be happening? Crystal ball says the package assumes UTF-8 encoding of the text from the sub-process, which is generally not what happens on Windows. Or maybe the package assumes that UTF-8 text from a sub-process will necessarily be decoded as UTF-8, which again can fail if the default coding-systems are not UTF-8 (which happens on Windows). The upshot is that the Lisp code expects some number of characters, but gets a different number of characters instead. But this is all basically stabbing in the dark, since I have no idea what that package does and what the program whose output it reads does. Suggest that you ask the user who reported that to show the actual output of the sub-process (e.g., by running the same command outside of Emacs and redirecting output to a file), and if the output looks correct, examine the Lisp code which processes that output, with an eye on how the text is decoded. For example, if the text from the sub-process is supposed to be UTF-8 encoded, your Lisp code should bind coding-system-for-read to 'utf-8', to make sure it is decoded correctly.