From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Alexis Newsgroups: gmane.emacs.devel Subject: "args-out-of-range" error when using data from external process on Windows Date: Thu, 18 Apr 2024 15:39:10 +1000 Message-ID: <87bk671b7l.fsf@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="5456"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: mu4e 1.12.4; emacs 29.3 To: emacs-devel@gnu.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Thu Apr 18 07:54:33 2024 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1rxKjA-0001Eh-JN for ged-emacs-devel@m.gmane-mx.org; Thu, 18 Apr 2024 07:54:32 +0200 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rxKiS-0003MC-Eb; Thu, 18 Apr 2024 01:53:48 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rxKUb-00009g-MY for emacs-devel@gnu.org; Thu, 18 Apr 2024 01:39:29 -0400 Original-Received: from mail-pg1-x533.google.com ([2607:f8b0:4864:20::533]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1rxKUS-0001Op-Ge for emacs-devel@gnu.org; Thu, 18 Apr 2024 01:39:27 -0400 Original-Received: by mail-pg1-x533.google.com with SMTP id 41be03b00d2f7-5dbcfa0eb5dso339693a12.3 for ; Wed, 17 Apr 2024 22:39:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1713418755; x=1714023555; darn=gnu.org; h=content-transfer-encoding:mime-version:message-id:date:user-agent :subject:to:from:from:to:cc:subject:date:message-id:reply-to; bh=/SgF/ZTAUlcFznxwfZeL2oLofuR2pq+re82fihkR8wk=; b=T0FmqJyKGc62JfR1/4R1xztRLJAxqTmdrnEtcttivilRlGHdARKh/7mvv/dcSj0m3M DN9drepQ01xWkCF4nu6549/KW+wscACGhXTdLpUnMYkRFHAZlxd1LUgKbLOk1jMGxlkx i7OMEOVaIEQAd5mIYv4rxt/5qzPIV6Xzqj8dqbXuJ5f9vs0oEX4798Zn+yZBa7uXF9ts wENxLnhNRVNbURBPsQaJN1taQS68Xhp+y4RqCT23T5BSg8B5iRpik2zwNga644rXZDYY GYrzcx7INXBiRL4QQijvHupNI/cJtZ+9Sen9Vg2Qgn3puqHJZkg4xV/o5bZREDFbFnla NVcw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1713418755; x=1714023555; h=content-transfer-encoding:mime-version:message-id:date:user-agent :subject:to:from:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=/SgF/ZTAUlcFznxwfZeL2oLofuR2pq+re82fihkR8wk=; b=B0Y1pwIh0/YJZD2+ERVPTlHBtFdYDC7kHE3PmXSnY6E3JHYQQxPlyy2l2U5zHnF8in TcqCNOnCnyVljsq/SiL6KxEiMWOpNUGjk2pi0CpIl72fU70ISpDiFFI24CNFMYoggPud 5/XYC5eHj6LzHY5kCU/5jO9pDLMowQod8wJ8mGUaYtUY2ATAKCuSB/i3hbXk96ygb6v+ NndaFk7NpvAAeB0LWZeGfqpV2vDnOwkBpyC81mYdUvfSsWmK4355PbXXPEuQ9p7vUAp8 r5tPdyA5m+DAQdu8BNj4srxzXv+nq+gqYSCQqIP3jfnbrZqPwXs/OJewCvkc8qx4xS0t aP8A== X-Gm-Message-State: AOJu0Yw0w1EP6KBLFTvKN3KhYNaFE5u1x/AOAQLktPXkQLV/gy94hv5f agzkw4WDv8X3jgpB1PPcNE4J0ERFbXQX1OvyuUS54+UNpstINBZLhhRcOQ== X-Google-Smtp-Source: AGHT+IEZ9LNJvVFRc0CDyyzfMNj+zqNUrq5Zj4wh+yN5nEoly/P4WSoz84yRLqtCmlyAJI6rbOOUPQ== X-Received: by 2002:a05:6a20:1584:b0:1aa:220a:d2c2 with SMTP id h4-20020a056a20158400b001aa220ad2c2mr2156976pzj.51.1713418754486; Wed, 17 Apr 2024 22:39:14 -0700 (PDT) Original-Received: from localhost ([120.21.220.186]) by smtp.gmail.com with ESMTPSA id b5-20020a170902d60500b001e421f98ebdsm609360plp.280.2024.04.17.22.39.13 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 17 Apr 2024 22:39:14 -0700 (PDT) Received-SPF: pass client-ip=2607:f8b0:4864:20::533; envelope-from=flexibeast@gmail.com; helo=mail-pg1-x533.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-Mailman-Approved-At: Thu, 18 Apr 2024 01:53:46 -0400 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.devel:317795 Archived-At: [Not currently subscribed to the list, so please cc me on=20 replies.] Hi all, A user of my `Ebuku` package has reported an=20 "args-out-of-range" error that i'm out of my depth trying to=20 diagnose. Here's the GitHub issue:=20 =20 https://github.com/flexibeast/ebuku/issues/32=20 =20 i can't reproduce the issue on my own system: * Gentoo + Emacs=20 29.3. * LANG=3Den_AU.UTF-8 * The only set LC_* variables are:=20 LC_MESSAGES=3DC LC_TIME=3Den_AU.UTF-8=20 * current-language-environment =3D "English" locale-coding-system =3D=20 * utf-8-unix Their system: Windows 11, using Emacs 29.2=20 * obtained via Scoop package manager; not using WSL=20 * LANG=3Dzh_CN.UTF-8, LC_ALL=3Dzh_CN.UTF-8=20 * current-language-environment: UTF-8 locale-coding-system =3D cp936=20 * default-process-coding-system =3D '(utf-8-dos . utf-8-unix)=20 * `Ebuku` uses `call-process` to call the Python-based `buku`=20 * bookmark database manager and present the resulting output in=20 * Emacs. buku stores data in an SQLite database.=20 =20 https://github.com/jarun/buku/=20 =20 The link:=20 =20 https://google.github.io/comprehensive-rust/=20 =20 in the buku database results in: ``` Debugger entered--Lisp=20 error: (args-out-of-range "1884. Welcome to Comprehensive Rust =F0=9F=A6=80= =20 - Comprehens..." 15862 15893)=20 match-string(1 "1884. Welcome to Comprehensive Rust =F0=9F=A6=80 -=20 Comprehensive Rust =F0=9F=A6=80") ebuku--search-helper("--print" "[all]"= =20 "-1000" "") ebuku-show-all() ebuku()=20 funcall-interactively(ebuku)1 command-execute(ebuku record)=20 execute-extended-command(nil "ebuku" "ebuku")=20 funcall-interactively(execute-extended-command nil "ebuku"=20 "ebuku") command-execute(execute-extended-command)=20 ``` Once the Unicode CRAB emoji is removed, there's no issue.=20 The link:=20 =20 https://coredumped.dev/2021/05/26/taking-org-roam-everywhere-with-logseq/= =20 =20 in the buku database results in: ``` Debugger entered--Lisp=20 error: (args-out-of-range "2027. Taking org-roam everywhere with=20 logseq =E2=80=A2 Core Dumped" 32318 32355)=20 match-string(1 "2027. Taking org-roam everywhere with logseq =E2=80=A2=20 Cor...") (setq tags (match-string 1 line)) (progn (string-match=20 "^\\s-*[#] \\(.*\\)$" line) (setq tags (match-string 1 line)))=20 [snip rest of traceback]=20 ``` The user has confirmed that the buku database is UTF-8.=20 Does anyone have any suggestions about what might be happening? i=20 presume my code is making some incorrect assumptions, or not doing=20 some encoding stuff that it should be. i really want to get=20 encoding and language support right, so even outside of this=20 specific issue, general comments about things i need to fix in=20 this regard would be most welcome. :-)=20 Alexis.