From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Ruijie Yu via "Bug reports for GNU Emacs, the Swiss army knife of text editors" Newsgroups: gmane.emacs.bugs Subject: bug#63125: 30.0.50; [BUG] last argument of libxml-parse-html-region has no effect? Date: Sat, 29 Apr 2023 08:58:03 +0800 Message-ID: References: <83h6t1s16p.fsf@gnu.org> <83ttx0qm3z.fsf@gnu.org> Reply-To: Ruijie Yu Mime-Version: 1.0 Content-Type: text/plain Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="35404"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: mu4e 1.9.22; emacs 30.0.50 Cc: Lars Ingebrigtsen , 63125@debbugs.gnu.org To: Eli Zaretskii Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Sat Apr 29 03:22:16 2023 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1psZI0-00090A-4j for geb-bug-gnu-emacs@m.gmane-mx.org; Sat, 29 Apr 2023 03:22:16 +0200 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1psZHn-0000YN-S3; Fri, 28 Apr 2023 21:22:04 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1psZHm-0000Xl-Qh for bug-gnu-emacs@gnu.org; Fri, 28 Apr 2023 21:22:02 -0400 Original-Received: from debbugs.gnu.org ([209.51.188.43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1psZHm-0001PT-F9 for bug-gnu-emacs@gnu.org; Fri, 28 Apr 2023 21:22:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1psZHm-0000N3-7G for bug-gnu-emacs@gnu.org; Fri, 28 Apr 2023 21:22:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Ruijie Yu Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 29 Apr 2023 01:22:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 63125 X-GNU-PR-Package: emacs Original-Received: via spool by 63125-submit@debbugs.gnu.org id=B63125.16827312741366 (code B ref 63125); Sat, 29 Apr 2023 01:22:02 +0000 Original-Received: (at 63125) by debbugs.gnu.org; 29 Apr 2023 01:21:14 +0000 Original-Received: from localhost ([127.0.0.1]:34760 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1psZH0-0000Ly-1w for submit@debbugs.gnu.org; Fri, 28 Apr 2023 21:21:14 -0400 Original-Received: from netyu.xyz ([152.44.41.246]:40170 helo=mail.netyu.xyz) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1psZGx-0000Lp-UU for 63125@debbugs.gnu.org; Fri, 28 Apr 2023 21:21:12 -0400 Original-Received: from fw.net.yu.netyu.xyz ( [222.248.4.98]) by netyu.xyz (OpenSMTPD) with ESMTPSA id bed7e80d (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO); Sat, 29 Apr 2023 01:21:10 +0000 (UTC) In-reply-to: <83ttx0qm3z.fsf@gnu.org> X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.bugs:260794 Archived-At: Eli Zaretskii writes: >> From: Ruijie Yu >> Cc: Eli Zaretskii , 63125@debbugs.gnu.org >> Date: Fri, 28 Apr 2023 18:40:35 +0800 >> >> > I have filed an issue [1] in libxml2. We'll see what they say about it. >> > >> > FTR, [2] is the documentation of the libxml2's htmlReadMemory() >> > function -- though it does not say much. >> > >> > [1]: https://gitlab.gnome.org/GNOME/libxml2/-/issues/525 >> > [2]: >> > https://gnome.pages.gitlab.gnome.org/libxml2/devhelp/libxml2-HTMLparser.html#htmlReadMemory. >> >> I just got a response from one of libxml2's maintainers. >> >> It seems that the docstring for `libxml-parse-html-region' is wrong: >> this argument has never served the purpose of resolving relative URLs. >> It was only used for error messages. So I suggest that we modify the >> docstring of this function and `libxml-parse-xml-region' to reflect this >> fact. > > The response doesn't say much. What is this "base URL" argument used > for, and why is it named "bas URL"? What does it mean "used for error > messages"? And where is the up-to-date and accurate documentation of > this function, which explains what is this argument for? > > Without knowing all that, we cannot fix our documentation, let alone > code. The "base-url" is an argument to the Elisp function `libxml-parse-html-region'. I added Lars to the CC, who originally introduced this function according to git-blame, and who may have a better idea. The following portion are my impressions, but I'm happy to pass any questions you still have to the libxml2 devs if you want (or you can comment there directly in the linked issue on gnome's gitlab instance). ----- As you pointed out, these arguments of the Elisp function are passed with minimal transformations and sent to the libxml2 function `htmlReadMemory()' function. This C function takes an argument `url', which is the string `base-url' or empty string if `base-url' is nil. According to Nick (the libxml2 maintainer) and my interpretation, the `url' parameter of the libxml2 function is simply stored inside the `url' field of a `xmlDoc' struct, to be used when an error message needs to be displayed. So, the `url' parameter practically does nothing for us, since we disable all libxml2-level warnings and errors in calling `htmlReadMemory()'. I put this url [1] to the issue assuming that it is the documentation, and Nick doesn't have any comment regarding the url. So this is probably the up-to-date, albeit not very elaborate, documentation for the function. [1]: https://gnome.pages.gitlab.gnome.org/libxml2/devhelp/libxml2-HTMLparser.html#htmlReadMemory -- Best, RY [Please note that this mail might go to spam due to some misconfiguration in my mail server -- still investigating.]