From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.devel Subject: Re: [PATCH] ffap.el: Exclude angle brackets from file names in XML Date: Sun, 10 Mar 2024 14:43:04 +0200 Message-ID: <861q8iw947.fsf@gnu.org> References: <874jdfkrno.fsf@strawberrytea.xyz> <86ttlewsb8.fsf@gnu.org> <01249b55-1cdf-40c6-92b8-16b1521baf33@app.fastmail.com> <86frwywpw7.fsf@gnu.org> <86cys2wmmr.fsf@gnu.org> <865xxuwej7.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="1601"; mail-complaints-to="usenet@ciao.gmane.io" Cc: look@strawberrytea.xyz, emacs-devel@gnu.org To: Yuri Khan Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Sun Mar 10 13:43:53 2024 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1rjIWv-0000F0-Jo for ged-emacs-devel@m.gmane-mx.org; Sun, 10 Mar 2024 13:43:53 +0100 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rjIWE-0006dk-CI; Sun, 10 Mar 2024 08:43:10 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rjIWC-0006dK-HF for emacs-devel@gnu.org; Sun, 10 Mar 2024 08:43:08 -0400 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rjIWC-0004i0-6o; Sun, 10 Mar 2024 08:43:08 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=MIME-version:References:Subject:In-Reply-To:To:From: Date; bh=4YARitiNCQJy5W7JhiPtTUOHeRTpCY6b2niL+7S83RY=; b=PmvAQgnVr+rN6iG5PeKE dWxyz2oNEx8W/5qC8PH1QT83n/WUilz0mMIo1kH00TNEPORnbMvkopDzTjabgqdh8gGJeLGhL6KV2 ZZ7q9hsE2/2ujl21qJiy31Rz5zMtI7KsSK3Y2yrc/pLi6ea8OCtExPoETnkraO+LXfIqsAgd8HF3S CaKrC5KSifoBnX/OCsz3Dsbn7nX/QZkRcBBhHtwRX+utV2dSL/yx2qcfg5wAB0SS5nYnSJBfgz7gB LR1Bf/Vksfpdv20PaTIsP/iFf3BD8aV4ZZa6G3DSkRCBCn86sBlW+zh38FATR8IZQpj3DluKjS7I+ gOVNz3WzpSTL5Q==; In-Reply-To: (message from Yuri Khan on Sun, 10 Mar 2024 19:17:27 +0700) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.devel:316970 Archived-At: > From: Yuri Khan > Date: Sun, 10 Mar 2024 19:17:27 +0700 > Cc: look@strawberrytea.xyz, emacs-devel@gnu.org > > On Sun, 10 Mar 2024 at 17:46, Eli Zaretskii wrote: > > > I was surprised that you didn't (and AFAIU still > > don't) suggest that we interpret the tags according to the schema. > > I think I explained that: because there is nothing in the XML Schema > meta-schema that talks specifically of files and file names. Then how do programs that process such XML know where are file names? > It might be possible to consider element content or attributes typed > as ‘anyURI’ to contain file names. It will incur false positives (when > an anyURI attribute actually contains a non-file URI) and false > negatives (when the schema of a document is not known to Emacs, or > when it does not specifically designate an element or attribute as > ‘anyURI’ even though it contains file names in practice). Is that what programs which process XML do? > > AFAIR, we do have an XML mode which can parse the schema, so it seems > > reasonable to expect that we could recognize file names using the > > information in the schema without any guesswork. If this is > > impossible, please explain why not, but please be specific: refer to > > what Emacs can know about an XML document using the existing XML > > mode(s), and how that affects our ability to know which tags specify > > file names. > > I assume you’re referring to ‘rnc-validate-mode’ minor mode and the > few schemas it knows about out of the box? > > The OP is working with an unspecified, possibly proprietary, XML-based > format. Likely, its schema is not included in Emacs, and possibly not > even readily available to the OP. If available, it might not be in the > RELAX NG Compact format, requiring conversion. If not available, the > OP would have to derive a schema from observed document instances. So how does rnc-validate-mode somehow always knows to indicate whether a given XML file is valid or not? > Meanwhile, the generic approach “the longest sequence of characters > allowed in file names, subject to restrictions of your buffer’s > syntax, starting from the point where you type M-RET and extending > both ways, is potentially a file name”, is easy to implement and > usable without any preliminary setup, although probably biased towards > false positives (attempting to produce a file name when the element or > attribute is not intended as such). If that's the best (not necessarily the easiest) we can do, I'm okay with doing that. But in general, IMO ffap.el is full of problematic heuristics that is known to fail from time to time, so whenever we can avoid the guesswork, we should, even if it is not the easiest solution. Thanks.