From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Yuri Khan Newsgroups: gmane.emacs.devel Subject: Re: [PATCH] ffap.el: Exclude angle brackets from file names in XML Date: Sun, 10 Mar 2024 19:17:27 +0700 Message-ID: References: <874jdfkrno.fsf@strawberrytea.xyz> <86ttlewsb8.fsf@gnu.org> <01249b55-1cdf-40c6-92b8-16b1521baf33@app.fastmail.com> <86frwywpw7.fsf@gnu.org> <86cys2wmmr.fsf@gnu.org> <865xxuwej7.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="376"; mail-complaints-to="usenet@ciao.gmane.io" Cc: look@strawberrytea.xyz, emacs-devel@gnu.org To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Sun Mar 10 13:18:37 2024 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1rjI8S-000ATX-TZ for ged-emacs-devel@m.gmane-mx.org; Sun, 10 Mar 2024 13:18:36 +0100 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rjI7d-0002gw-4g; Sun, 10 Mar 2024 08:17:45 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rjI7b-0002go-Rm for emacs-devel@gnu.org; Sun, 10 Mar 2024 08:17:44 -0400 Original-Received: from mail-oi1-x231.google.com ([2607:f8b0:4864:20::231]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1rjI7Z-0000bj-ME; Sun, 10 Mar 2024 08:17:43 -0400 Original-Received: by mail-oi1-x231.google.com with SMTP id 5614622812f47-3c1e992f069so1706223b6e.3; Sun, 10 Mar 2024 05:17:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1710073059; x=1710677859; darn=gnu.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=k2IIx4j6kdmUt0M3DqvWFIVjZosBt22mXtmrKO724q0=; b=O+UwWRZQvN155ph7nW6VXWIJ+00zQtCU+qgHPPRHdUWNeJ/ISYltnPWbh4mYlz1gnO c87YAqg8Vt/75DjWPp/G0t9P6Ace9NOeBvFcJxYtl4dj2U6aVVLYP1sexgsGYLaMp+pg 5mjIwlKO5RiJoI9fRmQqvEauskL0D0ZQLdrYvuDRqP3EShs0+enNXjAN2y9755xBev+4 jqGUZGn41dkv9xZ2gZ7Xwj+MSOHLPAUnp07SCTpxggrb3iGM5ap6Ja6eGoIfIpSO0yVN Cn+9WlwvxR0NjIQYCLLsoFxVSmYMbUTmI3VPjhziQBBWRbdpS1MEagGgSG+Hdu7vGMIa o37Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710073059; x=1710677859; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=k2IIx4j6kdmUt0M3DqvWFIVjZosBt22mXtmrKO724q0=; b=BuFKREoXqgUxjJLLpMRWKKj61hZsQxJk8wPccnmW4Ulqdu5rCEne8XIwVEtkZJuK3l 6wD/Z8Vj1xHVqb6a5uxNsZz3jGU/wDBjXIbomdTDW99Sp//hqKJrmrlXcfBie0tvJrwf smp4ZLpDitTZMz4AycKd9MK5AjA8ShmlCLjlB4aywKE3S0ZH6sKSqVYB2HRp0FTZrq1z fT99FWQlfAwIT7rjTBy7JxqQXykPLK55hia2gK3ZMjl+/MEt1+hcx+xRcZLc16l8KjiB 4LuMF1peutfQgyRsOitqzmlAp9RteBh6VeRlcLvc4cjxvJvsNxixqrV4aXULmTGGQxyZ bl1w== X-Forwarded-Encrypted: i=1; AJvYcCVZWS0rT4/sC/vHQm1XrbsnOAIl9U6vMzntoK/58usCMWd8e3k+IDbkfgy7homxSoE7lBlVXx2UTA+zIovx8fWtCtDL X-Gm-Message-State: AOJu0YwXcx0yzONQjV8rUIioH+rw3K3oDpUwJ4YpXnLRlYY1dRDbtS4h aQAacqceFVfOi0mlxYu2+vS993bol7rCw5KwJlfQseizSVkmBNeUzPEIpkorE7xXO5gdDF6hhdR ls120rxwacLpEIOnGTfpGqQcRax6hf53c X-Google-Smtp-Source: AGHT+IEv0cKbyIoBkcs3pzsNyelHgkCka3GzGtHm+PN4hiBCapW+qK/ONqsJRDjOP4n/vET3om6osnU4ucq5XXhNRYo= X-Received: by 2002:aca:d03:0:b0:3c1:e0b5:d16f with SMTP id 3-20020aca0d03000000b003c1e0b5d16fmr3646423oin.57.1710073059291; Sun, 10 Mar 2024 05:17:39 -0700 (PDT) In-Reply-To: <865xxuwej7.fsf@gnu.org> Received-SPF: pass client-ip=2607:f8b0:4864:20::231; envelope-from=yurivkhan@gmail.com; helo=mail-oi1-x231.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.devel:316969 Archived-At: On Sun, 10 Mar 2024 at 17:46, Eli Zaretskii wrote: > > > > there is no way for either a > > > > doctype definition or an XML Schema to say =E2=80=9Celements named = X > > > > definitely contain file names=E2=80=9D. The closest is XML Schema= =E2=80=99s anyURI > > > > datatype, but it might refer not necessarily to a local file but > > > > possibly to any other URI-addressed resource. > I was surprised that you didn't (and AFAIU still > don't) suggest that we interpret the tags according to the schema. I think I explained that: because there is nothing in the XML Schema meta-schema that talks specifically of files and file names. It might be possible to consider element content or attributes typed as =E2=80=98anyURI=E2=80=99 to contain file names. It will incur false posi= tives (when an anyURI attribute actually contains a non-file URI) and false negatives (when the schema of a document is not known to Emacs, or when it does not specifically designate an element or attribute as =E2=80=98anyURI=E2=80=99 even though it contains file names in practice). > AFAIR, we do have an XML mode which can parse the schema, so it seems > reasonable to expect that we could recognize file names using the > information in the schema without any guesswork. If this is > impossible, please explain why not, but please be specific: refer to > what Emacs can know about an XML document using the existing XML > mode(s), and how that affects our ability to know which tags specify > file names. I assume you=E2=80=99re referring to =E2=80=98rnc-validate-mode=E2=80=99 mi= nor mode and the few schemas it knows about out of the box? The OP is working with an unspecified, possibly proprietary, XML-based format. Likely, its schema is not included in Emacs, and possibly not even readily available to the OP. If available, it might not be in the RELAX NG Compact format, requiring conversion. If not available, the OP would have to derive a schema from observed document instances. Meanwhile, the generic approach =E2=80=9Cthe longest sequence of characters allowed in file names, subject to restrictions of your buffer=E2=80=99s syntax, starting from the point where you type M-RET and extending both ways, is potentially a file name=E2=80=9D, is easy to implement and usable without any preliminary setup, although probably biased towards false positives (attempting to produce a file name when the element or attribute is not intended as such).