From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Yuri Khan Newsgroups: gmane.emacs.devel Subject: Re: [PATCH] ffap.el: Exclude angle brackets from file names in XML Date: Sun, 10 Mar 2024 20:15:14 +0700 Message-ID: References: <874jdfkrno.fsf@strawberrytea.xyz> <86ttlewsb8.fsf@gnu.org> <01249b55-1cdf-40c6-92b8-16b1521baf33@app.fastmail.com> <86frwywpw7.fsf@gnu.org> <86cys2wmmr.fsf@gnu.org> <865xxuwej7.fsf@gnu.org> <861q8iw947.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="39493"; mail-complaints-to="usenet@ciao.gmane.io" Cc: look@strawberrytea.xyz, emacs-devel@gnu.org To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Sun Mar 10 14:16:37 2024 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1rjJ2a-000A1i-EM for ged-emacs-devel@m.gmane-mx.org; Sun, 10 Mar 2024 14:16:36 +0100 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rjJ1q-0004J3-5y; Sun, 10 Mar 2024 09:15:50 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rjJ1Y-0004IM-9K for emacs-devel@gnu.org; Sun, 10 Mar 2024 09:15:34 -0400 Original-Received: from mail-ua1-x92f.google.com ([2607:f8b0:4864:20::92f]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1rjJ1V-0002VS-Pc; Sun, 10 Mar 2024 09:15:31 -0400 Original-Received: by mail-ua1-x92f.google.com with SMTP id a1e0cc1a2514c-7dbd6ffb889so460245241.2; Sun, 10 Mar 2024 06:15:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1710076526; x=1710681326; darn=gnu.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=Pbjt83ofSBf3qmSEyWT/8/7d1jBsk3VtONFdE6RTfMs=; b=BvQK5NySGkqrFvrYAaZFMR46AGHCswPaAnkrjWiaFEc2X6w9OPnsFE3pfP/9MiDXO0 ZfKNzgS4tF9/kxJw9kvoszHyS8Ox/ID5P5mcVXY15C7UkIuau0bgzEhQJx0sJ6ZQ/QVz 4sGQp1Gp1tup7ZNhHdhdootN+FLloCZQ4wOL8vXDhBZ7DENwInoNquzHq85PJg/jXRy3 QFyvbQIErMLwwaewpWOYy+VZ5xymOi2RDtf+37dAJ3HhYZNxzXOd+I0jezY8Dm0Zt22i 67rcvYaUSobkX7sc6dzB1TorvQrS+XmYoo6BXqdpqDSmxjheC5+W9Rnwjr2XKUDghHmt Z5Fg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710076526; x=1710681326; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Pbjt83ofSBf3qmSEyWT/8/7d1jBsk3VtONFdE6RTfMs=; b=cfldb+Vjfb2OY7Ha515eeDqRLGSPpQBv+E9xfERdwU75SmFuzKhVsWcMj6qtZie2K8 6Cc3JBZEvKd31s2lK1O3IEZmfJGRIzGpXW3NbHR27jvLtVi38lWqoIGWEKlc4Y1qLmsu gVdF9DENx41+6j3hLSGEGCsK6/7fioEoEXMQNDtutAayGTTfy0ggCK06V8EMqJdGfCsR 32zfoCBgYZ2/qQ5BAJO7Y4AW8fgqDPvoCig4EaS4nt8lNRdaHm9m/KBN1D1i5DHIceZi j54Aiqnjw4GVEAkBo4v3smBi+LVNabsRyMSj96RRQOnEUEtcwvJ8FVS0sy38TKv3ErNk f8yw== X-Forwarded-Encrypted: i=1; AJvYcCUg0HJeHjIv2HtcoJ4jf19ppDWDluHFyqZYvKHKpRjln6OrYGKvaNUPmnd1TgPPP1Fjp06grm4Er2D//JJBP0OExw3j X-Gm-Message-State: AOJu0YxBmvT/QJlINYG/HZsR+SkN/u/4KRYAQlu0lE8nJHd0dHqBqdwu BteWUztOj4vNDc48JEGesz6MaFvGA1H5NDRiuPyUKTX6tA4MigRFwNwUdZpluG7cNQ8bYsE8C4k bgv1IVvANGFY5Wyk9TptYPq2OktQPpC/D X-Google-Smtp-Source: AGHT+IFVGEwU03I5n/pr0nS+cSort5RJiRvk7wRaPNLAg+B5w75mcOGj00C41t33j6vP5ksvXUp32YPGYBtMm3/CaV4= X-Received: by 2002:a05:6102:3097:b0:473:2426:9fc5 with SMTP id l23-20020a056102309700b0047324269fc5mr1051696vsb.26.1710076526200; Sun, 10 Mar 2024 06:15:26 -0700 (PDT) In-Reply-To: <861q8iw947.fsf@gnu.org> Received-SPF: pass client-ip=2607:f8b0:4864:20::92f; envelope-from=yurivkhan@gmail.com; helo=mail-ua1-x92f.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.devel:316971 Archived-At: On Sun, 10 Mar 2024 at 19:43, Eli Zaretskii wrote: > > I think I explained that: because there is nothing in the XML Schema > > meta-schema that talks specifically of files and file names. > > Then how do programs that process such XML know where are file names? By inside knowledge that is more nuanced than can be expressed in a schema. Or by the virtue of the fact that a relative file name is also valid as a relative URI reference and resolving it against a base URI that names a file yields an URI also naming a file. > > It might be possible to consider element content or attributes typed > > as =E2=80=98anyURI=E2=80=99 to contain file names. It will incur false = positives (when > > an anyURI attribute actually contains a non-file URI) and false > > negatives (when the schema of a document is not known to Emacs, or > > when it does not specifically designate an element or attribute as > > =E2=80=98anyURI=E2=80=99 even though it contains file names in practice= ). > > Is that what programs which process XML do? I cannot, off the top of my head, name any program which processes XML in a generic way (i.e. not specialized to a particular subtype, guided only by schemas) *and also* handles files referenced. > So how does rnc-validate-mode somehow always knows to indicate whether > a given XML file is valid or not? It doesn=E2=80=99t always. It attempts to heuristically detect the file=E2=80=99s schema by file name extension, or by the namespace or local name of its root element. If that fails, it applies what it calls a =E2=80=9Cvacuous schema=E2=80=9D =E2= =80=94 one by which any well-formed XML is valid, and by which none of the elements or attributes are known to be anyURI. > If that's the best (not necessarily the easiest) we can do, I'm okay > with doing that. But in general, IMO ffap.el is full of problematic > heuristics that is known to fail from time to time, so whenever we can > avoid the guesswork, we should, even if it is not the easiest > solution. No argument from me here, but as long as we have heuristics, might as well apply improvements to them.