From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.devel Subject: Re: [PATCH] ffap.el: Exclude angle brackets from file names in XML Date: Sun, 10 Mar 2024 12:46:04 +0200 Message-ID: <865xxuwej7.fsf@gnu.org> References: <874jdfkrno.fsf@strawberrytea.xyz> <86ttlewsb8.fsf@gnu.org> <01249b55-1cdf-40c6-92b8-16b1521baf33@app.fastmail.com> <86frwywpw7.fsf@gnu.org> <86cys2wmmr.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="4641"; mail-complaints-to="usenet@ciao.gmane.io" Cc: look@strawberrytea.xyz, emacs-devel@gnu.org To: Yuri Khan Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Sun Mar 10 11:46:52 2024 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1rjGhg-00010T-2v for ged-emacs-devel@m.gmane-mx.org; Sun, 10 Mar 2024 11:46:52 +0100 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rjGgz-0001uP-1X; Sun, 10 Mar 2024 06:46:09 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rjGgx-0001uD-EC for emacs-devel@gnu.org; Sun, 10 Mar 2024 06:46:07 -0400 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rjGgw-0002Sp-WD; Sun, 10 Mar 2024 06:46:07 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=MIME-version:References:Subject:In-Reply-To:To:From: Date; bh=1MQH0pBzKRTMOfgybxJabw4QV7/iRzd+zXCIEXIMNKY=; b=qXilgXzOFUnjiaigHeqy zyhSfHURP69SGGGjHfguYFNbJorb8cIegnIBuTY9xQGWkYCFdXnA4jR2Vd2lY/dVA+qrWRmThbWR0 DSf4NHyYupsvwJgCh87g8wJld+/fqCqDPjW98yIzvcD36GY6VLj1FF8JRJ3auo8HdsbTXSXCDFu4y gMxiDOibLQXJu/+ifoaXeKyWyhSfCLgaJluwmVjj/6Br5o5NBWexvMdjx9GuoHh1hqytM+LAEjlvk /aJn8aDmeomp0vUEB74G2f6+Rv/3ABfvKmEGPl/Trfkm1f1tecAiRJGQVrLu2zwt3dnDSxF+zEEHJ /e9kVsKZbZ3tGQ==; In-Reply-To: (message from Yuri Khan on Sun, 10 Mar 2024 17:30:50 +0700) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.devel:316968 Archived-At: > From: Yuri Khan > Date: Sun, 10 Mar 2024 17:30:50 +0700 > Cc: look@strawberrytea.xyz, emacs-devel@gnu.org > > On Sun, 10 Mar 2024 at 14:51, Eli Zaretskii wrote: > > > > Tags do not inherently mean anything in XML. Only a specific XML-based > > > format definition gives them meaning, and there is no way for either a > > > doctype definition or an XML Schema to say “elements named X > > > definitely contain file names”. The closest is XML Schema’s anyURI > > > datatype, but it might refer not necessarily to a local file but > > > possibly to any other URI-addressed resource. > > > > Are you saying that applications using XML hard-code the meaning of > > each tag that they need to process? IOW, how does a program using > > such an XML know that the value is a file name? > > XML is a generic framework for designing formats. Each application > designs a specific schema that defines which tags mean what. > > For example, XHTML is an XML subtype in which the element can > have a src="…" attribute which contains an URI. In particular, it > might contain a file name. > > For another example, SVG is an XML subtype that does not define an > element; is invalid if used in an SVG document. On the > other hand, SVG has an element whose href="…" attribute > contains an URI. > > > A program typically does not work with arbitrary XML. It works with a > specific XML subtype, like OpenOffice with ODF (which is a ZIP archive > containing, among others, some XML files) and Inkscape with SVG. In > case an application is capable of understanding multiple XML-based > formats, it must know beforehand which particular format a document is > an instance of. > > For example, when Firefox receives an HTTP response with a > ‘Content-Type: application/xhtml+xml’ header, it interprets the > response body as XHTML page; but with a ‘Content-Type: image/svg+xml’, > it processes the body as an SVG image. > > > (I did not expect you’d need a lecture on this.) I don't, not the general lecture you gave above, anyway. I know about the XML schema. I was surprised that you didn't (and AFAIU still don't) suggest that we interpret the tags according to the schema. AFAIR, we do have an XML mode which can parse the schema, so it seems reasonable to expect that we could recognize file names using the information in the schema without any guesswork. If this is impossible, please explain why not, but please be specific: refer to what Emacs can know about an XML document using the existing XML mode(s), and how that affects our ability to know which tags specify file names.