all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: "Jostein Kjønigsen" <jostein@secure.kjonigsen.net>
To: "Ergus via Emacs development discussions." <emacs-devel@gnu.org>
Subject: Adding new schemas to nxml-mode. Am I doing it right?
Date: Sun, 18 Feb 2024 21:12:51 +0100	[thread overview]
Message-ID: <A320D6EF-4B01-476D-8D5B-9CD09E030AA5@secure.kjonigsen.net> (raw)

[-- Attachment #1: Type: text/plain, Size: 2691 bytes --]

Hey everyone!

I recently discovered that nxml-mode in Emacs supports validating XML content against XML schemas, which I had never noticed before. It turns out that none of the files I've edited using nxml-mode had a supported schema.

Looking into this I learned a few things about nxml-mode:

XML is not validated against XSD schemas, which are the most common schema format today.
Emacs relies on RNC schemas[1], which are less common and have fewer tools, but are simpler.
Emacs cannot automatically obtain the schemas which are declared in XML documents.

The last point is the most impactful in terms of usability, but it might require more effort to fix than we can currently manage.

As an end-user, effectively only the RNC schemas provided with Emacs are available. To make matters worse, seemingly these list of supported schemas have not changed since 2007. Someone correct me if I'm wrong.

I would like Emacs to support XSD and automatically fetch schemas at runtime, but I also want a better nxml-mode experience for the file formats I use daily, today.

To address this, I have created RNC schemas for the formats I depend on. I have attached an abbreviated diff of the changes. The main changes are:

Updating schemas.xml with typeIDs and conditions for applying them in the correct order.
Generating new RNC schemas based on those typeIDs.

I tried to find tooling to convert XSD schemas to RNC, but I couldn't find anything which actually worked. After a few days of getting nowhere, I instead decided to use a tool called "jing-trang”[2] to infer the XML schema based on existing documents in my possession (200+ software projects, 50+ GBs).

While this method doesn't guarantee the accuracy of the schemas, they are based on a large number of files, ensuring that most common elements and attributes are present and specified. It may not be scientifically accurate (like an actual XSD to RNC translation), but it works for my purposes.

Accurate schema support is a small yet significant feature that can make a noticeable difference when working with XML. Ideally, Emacs should have schemas for all XML-based file-types commonly used.

As such, I would like to contribute these patches to core Emacs to help improve the current situation, but I want to make sure I'm doing it correctly.

How can I create accurate, quality schemas in RNC with the current tooling?
What are the criteria for accepting new schemas in Emacs core? Are there any?

I would appreciate any comments or feedback on this matter.

Thanks!

[1] https://relaxng.org/
[2] https://github.com/relaxng/jing-trang


—
Kind Regards
Jostein Kjønigsen



[-- Attachment #2.1: Type: text/html, Size: 12939 bytes --]

[-- Attachment #2.2: schemas.patch --]
[-- Type: application/octet-stream, Size: 1533 bytes --]

diff --git a/etc/schema/schemas.xml b/etc/schema/schemas.xml
index f04bba849b4..dd1e23a5a8e 100644
--- a/etc/schema/schemas.xml
+++ b/etc/schema/schemas.xml
@@ -66,4 +66,30 @@
   <typeId id="LibreOffice" uri="OpenDocument-schema-v1.3+libreoffice.rnc"/>
   <typeId id="OpenDocument Manifest" uri="od-manifest-schema-v1.2-os.rnc"/>
 
+  <!-- .net development related schemas -->
+  <uri pattern="nuget.config" typeId="Nuget Config" />
+  <typeId id="Nuget Config" uri="nuget.rnc" />
+
+  <uri pattern="*.nuspec" typeId="Nuget Spec" />
+  <namespace ns="http://schemas.microsoft.com/packaging/2011/08/nuspec.xsd" typeId="Nuget Spec" />
+  <typeId id="Nuget Spec" uri="nuspec.rnc" />
+
+  <uri pattern="web.config" typeId="Dotnet App Config" />
+  <uri pattern="app.config" typeId="Dotnet App Config" />
+  <namespace ns="http://schemas.microsoft.com/.NetConfiguration/v2.0" typeId="Dotnet App Config" />
+  <typeId id="Dotnet App Config" uri="dotnet-appconfig.rnc" />
+
+  <uri pattern="Directory.Build.props" typeId="Dotnet Build Props" />
+  <typeId id="Dotnet Build Props" uri="dotnet-build-props.rnc" />
+
+  <uri pattern="Directory.Packages.props" typeId="Dotnet Packages Props" />
+  <typeId id="Dotnet Packages Props" uri="dotnet-packages-props.rnc" />
+
+  <uri pattern="*.resx" typeId="Dotnet Resx" />
+  <typeId id="Dotnet Resx" uri="dotnet-resx.rnc" />
+
+  <uri pattern="*.*proj" typeId="MSBuild" />
+  <documentElement localName="Project" typeId="MSBuild"/>
+  <typeId id="MSBuild" uri="msbuild.rnc" />
+
 </locatingRules>

[-- Attachment #2.3: Type: text/html, Size: 212 bytes --]

             reply	other threads:[~2024-02-18 20:12 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-18 20:12 Jostein Kjønigsen [this message]
2024-02-19 22:20 ` Adding new schemas to nxml-mode. Am I doing it right? Stefan Kangas
2024-02-20 20:02   ` [PATCH] Adding new schemas to nxml-mode Jostein Kjønigsen
2024-02-21  3:01     ` Stefan Kangas
2024-02-22 15:13       ` Jostein Kjønigsen
2024-02-23 14:25         ` Jostein Kjønigsen
2024-05-18 19:48           ` Stefan Kangas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=A320D6EF-4B01-476D-8D5B-9CD09E030AA5@secure.kjonigsen.net \
    --to=jostein@secure.kjonigsen.net \
    --cc=emacs-devel@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.