unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Adding new schemas to nxml-mode. Am I doing it right?
@ 2024-02-18 20:12 Jostein Kjønigsen
  2024-02-19 22:20 ` Stefan Kangas
  0 siblings, 1 reply; 7+ messages in thread
From: Jostein Kjønigsen @ 2024-02-18 20:12 UTC (permalink / raw)
  To: Ergus via Emacs development discussions.

[-- Attachment #1: Type: text/plain, Size: 2691 bytes --]

Hey everyone!

I recently discovered that nxml-mode in Emacs supports validating XML content against XML schemas, which I had never noticed before. It turns out that none of the files I've edited using nxml-mode had a supported schema.

Looking into this I learned a few things about nxml-mode:

XML is not validated against XSD schemas, which are the most common schema format today.
Emacs relies on RNC schemas[1], which are less common and have fewer tools, but are simpler.
Emacs cannot automatically obtain the schemas which are declared in XML documents.

The last point is the most impactful in terms of usability, but it might require more effort to fix than we can currently manage.

As an end-user, effectively only the RNC schemas provided with Emacs are available. To make matters worse, seemingly these list of supported schemas have not changed since 2007. Someone correct me if I'm wrong.

I would like Emacs to support XSD and automatically fetch schemas at runtime, but I also want a better nxml-mode experience for the file formats I use daily, today.

To address this, I have created RNC schemas for the formats I depend on. I have attached an abbreviated diff of the changes. The main changes are:

Updating schemas.xml with typeIDs and conditions for applying them in the correct order.
Generating new RNC schemas based on those typeIDs.

I tried to find tooling to convert XSD schemas to RNC, but I couldn't find anything which actually worked. After a few days of getting nowhere, I instead decided to use a tool called "jing-trang”[2] to infer the XML schema based on existing documents in my possession (200+ software projects, 50+ GBs).

While this method doesn't guarantee the accuracy of the schemas, they are based on a large number of files, ensuring that most common elements and attributes are present and specified. It may not be scientifically accurate (like an actual XSD to RNC translation), but it works for my purposes.

Accurate schema support is a small yet significant feature that can make a noticeable difference when working with XML. Ideally, Emacs should have schemas for all XML-based file-types commonly used.

As such, I would like to contribute these patches to core Emacs to help improve the current situation, but I want to make sure I'm doing it correctly.

How can I create accurate, quality schemas in RNC with the current tooling?
What are the criteria for accepting new schemas in Emacs core? Are there any?

I would appreciate any comments or feedback on this matter.

Thanks!

[1] https://relaxng.org/
[2] https://github.com/relaxng/jing-trang


—
Kind Regards
Jostein Kjønigsen



[-- Attachment #2.1: Type: text/html, Size: 12939 bytes --]

[-- Attachment #2.2: schemas.patch --]
[-- Type: application/octet-stream, Size: 1533 bytes --]

diff --git a/etc/schema/schemas.xml b/etc/schema/schemas.xml
index f04bba849b4..dd1e23a5a8e 100644
--- a/etc/schema/schemas.xml
+++ b/etc/schema/schemas.xml
@@ -66,4 +66,30 @@
   <typeId id="LibreOffice" uri="OpenDocument-schema-v1.3+libreoffice.rnc"/>
   <typeId id="OpenDocument Manifest" uri="od-manifest-schema-v1.2-os.rnc"/>
 
+  <!-- .net development related schemas -->
+  <uri pattern="nuget.config" typeId="Nuget Config" />
+  <typeId id="Nuget Config" uri="nuget.rnc" />
+
+  <uri pattern="*.nuspec" typeId="Nuget Spec" />
+  <namespace ns="http://schemas.microsoft.com/packaging/2011/08/nuspec.xsd" typeId="Nuget Spec" />
+  <typeId id="Nuget Spec" uri="nuspec.rnc" />
+
+  <uri pattern="web.config" typeId="Dotnet App Config" />
+  <uri pattern="app.config" typeId="Dotnet App Config" />
+  <namespace ns="http://schemas.microsoft.com/.NetConfiguration/v2.0" typeId="Dotnet App Config" />
+  <typeId id="Dotnet App Config" uri="dotnet-appconfig.rnc" />
+
+  <uri pattern="Directory.Build.props" typeId="Dotnet Build Props" />
+  <typeId id="Dotnet Build Props" uri="dotnet-build-props.rnc" />
+
+  <uri pattern="Directory.Packages.props" typeId="Dotnet Packages Props" />
+  <typeId id="Dotnet Packages Props" uri="dotnet-packages-props.rnc" />
+
+  <uri pattern="*.resx" typeId="Dotnet Resx" />
+  <typeId id="Dotnet Resx" uri="dotnet-resx.rnc" />
+
+  <uri pattern="*.*proj" typeId="MSBuild" />
+  <documentElement localName="Project" typeId="MSBuild"/>
+  <typeId id="MSBuild" uri="msbuild.rnc" />
+
 </locatingRules>

[-- Attachment #2.3: Type: text/html, Size: 212 bytes --]

^ permalink raw reply related	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2024-05-18 19:48 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-02-18 20:12 Adding new schemas to nxml-mode. Am I doing it right? Jostein Kjønigsen
2024-02-19 22:20 ` Stefan Kangas
2024-02-20 20:02   ` [PATCH] Adding new schemas to nxml-mode Jostein Kjønigsen
2024-02-21  3:01     ` Stefan Kangas
2024-02-22 15:13       ` Jostein Kjønigsen
2024-02-23 14:25         ` Jostein Kjønigsen
2024-05-18 19:48           ` Stefan Kangas

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).