It would probably be a large amount of difficult work.
Over the years, multiple groups have tried this approach using less
complicated versions of the web (pre-javascript, pre-css, pre-xhtml,
etc) and less complicated editing systems (more `notepad' than
`emacs'), and have fallen under their own weight.
I hate to be gloomy, but I'm pretty sure that's the reason. Someday,
another epoch or Gerd Moellmann-style effort will arise and implement
this idea or something like it, but, believe it or not, xembed is
probably an easier path.