Eli Zaretskii schrieb am So., 22. Nov. 2015 um 20:51 Uhr: > > > No matter what we expect or tolerate, we need to state that. > > > > No, we don't. When the callers violate the contract, they cannot > > expect to know in detail what will happen. If they want to know, they > > will have to read the source. > > > > So you want this to be unspecified or undefined behavior? That might be > OK (we > > already have that in several places), but we still need to state what the > > contract is. > > You can call it "undefined behavior" if you want. Personally, I don't > think that's accurate: "undefined" means anything can happen, whereas > Emacs at least promises to output the original bytes unchanged, as > long as the text modifications didn't touch them. > "Unspecified" would fit the bill better. Actually for most interesting inputs (UTF-8 strings) the behavior is well-defined anyway. > > > > An Emacs string is a sequence of integers. > > > > No, it's a sequence of bytes. > > > > From > > > https://www.gnu.org/software/emacs/manual/html_node/elisp/String-Basics.html > : > > "In Emacs Lisp, characters are simply integers ... A string is a fixed > sequence > > of characters" > > That's the _User_ manual, it simplifies things to avoid too much > complexity. > So where's the programmer's manual then? The source code? ;-) > > > How a string is represented internally shouldn't be the concern of module > > authors. > > Indeed. But it does concern us, the developers of Emacs internals. > > > No, I will definitely fix it. > > Thank you. > Attached a patch that uses make_multibyte_string directly.