>>>>> "Eli" == Eli Zaretskii writes: >> From: Chen Bin Cc: emacs-devel@gnu.org >> Date: Sun, 15 Apr 2018 02:40:18 +1000 >> >> Correct me if I'm wrong. >> >> I read cod eand found definion of Lisp_String: struct GCALIGNED >> Lisp_String { ptrdiff_t size; ptrdiff_t size_byte; INTERVAL >> intervals; /* Text properties in this string. */ unsigned char >> *data; }; >> >> I understand string text is encoded in UTF8 format and is stored >> in 'Lisp_String::data'. There is actually no difference between >> unibyte and multibyte text since UTF8 is compatible with ASCII >> and we only deal with 'data' field. Eli> No, that's incorrect. The difference does exist, it just all Eli> but disappear for unibyte strings encoded in UTF-8. But if you Eli> encode a string in some other encoding, like Latin-1, you will Eli> see a very different stream of bytes. >> I attached the latest patch. Eli> Thanks. >> + ;; string containing unicode character (Hanzi) + (should (equal >> 6 (string-distance "ab" "ab我她"))) + (should (equal 3 >> (string-distance "我" "她")))) Eli> Should the distance be measured in bytes or in characters? I Eli> think it's the latter, in which case the implementation should Eli> work in characters, not bytes. -- Best Regards, Chen Bin -- Help me, help you