Hi, sorry for the late response.  I've just noticed that my reply mail
didn't go out successfully.  I'm trying to re-send it.

I wrote:

> In article <871t2dz22d.fsf@gmail.com>, ynyaaa@gmail.com writes:
> > If there are unencodable characters, encodable characters may be broken.
> > In this example, the second ?\x4E00 character disappears.
> >     (set-language-environment 'Chinese-GB)
> >     (decode-coding-string (encode-coding-string "\x4E00\x00B7\x4E00" 'hz) 'hz)
> >>> "\x4E00\e\x3048\x6070\x70B3\x11213D\300\273"
> 
> How to treat unencodable characters on encoding is a difficult problem.
> As HZ is designed for 7-bit environment, I think it's important to keep
> 7-bit on encoding.  So, the new code uses \uXXXX for those characters.
> Another way is to use UTF-8 sequence for them, then we can decode it
> back.  Which, do yo think, is better?
> 
> > To avoid this behavior, there are some solutions.
> > (a) While decoding, replace "~{...~}" with "\e$A...\e(B"
> >     and decode with iso-2022-7bit.
> > (b) Like (a), replace "~{...~}" with "\e$A...\e(B" while decoding
> >     and insert "\e$)A" at the beginning of the temp buffer
> >     and decode with iso-2022-8bit-ss2.
> >     (8bit data are decoded as euc-cn.)
> > (c) While encoding, use euc-cn instead of iso-2022-7bit
> >     and translate each consecutive 8bit data to 7bit data
> >     prefixed by "~{" and postfixed by "~}".
> 
> I adopted the (a) method for decoding, and fix bugs encoding code.
> 
> > By the way, RFC1843 describes:
> >     The escape sequence '~\n' is a line-continuation marker to be
> >     consumed with no output produced.
> 
> The variable decode-hz-line-continuation controls this feature.  I don't
> remember why the default is nil (i.e. do not decode ~\n), perhaps some
> Chinese people I was discussing with on implementing HZ support
> suggested that.
> 
> Attched is the full china-util.el (not a diff).
> 
> ---
> K. Handa
> handa@gnu.org