From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Li Ian-Xue Newsgroups: gmane.emacs.bugs Subject: bug#12051: 24.1; rcirc-send-message doesn't take multibyte into account. Date: Thu, 26 Jul 2012 00:18:29 +0800 Message-ID: <87boj3byqy.fsf@acerpad.localdomain> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" X-Trace: dough.gmane.org 1343265020 16476 80.91.229.3 (26 Jul 2012 01:10:20 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Thu, 26 Jul 2012 01:10:20 +0000 (UTC) To: 12051@debbugs.gnu.org Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Thu Jul 26 03:10:20 2012 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1SuCap-0004I3-U6 for geb-bug-gnu-emacs@m.gmane.org; Thu, 26 Jul 2012 03:10:20 +0200 Original-Received: from localhost ([::1]:40544 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SuCap-0002sd-5h for geb-bug-gnu-emacs@m.gmane.org; Wed, 25 Jul 2012 21:10:19 -0400 Original-Received: from eggs.gnu.org ([208.118.235.92]:42011) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SuCal-0002sO-Sf for bug-gnu-emacs@gnu.org; Wed, 25 Jul 2012 21:10:17 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1SuCak-0006hB-Ru for bug-gnu-emacs@gnu.org; Wed, 25 Jul 2012 21:10:15 -0400 Original-Received: from debbugs.gnu.org ([140.186.70.43]:58635) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SuCak-0006h7-OY for bug-gnu-emacs@gnu.org; Wed, 25 Jul 2012 21:10:14 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.72) (envelope-from ) id 1SuChK-0003Os-M2 for bug-gnu-emacs@gnu.org; Wed, 25 Jul 2012 21:17:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Li Ian-Xue Original-Sender: debbugs-submit-bounces@debbugs.gnu.org Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Thu, 26 Jul 2012 01:17:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: report 12051 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: X-Debbugs-Original-To: bug-gnu-emacs@gnu.org Original-Received: via spool by submit@debbugs.gnu.org id=B.134326539912993 (code B ref -1); Thu, 26 Jul 2012 01:17:02 +0000 Original-Received: (at submit) by debbugs.gnu.org; 26 Jul 2012 01:16:39 +0000 Original-Received: from localhost ([127.0.0.1]:39944 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1SuCgw-0003NR-IZ for submit@debbugs.gnu.org; Wed, 25 Jul 2012 21:16:39 -0400 Original-Received: from eggs.gnu.org ([208.118.235.92]:60349) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1Su4nT-0000Pi-9f for submit@debbugs.gnu.org; Wed, 25 Jul 2012 12:50:52 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Su4gp-0006hd-D2 for submit@debbugs.gnu.org; Wed, 25 Jul 2012 12:44:05 -0400 Original-Received: from lists.gnu.org ([208.118.235.17]:46862) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Su4gp-0006hY-9o for submit@debbugs.gnu.org; Wed, 25 Jul 2012 12:43:59 -0400 Original-Received: from eggs.gnu.org ([208.118.235.92]:40654) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Su4gn-0002Mn-J8 for bug-gnu-emacs@gnu.org; Wed, 25 Jul 2012 12:43:59 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Su4gm-0006h9-IB for bug-gnu-emacs@gnu.org; Wed, 25 Jul 2012 12:43:57 -0400 Original-Received: from 122-117-157-82.hinet-ip.hinet.net ([122.117.157.82]:43083 helo=bephor.org) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Su4gm-0006f1-6a for bug-gnu-emacs@gnu.org; Wed, 25 Jul 2012 12:43:56 -0400 Original-Received: from 114-47-18-1.dynamic.hinet.net ([114.47.18.1] helo=acerpad.localdomain) by bephor.org with esmtpsa (TLSv1.2:DHE-RSA-AES128-SHA:128) (Exim 4.80) (envelope-from ) id 1Su4JU-00030d-Lo for bug-gnu-emacs@gnu.org; Thu, 26 Jul 2012 00:19:52 +0800 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3) X-Mailman-Approved-At: Wed, 25 Jul 2012 21:16:36 -0400 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.13 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 2) X-Received-From: 140.186.70.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.bugs:62391 Archived-At: --=-=-= Content-Type: text/plain Hello developers, I discovered recently that the irc client `rcirc', although has an max-message-length set, but it simply uses (length str) for detecting the output length, which is not desirable for multibyte users because usually our characters encode to more than one byte, and this causes an error that the client actually sends out more bytes than the standard has required (512 bytes to my understanding). This limit is easily reached since chinese characters are usually encoded with 3 bytes for one character. By this error, if the server truncates the result string simply by bytes, then it's known to cause the string to become entirely scrambles for xchat. I'm attaching a patch to perform an binary search for multibyte strings, and this patch should not have any penalties for original ascii users since it begins with a (multibyte-string-p) to decide which style to use. --=-=-= Content-Type: text/x-patch Content-Disposition: inline; filename=rcirc-fix-multibyte-overflow.patch --- rcirc.el 2012-07-25 23:52:41.813226461 +0800 +++ rcirc-1.el 2012-07-25 23:55:20.813220626 +0800 @@ -792,21 +792,40 @@ (defvar rcirc-max-message-length 420 "Messages longer than this value will be split.") +(defun rcirc-multibyte-position-at-byte (str bytes) + (if (multibyte-string-p str) + (rcirc-multibyte-position-at-byte-1 str bytes 0 0) + bytes)) + +(defun rcirc-multibyte-position-at-byte-1 (str bytes now-chars now-bytes) + (let ((len (length str))) + (if (<= len 1) + now-chars + (let* ((half-len (/ len 2)) + (lstr (substring str 0 half-len)) + (rstr (substring str half-len len)) + (now-bytes-1 (+ now-bytes (string-bytes lstr)))) + (if (> now-bytes-1 bytes) + (rcirc-multibyte-position-at-byte-1 lstr bytes now-chars now-bytes) + (rcirc-multibyte-position-at-byte-1 rstr bytes (+ half-len now-chars) now-bytes-1)))))) + (defun rcirc-send-message (process target message &optional noticep silent) "Send TARGET associated with PROCESS a privmsg with text MESSAGE. If NOTICEP is non-nil, send a notice instead of privmsg. If SILENT is non-nil, do not print the message in any irc buffer." ;; max message length is 512 including CRLF (let* ((response (if noticep "NOTICE" "PRIVMSG")) - (oversize (> (length message) rcirc-max-message-length)) + (oversize (> (string-bytes message) rcirc-max-message-length)) + (adjusted-pos (if oversize + (rcirc-multibyte-position-at-byte message rcirc-max-message-length))) (text (if oversize - (substring message 0 rcirc-max-message-length) + (substring message 0 adjusted-pos) message)) (text (if (string= text "") " " text)) (more (if oversize - (substring message rcirc-max-message-length)))) + (substring message adjusted-pos)))) (rcirc-get-buffer-create process target) (rcirc-send-string process (concat response " " target " :" text)) (unless silent --=-=-=--