From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: "Kai Tetzlaff" Newsgroups: gmane.emacs.bugs Subject: bug#54154: 29.0.50; [PATCH] `sieve-manage-getscript' fails if script contains multibyte characters Date: Fri, 25 Feb 2022 10:04:47 +0100 Message-ID: <87wnhj5nbk.fsf@tetzco.de> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="29789"; mail-complaints-to="usenet@ciao.gmane.io" To: 54154@debbugs.gnu.org Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Fri Feb 25 11:21:17 2022 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1nNXiv-0007VP-HA for geb-bug-gnu-emacs@m.gmane-mx.org; Fri, 25 Feb 2022 11:21:17 +0100 Original-Received: from localhost ([::1]:40016 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nNXiu-0000sq-0X for geb-bug-gnu-emacs@m.gmane-mx.org; Fri, 25 Feb 2022 05:21:16 -0500 Original-Received: from eggs.gnu.org ([209.51.188.92]:34018) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nNWlx-0005yE-2c for bug-gnu-emacs@gnu.org; Fri, 25 Feb 2022 04:20:26 -0500 Original-Received: from debbugs.gnu.org ([209.51.188.43]:57377) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1nNWle-00086N-9T for bug-gnu-emacs@gnu.org; Fri, 25 Feb 2022 04:20:18 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1nNWld-0005QK-V4 for bug-gnu-emacs@gnu.org; Fri, 25 Feb 2022 04:20:01 -0500 X-Loop: help-debbugs@gnu.org Resent-From: "Kai Tetzlaff" Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Fri, 25 Feb 2022 09:20:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: report 54154 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch X-Debbugs-Original-To: bug-gnu-emacs@gnu.org Original-Received: via spool by submit@debbugs.gnu.org id=B.164578077820793 (code B ref -1); Fri, 25 Feb 2022 09:20:01 +0000 Original-Received: (at submit) by debbugs.gnu.org; 25 Feb 2022 09:19:38 +0000 Original-Received: from localhost ([127.0.0.1]:51270 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1nNWlF-0005PB-LS for submit@debbugs.gnu.org; Fri, 25 Feb 2022 04:19:38 -0500 Original-Received: from lists.gnu.org ([209.51.188.17]:33064) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1nNWXC-00050h-Pn for submit@debbugs.gnu.org; Fri, 25 Feb 2022 04:05:07 -0500 Original-Received: from eggs.gnu.org ([209.51.188.92]:59570) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nNWXC-00012j-1T for bug-gnu-emacs@gnu.org; Fri, 25 Feb 2022 04:05:06 -0500 Original-Received: from mailout06.t-online.de ([194.25.134.19]:59334) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nNWX9-00043g-5D for bug-gnu-emacs@gnu.org; Fri, 25 Feb 2022 04:05:05 -0500 Original-Received: from fwd89.dcpf.telekom.de (fwd89.aul.t-online.de [10.223.144.115]) by mailout06.t-online.de (Postfix) with SMTP id 95FB6208E1 for ; Fri, 25 Feb 2022 10:04:53 +0100 (CET) Original-Received: from mail.tetzco.de ([188.192.172.49]) by fwd89.t-online.de with (TLSv1.2:ECDHE-RSA-AES256-GCM-SHA384 encrypted) esmtp id 1nNWWz-3QF0Ph0; Fri, 25 Feb 2022 10:04:53 +0100 Original-Received: from moka (moka.tetzco.de [172.30.42.200]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: kai@tetzco.de) by mail.tetzco.de (Postfix) with ESMTPSA id D219A6C00B7 for ; Fri, 25 Feb 2022 10:04:48 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=tetzco.de; s=20210624; t=1645779888; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type; bh=PwDgaHjyL8rA/3fBFfSZbaZYA0yGb7cVYkFEX3BdMa4=; b=j+Qm8m1edEiKhn1n/ay7M5dIMFHshuIwGvM/2fxAhyMjQ5CM8xcthdGduNtmQv+TshT0Y3 RR1xFqRuW4FbiKd7vA1DMScovItq3rPANK2d4ySvSu87zvzXyDxfZMT2vuizE+voekVHFw KrkvxbNXIWN/XkpScqKTgTU0E16Kv/kU/LeuK/GJ9bktCcfxb4TBsV52cmNgr3TDzqHWVy NjaYVAWw8NxdEdm433kv5fx8lLi7AMGLYGn/uIhyHRzOQe5Odf2bEblBaWADg20DP2OaWb 37CEs9j845ckkUVJhVmJWgJxqHX1TUQs9d464mUAyOrPbEvMaHY4EYAIY91Hwd+CtHACTf HJpQNG7ArecXvJ8fJl/S+hLV3bIGz3Z/nB0eA62B3iLZL8jucTnlUbvq8q/FwK1EsRgqTt 1v++50Zg7pJKdOTjW91K45U70Kg4jXmaGt1sLnudL56ioHUl30W2qqAk2/htrjKhLXnOoF Owcnrj/8SCykY+P3Zxy/nSTl8Dkbv8Hc+DOtJCO8UCpIUcLGFDE0yNsRPWl2yiuI91RsGh krbPh8PIwu8ZbhsK0W8Nng/X1bLTJqUzl+Ko+XCcmQ X-Rspamd-Queue-Id: D219A6C00B7 X-Spamd-Result: default: False [-6.69 / 30.00]; BAYES_HAM(-3.00)[100.00%]; NEURAL_HAM(-3.00)[-1.000]; GENERIC_REPUTATION(-0.69)[-0.68619248898889]; MIME_GOOD(-0.10)[multipart/mixed,text/plain,text/x-diff]; MIME_UNKNOWN(0.10)[application/emacs-lisp]; MIME_TRACE(0.00)[0:+,1:+,2:~,3:+,4:+,5:+]; RCVD_COUNT_ZERO(0.00)[0]; FROM_EQ_ENVFROM(0.00)[]; MID_RHS_MATCH_FROM(0.00)[]; DKIM_SIGNED(0.00)[tetzco.de:s=20210624]; TO_DN_NONE(0.00)[]; RCPT_COUNT_ONE(0.00)[1]; FROM_HAS_DN(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; ARC_NA(0.00)[] X-Rspamd-Server: rakaposhi X-TOI-EXPURGATEID: 150726::1645779893-0000F5B3-056C70B4/0/0 CLEAN NORMAL X-TOI-MSGID: 8bbb26c9-1b3f-4aef-bf78-94f83fce10fb Received-SPF: none client-ip=194.25.134.19; envelope-from=kai.tetzlaff@t-online.de; helo=mailout06.t-online.de X-Spam_score_int: -6 X-Spam_score: -0.7 X-Spam_bar: / X-Spam_report: (-0.7 / 5.0 requ) BAYES_00=-1.9, DKIM_INVALID=0.1, DKIM_SIGNED=0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001, TONLINE_FAKE_DKIM=1, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=no autolearn_force=no X-Spam_action: no action X-Mailman-Approved-At: Fri, 25 Feb 2022 04:19:36 -0500 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.io gmane.emacs.bugs:227621 Archived-At: --=-=-= Content-Type: text/plain The sieve-manage package uses the managesieve protocol (s. https://datatracker.ietf.org/doc/html/draft-ietf-sieve-managesieve) to communicate with a sieve server process. When the sieve-manage client retrieves a script from the server it uses the `sieve-manage-getscript' function to send command `GETSCRIPT ""` to the server and tries to parse the response. If the downloaded sieve script contains multibyte characters the attempt to parse the response results in an infloop (in `sieve-manage-parse-okno'). To reproduce, save the following code --=-=-= Content-Type: application/emacs-lisp; charset=utf-8 Content-Disposition: inline; filename=sieve-manage-getscript-minimal-example.el Content-Transfer-Encoding: quoted-printable Content-Description: minimal example (require 'sieve-manage) (require 'cl) ; for flet below (let* ((script-name "test.sieve") ;; variables `sieve-manage-server' and `sieve-manage-port' are ;; used in `sieve-manage-make-process-buffer' (sieve-manage-server) (sieve-manage-port "sieve") (sieve-buffer (sieve-manage-make-process-buffer)) (output-buffer (generate-new-buffer script-name))) (with-current-buffer sieve-buffer (goto-char (point-min)) ;; simulate managesieve response-getscript with a single multibyte ;; character: `=C3=A4` (insert "{32}\r\nif body :matches \"=C3=A4\" { stop; }\n\r\nOK \"Getscr= ipt completed.\"\r\n")) ;; use flet to mock some functions in call chain of sieve-manage-getscript (flet ((sieve-manage-send (_) nil) (accept-process-output (&optional _ _ _ _) nil) (get-buffer-process (_) nil)) ;; watch `sieve-manage-getscript' infloop (sieve-manage-getscript script-name output-buffer sieve-buffer) (kill-buffer sieve-buffer))) ;; Local Variables: ;; coding: utf-8-unix ;; End: --=-=-= Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable to a file and run: emacs -Q -l * Detailed analysis: The example code sets up a response buffer for a successful managesieve `response-getscript` defined as: response-getscript =3D (sieve-script CRLF response-ok) Here's the buffer content: ``` 1: {32} 2: if body :matches "=C3=A4" { stop; } 3: 4: OK "Getscript completed." ``` It comprises: 1. lines 1-2 (`sieve-script`): encoded as a managesieve `literal-s2c` string which: a. starts with a length in the form '{}' (i.e. 32) b. followed by the string data (i.e. the actual script: 'if body :matches "=C3=A4" { stop;}') using UTF-8 encoding 2. line 3 (`CRLF`) 3. line 4 (`response-ok`): 'OK' SP "Getscript completed." (the latter is an optional `quoted` string which can be shown to the user) The sieve-manage code is parsing the length into an integer and uses it to skip over `sieve-script` to get to the start of line 3 ( - empty) which is then also skipped to get to line 4 in order to parse the result ('OK'). Now the problem: Since sieve-manage explicitly enables multibyte support in the response buffer (by calling '(mm-enable-multibyte)' in `sieve-manage-make-process-buffer`) and uses `goto-char' for the purpose of skipping/jumping over `sieve-script`, each multibyte character in `sieve-script` causes the jump to go 1 (2, 3) character(s) too far. In the example above there's only a single 2 byte character (`=C3=A4`), so instead of skipping to the beginning of line 3, we land in the middle of : . This causes the following attempt to parse the result code (i.e. the 'OK "Getscript completed."' line) to infloop in `sieve-manage-parse-okno'. * An attempt of a fix: As far as I can tell, the attached patch fixes the issue for the GETSCRIPT command. --=-=-= Content-Type: text/x-diff Content-Disposition: inline; filename=sieve-manage-getscript-multibyte-fix.patch Content-Description: Fix for multibyte issue in `sieve-manage-getscript' diff --git a/lisp/net/sieve-manage.el b/lisp/net/sieve-manage.el index 50342b9105..8020e6fdca 100644 --- a/lisp/net/sieve-manage.el +++ b/lisp/net/sieve-manage.el @@ -449,10 +449,19 @@ sieve-manage-deletescript (defun sieve-manage-getscript (name output-buffer &optional buffer) (with-current-buffer (or buffer (current-buffer)) (sieve-manage-send (format "GETSCRIPT \"%s\"" name)) + (set-buffer-multibyte nil) (let ((script (sieve-manage-parse-string))) + (set-buffer-multibyte t) (sieve-manage-parse-crlf) (with-current-buffer output-buffer - (insert script)) + (insert (decode-coding-string + script + ;; not sure if using `buffer-file-coding-system' is + ;; the right approach, it might be better to hardcode + ;; it to utf-8-* (managesieve requires UTF-8 + ;; encoding) but in that case, which variant of + ;; utf-8-unix/dos/... is to be used? + buffer-file-coding-system t))) (sieve-manage-parse-okno)))) (defun sieve-manage-setactive (name &optional buffer) --=-=-= Content-Type: text/plain * Additional remarks: There might be more problems. E.g. `sieve-manage-putscript' contains the following comment: ;; Here we assume that the coding-system will ;; replace each char with a single byte. ;; This is always the case if `content' is ;; a unibyte string. which seems to indicate that it might also have an issue with multibyte content (even though I have not experienced any uploading issues). I will try do some more testing to check that. In general, it is also not clear to me why the response (or process) buffer needs to be multibyte enabled at all as it should only be used for the line/byte oriented protocol data. But the commit message of 8e16fb987df9b which introduced the multibyte handling states: commit 8e16fb987df9b80b8328e9dbf80351a5f9d85bbb Author: Albert Krewinkel Date: 2013-06-11 07:32:25 +0000 ... * Enable Multibyte for SieveManage buffers: The parser won't properly handle umlauts and line endings unless multibyte is turned on in the process buffer. ... so this was obviously done on purpose. I contacted Albert about this but he couldn't remember the details (it's been nearly 10 years). In GNU Emacs 29.0.50 (build 2, x86_64-pc-linux-gnu, GTK+ Version 3.24.31, cairo version 1.16.0) of 2022-02-18 built on moka Repository revision: 51e51ce2df46fc0c6e17a97e74b00366bb9c09d8 Repository branch: master System Description: Debian GNU/Linux bookworm/sid Configured using: 'configure --with-pgtk --with-native-compilation' Configured features: ACL CAIRO DBUS FREETYPE GIF GLIB GMP GNUTLS GPM GSETTINGS HARFBUZZ JPEG JSON LCMS2 LIBOTF LIBSELINUX LIBXML2 MODULES NATIVE_COMP NOTIFY INOTIFY PDUMPER PGTK PNG RSVG SECCOMP SOUND SQLITE3 THREADS TIFF TOOLKIT_SCROLL_BARS XIM GTK3 ZLIB Important settings: value of $LANG: en_US.UTF-8 locale-coding-system: utf-8-unix --=-=-=--