From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: X-Spam-Status: No, score=-3.2 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE shortcircuit=no autolearn=ham autolearn_force=no version=3.4.6 Received: from madras.collabora.co.uk (madras.collabora.co.uk [IPv6:2a00:1098:0:82:1000:25:2eeb:e5ab]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by dcvr.yhbt.net (Postfix) with ESMTPS id DB7BD1F406 for ; Mon, 27 Nov 2023 08:17:18 +0000 (UTC) Authentication-Results: dcvr.yhbt.net; dkim=pass (2048-bit key; unprotected) header.d=collabora.com header.i=@collabora.com header.a=rsa-sha256 header.s=mail header.b=C6hYd0hN; dkim-atps=neutral Received: from localhost (cola.collaboradmins.com [195.201.22.229]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: rcn) by madras.collabora.co.uk (Postfix) with ESMTPSA id 506BB66022D6 for ; Mon, 27 Nov 2023 08:17:15 +0000 (GMT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=collabora.com; s=mail; t=1701073035; bh=b6lv4UjzIIzuo1dsXnrKNLGdanlNMTYWMKYdyZgUIfw=; h=From:To:Subject:Date:From; b=C6hYd0hNAk3UmEai6G8fV6CMdtqU3/XRan9N7xw17I751Yz+ARsYB2l87jWgQafoX 8vap5bRRf6HYRQqmFtyhdgTiciKCKv9EqciULIadZIQrdHytwNNFcyKKpI99nxM3Ex H71yTjO24Ll4WOdLrgBsky+dDAfHWN3tTtQSWoezMahUxEpgFxUArcrbOLo6ikdnsc oz8HolEtlar/O7mqYVQvUOe+DC2xm43s0/wAIZSQlyyy2ObzZwlo8YUZ2WVeSC+0Dc xxHFaepVnnGYx1ELcDorJaNawrKt5Pzc5Aa11W32zttkwgvlzofs3PtGTNWVa00npg 3Sy47faHkEZJg== From: =?utf-8?Q?Ricardo_Ca=C3=B1uelo?= To: meta@public-inbox.org Subject: [BUG] Unescaped '&' ampersands in atom header links Date: Mon, 27 Nov 2023 09:17:11 +0100 Message-ID: <87o7ff4nlk.fsf@collabora.com> MIME-Version: 1.0 Content-Type: text/plain List-Id: Hi, When parsing outputs from lore.kernel.org with Python3 xml.dom.minidom I noticed that, for queries that contain '&' characters, they aren't escaped in the href attributes of the title tags in atom feed headers. So, for example, for this request: https://lore.kernel.org/all/?x=A&q=driver+core%3A+Fix+wait_for_device_probe%28%29+%26+deferred_probe_timeout+interaction The atom header in the output contains: driver core: Fix wait_for_device_probe() & deferred_probe_timeout interaction - search results where the '&' character is escaped in the text of the tag but not in the href attributes. Shouldn't these be escaped as well? If so, the fix should be most likely located in WwwAtomStream.pm:atom_header(). Cheers, Ricardo