From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS15169 209.85.128.0/17 X-Spam-Status: No, score=-3.2 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS shortcircuit=no autolearn=ham autolearn_force=no version=3.4.6 Received: from mail-lj1-f175.google.com (mail-lj1-f175.google.com [209.85.208.175]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by dcvr.yhbt.net (Postfix) with ESMTPS id D53D71F54A for ; Fri, 6 Sep 2024 22:21:03 +0000 (UTC) Authentication-Results: dcvr.yhbt.net; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20230601 header.b=JdabZwGL; dkim-atps=neutral Received: by mail-lj1-f175.google.com with SMTP id 38308e7fff4ca-2f029e9c9cfso34597871fa.2 for ; Fri, 06 Sep 2024 15:21:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1725661261; x=1726266061; darn=public-inbox.org; h=mime-version:user-agent:content-transfer-encoding:references :in-reply-to:date:to:from:subject:message-id:from:to:cc:subject:date :message-id:reply-to; bh=OmLADcsME6kLBA1OpEAwnLC1/CScBCKIiZkztH7EvAw=; b=JdabZwGLSrJCup0hxUCETy8TvkeV8Oq6swhPtRuBKjk2PoDQTYus4XeRcfuqtv3MN5 mzxE1R+icq19HSVDwLeMnb4mKSPvgdu4JesK/AQNiPjjD8mxfpajpkjENuW1TapkMx2O RluPAPqLupcv9lsD7sWe3JKAKW29mtakprPuNT/ruceiFFCwkpoe2HC/f/m9q73eUwDM jCBEXwIaJ4wl42BchCdRTBLKKriGtthNeq44PMmiIzChvbU1Z8PB1uPaADs7rPnSDWkw LKdyrS7/Ktq1fXWujXCbzaI0w/W2VxhLw+0Yk6Tf7KhTgxhvxqFpQcm/pxTiWQblzxIz dqXw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725661261; x=1726266061; h=mime-version:user-agent:content-transfer-encoding:references :in-reply-to:date:to:from:subject:message-id:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=OmLADcsME6kLBA1OpEAwnLC1/CScBCKIiZkztH7EvAw=; b=qNp4j6ScchHXqxGJgXVSlLIG69zBGnFMzJ11wTlFEVwC8yRdrtVjIotr82V4qyvrTJ 4/k9IhGgMXaNSs2obarDVdd+gU3i2TGzyZkUgFTOjoEsQawPQ4JME3YWgcJRZyD79Ocv Aawk/XfmZAjGfT4qyCl9AiZzYqrUYlOQy1wnNfPoJ3apoE3ASM9ZdK92hWZFgUM6OupA /YRwemACtggxGUs6Zb8OVG+AcUWxnoSfy1AzqKFuzeuMN2M0zmjyGJIe18hWCCuxmdr4 Y98Fj1fp5HOy3SBIeRPJx4cxr1bc3PXBc9rE8u+5IEylnxW0+ivoX1i2mk9tkH5pH9J2 1gXQ== X-Forwarded-Encrypted: i=1; AJvYcCXQDxhu2oAWVgCyWNBo4oobAj/fnmQb3jcP2SLtPhdPAvo1GXrBvx4MHmTLIrPBC3qLbGby@public-inbox.org X-Gm-Message-State: AOJu0YwGcKvxcjP+en9LvPCfjluxpIV+sAcuVgpSgfoWWt68bSfoubzi p6KmgqtcTSlkuz8ArnjZHiTWlOJy8z6dxJbJOvqrmHjAGXaBbDQKfGw5uGcf X-Google-Smtp-Source: AGHT+IHFMRVaJ9MGuUS/FdXlilvvhi3gty5MwW0exyBE/agB48wHKaXkpIpWv+2HfqVr+CXCLSVQRg== X-Received: by 2002:a05:651c:1548:b0:2f7:5049:160 with SMTP id 38308e7fff4ca-2f75a98689bmr3239521fa.13.1725661260379; Fri, 06 Sep 2024 15:21:00 -0700 (PDT) Received: from [10.212.99.200] ([185.212.13.219]) by smtp.gmail.com with ESMTPSA id 4fb4d7f45d1cf-5c3cc52ebfcsm2840119a12.20.2024.09.06.15.20.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 06 Sep 2024 15:20:59 -0700 (PDT) Message-ID: Subject: Re: Occasional web view corruption (extra html escapes) From: Filip Hejsek To: Konstantin Ryabitsev , Eric Wong , meta@public-inbox.org Date: Sat, 07 Sep 2024 00:20:58 +0200 In-Reply-To: <20240903-brainy-lionfish-of-saturation-71ae1a@lemur> References: <20240903-brainy-lionfish-of-saturation-71ae1a@lemur> Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable User-Agent: Evolution 3.52.4 MIME-Version: 1.0 List-Id: Hello, I have figured out why this happens. I have reproduced the bug by doing the following: 1. setup an instance of public-inbox and import some message into it 2. create extindex named all and add it to config 3. start public-inbox-httpd 4. directly open http:///all// The server will enter the broken state if this is the first page loaded from the server. The issue occurs because of the following sequence of events: 1. WWW->call is called 2. after matching the URL, msg_page is called 3. msg_page validates the inbox name by calling invalid_inbox_mid 4. invalid_inbox_mid calls invalid_inbox 5. the name is looked up with lookup_name 6. because there is no inbox with that name, undef is returned 7. the name is looked uo with lookup_ei 8. the lookup succeeds 9. get_mid_html is called to generate the HTML page 10. addr2urlmap is used to construct a regex of known addresses 11. because no inbox has been instantiated, $cfg->{-by_addr} is empty 12. because of that, $re will an empty string 13. so the final regex is /\b()\b/, which matches every word boundary - Filip Hejsek