From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp12.migadu.com ([2001:41d0:2:bcc0::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms5.migadu.com with LMTPS id WPhlK3U18WLEKwEAbAwnHQ (envelope-from ) for ; Mon, 08 Aug 2022 18:10:29 +0200 Received: from aspmx1.migadu.com ([2001:41d0:2:bcc0::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp12.migadu.com with LMTPS id GOBPK3U18WLD1wAAauVa8A (envelope-from ) for ; Mon, 08 Aug 2022 18:10:29 +0200 Received: from mail.notmuchmail.org (yantan.tethera.net [135.181.149.255]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 8A6E8FB63 for ; Mon, 8 Aug 2022 18:10:29 +0200 (CEST) Received: from yantan.tethera.net (localhost [127.0.0.1]) by mail.notmuchmail.org (Postfix) with ESMTP id E25825F370; Mon, 8 Aug 2022 16:10:26 +0000 (UTC) Received: from fethera.tethera.net (fethera.tethera.net [IPv6:2607:5300:60:c5::1]) by mail.notmuchmail.org (Postfix) with ESMTP id EE92B5E545 for ; Mon, 8 Aug 2022 16:10:24 +0000 (UTC) Received: by fethera.tethera.net (Postfix, from userid 1001) id 3940F5FBC0; Mon, 8 Aug 2022 12:10:24 -0400 (EDT) Received: (nullmailer pid 1643894 invoked by uid 1000); Mon, 08 Aug 2022 16:10:23 -0000 From: David Bremner To: Bence Ferdinandy Subject: Re: matching both accented and non-accented character for non-accented characters? In-Reply-To: References: <87wnbikcjs.fsf@tethera.net> Date: Mon, 08 Aug 2022 13:10:23 -0300 Message-ID: <87r11qk974.fsf@tethera.net> MIME-Version: 1.0 Message-ID-Hash: HOJFYMR6HF455I4KG3BGKMZSMM755PAD X-Message-ID-Hash: HOJFYMR6HF455I4KG3BGKMZSMM755PAD X-MailFrom: david@tethera.net X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; header-match-notmuch.notmuchmail.org-0; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: notmuch@notmuchmail.org X-Mailman-Version: 3.3.3 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Migadu-Flow: FLOW_IN X-Migadu-To: larch@yhetil.org X-Migadu-Country: DE ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1659975029; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:list-id:list-help: list-owner:list-unsubscribe:list-subscribe:list-post; bh=5SMN6IKoabfiKoOIBXFbfeA5utOg4mfWu59YMA3RnrI=; b=miZZmdb5gURHigOGM9wk+RXPFf3BOpV0FSDJpdMrNVq9OYQzwLIhlY5w+hz+IlXqP+gcex GGeQ3wnZDieFb3hTVYrFTiAfMLGgfN/u10FrsI8ODkkPeadlD67NLH154Lc4lmy+DFs4R3 9dcyS9LpfpUmgVVqmDMOpe+Lxv42rO0h4YOWJE246G3enMClpawUF9h7S3RIpyceLYyTiU o+nkqfbi2jirutFR1GEp4v6pnWivhHAZM9OoY4bmnp69Yt3bckWXkgsB62UwAiILP1Z1Uc ik25ceFuw+OKqaLYNTfCqRcgMN8abyvhmgA4qMEJE5Sao8gs16RnljLSGszokQ== ARC-Seal: i=1; s=key1; d=yhetil.org; t=1659975029; a=rsa-sha256; cv=none; b=XDalPyXdKYZGZTZ9xYU2vf8b6vyO0JdEb1N2/RZFvuY4pusajAT90r0kNPqTsLjzUCWyXl UPZrPOl9tJuvnTRZIPDnTuwHY0ydDJ30PIad1vikY/ghazNLaCIXbEe6vFxM7IPvmEW+O7 z8kVZUTe49YlYHf5bsI4Bqi5G3SK3SlAhrIJzwu9F0YwWmZYrIwRi4GXnrFLqqq+Gk4jws cs6zbz4+sYCxOr1F7rvIoZwCDdB0hBhx5n4yiM3yX2Wr8zui4V1fm6CRGFTe3ECUiq+Cg3 s6gaCIVtkiiDpgiGMaPixU6bx5G1pkl8goFgm4dqTY/mTQEA4q3yWvo0gV+XQg== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=none; dmarc=none; spf=pass (aspmx1.migadu.com: domain of notmuch-bounces@notmuchmail.org designates 135.181.149.255 as permitted sender) smtp.mailfrom=notmuch-bounces@notmuchmail.org X-Migadu-Spam-Score: -1.30 Authentication-Results: aspmx1.migadu.com; dkim=none; dmarc=none; spf=pass (aspmx1.migadu.com: domain of notmuch-bounces@notmuchmail.org designates 135.181.149.255 as permitted sender) smtp.mailfrom=notmuch-bounces@notmuchmail.org X-Migadu-Queue-Id: 8A6E8FB63 X-Spam-Score: -1.30 X-Migadu-Scanner: scn0.migadu.com X-TUID: yOnCHm5gMpQT Bence Ferdinandy writes: > Thanks! I didn't know unicode equivalence existed, but it seems to be the > feature I want, so at least now I have a name for it :) And yes, actually > setting the stemmer would also be cool, I saw that Xapian has a Hungarian > stemmer but I kind of assumed all stemmers are applied somehow (although it > makes sense they're not). Is stemming done during search or would it affect > the database as well? Just to have a notion of how complicated a settable > stemmer feature would be. > Stemming happens both during search (unless turned off for given term) and at indexing time. So yeah, changing the stemming algorithm with change the database (and require a re-index).