From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from localhost (localhost [127.0.0.1]) by arlo.cworth.org (Postfix) with ESMTP id 730636DE0A77 for ; Thu, 15 Jun 2017 18:07:58 -0700 (PDT) X-Virus-Scanned: Debian amavisd-new at cworth.org X-Spam-Flag: NO X-Spam-Score: -0.348 X-Spam-Level: X-Spam-Status: No, score=-0.348 tagged_above=-999 required=5 tests=[AWL=-0.217, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01] autolearn=disabled Received: from arlo.cworth.org ([127.0.0.1]) by localhost (arlo.cworth.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id SUWhndnMM2Yg for ; Thu, 15 Jun 2017 18:07:57 -0700 (PDT) Received: from mail-pg0-f50.google.com (mail-pg0-f50.google.com [74.125.83.50]) by arlo.cworth.org (Postfix) with ESMTPS id 3BA0B6DE02DA for ; Thu, 15 Jun 2017 18:07:57 -0700 (PDT) Received: by mail-pg0-f50.google.com with SMTP id k71so13737652pgd.2 for ; Thu, 15 Jun 2017 18:07:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=from:to:subject:in-reply-to:references:date:message-id:mime-version; bh=2Q7HIdhxlF2Xb2OCCBhRLl0Sck6VwzqwZdXtRRR9X/A=; b=A0Rgvzz+cktS5r0JgoxAZv8QT8EqU5il9iXVNm4uTPSoYyX42z+dlSSf0y88GGWGRD +t0qY6xs+eInmbavCqtPbCNrvCK4LCKHFHFM9C7TcoxVv6hDak6ho3XBCzrCFET7YtPC 7xDiEyr7YmKEJc+GyZH9gz1kZUUUNekRUu9P+giu4Vh3VQezFJILEFZ7eFcSoWCgnkop gfScZc3EwTXCyFUNwA+zfizpuJc5evRAZbjQvmMWEHTsubb+HZigVM0LowWUjDUpm16D dKbDh7WN5+dZ3SH9noTiO5dNUmxwhWy+5cDAKo4pgOA5gOyT9veTzegvex8vBCVpIKlX 5Kew== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:in-reply-to:references:date :message-id:mime-version; bh=2Q7HIdhxlF2Xb2OCCBhRLl0Sck6VwzqwZdXtRRR9X/A=; b=mH4R7lzKfNhGZXl5VYoAOC3lPD64weMBSYFIXSZQd7a5nrO3XltIsRAZqs7gYv3k7n WY4d1OjFLAeS9Txm4fkK91Irae4j4Hvzh/Zo2ANw/v0Td9RdjF12nmh/oFtvWQnb3zBv bdUfM7yBMgT+bTUI5Y5cgM2hYHMkqenhcF9/e2h2Q5oMg8FQYeo/OwtkVdUUyPeFsG/L xQrL8+ku9478/ioMr/JViryYtiJaBshzugonrJZyOmHrj+zuK1DxG01xKZ7obG1TqM89 F600QGUCSF1zBZxDp5bW5Q/5FIetrA/agjfucJy5jLzeem4lxTElx9YO8pFPX2HFeww8 0B3A== X-Gm-Message-State: AKS2vOwocKiqmVgHSnMTpM7ZQ4NL235Eejm7tGqVrzKtzuq9wkpKCYM2 9+Rh54iCMQmr/etoB2bNzg== X-Received: by 10.84.197.3 with SMTP id m3mr9537605pld.40.1497575276348; Thu, 15 Jun 2017 18:07:56 -0700 (PDT) Received: from marmstrong-linux.kir.corp.google.com ([2620:0:1008:11:7c7b:3b77:3470:40cc]) by smtp.gmail.com with ESMTPSA id e124sm832664pgc.17.2017.06.15.18.07.54 (version=TLS1_2 cipher=AES128-SHA bits=128/128); Thu, 15 Jun 2017 18:07:54 -0700 (PDT) From: Matt Armstrong To: David Bremner , Daniel Kahn Gillmor , Xu Wang , notmuch@notmuchmail.org Subject: Re: find threads where I and Jian participated but not Dave In-Reply-To: <8737b1rojw.fsf@tethera.net> References: <87bmprtqgo.fsf@tethera.net> <87fuf1nnl5.fsf@fifthhorseman.net> <8737b1rojw.fsf@tethera.net> Date: Thu, 15 Jun 2017 18:07:53 -0700 Message-ID: MIME-Version: 1.0 Content-Type: text/plain X-BeenThere: notmuch@notmuchmail.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 16 Jun 2017 01:07:58 -0000 David Bremner writes: > Daniel Kahn Gillmor writes: > >> >> One of my long-standing wishes is to be able to say "show me mails in my >> inbox from people who have replied to messages i've sent them". >> >> This could be re-framed as "show me threads in which i've participated, >> where there are some messages flagged with 'inbox'". but generating a >> huge list of all threads in which i've participated, just to be able to >> do an intersection operation with a (much smaller) list of all threads >> that have a message with the inbox flag seems like a pretty gross >> inefficiency. > > At the moment the best we could do is essentially the same algorithm, > but in C instead of shell / python. Threads are not documents in the > database, so they can't efficiently be searched for. Of course we could > change that, but those kind of changes take a fair amount of effort, and > some careful design work. Even if the C level does the same algorithm, it may be able to do some optimizations on behalf of the "scripting layer" queries. I suspect that a separate "thread based" query language may be an interesting area of investigation. Taking Daniel's last example, "show me mails in my inbox from people who have replied to messages I've sent them". That isn't even an entirely unambiguous query specification. What is *actually* desired: a) show me messages from X that are part of threads where at least one message is in the inbox and for which at least one message is from me. or, b) same as (a) but the "message from X" must be in the inbox (not just any other message in the thread) or, c) same as (a) or (b) but the "message from X" is a reply (e.g. dated after, or in-reply-to) a message from me. or, d) same as (c) but "message from X" is "unread", etc. Like David's 'comm -12 A B' solution, these pretty quickly start looking like multi-pass, or structed/nested, queries. They are a lot more like relational database queries (SQL) than the single-pass, flat (NoSQL) queries we typically use with notmuch.