From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from localhost (localhost [127.0.0.1]) by arlo.cworth.org (Postfix) with ESMTP id 935336DE1A97 for ; Sat, 25 Feb 2017 12:39:46 -0800 (PST) X-Virus-Scanned: Debian amavisd-new at cworth.org X-Spam-Flag: NO X-Spam-Score: -0.182 X-Spam-Level: X-Spam-Status: No, score=-0.182 tagged_above=-999 required=5 tests=[AWL=-0.162, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01] autolearn=disabled Received: from arlo.cworth.org ([127.0.0.1]) by localhost (arlo.cworth.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id c-ltW3PnSS3W for ; Sat, 25 Feb 2017 12:39:45 -0800 (PST) Received: from mail-lf0-f47.google.com (mail-lf0-f47.google.com [209.85.215.47]) by arlo.cworth.org (Postfix) with ESMTPS id C80A76DE1A6B for ; Sat, 25 Feb 2017 12:39:44 -0800 (PST) Received: by mail-lf0-f47.google.com with SMTP id k202so6802358lfe.1 for ; Sat, 25 Feb 2017 12:39:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nikula-org.20150623.gappssmtp.com; s=20150623; h=from:to:subject:in-reply-to:references:date:message-id:mime-version; bh=idg/Ma5FjKgxyu2sNRvkRnPYH6u9qRsxDHH2IOmlM5g=; b=CdI+UbiYft/oGA99c1R79mgtfCPuj6CE2dBxpWj/Fl55yj7KNy7mzCOKtDwmpEUv8m /5vm1XsiCXG8Pcq+bHwv+A0VE/YkdigHoJSuHFFz1C+DxB7Yu36sAP70DqyeDe3NDmTe v2ByR+D+yUuS2WLu/wg6si+6tbjgXtrGhjRKe/lyHBZQO0tK/XwTeSGKGs5YBk6eFNhH LmN331yf0BNIploRl5q1y2z1MIdH6rwdlqZbxbW76xRNDHhFJ4LKakAxC+T+PiLsPK0q cI6Xrl62KJjzdvBBuXY34Pp+ljqtCgti0/tW/Ft8gdLw64M71Or2zhhz7NInMerGJFG3 MtuQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:in-reply-to:references:date :message-id:mime-version; bh=idg/Ma5FjKgxyu2sNRvkRnPYH6u9qRsxDHH2IOmlM5g=; b=tpBZiEy5m5lfcC1tEpxvPP1or4ti+bRWm+jP5ygkTXUtIfJqXMpEDa+Z7KFO2E17VA 7eGft5s4iJMTXR4Uy6bM0+crqy004DJyTy8tadWkCs4ELU9JeMRXDK4i2wWGFoPcfWOt gEztlEYdj1eLuoEgsAWVM4udD8gO+bpk2JSAeSxNwpBW02ueUAXYRstaCDpQO4iJiKyb lcbRTdWvmH/CaIU4nuATdgqr0lSoAH4wc+ImzwwGewmKrd+fSwNDbDHOdbJnpmRu9eOf 0CCx3/6HEalyueS5L4twFvH/JL2zuEf5oSxoD1ygB4in8/d5//3SDaShjjQMx/Yn1RDx AebQ== X-Gm-Message-State: AMke39lV8rrn8fyOoVUh2C5bdNel6teQmxHv2C4WkThYmknV1x0T9pi0kBeeMjejpLZSgA== X-Received: by 10.25.216.28 with SMTP id p28mr1165311lfg.164.1488055183064; Sat, 25 Feb 2017 12:39:43 -0800 (PST) Received: from localhost (mobile-access-bcee80-14.dhcp.inet.fi. [188.238.128.14]) by smtp.gmail.com with ESMTPSA id h41sm9654435lji.59.2017.02.25.12.39.42 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 25 Feb 2017 12:39:42 -0800 (PST) From: Jani Nikula To: David Bremner , notmuch@notmuchmail.org, eg@gaute.vetsj.com Subject: Re: [PATCH 3/4] lib: handle DatabaseModifiedError in _n_message_ensure_metadata In-Reply-To: <8737f210p6.fsf@tethera.net> References: <20170225034513.19427-1-david@tethera.net> <20170225034513.19427-4-david@tethera.net> <87poi66rjj.fsf@nikula.org> <8737f210p6.fsf@tethera.net> Date: Sat, 25 Feb 2017 22:39:41 +0200 Message-ID: <87efym6m42.fsf@nikula.org> MIME-Version: 1.0 Content-Type: text/plain X-BeenThere: notmuch@notmuchmail.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 25 Feb 2017 20:39:46 -0000 On Sat, 25 Feb 2017, David Bremner wrote: > Jani Nikula writes: > >> On Fri, 24 Feb 2017, David Bremner wrote: >>> The retries are hardcoded to a small number, and error handling aborts >>> than propagating errors from notmuch_database_reopen. These are both >>> somewhat justified by the assumption that most things that can go >>> wrong in Xapian::Database::reopen are rare and fatal (like running out >>> of memory or disk corruption). >> >> I think the sanity of the implementation hinges on that assumption. It >> makes sense if you're right, but I really have no idea either way... > > That was my conclusion from talking to Olly (xapian upstream). > > 24-02-2017 08:12:57 < bremner> any intuition about how likely > Xapian::Database::reopen is to fail? I'm catching a > DatabaseModifiedError somewhere where handling any further errors is > tricky, and wondering about treating a failed reopen as as "the > impossible happened, stopping" > > 24-02-2017 16:22:34 < olly> bremner: there should not be much scope for > failure - stuff like out of memory or disk errors, which are probably a > good enough excuse to stop > > I could add that to the commit message? Yes, please, it'll come in handy when the memory of the discussion has faded! (Like two weeks from now... ;) > >>> + >>> + /* all the way without an exception */ >>> + success = TRUE; >> >> Nitpick, if you don't intend to use that variable to return status from >> the function, you can just break here, and get rid of the variable. But >> no big deal. >> > > I think I have some kind of mental block about break and continue. But > it could even be a goto, those I understand ;). Heh, as I said, no big deal. > >> >>> + } catch (const Xapian::DatabaseModifiedError &error) { >>> + notmuch_status_t status = notmuch_database_reopen (message->notmuch); >>> + if (status != NOTMUCH_STATUS_SUCCESS) >>> + INTERNAL_ERROR ("unhandled error from notmuch_database_reopen: %s\n", >>> + notmuch_status_to_string (status)); >>> + success = FALSE; >>> + } catch (const Xapian::Error &error) { >>> + INTERNAL_ERROR ("A Xapian exception occurred fetching message metadata: %s\n", >>> + error.get_msg().c_str()); >>> + } >> >> If the assumption is that these really are rare cases (read: shouldn't >> happen), INTERNAL_ERROR is an improvement over leaking the >> exception. Otherwise, I think we'd need to propagate the status all the >> way to the API, which would really be annoying. >> >> I guess I think this is a worthwhile improvement no matter what. > > Yeah, I had a go at that in the previous longer series, but I was not > very happy with the (incomplete) results Err, I'm sorry for not being clear, I meant that this patch *series* is a worthwhile improvement as-is, even if reopen did have common and unrecoverable failure modes (which it shouldn't). This series catches a previously uncaught exception, tries to do something sensible about it, and failing at that, causes an INTERNAL_ERROR. That's a huge improvement. If we end up seeing those errors later, we can reconsider the error propagation. BR, Jani.