From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp1 ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms11 with LMTPS id k+h3OiiGu19dfgAA0tVLHw (envelope-from ) for ; Mon, 23 Nov 2020 09:51:36 +0000 Received: from aspmx1.migadu.com ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp1 with LMTPS id 6Ef+NSiGu18OcwAAbx9fmQ (envelope-from ) for ; Mon, 23 Nov 2020 09:51:36 +0000 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 1B79494011C for ; Mon, 23 Nov 2020 09:51:36 +0000 (UTC) Received: from localhost ([::1]:48332 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kh8VS-0003mI-Tx for larch@yhetil.org; Mon, 23 Nov 2020 04:51:34 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:58912) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kh8Uo-0003m6-3t for emacs-orgmode@gnu.org; Mon, 23 Nov 2020 04:50:54 -0500 Received: from mail-ej1-x62a.google.com ([2a00:1450:4864:20::62a]:44742) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kh8Um-0002fo-9j for emacs-orgmode@gnu.org; Mon, 23 Nov 2020 04:50:53 -0500 Received: by mail-ej1-x62a.google.com with SMTP id k9so7564798ejc.11 for ; Mon, 23 Nov 2020 01:50:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=2QcB6bpA4F6U9eUmw/aoAFi+9ZNKkGDsucDyj5a63W8=; b=oyQOvCH6OawY5rv7q3VFAh/I1r+sDM79ywL9bTng+55WeuGVn+vNbmo15jiipJm+4P Ziv/u7LdQqL5flxOZsT25nMtH+zGQwXiBAitvCubY9aNgOjj34zBefWTJ9wZAS9eLOfK HSfxhUuGwHe9GFX6TKqR2+SU8bzkbHMLBEBxFh+1CFrUumCZD0b11rsUgiYn9Xkb4m6K iKHtY78PbhY21o+HxroSdf7trs+C9Nz1rMDJhB+njQa+7+rEns4U9HadHx8SZvLc9MUM hLNJUlb/aRDjbF5CpFcGlcr2cyAePl5bJmLWrn1ykqOh2qX+FqhizYmTdHEMIcg9XL8F MV2g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=2QcB6bpA4F6U9eUmw/aoAFi+9ZNKkGDsucDyj5a63W8=; b=BDyKN1shlM7TdhJE0tYiYRu+oU9Zalm45AWv9IiccZQsMuT0ZaHnMxWYzkonWB42hA +89kiaSDHQfcfv/X8w+SPZtFRDLjLqqkV+ARHjsUHTainU8jSPyUE+dwvryaCyPKSCEK isjkDKW9P/lakAW/oz3ky7gsyFS6C/WnaP+nb24oj7VpxS7CxU3dTsisBsk6q7rYDfWV QD/bxoj1FFM8K2pj9xpEof8AHOWRCrp9ULb2iHLwPZfq4bdMK+apd9i/F/LG+QY+9Yxz USb9FOPGUXVq/aLIwx7+GnjJm/36srbGLYll1tM5j/q74HZ1UnoqTBZpwlfvGY2ZLzbG bISg== X-Gm-Message-State: AOAM530nmxrp0B+6AaYaBziOMpVUv6WWbD+TtBpN/0218S32FAzg4lg1 tG64DGdTREu+l+rmiXLF9EEO06IcJPQoBQ16Iqk= X-Google-Smtp-Source: ABdhPJz7hhRXCjK0efjxsyGvjwsouzc68e/d+f+aHAjqCIrmxD6sQIwcJNTSk0ZpF8sNPWv58mmsKxGkS2OVMBY9+UM= X-Received: by 2002:a17:906:5793:: with SMTP id k19mr45018747ejq.410.1606125049839; Mon, 23 Nov 2020 01:50:49 -0800 (PST) MIME-Version: 1.0 References: <87y2ive1i4.fsf@localhost> <878sauhhv1.fsf@web.de> <875z5ygwwr.fsf@web.de> <87r1olfvh4.fsf@web.de> In-Reply-To: <87r1olfvh4.fsf@web.de> From: Texas Cyberthal Date: Mon, 23 Nov 2020 17:50:13 +0800 Message-ID: Subject: Re: One vs many directories To: "Dr. Arne Babenhauserheide" Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Received-SPF: pass client-ip=2a00:1450:4864:20::62a; envelope-from=texas.cyberthal@gmail.com; helo=mail-ej1-x62a.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-orgmode@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "emacs-orgmode@gnu.org" Errors-To: emacs-orgmode-bounces+larch=yhetil.org@gnu.org Sender: "Emacs-orgmode" X-Scanner: ns3122888.ip-94-23-21.eu Authentication-Results: aspmx1.migadu.com; dkim=pass header.d=gmail.com header.s=20161025 header.b=oyQOvCH6; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (aspmx1.migadu.com: domain of emacs-orgmode-bounces@gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=emacs-orgmode-bounces@gnu.org X-Spam-Score: -1.71 X-TUID: m9XiSr9PYLuv Hi Dr. Arne, > The only part that hits performance limits is the agenda. Well, IIRC your Org Textmind is much smaller than mine. > My current guess is that the agenta is slow because it has to parse all m= y 7500 clock entries, and it has to check the Todo states of around 1200 he= adings. Ouch. I'd rather keep a "ramble log" so I can reconstruct an exactly honest time accounting, with discounts for partial attention, without worrying about fiddly clockin/outs. At least when working from home. If clocking into a work site, that's different, because one can reasonably bill for the entire time, with minimal clock toggling. > Did you check against filesystem limits? At 10k entries in a directory ty= pical filesystems start becoming slow. That=E2=80=99s the main reason I see= for adding hierarchies. 10k entries in a directory sounds inhumanely unergonomic. I guess my biggest flat name directory might eventually reach that size? In which case I could just split it in the middle of the alphabet, or similar solution. Right now it's only 600. If I guess a generous growth rate of 2 per day, times 30 years, that would be an additional 22k. Sounds manageable. Remember there are ways to consolidate entries even in flat "solid names" directory. It's advantageous to do so to facilitate isearch matching. For example, everyone with the same last name is one directory. Ditto everything that starts with the same word or even prefix. For example I have a directory called ~Wiki-~ and another called ~Tru-~ which contains truth, Trudeau and Trump. Most adults know 20-35k words. That's not the same as "solid names" known, but gives a ballpark on human memory size for a similar name type. I suspect computers will advance faster than anyone's Textmind reaches the Dired lag limit. No, if we are talking about scaling limits, then limits such as buffer size and Agenda search speed are orders of magnitude more relevant. Which problems deep tree nesting fixes. A 10k entry directory is getting into enterprise territory, and I'm sure enterprise has tech tricks that become worthwhile at that scale. > There are scaling problems in every direction: Too many files per directo= ry, too large files, too much content per heading, too many headings. There are scaling problems from too much deep tree nesting, namely too much fiddly ambiguous manual refiling. Solution is flat "solid name" directories just below feasible 10 Bins. Work fine. > I would have to build lots of additional tooling to make that work as wel= l. Many of the tools in Emacs work better on large files than on many files= =E2=80=94 I will switch to more files when performance on large files reac= hes its limits. Nah, my 100 mb (non archived) Textmind works fine. I just separated Agenda metadata from bulk prose. I am curious how many headings I have, how would I count that recursively? On Sun, Nov 22, 2020 at 8:04 PM Dr. Arne Babenhauserheide wrote: > > > Texas Cyberthal writes: > > >> I need instant search in the knowledge database and quick filing of ta= sks. Also I need the agenda to create a clocktable (that=E2=80=99s on the l= imit of being too slow) and the calendar and tasks of the week. > > > >> Also I need quick filing of notes and quotes (in specific files, not p= art of the agenda) and of long-form articles, one file per article (using j= ournal.el, also outside the agenda, searched using M-x deft), and quick cre= ation of website entries for a given category within the site (i.e. M-x dra= keto-software). > > > > So your Org usage style quickly hits critical performance problems at s= cale. > > The only part that hits performance limits is the agenda. All the rest > scales nicely. My current guess is that the agenta is slow because it > has to parse all my 7500 clock entries, and it has to check the TODO > states of around 1200 headings. Having multiple files would only add to > that. > > > I don't have these problems. Treefactor refiling is immune to scale. > > Did you check against filesystem limits? At 10k entries in a directory > typical filesystems start becoming slow. That=E2=80=99s the main reason I= see > for adding hierarchies. > > > Org's many tools and tricks are still handy in niche cases, but they > > don't cause scaling problems because they don't handle bulk info > > management. For example Org's refile tools are useful when writing > > advanced documentation with large single-file outlines. Most info > > doesn't require that much organization. It works fine as flat lists > > of headings in a detailed directory tree. > > Or as sub-headings in a large outline. > > There are scaling problems in every direction: Too many files per > directory, too large files, too much content per heading, too many > headings. > > I would have to build lots of additional tooling to make that work as > well. Many of the tools in Emacs work better on large files than on many > files =E2=80=94 I will switch to more files when performance on large fil= es > reaches its limits. > > I have one file where I=E2=80=99m reaching the limit. That=E2=80=99 my 7.= 3 MiB > emacs-remember-mode.org file where I throw long-form articles for > full-text search. I am considering to switch to a multi-file approach > for that and then to use deft to retrieve articles. > > Best wishes, > Arne > -- > Unpolitisch sein > hei=C3=9Ft politisch sein > ohne es zu merken