From: Philip McGrath <philip@philipmcgrath.com>
To: "Ludovic Courtès" <ludo@gnu.org>, guix-sysadmin <guix-sysadmin@gnu.org>
Cc: guix-devel@gnu.org, morganlemmerwebber@gmail.com
Subject: Re: Git-LFS or Git Annex?
Date: Fri, 26 Jan 2024 23:31:24 -0500 [thread overview]
Message-ID: <0452e9d6-7954-42e0-836f-83dcc9487b73@philipmcgrath.com> (raw)
In-Reply-To: <87mssuu57m.fsf@inria.fr>
Hi,
On 1/24/24 10:22, Ludovic Courtès wrote:
>
> The question boils down to: Git-LFS or Git Annex?
>
> [...]
>
> What’s your experience? What would you suggest?
>
I have a few times had a problem for which I thought Git LFS might be a
solution, and each time I have ended up ripping out Git LFS in
frustration before long.
I have not used Git Annex. I have looked into it a few times, but each
time I decided it was too complex or not quite suitable for my use-case
in some way. On the other hand, I have heard good things about it from
people who have used it: in particular, I believe Morgan Lemmer-Webber
(CC'ed) used it to manage a large set of art history images.
The main thing in this context that still isn't clear to me from by
reading so far is how sharing lists of remotes works with Git Annex. In
plain Git, remotes are part of the local state of a particular clone,
not distributed as part of the repository. For the objectives here,
though, a lot of the benefit would seem to be having many copies in
synchronized, possibly "special" remotes so that anyone trying to get
the videos would have plenty of ways to get them. I'm not sure to what
extent Git Annex does that out of the box.
I did see that Git Annex can use Git LFS as a "special remote".
There are also two other approaches I think would be worth at least
considering:
1. Just use Git
While the limitations of Git for storing large media files are well
known, I have found it to be good enough for several use-cases, and it
has the strong advantage of not requiring additional tools. My
impression is that a significant factor in people using Git LFS, in
particular, is the limit on repository size imposed by the popular
hosting providers. There are strategies within Git to avoid having to
download unwanted artifacts, including creating branches with unrelated
histories, shallow clones (e.g. --depth=1 --single-branch), partial
clones [1][2][3] (e.g. --filter=blob:none), and sparse checkouts [4][5],
with the later two being fairly new features.
[1]: https://git-scm.com/docs/partial-clone
[2]:
https://git-scm.com/docs/git-clone#Documentation/git-clone.txt---filterltfilter-specgt
[3]:
https://git-scm.com/docs/git-rev-list#Documentation/git-rev-list.txt---filterltfilter-specgt
[4]: https://git-scm.com/docs/git-sparse-checkout
[5]: https://git-scm.com/docs/git-clone#Documentation/git-clone.txt---sparse
2. Mirror URLs
Another approach would be just to make each video available at a few
URLs and have Guix origins with the list. If one of the available URLs
were the Internet Archive, it would have a high degree of assurance of
long-term preservation. I think the biggest downside is that this might
not help much with managing the collection of videos.
Philip
next prev parent reply other threads:[~2024-01-27 4:32 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-01-24 15:22 Git-LFS or Git Annex? Ludovic Courtès
2024-01-24 16:13 ` indieterminacy
2024-01-24 17:39 ` Giovanni Biscuolo
2024-01-28 10:33 ` Nicolas Graves via Development of GNU Guix and the GNU System distribution.
2024-01-28 11:32 ` Philip McGrath
2024-01-28 17:32 ` Giovanni Biscuolo
2024-01-29 11:39 ` Nicolas Graves via Development of GNU Guix and the GNU System distribution.
2024-01-24 18:41 ` pukkamustard
2024-01-24 20:32 ` Troy Figiel
2024-01-25 12:03 ` Giovanni Biscuolo
2024-01-25 16:55 ` Simon Tournier
2024-01-26 2:20 ` Kyle Meyer
2024-01-26 10:02 ` Simon Tournier
2024-01-27 16:59 ` Timothy Sample
2024-01-27 17:47 ` Kyle Meyer
2024-02-14 15:18 ` Simon Tournier
2024-01-27 4:31 ` Philip McGrath [this message]
2024-01-28 17:37 ` Efraim Flashner
2024-02-02 16:46 ` Christine Lemmer-Webber
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://guix.gnu.org/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=0452e9d6-7954-42e0-836f-83dcc9487b73@philipmcgrath.com \
--to=philip@philipmcgrath.com \
--cc=guix-devel@gnu.org \
--cc=guix-sysadmin@gnu.org \
--cc=ludo@gnu.org \
--cc=morganlemmerwebber@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/guix.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).