unofficial mirror of bug-guix@gnu.org 
 help / color / mirror / code / Atom feed
* bug#37207: guix.gnu.org returns Last-Modified = Epoch
@ 2019-08-28  9:52 Ludovic Courtès
  2019-08-28 10:40 ` bug#37207: guix.gnu.org Last Modified at epoch Gábor Boskovits
                   ` (4 more replies)
  0 siblings, 5 replies; 22+ messages in thread
From: Ludovic Courtès @ 2019-08-28  9:52 UTC (permalink / raw)
  To: bug-Guix

Hello Guix,

Since the use of the ‘static-web-site’ service, which puts web site
files in the store, nginx returns a ‘Last-Modified’ header that can
trick clients into caching things forever:

--8<---------------cut here---------------start------------->8---
$ wget --debug -O /dev/null   https://guix.gnu.org/packages.json 2>&1 | grep Last
Last-Modified: Thu, 01 Jan 1970 00:00:01 GMT
--8<---------------cut here---------------end--------------->8---

We should tell nginx to do not emit ‘Last-Modified’, or to take the
state from the /srv/guix.gnu.org symlink, if possible.

Ludo’.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* bug#37207: guix.gnu.org Last Modified at epoch
  2019-08-28  9:52 bug#37207: guix.gnu.org returns Last-Modified = Epoch Ludovic Courtès
@ 2019-08-28 10:40 ` Gábor Boskovits
  2019-08-28 14:37   ` Tobias Geerinckx-Rice via Bug reports for GNU Guix
  2019-08-28 15:05   ` Danny Milosavljevic
  2020-03-26 23:06 ` bug#37207: nginx serving files from the store returns Last-Modified = Epoch Vincent Legoll
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 22+ messages in thread
From: Gábor Boskovits @ 2019-08-28 10:40 UTC (permalink / raw)
  To: 37207

[-- Attachment #1: Type: text/plain, Size: 262 bytes --]

Hello,

Supressing the last modified header is just an
add_header Last-Modified "";
away.

To get the info from the symlink seems to be much trickier, i would do with
either embedded perl or embedded lua. I am not sure if we should bother
with it, though. Wdyt?

[-- Attachment #2: Type: text/html, Size: 431 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* bug#37207: guix.gnu.org Last Modified at epoch
  2019-08-28 10:40 ` bug#37207: guix.gnu.org Last Modified at epoch Gábor Boskovits
@ 2019-08-28 14:37   ` Tobias Geerinckx-Rice via Bug reports for GNU Guix
  2019-08-28 19:42     ` Gábor Boskovits
  2019-09-26  8:39     ` Ludovic Courtès
  2019-08-28 15:05   ` Danny Milosavljevic
  1 sibling, 2 replies; 22+ messages in thread
From: Tobias Geerinckx-Rice via Bug reports for GNU Guix @ 2019-08-28 14:37 UTC (permalink / raw)
  To: 37207

[-- Attachment #1: Type: text/plain, Size: 767 bytes --]

Gábor, Ludo',

Gábor Boskovits 写道:
> Supressing the last modified header is just an
> add_header Last-Modified "";
> away.

You'll also need:

# Don't honour client If-Modified-Since constraints.
if_modified_since off;
# Nginx's etags are hashes of file timestamp & file length.
etag off;

Turning these off will of course prevent all caching.  I don't 
know if that would add measurable load to guix.gnu.org (it would 
be more problematic if we used a CDN, but it might still make a 
difference).

Nix does something both interesting and icky — as always: patch[0] 
nginx to look up the realpath() instead, so clients can still 
cache using If-None-Match.

Kind regards,

T G-R

[0]: https://github.com/NixOS/nixpkgs/pull/48337

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 227 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* bug#37207: guix.gnu.org Last Modified at epoch
  2019-08-28 10:40 ` bug#37207: guix.gnu.org Last Modified at epoch Gábor Boskovits
  2019-08-28 14:37   ` Tobias Geerinckx-Rice via Bug reports for GNU Guix
@ 2019-08-28 15:05   ` Danny Milosavljevic
  2019-08-28 18:59     ` Gábor Boskovits
  1 sibling, 1 reply; 22+ messages in thread
From: Danny Milosavljevic @ 2019-08-28 15:05 UTC (permalink / raw)
  To: Gábor Boskovits; +Cc: 37207

[-- Attachment #1: Type: text/plain, Size: 601 bytes --]

Hi Gabor,

On Wed, 28 Aug 2019 12:40:37 +0200
Gábor Boskovits <boskovits@gmail.com> wrote:

> Supressing the last modified header is just an
> add_header Last-Modified "";
> away.
> 
> To get the info from the symlink seems to be much trickier, i would do with
> either embedded perl or embedded lua. I am not sure if we should bother
> with it, though. Wdyt?

Since we already emit ETag, I don't think we need to bother with Last-Modified.

Why is the ETag so short, though?

>wget --debug -O /dev/null   https://guix.gnu.org/packages.json 2>&1 | grep -i etag
>ETag: "1-2f38b1"


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* bug#37207: guix.gnu.org Last Modified at epoch
  2019-08-28 15:05   ` Danny Milosavljevic
@ 2019-08-28 18:59     ` Gábor Boskovits
  0 siblings, 0 replies; 22+ messages in thread
From: Gábor Boskovits @ 2019-08-28 18:59 UTC (permalink / raw)
  To: Danny Milosavljevic; +Cc: 37207

[-- Attachment #1: Type: text/plain, Size: 1000 bytes --]

Hello Danny,

Danny Milosavljevic <dannym@scratchpost.org> ezt írta (időpont: 2019. aug.
28., Sze, 17:05):

> Hi Gabor,
>
> On Wed, 28 Aug 2019 12:40:37 +0200
> Gábor Boskovits <boskovits@gmail.com> wrote:
>
> > Supressing the last modified header is just an
> > add_header Last-Modified "";
> > away.
> >
> > To get the info from the symlink seems to be much trickier, i would do
> with
> > either embedded perl or embedded lua. I am not sure if we should bother
> > with it, though. Wdyt?
>
> Since we already emit ETag, I don't think we need to bother with
> Last-Modified.
>
> Why is the ETag so short, though?
>
>
The ETag we emit is also bad. Nginx calculates this from mtime and
content-lenght,
so in our case it's just content length.


> >wget --debug -O /dev/null   https://guix.gnu.org/packages.json 2>&1 |
> grep -i etag
> >ETag: "1-2f38b1"
>
>
Best regards,
g_bor

-- 
OpenPGP Key Fingerprint: 7988:3B9F:7D6A:4DBF:3719:0367:2506:A96C:CF63:0B21

[-- Attachment #2: Type: text/html, Size: 1985 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* bug#37207: guix.gnu.org Last Modified at epoch
  2019-08-28 14:37   ` Tobias Geerinckx-Rice via Bug reports for GNU Guix
@ 2019-08-28 19:42     ` Gábor Boskovits
  2019-08-28 20:32       ` Ludovic Courtès
  2019-09-26  8:39     ` Ludovic Courtès
  1 sibling, 1 reply; 22+ messages in thread
From: Gábor Boskovits @ 2019-08-28 19:42 UTC (permalink / raw)
  To: Tobias Geerinckx-Rice; +Cc: 37207

[-- Attachment #1: Type: text/plain, Size: 1556 bytes --]

Hello Tobias,

Tobias Geerinckx-Rice via Bug reports for GNU Guix <bug-guix@gnu.org> ezt
írta (időpont: 2019. aug. 28., Sze, 16:38):

> Gábor, Ludo',
>
> Gábor Boskovits 写道:
> > Supressing the last modified header is just an
> > add_header Last-Modified "";
> > away.
>
> You'll also need:
>
> # Don't honour client If-Modified-Since constraints.
> if_modified_since off;
> # Nginx's etags are hashes of file timestamp & file length.
> etag off;
>
>
You really have a point here.

Based on my reseach, I came up with the following:

we need
etag off;

we should create a file with the git last modification time of the files,
updated when there is a new commit in the repo => last-modified
we should create a file with some hash of the files, updated when there is
a new commit in the repo => etag
we could restrict these operations to the files modified since the last
checkout.

Retrieve these with embededd perl.
Wdyt?


> Turning these off will of course prevent all caching.  I don't
> know if that would add measurable load to guix.gnu.org (it would
> be more problematic if we used a CDN, but it might still make a
> difference).
>
> Nix does something both interesting and icky — as always: patch[0]
> nginx to look up the realpath() instead, so clients can still
> cache using If-None-Match.
>
> Kind regards,
>
> T G-R
>
> [0]: https://github.com/NixOS/nixpkgs/pull/48337
>

Best regards,
g_bor

-- 
OpenPGP Key Fingerprint: 7988:3B9F:7D6A:4DBF:3719:0367:2506:A96C:CF63:0B21

[-- Attachment #2: Type: text/html, Size: 2568 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* bug#37207: guix.gnu.org Last Modified at epoch
  2019-08-28 19:42     ` Gábor Boskovits
@ 2019-08-28 20:32       ` Ludovic Courtès
  2019-08-29  6:11         ` Gábor Boskovits
  0 siblings, 1 reply; 22+ messages in thread
From: Ludovic Courtès @ 2019-08-28 20:32 UTC (permalink / raw)
  To: Gábor Boskovits; +Cc: 37207

Hello,

Gábor Boskovits <boskovits@gmail.com> skribis:

> we should create a file with the git last modification time of the files,
> updated when there is a new commit in the repo => last-modified
> we should create a file with some hash of the files, updated when there is
> a new commit in the repo => etag
> we could restrict these operations to the files modified since the last
> checkout.
>
> Retrieve these with embededd perl.
> Wdyt?

What would the config look like?  AFAICS our ‘nginx’ package doesn’t
embed Perl, and I think it’s better this way.  :-)  Can we do that with
pure nginx directives?

We create /srv/guix.gnu.org (as a symlink) with the correct mtime¹.  If
we can tell nginx to use it as the ‘Last-Modified’ date, that’s perfect.

Ludo’.

¹ https://git.savannah.gnu.org/cgit/guix/maintenance.git/tree/hydra/berlin.scm#n212

^ permalink raw reply	[flat|nested] 22+ messages in thread

* bug#37207: guix.gnu.org Last Modified at epoch
  2019-08-28 20:32       ` Ludovic Courtès
@ 2019-08-29  6:11         ` Gábor Boskovits
  2019-08-29 12:40           ` Ludovic Courtès
  0 siblings, 1 reply; 22+ messages in thread
From: Gábor Boskovits @ 2019-08-29  6:11 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: 37207

[-- Attachment #1: Type: text/plain, Size: 1619 bytes --]

Hello Ludo,

Ludovic Courtès <ludo@gnu.org> ezt írta (időpont: 2019. aug. 28., Sze,
22:32):

> Hello,
>
> Gábor Boskovits <boskovits@gmail.com> skribis:
>
> > we should create a file with the git last modification time of the files,
> > updated when there is a new commit in the repo => last-modified
> > we should create a file with some hash of the files, updated when there
> is
> > a new commit in the repo => etag
> > we could restrict these operations to the files modified since the last
> > checkout.
> >
> > Retrieve these with embededd perl.
> > Wdyt?
>
> What would the config look like?  AFAICS our ‘nginx’ package doesn’t
> embed Perl, and I think it’s better this way.  :-)  Can we do that with
> pure nginx directives?
>
> We create /srv/guix.gnu.org (as a symlink) with the correct mtime¹.  If
> we can tell nginx to use it as the ‘Last-Modified’ date, that’s perfect.
>
>
I was thinking about this. Yes, we can solve that with pure nginx. There is
an issue however.
It invalidates all cached entries on update, so files not modified will
also need to be downloaded again.

The easiest way to do that would be to simply generate an nginx config
snippet at a configurable location,
setting up the mtime and etags variable, and include that from the main
config.

If this would be ok, then I will have a look at implementing this.

Ludo’.
>
> ¹
> https://git.savannah.gnu.org/cgit/guix/maintenance.git/tree/hydra/berlin.scm#n212
>

Best regards,
g_bor

-- 
OpenPGP Key Fingerprint: 7988:3B9F:7D6A:4DBF:3719:0367:2506:A96C:CF63:0B21

[-- Attachment #2: Type: text/html, Size: 2623 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* bug#37207: guix.gnu.org Last Modified at epoch
  2019-08-29  6:11         ` Gábor Boskovits
@ 2019-08-29 12:40           ` Ludovic Courtès
  2019-09-05 20:47             ` Ludovic Courtès
  0 siblings, 1 reply; 22+ messages in thread
From: Ludovic Courtès @ 2019-08-29 12:40 UTC (permalink / raw)
  To: Gábor Boskovits; +Cc: 37207

Hi Gábor,

Gábor Boskovits <boskovits@gmail.com> skribis:

> Ludovic Courtès <ludo@gnu.org> ezt írta (időpont: 2019. aug. 28., Sze,
> 22:32):
>
>> Hello,
>>
>> Gábor Boskovits <boskovits@gmail.com> skribis:
>>
>> > we should create a file with the git last modification time of the files,
>> > updated when there is a new commit in the repo => last-modified
>> > we should create a file with some hash of the files, updated when there
>> is
>> > a new commit in the repo => etag
>> > we could restrict these operations to the files modified since the last
>> > checkout.
>> >
>> > Retrieve these with embededd perl.
>> > Wdyt?
>>
>> What would the config look like?  AFAICS our ‘nginx’ package doesn’t
>> embed Perl, and I think it’s better this way.  :-)  Can we do that with
>> pure nginx directives?
>>
>> We create /srv/guix.gnu.org (as a symlink) with the correct mtime¹.  If
>> we can tell nginx to use it as the ‘Last-Modified’ date, that’s perfect.
>>
>>
> I was thinking about this. Yes, we can solve that with pure nginx. There is
> an issue however.
> It invalidates all cached entries on update, so files not modified will
> also need to be downloaded again.
>
> The easiest way to do that would be to simply generate an nginx config
> snippet at a configurable location,
> setting up the mtime and etags variable, and include that from the main
> config.
>
> If this would be ok, then I will have a look at implementing this.

I’m not sure I fully understand, but yes, if you could send a prototype
as a diff against maintenance.git, that’d be great!

Thank you,
Ludo’.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* bug#37207: guix.gnu.org Last Modified at epoch
  2019-08-29 12:40           ` Ludovic Courtès
@ 2019-09-05 20:47             ` Ludovic Courtès
  0 siblings, 0 replies; 22+ messages in thread
From: Ludovic Courtès @ 2019-09-05 20:47 UTC (permalink / raw)
  To: Gábor Boskovits; +Cc: 37207

Hello!

Did one of you have chance to come up with a trick to emit the right
‘Last-Modified’?  We seemed to be close to having something.  :-)

Thanks,
Ludo’.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* bug#37207: guix.gnu.org Last Modified at epoch
  2019-08-28 14:37   ` Tobias Geerinckx-Rice via Bug reports for GNU Guix
  2019-08-28 19:42     ` Gábor Boskovits
@ 2019-09-26  8:39     ` Ludovic Courtès
  1 sibling, 0 replies; 22+ messages in thread
From: Ludovic Courtès @ 2019-09-26  8:39 UTC (permalink / raw)
  To: Tobias Geerinckx-Rice; +Cc: 37207

Hi Tobias,

Tobias Geerinckx-Rice <me@tobias.gr> skribis:

> Turning these off will of course prevent all caching.  I don't know if
> that would add measurable load to guix.gnu.org (it would be more
> problematic if we used a CDN, but it might still make a difference).
>
> Nix does something both interesting and icky — as always: patch[0]
> nginx to look up the realpath() instead, so clients can still cache
> using If-None-Match.

> [0]: https://github.com/NixOS/nixpkgs/pull/48337

(See
<https://raw.githubusercontent.com/NixOS/nixpkgs/9bc23f31d29138f09db6af52708a9b8b64deec64/pkgs/servers/http/nginx/nix-etag-1.15.4.patch>.)

I had overlooked this patch but it looks like the right approach
overall.  Calling ‘realpath’ each time seems a bit expensive as it
creates an ‘lstat’ storm, but I can’t think of a better solution.

I also found this post whose main interest is in showing how to write a
plugin to generate custom etags:

  https://mikewest.org/2008/11/generating-etags-for-static-content-using-nginx/

Thoughts?

Ludo’.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* bug#37207: nginx serving files from the store returns Last-Modified = Epoch
  2019-08-28  9:52 bug#37207: guix.gnu.org returns Last-Modified = Epoch Ludovic Courtès
  2019-08-28 10:40 ` bug#37207: guix.gnu.org Last Modified at epoch Gábor Boskovits
@ 2020-03-26 23:06 ` Vincent Legoll
  2020-03-29  9:50   ` Gábor Boskovits
  2020-03-26 23:30 ` bug#37207: Repology Vincent Legoll
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 22+ messages in thread
From: Vincent Legoll @ 2020-03-26 23:06 UTC (permalink / raw)
  To: 37207

This bug prevents repology [1] to show
the latest versions of packages in guix,
as it relies on 'Last-Modified' for:
https://guix.gnu.org/packages.json
changing in a meaningful way...

[1] https://repology.org/

-- 
Vincent Legoll

^ permalink raw reply	[flat|nested] 22+ messages in thread

* bug#37207: Repology
  2019-08-28  9:52 bug#37207: guix.gnu.org returns Last-Modified = Epoch Ludovic Courtès
  2019-08-28 10:40 ` bug#37207: guix.gnu.org Last Modified at epoch Gábor Boskovits
  2020-03-26 23:06 ` bug#37207: nginx serving files from the store returns Last-Modified = Epoch Vincent Legoll
@ 2020-03-26 23:30 ` Vincent Legoll
  2020-05-09 22:07 ` bug#37207: guix.gnu.org returns Last-Modified = Epoch Christopher Baines
  2020-05-15 21:12 ` bug#37207: nginx serving files from the store returns Last-Modified = Epoch anadon via web
  4 siblings, 0 replies; 22+ messages in thread
From: Vincent Legoll @ 2020-03-26 23:30 UTC (permalink / raw)
  To: 37207

It also paint us a a fairly outdated distro,
despite our efforts to keep the pace and
update to latest versions of packages.

We may even get into the top ten, which
may give us a bit of attention and attract
some distrohoppers^Wusers.

-- 
Vincent Legoll

^ permalink raw reply	[flat|nested] 22+ messages in thread

* bug#37207: nginx serving files from the store returns Last-Modified = Epoch
  2020-03-26 23:06 ` bug#37207: nginx serving files from the store returns Last-Modified = Epoch Vincent Legoll
@ 2020-03-29  9:50   ` Gábor Boskovits
  2020-03-30 11:53     ` Vincent Legoll
  0 siblings, 1 reply; 22+ messages in thread
From: Gábor Boskovits @ 2020-03-29  9:50 UTC (permalink / raw)
  To: Vincent Legoll; +Cc: 37207

Hello Vincent,

Vincent Legoll <vincent.legoll@gmail.com> ezt írta (időpont: 2020.
márc. 27., P, 0:07):
>
> This bug prevents repology [1] to show
> the latest versions of packages in guix,
> as it relies on 'Last-Modified' for:
> https://guix.gnu.org/packages.json
> changing in a meaningful way...
>

Does it also use etags, or just last-modified?

I ask this because we already have bug similar to this, and it would
be interesting to know if
it would be enough to have a meaningful etags generation, or we still have to
deal with last-modified.

> [1] https://repology.org/
>
> --
> Vincent Legoll
>
>
>

Best regards,
g_bor
-- 
OpenPGP Key Fingerprint: 7988:3B9F:7D6A:4DBF:3719:0367:2506:A96C:CF63:0B21

^ permalink raw reply	[flat|nested] 22+ messages in thread

* bug#37207: nginx serving files from the store returns Last-Modified = Epoch
  2020-03-29  9:50   ` Gábor Boskovits
@ 2020-03-30 11:53     ` Vincent Legoll
  0 siblings, 0 replies; 22+ messages in thread
From: Vincent Legoll @ 2020-03-30 11:53 UTC (permalink / raw)
  To: Gábor Boskovits; +Cc: 37207

Hello,

On Sun, Mar 29, 2020 at 11:50 AM Gábor Boskovits <boskovits@gmail.com> wrote:
> Does it also use etags, or just last-modified?

From the email exchange I had with the maintainer of the site,
I think it only uses last-modified.

> I ask this because we already have bug similar to this, and it would
> be interesting to know if
> it would be enough to have a meaningful etags generation, or we
> still have to deal with last-modified.

Is etags easier for us to handle ?

-- 
Vincent Legoll

^ permalink raw reply	[flat|nested] 22+ messages in thread

* bug#37207: guix.gnu.org returns Last-Modified = Epoch
  2019-08-28  9:52 bug#37207: guix.gnu.org returns Last-Modified = Epoch Ludovic Courtès
                   ` (2 preceding siblings ...)
  2020-03-26 23:30 ` bug#37207: Repology Vincent Legoll
@ 2020-05-09 22:07 ` Christopher Baines
  2020-05-10 10:11   ` Ludovic Courtès
  2020-05-15 21:12 ` bug#37207: nginx serving files from the store returns Last-Modified = Epoch anadon via web
  4 siblings, 1 reply; 22+ messages in thread
From: Christopher Baines @ 2020-05-09 22:07 UTC (permalink / raw)
  To: 37207

[-- Attachment #1: Type: text/plain, Size: 1329 bytes --]


Ludovic Courtès <ludo@gnu.org> writes:

> Since the use of the ‘static-web-site’ service, which puts web site
> files in the store, nginx returns a ‘Last-Modified’ header that can
> trick clients into caching things forever:
>
> --8<---------------cut here---------------start------------->8---
> $ wget --debug -O /dev/null   https://guix.gnu.org/packages.json 2>&1 | grep Last
> Last-Modified: Thu, 01 Jan 1970 00:00:01 GMT
> --8<---------------cut here---------------end--------------->8---
>
> We should tell nginx to do not emit ‘Last-Modified’, or to take the
> state from the /srv/guix.gnu.org symlink, if possible.

I ended up looking at this again in relation to Repology [1].

1: https://github.com/repology/repology-updater/issues/218#issuecomment-525905704

Going back to that comment, given that the Last-Modified header (and the
ETag) is wrong, it's probably sensible to remove them. That might even
fix the issue with Repology fetching the packages.json file.

Alternatively (or in addition), we could run a really simple Guile web
server that just serves the packages.json file with the right
Last-Modified value, and have NGinx proxy requests to that server. This
would be pretty easy to setup I believe, and would allow providing a
correct value.

Thoughts?

Chris

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 962 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* bug#37207: guix.gnu.org returns Last-Modified = Epoch
  2020-05-09 22:07 ` bug#37207: guix.gnu.org returns Last-Modified = Epoch Christopher Baines
@ 2020-05-10 10:11   ` Ludovic Courtès
  2020-05-11 10:32     ` Christopher Baines
  0 siblings, 1 reply; 22+ messages in thread
From: Ludovic Courtès @ 2020-05-10 10:11 UTC (permalink / raw)
  To: Christopher Baines; +Cc: 37207

Howdy!

Christopher Baines <mail@cbaines.net> skribis:

> Ludovic Courtès <ludo@gnu.org> writes:
>
>> Since the use of the ‘static-web-site’ service, which puts web site
>> files in the store, nginx returns a ‘Last-Modified’ header that can
>> trick clients into caching things forever:
>>
>> --8<---------------cut here---------------start------------->8---
>> $ wget --debug -O /dev/null   https://guix.gnu.org/packages.json 2>&1 | grep Last
>> Last-Modified: Thu, 01 Jan 1970 00:00:01 GMT
>> --8<---------------cut here---------------end--------------->8---
>>
>> We should tell nginx to do not emit ‘Last-Modified’, or to take the
>> state from the /srv/guix.gnu.org symlink, if possible.
>
> I ended up looking at this again in relation to Repology [1].
>
> 1: https://github.com/repology/repology-updater/issues/218#issuecomment-525905704
>
> Going back to that comment, given that the Last-Modified header (and the
> ETag) is wrong, it's probably sensible to remove them. That might even
> fix the issue with Repology fetching the packages.json file.
>
> Alternatively (or in addition), we could run a really simple Guile web
> server that just serves the packages.json file with the right
> Last-Modified value, and have NGinx proxy requests to that server. This
> would be pretty easy to setup I believe, and would allow providing a
> correct value.
>
> Thoughts?

I think it wouldn’t really help because the Last-Modified issue is
pervasive.  It shows for instance when accessing the web site: one often
has to force the browser to reload pages to get the latest version.

So I’m all for one of the solutions that were proposed earlier.

WDYT?

Ludo’.




^ permalink raw reply	[flat|nested] 22+ messages in thread

* bug#37207: guix.gnu.org returns Last-Modified = Epoch
  2020-05-10 10:11   ` Ludovic Courtès
@ 2020-05-11 10:32     ` Christopher Baines
  2020-05-11 12:47       ` Ludovic Courtès
  2020-05-25  8:20       ` bug#37207: [PATCH] nginx: berlin: Work around Last-Modified issues for guix.gnu.org Christopher Baines
  0 siblings, 2 replies; 22+ messages in thread
From: Christopher Baines @ 2020-05-11 10:32 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: 37207

[-- Attachment #1: Type: text/plain, Size: 2918 bytes --]


Ludovic Courtès <ludo@gnu.org> writes:

> Howdy!
>
> Christopher Baines <mail@cbaines.net> skribis:
>
>> Ludovic Courtès <ludo@gnu.org> writes:
>>
>>> Since the use of the ‘static-web-site’ service, which puts web site
>>> files in the store, nginx returns a ‘Last-Modified’ header that can
>>> trick clients into caching things forever:
>>>
>>> --8<---------------cut here---------------start------------->8---
>>> $ wget --debug -O /dev/null   https://guix.gnu.org/packages.json 2>&1 | grep Last
>>> Last-Modified: Thu, 01 Jan 1970 00:00:01 GMT
>>> --8<---------------cut here---------------end--------------->8---
>>>
>>> We should tell nginx to do not emit ‘Last-Modified’, or to take the
>>> state from the /srv/guix.gnu.org symlink, if possible.
>>
>> I ended up looking at this again in relation to Repology [1].
>>
>> 1: https://github.com/repology/repology-updater/issues/218#issuecomment-525905704
>>
>> Going back to that comment, given that the Last-Modified header (and the
>> ETag) is wrong, it's probably sensible to remove them. That might even
>> fix the issue with Repology fetching the packages.json file.
>>
>> Alternatively (or in addition), we could run a really simple Guile web
>> server that just serves the packages.json file with the right
>> Last-Modified value, and have NGinx proxy requests to that server. This
>> would be pretty easy to setup I believe, and would allow providing a
>> correct value.
>>
>> Thoughts?
>
> I think it wouldn’t really help because the Last-Modified issue is
> pervasive.  It shows for instance when accessing the web site: one often
> has to force the browser to reload pages to get the latest version.
>
> So I’m all for one of the solutions that were proposed earlier.
>
> WDYT?

So I think removing the Last-Modified header from the responses will fix
the issue with the Repology fetcher (as it will stop thinking it's
already fetch the file, since it was last modified in 1970), instead it
will just always process the file.

Removing the Last-Modified header, and maybe the ETag as well from
responses should avoid the issue with web browsers using a cached
version of the page when they probably shouldn't.

I realise what I described with using a Guile web server to serve the
packages.json file wouldn't help with other pages (unless they're served
as well, which is a possibility), but that was just an optimisation over
removing the header entirely, as having the Last-Modified header, with a
correct value would help the Repology fetcher cache the file.

Does that make sense? It still seems to me that a small change to the
NGinx config (I think these lines somewhere in the config would do it
[1]) would help with the Repology fetcher issue, and the issue you
describe with web browsers.

1:

add_header Last-Modified "";
if_modified_since off;
etag off;

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 962 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* bug#37207: guix.gnu.org returns Last-Modified = Epoch
  2020-05-11 10:32     ` Christopher Baines
@ 2020-05-11 12:47       ` Ludovic Courtès
  2020-05-25  8:24         ` Christopher Baines
  2020-05-25  8:20       ` bug#37207: [PATCH] nginx: berlin: Work around Last-Modified issues for guix.gnu.org Christopher Baines
  1 sibling, 1 reply; 22+ messages in thread
From: Ludovic Courtès @ 2020-05-11 12:47 UTC (permalink / raw)
  To: Christopher Baines; +Cc: 37207

Hi,

Christopher Baines <mail@cbaines.net> skribis:

> So I think removing the Last-Modified header from the responses will fix
> the issue with the Repology fetcher (as it will stop thinking it's
> already fetch the file, since it was last modified in 1970), instead it
> will just always process the file.
>
> Removing the Last-Modified header, and maybe the ETag as well from
> responses should avoid the issue with web browsers using a cached
> version of the page when they probably shouldn't.

It would prevent client-side caching altogether.  So perhaps we can do
that as a stopgap (and it has the advantage of requiring only a tiny
config change).

Eventually, it’d be nice to have something better, like one of the Etag
patches discussed upthread.

Thanks,
Ludo’.




^ permalink raw reply	[flat|nested] 22+ messages in thread

* bug#37207: nginx serving files from the store returns Last-Modified = Epoch
  2019-08-28  9:52 bug#37207: guix.gnu.org returns Last-Modified = Epoch Ludovic Courtès
                   ` (3 preceding siblings ...)
  2020-05-09 22:07 ` bug#37207: guix.gnu.org returns Last-Modified = Epoch Christopher Baines
@ 2020-05-15 21:12 ` anadon via web
  4 siblings, 0 replies; 22+ messages in thread
From: anadon via web @ 2020-05-15 21:12 UTC (permalink / raw)
  To: 37207

 Any movement on this?





^ permalink raw reply	[flat|nested] 22+ messages in thread

* bug#37207: [PATCH] nginx: berlin: Work around Last-Modified issues for guix.gnu.org.
  2020-05-11 10:32     ` Christopher Baines
  2020-05-11 12:47       ` Ludovic Courtès
@ 2020-05-25  8:20       ` Christopher Baines
  1 sibling, 0 replies; 22+ messages in thread
From: Christopher Baines @ 2020-05-25  8:20 UTC (permalink / raw)
  To: 37207

* hydra/nginx/berlin.scm (%berlin-servers): Add some config to the
nginx-server-configurations for guix.gnu.org.
---
 hydra/nginx/berlin.scm | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/hydra/nginx/berlin.scm b/hydra/nginx/berlin.scm
index 303fd35..8c90eb1 100644
--- a/hydra/nginx/berlin.scm
+++ b/hydra/nginx/berlin.scm
@@ -514,6 +514,13 @@ PUBLISH-URL."
     (locations guix.gnu.org-locations)
     (raw-content
      (list
+      ;; TODO This works around NGinx using the epoch for the
+      ;; Last-Modified date, as well as the etag.
+      ;; See http://issues.guix.info/issue/37207
+      "add_header Last-Modified \"\";"
+      "if_modified_since off;"
+      "etag off;"
+
       "access_log /var/log/nginx/guix-info.access.log;")))
 
    (nginx-server-configuration
@@ -634,6 +641,13 @@ PUBLISH-URL."
      (append
       %tls-settings
       (list
+       ;; TODO This works around NGinx using the epoch for the
+       ;; Last-Modified date, as well as the etag.
+       ;; See http://issues.guix.info/issue/37207
+       "add_header Last-Modified \"\";"
+       "if_modified_since off;"
+       "etag off;"
+
        "access_log /var/log/nginx/guix-gnu-org.https.access.log;"))))
 
    (nginx-server-configuration
-- 
2.26.2





^ permalink raw reply related	[flat|nested] 22+ messages in thread

* bug#37207: guix.gnu.org returns Last-Modified = Epoch
  2020-05-11 12:47       ` Ludovic Courtès
@ 2020-05-25  8:24         ` Christopher Baines
  0 siblings, 0 replies; 22+ messages in thread
From: Christopher Baines @ 2020-05-25  8:24 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: 37207

[-- Attachment #1: Type: text/plain, Size: 805 bytes --]


Ludovic Courtès <ludo@gnu.org> writes:

> Hi,
>
> Christopher Baines <mail@cbaines.net> skribis:
>
>> So I think removing the Last-Modified header from the responses will fix
>> the issue with the Repology fetcher (as it will stop thinking it's
>> already fetch the file, since it was last modified in 1970), instead it
>> will just always process the file.
>>
>> Removing the Last-Modified header, and maybe the ETag as well from
>> responses should avoid the issue with web browsers using a cached
>> version of the page when they probably shouldn't.
>
> It would prevent client-side caching altogether.  So perhaps we can do
> that as a stopgap (and it has the advantage of requiring only a tiny
> config change).

Great, I've finally got around to sending a patch for this now.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 962 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2020-05-25  8:26 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-08-28  9:52 bug#37207: guix.gnu.org returns Last-Modified = Epoch Ludovic Courtès
2019-08-28 10:40 ` bug#37207: guix.gnu.org Last Modified at epoch Gábor Boskovits
2019-08-28 14:37   ` Tobias Geerinckx-Rice via Bug reports for GNU Guix
2019-08-28 19:42     ` Gábor Boskovits
2019-08-28 20:32       ` Ludovic Courtès
2019-08-29  6:11         ` Gábor Boskovits
2019-08-29 12:40           ` Ludovic Courtès
2019-09-05 20:47             ` Ludovic Courtès
2019-09-26  8:39     ` Ludovic Courtès
2019-08-28 15:05   ` Danny Milosavljevic
2019-08-28 18:59     ` Gábor Boskovits
2020-03-26 23:06 ` bug#37207: nginx serving files from the store returns Last-Modified = Epoch Vincent Legoll
2020-03-29  9:50   ` Gábor Boskovits
2020-03-30 11:53     ` Vincent Legoll
2020-03-26 23:30 ` bug#37207: Repology Vincent Legoll
2020-05-09 22:07 ` bug#37207: guix.gnu.org returns Last-Modified = Epoch Christopher Baines
2020-05-10 10:11   ` Ludovic Courtès
2020-05-11 10:32     ` Christopher Baines
2020-05-11 12:47       ` Ludovic Courtès
2020-05-25  8:24         ` Christopher Baines
2020-05-25  8:20       ` bug#37207: [PATCH] nginx: berlin: Work around Last-Modified issues for guix.gnu.org Christopher Baines
2020-05-15 21:12 ` bug#37207: nginx serving files from the store returns Last-Modified = Epoch anadon via web

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).