unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Mysterious gzipped images
@ 2013-08-06 22:20 Lars Magne Ingebrigtsen
  2013-08-06 22:30 ` Lars Magne Ingebrigtsen
  0 siblings, 1 reply; 46+ messages in thread
From: Lars Magne Ingebrigtsen @ 2013-08-06 22:20 UTC (permalink / raw)
  To: emacs-devel

I was looking at duckduckgo in eww, and I noticed that none of the
images displayed correctly in the search results.  And then I noticed
the following error message:

ImageMagick error: no decode delegate for this image format `' @ error/constitute.c/ReadImage/544 [3 times]

So.  Downloading one of the images:

https://icons.duckduckgo.com/i/a-z-animals.com.ico

[larsi@stories /tmp]$ curl -O https://icons.duckduckgo.com/i/a-z-animals.com.ico
[larsi@stories /tmp]$ file a-z-animals.com.ico 
a-z-animals.com.ico: gzip compressed data, max compression
[larsi@stories /tmp]$ mv a-z-animals.com.ico a-z-animals.com.ico.gz
[larsi@stories /tmp]$ gunzip a-z-animals.com.ico.gz 
[larsi@stories /tmp]$ file a-z-animals.com.ico   
a-z-animals.com.ico: MS Windows icon resource - 1 icon

So it's a gzipped image!  After unzipping Emacs understands it
perfectly.

Firefox does too, so I think Emacs should understand this weird method
of encoding images.  But how?  Should `create-image' see whether it
looks like a gzipped file before passing it on?  No, that's too grody.
Totally.

Hm...  perhaps this is something that url.el should handle?  The HTTP
headers from the server says:

Content-Type: image/x-icon
Content-Encoding: gzip

So...  uhm...  should url undo the Content-Encoding before returning the
buffer?

-- 
(domestic pets only, the antidote for overdose, milk.)
  No Gnus T-Shirt for sale: http://ingebrigtsen.no/no.php
  and http://lars.ingebrigtsen.no/2013/08/twenty-years-of-september.html




^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Mysterious gzipped images
  2013-08-06 22:20 Mysterious gzipped images Lars Magne Ingebrigtsen
@ 2013-08-06 22:30 ` Lars Magne Ingebrigtsen
  2013-08-06 23:05   ` Andreas Schwab
  0 siblings, 1 reply; 46+ messages in thread
From: Lars Magne Ingebrigtsen @ 2013-08-06 22:30 UTC (permalink / raw)
  To: emacs-devel

Lars Magne Ingebrigtsen <larsi@gnus.org> writes:

> Content-Type: image/x-icon
> Content-Encoding: gzip
>
> So...  uhm...  should url undo the Content-Encoding before returning the
> buffer?

I guess that would be the appropriate thing to do.  And I guess what
DuckDuckGo is, strictly speaking, wrong, since url.el doesn't say that
it accepts gzip in the Accept-Encoding header.

But I think url.el should do that, because it'd make downloading stuff a
lot faster.  And, of course, unzip stuff on receipt.

Emacs doesn't happen to have gunzip built in, by any chance?  >"?

-- 
(domestic pets only, the antidote for overdose, milk.)
  No Gnus T-Shirt for sale: http://ingebrigtsen.no/no.php
  and http://lars.ingebrigtsen.no/2013/08/twenty-years-of-september.html



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Mysterious gzipped images
  2013-08-06 22:30 ` Lars Magne Ingebrigtsen
@ 2013-08-06 23:05   ` Andreas Schwab
  2013-08-06 23:38     ` Lars Magne Ingebrigtsen
  0 siblings, 1 reply; 46+ messages in thread
From: Andreas Schwab @ 2013-08-06 23:05 UTC (permalink / raw)
  To: Lars Magne Ingebrigtsen; +Cc: emacs-devel

Lars Magne Ingebrigtsen <larsi@gnus.org> writes:

> Emacs doesn't happen to have gunzip built in, by any chance?  >"?

Indirectly.  (libpng links against libz.)

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Mysterious gzipped images
  2013-08-06 23:05   ` Andreas Schwab
@ 2013-08-06 23:38     ` Lars Magne Ingebrigtsen
  2013-08-07  1:08       ` Lars Magne Ingebrigtsen
  0 siblings, 1 reply; 46+ messages in thread
From: Lars Magne Ingebrigtsen @ 2013-08-06 23:38 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: emacs-devel

Andreas Schwab <schwab@linux-m68k.org> writes:

>> Emacs doesn't happen to have gunzip built in, by any chance?  >"?
>
> Indirectly.  (libpng links against libz.)

I see.  So we'd have get it "for free" in normal builds.  But we'd have
to add autoconf checks for it anyway, I guess?

I think it would be useful to have built-in libz support in Emacs.  When
fetching lots of small image files (which is quite common), I think that
calling inflate() would be a lot faster than calling the gunzip command
over pipes.  Probably.  Should I go ahead and implement this?

-- 
(domestic pets only, the antidote for overdose, milk.)
  No Gnus T-Shirt for sale: http://ingebrigtsen.no/no.php
  and http://lars.ingebrigtsen.no/2013/08/twenty-years-of-september.html



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Mysterious gzipped images
  2013-08-06 23:38     ` Lars Magne Ingebrigtsen
@ 2013-08-07  1:08       ` Lars Magne Ingebrigtsen
  2013-08-07  8:51         ` Julien Danjou
                           ` (2 more replies)
  0 siblings, 3 replies; 46+ messages in thread
From: Lars Magne Ingebrigtsen @ 2013-08-07  1:08 UTC (permalink / raw)
  To: emacs-devel

Lars Magne Ingebrigtsen <larsi@gnus.org> writes:

> I think it would be useful to have built-in libz support in Emacs.  When
> fetching lots of small image files (which is quite common), I think that
> calling inflate() would be a lot faster than calling the gunzip command
> over pipes.  Probably.  Should I go ahead and implement this?

Erm, I went ahead and implemented it.  Should I just check in?  :-)

-- 
(domestic pets only, the antidote for overdose, milk.)
  No Gnus T-Shirt for sale: http://ingebrigtsen.no/no.php
  and http://lars.ingebrigtsen.no/2013/08/twenty-years-of-september.html



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Mysterious gzipped images
  2013-08-07  1:08       ` Lars Magne Ingebrigtsen
@ 2013-08-07  8:51         ` Julien Danjou
  2013-08-07 14:35         ` Stefan Monnier
  2013-08-07 18:25         ` Rüdiger Sonderfeld
  2 siblings, 0 replies; 46+ messages in thread
From: Julien Danjou @ 2013-08-07  8:51 UTC (permalink / raw)
  To: Lars Magne Ingebrigtsen; +Cc: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 589 bytes --]

On Wed, Aug 07 2013, Lars Magne Ingebrigtsen wrote:

> Lars Magne Ingebrigtsen <larsi@gnus.org> writes:
>
>> I think it would be useful to have built-in libz support in Emacs.  When
>> fetching lots of small image files (which is quite common), I think that
>> calling inflate() would be a lot faster than calling the gunzip command
>> over pipes.  Probably.  Should I go ahead and implement this?
>
> Erm, I went ahead and implemented it.  Should I just check in?  :-)

+1

-- 
Julien Danjou
/* Free Software hacker * freelance consultant
   http://julien.danjou.info */

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 835 bytes --]

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Mysterious gzipped images
  2013-08-07  1:08       ` Lars Magne Ingebrigtsen
  2013-08-07  8:51         ` Julien Danjou
@ 2013-08-07 14:35         ` Stefan Monnier
  2013-08-07 16:47           ` chad
  2013-08-08 10:35           ` Lars Magne Ingebrigtsen
  2013-08-07 18:25         ` Rüdiger Sonderfeld
  2 siblings, 2 replies; 46+ messages in thread
From: Stefan Monnier @ 2013-08-07 14:35 UTC (permalink / raw)
  To: Lars Magne Ingebrigtsen; +Cc: emacs-devel

> [larsi@stories /tmp]$ curl -O https://icons.duckduckgo.com/i/a-z-animals.com.ico
> [larsi@stories /tmp]$ file a-z-animals.com.ico 
> a-z-animals.com.ico: gzip compressed data, max compression
[...]
> So it's a gzipped image!  After unzipping Emacs understands it
> perfectly.

So curl doesn't implement Accept/Content-Encoding either?  That's weird!

>> I think it would be useful to have built-in libz support in Emacs.  When
>> fetching lots of small image files (which is quite common), I think that
>> calling inflate() would be a lot faster than calling the gunzip command
>> over pipes.  Probably.  Should I go ahead and implement this?
> Erm, I went ahead and implemented it.  Should I just check in?  :-)

Sounds good, tho I'd prefer to see the patch first.


        Stefan "Wondering if that could be used to decompress the index
                at the end of PDF files as well"


PS: Here's my previous hack for URL.


=== modified file 'lisp/url/url-http.el'
--- lisp/url/url-http.el	2013-07-22 04:06:21 +0000
+++ lisp/url/url-http.el	2013-07-22 15:45:14 +0000
@@ -313,10 +313,8 @@
              (if url-personal-mail-address
                  (concat
                   "From: " url-personal-mail-address "\r\n"))
-             ;; Encodings we understand
-             (if url-mime-encoding-string
-                 (concat
-                  "Accept-encoding: " url-mime-encoding-string "\r\n"))
+             ;; Encodings we understand.  FIXME: Only use gzip if installed.
+             "Accept-encoding: gzip\r\n"
              (if url-mime-charset-string
                  (concat
                   "Accept-charset: " url-mime-charset-string "\r\n"))
@@ -544,7 +542,16 @@
 	  ;; mark it as successful.
 	  (widen)
 	  (if (and url-automatic-caching (equal url-http-method "GET"))
-	      (url-store-in-cache buffer))))
+	      (url-store-in-cache buffer))
+          ;; Decompress, if needed.  Do it after storing the result in
+          ;; the cache, so the cache keeps the compressed data.
+          (when (mail-fetch-field "Content-Encoding")
+            ;; The only encoding we support.
+            (cl-assert (equal "gzip" (mail-fetch-field "Content-Encoding")))
+            (save-excursion
+              (goto-char (point-min))
+              (re-search-forward "^\n")
+              (call-process-region (point) (point-max) "gzip" t t nil "-d")))))
        (setq success t))
       (3				; Redirection
        ;; 300 Multiple choices




^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Mysterious gzipped images
  2013-08-07 14:35         ` Stefan Monnier
@ 2013-08-07 16:47           ` chad
  2013-08-08 10:35           ` Lars Magne Ingebrigtsen
  1 sibling, 0 replies; 46+ messages in thread
From: chad @ 2013-08-07 16:47 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Lars Magne Ingebrigtsen, emacs-devel


On 07 Aug 2013, at 07:35, Stefan Monnier <monnier@iro.umontreal.ca> wrote:

>> [larsi@stories /tmp]$ curl -O https://icons.duckduckgo.com/i/a-z-animals.com.ico
>> [larsi@stories /tmp]$ file a-z-animals.com.ico 
>> a-z-animals.com.ico: gzip compressed data, max compression
> [...]
>> So it's a gzipped image!  After unzipping Emacs understands it
>> perfectly.
> 
> So curl doesn't implement Accept/Content-Encoding either?  That's weird!

I suspect that curl isn't decoding the object because it didn't
send an Accept-Encoding with the request. If you ask it to do so
(add --compress to the command line), then it decodes the object.
I'm not sure where Lars got that URL in the first place, but my
guess is that it's gzipped without an indicator due to a quirk in
how favicons are implemented.

Hope that helps,
~Chad




^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Mysterious gzipped images
  2013-08-07  1:08       ` Lars Magne Ingebrigtsen
  2013-08-07  8:51         ` Julien Danjou
  2013-08-07 14:35         ` Stefan Monnier
@ 2013-08-07 18:25         ` Rüdiger Sonderfeld
  2013-08-07 21:06           ` FFI (was: Mysterious gzipped images) Stefan Monnier
  2 siblings, 1 reply; 46+ messages in thread
From: Rüdiger Sonderfeld @ 2013-08-07 18:25 UTC (permalink / raw)
  To: emacs-devel; +Cc: Lars Magne Ingebrigtsen

On Wednesday 07 August 2013 03:08:45 Lars Magne Ingebrigtsen wrote:
> Lars Magne Ingebrigtsen <larsi@gnus.org> writes:
> > I think it would be useful to have built-in libz support in Emacs.  When
> > fetching lots of small image files (which is quite common), I think that
> > calling inflate() would be a lot faster than calling the gunzip command
> > over pipes.  Probably.  Should I go ahead and implement this?
> 
> Erm, I went ahead and implemented it.  Should I just check in?  :-)

+1

This would probably allow weechat.el to add compression support as well!

https://github.com/the-kenny/weechat.el

Regards,
Rüdiger




^ permalink raw reply	[flat|nested] 46+ messages in thread

* FFI (was: Mysterious gzipped images)
  2013-08-07 18:25         ` Rüdiger Sonderfeld
@ 2013-08-07 21:06           ` Stefan Monnier
  2013-08-08  0:50             ` Stephen J. Turnbull
  0 siblings, 1 reply; 46+ messages in thread
From: Stefan Monnier @ 2013-08-07 21:06 UTC (permalink / raw)
  To: Rüdiger Sonderfeld; +Cc: Lars Magne Ingebrigtsen, emacs-devel

>> Erm, I went ahead and implemented it.  Should I just check in?  :-)

Of course, the better way would be to add FFI support first.


        Stefan



^ permalink raw reply	[flat|nested] 46+ messages in thread

* FFI (was: Mysterious gzipped images)
  2013-08-07 21:06           ` FFI (was: Mysterious gzipped images) Stefan Monnier
@ 2013-08-08  0:50             ` Stephen J. Turnbull
  0 siblings, 0 replies; 46+ messages in thread
From: Stephen J. Turnbull @ 2013-08-08  0:50 UTC (permalink / raw)
  To: Stefan Monnier
  Cc: Rüdiger Sonderfeld, Lars Magne Ingebrigtsen, emacs-devel

Stefan Monnier writes:

 > Of course, the better way would be to add FFI support first.

Not really.  FFI guarantees that you can crash Emacs (if not the whole
system) from Lisp.

FFI has its advantages, of course, but it's hard to see how it can be
considered "better" except in very specific use cases.  "Useful
option" is about as far as I would be willing to go.




^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Mysterious gzipped images
  2013-08-07 14:35         ` Stefan Monnier
  2013-08-07 16:47           ` chad
@ 2013-08-08 10:35           ` Lars Magne Ingebrigtsen
  2013-08-08 10:57             ` Andreas Schwab
                               ` (3 more replies)
  1 sibling, 4 replies; 46+ messages in thread
From: Lars Magne Ingebrigtsen @ 2013-08-08 10:35 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 678 bytes --]

Stefan Monnier <monnier@iro.umontreal.ca> writes:

> So curl doesn't implement Accept/Content-Encoding either?  That's weird!

It does, but you have to say "curl --compressed".  That makes curl both
say that it accept gzipped content and decompresses it if it gets it.

Although it's "wrong" for the server to output compressed data if the
client doesn't say that it accepts it, I think it's pretty buggy not to
decode the compressed data, anyway.

> Sounds good, tho I'd prefer to see the patch first.

I've included the new .c file below.  The attendant code change to
lisp.h/emacs.c is what you'd expect.

Now I just have to write the configure.ac code to define HAVE_ZLIB.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: decompress.c --]
[-- Type: text/x-csrc, Size: 3198 bytes --]

/* Interface to zlib.
   Copyright (C) 2013 Free Software Foundation, Inc.

This file is part of GNU Emacs.

GNU Emacs is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.

GNU Emacs is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with GNU Emacs.  If not, see <http://www.gnu.org/licenses/>.  */

#include <config.h>

#ifdef HAVE_PNG

#include <zlib.h>

#include "lisp.h"
#include "character.h"
#include "buffer.h"

\f
#define BUFFER_SIZE 16384

DEFUN ("decompress-gzipped-region", Fdecompress_gzipped_region,
       Sdecompress_gzipped_region,
       2, 2, 0,
       doc: /* Decompress a gzip-compressed region.
The text in the region will be replaced by the decompressed data.
On failure, nil is returned and the data is left in place.
This function can only be called in unibyte buffers.*/)
  (Lisp_Object start, Lisp_Object end)
{
  ptrdiff_t istart, iend, point = PT;
  z_stream stream;
  int decompressed;
  unsigned char *out;
  
  validate_region (&start, &end);
  move_gap_both (iend, iend);
  
  if (! NILP (BVAR (current_buffer, enable_multibyte_characters)))
    error ("This function can only be called in unibyte buffers");

  /* This is a unibyte buffer, so character positions and bytes are
     the same. */
  istart = XINT (start);  
  iend = XINT (end);

  stream.zalloc = Z_NULL;
  stream.zfree = Z_NULL;
  stream.opaque = Z_NULL;
  stream.avail_in = 0;
  stream.next_in = Z_NULL;

  /* This magic number apparently means "this is gzip". */
  if (inflateInit2 (&stream, 16 + MAX_WBITS) != Z_OK)
    return Qnil;

  out = (char *) malloc (BUFFER_SIZE);

  /* We're inserting the decompressed data at the end of the
     compressed data. */
  SET_PT (iend);

  stream.avail_in = iend - istart;
  stream.next_in = (char *) BYTE_POS_ADDR (istart);

  /* Run inflate() on input until the output buffer isn't full. */
  do {
    stream.avail_out = BUFFER_SIZE;
    stream.next_out = out;
    switch (inflate (&stream, Z_NO_FLUSH)) {
    case Z_STREAM_ERROR:
    case Z_NEED_DICT:
    case Z_DATA_ERROR:
    case Z_MEM_ERROR:
      inflateEnd (&stream);
      /* Delete any uncompressed data already inserted and restore
	 point. */
      del_range (iend, PT);
      SET_PT (point);
      free (out);
      return Qnil;
    }

    decompressed = BUFFER_SIZE - stream.avail_out;
    insert_1_both (out, decompressed, decompressed, 0, 0, 0);
  } while (stream.avail_out == 0);

  inflateEnd (&stream);
  free (out);

  /* Delete the compressed data. */
  del_range (istart, iend);

  return Qt;
}

\f
/***********************************************************************
			    Initialization
 ***********************************************************************/
void
syms_of_decompress (void)
{
  defsubr (&Sdecompress_gzipped_region);
}

#endif /* HAVE_PNG */

[-- Attachment #3: Type: text/plain, Size: 191 bytes --]


-- 
(domestic pets only, the antidote for overdose, milk.)
  No Gnus T-Shirt for sale: http://ingebrigtsen.no/no.php
  and http://lars.ingebrigtsen.no/2013/08/twenty-years-of-september.html

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Mysterious gzipped images
  2013-08-08 10:35           ` Lars Magne Ingebrigtsen
@ 2013-08-08 10:57             ` Andreas Schwab
  2013-08-11 19:44               ` Lars Magne Ingebrigtsen
  2013-08-08 13:22             ` Paul Eggert
                               ` (2 subsequent siblings)
  3 siblings, 1 reply; 46+ messages in thread
From: Andreas Schwab @ 2013-08-08 10:57 UTC (permalink / raw)
  To: Lars Magne Ingebrigtsen; +Cc: Stefan Monnier, emacs-devel

Lars Magne Ingebrigtsen <larsi@gnus.org> writes:

>   out = (char *) malloc (BUFFER_SIZE);

Please use xmalloc.

Andreas.

-- 
Andreas Schwab, SUSE Labs, schwab@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Mysterious gzipped images
  2013-08-08 10:35           ` Lars Magne Ingebrigtsen
  2013-08-08 10:57             ` Andreas Schwab
@ 2013-08-08 13:22             ` Paul Eggert
  2013-08-08 14:06               ` Lars Magne Ingebrigtsen
  2013-08-08 13:24             ` Romain Francoise
  2013-08-08 14:47             ` Stefan Monnier
  3 siblings, 1 reply; 46+ messages in thread
From: Paul Eggert @ 2013-08-08 13:22 UTC (permalink / raw)
  To: Lars Magne Ingebrigtsen; +Cc: emacs-devel

On 08/08/2013 03:35 AM, Lars Magne Ingebrigtsen wrote:

>   out = (char *) malloc (BUFFER_SIZE);

BUFFER_SIZE is so small that you should just put the buffer on the
stack "char out[BUFFER_SIZE];".  That way, you don't need to worry
about freeing it later (e.g., on error).

Or better yet, why not use the gap as the output buffer?
That should avoid some unnecessary copying.

>   stream.avail_in = iend - istart;

On 64-bit platforms, this won't work on buffers larger than 4 GiB.
I suggest 'stream.avail_in = min (iend - istart, UINT_MAX);'
and then put the whole thing inside a loop that repeats
until the input buffer is exhausted.

Each time through the inner loop, it should QUIT so that the user can
interrupt.  You need a record_unwind_protect around the whole thing;
the unwind-protect should be the code that deletes any uncompressed
data already inserted and restores point.

>     case Z_STREAM_ERROR:
>     case Z_NEED_DICT:
>     case Z_DATA_ERROR:
>     case Z_MEM_ERROR:

Shouldn't an error be signaled for this sort of thing?
That would fit in with the unwind-protect.





^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Mysterious gzipped images
  2013-08-08 10:35           ` Lars Magne Ingebrigtsen
  2013-08-08 10:57             ` Andreas Schwab
  2013-08-08 13:22             ` Paul Eggert
@ 2013-08-08 13:24             ` Romain Francoise
  2013-08-08 13:28               ` Lars Magne Ingebrigtsen
  2013-08-08 14:47             ` Stefan Monnier
  3 siblings, 1 reply; 46+ messages in thread
From: Romain Francoise @ 2013-08-08 13:24 UTC (permalink / raw)
  To: Lars Magne Ingebrigtsen; +Cc: Stefan Monnier, emacs-devel

Lars Magne Ingebrigtsen <larsi@gnus.org> writes:

> DEFUN ("decompress-gzipped-region", Fdecompress_gzipped_region,
>        Sdecompress_gzipped_region,

Wouldn't it be better to have a single `decompress-region' function to
which you specify the decompression algorithm? It's almost certain that
if we add gzip, down the road someone will request bzip2, or xz, or...

Then you can add a companion `compress-region' function which also takes
the compression level, etc.



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Mysterious gzipped images
  2013-08-08 13:24             ` Romain Francoise
@ 2013-08-08 13:28               ` Lars Magne Ingebrigtsen
  2013-08-08 13:36                 ` Christopher Schmidt
  2013-08-08 15:43                 ` Romain Francoise
  0 siblings, 2 replies; 46+ messages in thread
From: Lars Magne Ingebrigtsen @ 2013-08-08 13:28 UTC (permalink / raw)
  To: Romain Francoise; +Cc: Stefan Monnier, emacs-devel

Romain Francoise <romain@orebokech.com> writes:

> Lars Magne Ingebrigtsen <larsi@gnus.org> writes:
>
>> DEFUN ("decompress-gzipped-region", Fdecompress_gzipped_region,
>>        Sdecompress_gzipped_region,
>
> Wouldn't it be better to have a single `decompress-region' function to
> which you specify the decompression algorithm? It's almost certain that
> if we add gzip, down the road someone will request bzip2, or xz, or...

Well, yes, that will probably be requested, but would require extra
libraries.  gzip is "free", since it's already built into 99% of Emacs
builds already.  So it smacks of premature overengineering to me.  >"?

> Then you can add a companion `compress-region' function which also takes
> the compression level, etc.

I think we can rely on external programs for compression.  It's not
something that Emacs needs to do efficiently.

-- 
(domestic pets only, the antidote for overdose, milk.)
  No Gnus T-Shirt for sale: http://ingebrigtsen.no/no.php
  and http://lars.ingebrigtsen.no/2013/08/twenty-years-of-september.html



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Mysterious gzipped images
  2013-08-08 13:28               ` Lars Magne Ingebrigtsen
@ 2013-08-08 13:36                 ` Christopher Schmidt
  2013-08-08 13:55                   ` Lars Magne Ingebrigtsen
  2013-08-08 15:43                 ` Romain Francoise
  1 sibling, 1 reply; 46+ messages in thread
From: Christopher Schmidt @ 2013-08-08 13:36 UTC (permalink / raw)
  To: emacs-devel

Lars Magne Ingebrigtsen <larsi@gnus.org> writes:
> Well, yes, that will probably be requested, but would require extra
> libraries.  gzip is "free", since it's already built into 99% of Emacs
> builds already.  So it smacks of premature overengineering to me.  >"?

Once the interface found its way into the release, it will not be
removed.  There is no such thing as overengineering here.

        Christopher



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Mysterious gzipped images
  2013-08-08 13:36                 ` Christopher Schmidt
@ 2013-08-08 13:55                   ` Lars Magne Ingebrigtsen
  2013-08-08 14:06                     ` Christopher Schmidt
  0 siblings, 1 reply; 46+ messages in thread
From: Lars Magne Ingebrigtsen @ 2013-08-08 13:55 UTC (permalink / raw)
  To: emacs-devel

Christopher Schmidt <christopher@ch.ristopher.com> writes:

> Once the interface found its way into the release, it will not be
> removed.  There is no such thing as overengineering here.

The basic `decompress-gzipped-region' function will always remain as it
is.  If we later add a `decompress-region' function that takes different
compressors as parameters, then that function will call the basic
underlying functions.

-- 
(domestic pets only, the antidote for overdose, milk.)
  No Gnus T-Shirt for sale: http://ingebrigtsen.no/no.php
  and http://lars.ingebrigtsen.no/2013/08/twenty-years-of-september.html



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Mysterious gzipped images
  2013-08-08 13:22             ` Paul Eggert
@ 2013-08-08 14:06               ` Lars Magne Ingebrigtsen
  2013-08-08 15:31                 ` Lars Magne Ingebrigtsen
  0 siblings, 1 reply; 46+ messages in thread
From: Lars Magne Ingebrigtsen @ 2013-08-08 14:06 UTC (permalink / raw)
  To: Paul Eggert; +Cc: emacs-devel

Paul Eggert <eggert@cs.ucla.edu> writes:

>>   out = (char *) malloc (BUFFER_SIZE);
>
> BUFFER_SIZE is so small that you should just put the buffer on the
> stack "char out[BUFFER_SIZE];".  That way, you don't need to worry
> about freeing it later (e.g., on error).

Ah, I thought having (largish) arrays on the stack was frowned upon, but
perhaps 16K isn't big these days?

> Or better yet, why not use the gap as the output buffer?
> That should avoid some unnecessary copying.

The buffer gap?  Ah, I see.  That's very clever.  I'll rework the code
to do that.

>>   stream.avail_in = iend - istart;
>
> On 64-bit platforms, this won't work on buffers larger than 4 GiB.
> I suggest 'stream.avail_in = min (iend - istart, UINT_MAX);'
> and then put the whole thing inside a loop that repeats
> until the input buffer is exhausted.

Right.

> Each time through the inner loop, it should QUIT so that the user can
> interrupt.  You need a record_unwind_protect around the whole thing;
> the unwind-protect should be the code that deletes any uncompressed
> data already inserted and restores point.

Hm...  is that necessary?  gunzipping is quite fast.  And if you have a
8GB compressed file somehow in your buffer, I would assume that we're in
the year 2025, and your machine is fast enough to make that not matter. >"?

On the other hand, if your machine starts thrashing because it's running
out of memory, having `C-g' work would be nice.  In which case, having
the outer loop be smaller might be a good idea.  Say, only decompress a
few hundred megs at a time, max, for instance.

>>     case Z_STREAM_ERROR:
>>     case Z_NEED_DICT:
>>     case Z_DATA_ERROR:
>>     case Z_MEM_ERROR:
>
> Shouldn't an error be signaled for this sort of thing?
> That would fit in with the unwind-protect.

Yeah, I debated with myself whether signalling an error or returning nil
would be the best interface here.  We have lots of decoding functions
that won't signal errors on invalid inputs (like
`rfc2047-decode-encoded-words'), because we're typically calling them in
tight loops, and slapping `condition-case' around those calls are
annoying.

I.e., they're not anywhere close to the user level, so making them
signal an error is almost never what we want.

Decompressing the region, on the other hand, won't usually be called
that way (in a loop), but it's not a user-level command, either.  So I
don't know.

-- 
(domestic pets only, the antidote for overdose, milk.)
  No Gnus T-Shirt for sale: http://ingebrigtsen.no/no.php
  and http://lars.ingebrigtsen.no/2013/08/twenty-years-of-september.html



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Mysterious gzipped images
  2013-08-08 13:55                   ` Lars Magne Ingebrigtsen
@ 2013-08-08 14:06                     ` Christopher Schmidt
  2013-08-08 14:09                       ` Lars Magne Ingebrigtsen
  0 siblings, 1 reply; 46+ messages in thread
From: Christopher Schmidt @ 2013-08-08 14:06 UTC (permalink / raw)
  To: emacs-devel

Lars Magne Ingebrigtsen <larsi@gnus.org> writes:
> The basic `decompress-gzipped-region' function will always remain as
> it is.  If we later add a `decompress-region' function that takes
> different compressors as parameters, then that function will call the
> basic underlying functions.

As you said before, adding new algorithms based on
shell-command-on-region in not hard.  Make {,de}compress-gzipped-region
an implementation detail and add {,de}compress-region right away.  Why
wait?  ;)

        Christopher



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Mysterious gzipped images
  2013-08-08 14:06                     ` Christopher Schmidt
@ 2013-08-08 14:09                       ` Lars Magne Ingebrigtsen
  0 siblings, 0 replies; 46+ messages in thread
From: Lars Magne Ingebrigtsen @ 2013-08-08 14:09 UTC (permalink / raw)
  To: emacs-devel

Christopher Schmidt <christopher@ch.ristopher.com> writes:

> As you said before, adding new algorithms based on
> shell-command-on-region in not hard.  Make {,de}compress-gzipped-region
> an implementation detail and add {,de}compress-region right away.  Why
> wait?  ;)

If somebody wants to do that -- be my guest.  I'm not particularly
interested, myself, in adding theoretically useful features that have no
current use cases.

-- 
(domestic pets only, the antidote for overdose, milk.)
  No Gnus T-Shirt for sale: http://ingebrigtsen.no/no.php
  and http://lars.ingebrigtsen.no/2013/08/twenty-years-of-september.html



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Mysterious gzipped images
  2013-08-08 10:35           ` Lars Magne Ingebrigtsen
                               ` (2 preceding siblings ...)
  2013-08-08 13:24             ` Romain Francoise
@ 2013-08-08 14:47             ` Stefan Monnier
  2013-08-11 19:47               ` Lars Magne Ingebrigtsen
  3 siblings, 1 reply; 46+ messages in thread
From: Stefan Monnier @ 2013-08-08 14:47 UTC (permalink / raw)
  To: Lars Magne Ingebrigtsen; +Cc: emacs-devel

>> Sounds good, tho I'd prefer to see the patch first.
> I've included the new .c file below.  The attendant code change to
> lisp.h/emacs.c is what you'd expect.

Could you try and use a "package prefix".  E.g. "libz-" or "zlib-"?


        Stefan "Hmm... looks like this doesn't include a function to
                decompress PDF data yet :-("



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Mysterious gzipped images
  2013-08-08 14:06               ` Lars Magne Ingebrigtsen
@ 2013-08-08 15:31                 ` Lars Magne Ingebrigtsen
  2013-08-11 22:05                   ` Lars Magne Ingebrigtsen
  0 siblings, 1 reply; 46+ messages in thread
From: Lars Magne Ingebrigtsen @ 2013-08-08 15:31 UTC (permalink / raw)
  To: emacs-devel

Or perhaps the C-g check should be in the output part.  A compressecd
small file can expand to a huge uncompressed file, so it should check
for C-g ever few mb out data output.
-- 
(domestic pets only, the antidote for overdose, milk.)
  No Gnus T-Shirt for sale: http://ingebrigtsen.no/no.php
  and http://lars.ingebrigtsen.no/2013/08/twenty-years-of-september.html




^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Mysterious gzipped images
  2013-08-08 13:28               ` Lars Magne Ingebrigtsen
  2013-08-08 13:36                 ` Christopher Schmidt
@ 2013-08-08 15:43                 ` Romain Francoise
  2013-08-08 15:52                   ` Lars Magne Ingebrigtsen
  2013-08-11 19:51                   ` Lars Magne Ingebrigtsen
  1 sibling, 2 replies; 46+ messages in thread
From: Romain Francoise @ 2013-08-08 15:43 UTC (permalink / raw)
  To: Lars Magne Ingebrigtsen; +Cc: Stefan Monnier, emacs-devel

Lars Magne Ingebrigtsen <larsi@gnus.org> writes:

> gzip is "free", since it's already built into 99% of Emacs builds
> already.

This argument applies just as well to other compression types, e.g. in
Debian all variants of Emacs 24 are built with libxml support, and
libxml is linked with liblzma so xz would also be "free".

> I think we can rely on external programs for compression.  It's not
> something that Emacs needs to do efficiently.

Relying on external programs doesn't necessarily work on all platforms.
If we can have a built-in version of anything, we should prefer that to
using external programs.



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Mysterious gzipped images
  2013-08-08 15:43                 ` Romain Francoise
@ 2013-08-08 15:52                   ` Lars Magne Ingebrigtsen
  2013-08-11 19:51                   ` Lars Magne Ingebrigtsen
  1 sibling, 0 replies; 46+ messages in thread
From: Lars Magne Ingebrigtsen @ 2013-08-08 15:52 UTC (permalink / raw)
  To: Romain Francoise; +Cc: Stefan Monnier, emacs-devel

Romain Francoise <romain@orebokech.com> writes:

>> gzip is "free", since it's already built into 99% of Emacs builds
>> already.
>
> This argument applies just as well to other compression types, e.g. in
> Debian all variants of Emacs 24 are built with libxml support, and
> libxml is linked with liblzma so xz would also be "free".

Sure.  But is it uzeful?

-- 
(domestic pets only, the antidote for overdose, milk.)
  No Gnus T-Shirt for sale: http://ingebrigtsen.no/no.php
  and http://lars.ingebrigtsen.no/2013/08/twenty-years-of-september.html



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Mysterious gzipped images
  2013-08-08 10:57             ` Andreas Schwab
@ 2013-08-11 19:44               ` Lars Magne Ingebrigtsen
  0 siblings, 0 replies; 46+ messages in thread
From: Lars Magne Ingebrigtsen @ 2013-08-11 19:44 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: Stefan Monnier, emacs-devel

Andreas Schwab <schwab@suse.de> writes:

> Please use xmalloc.

I put the array on the stack as Paul suggested.

-- 
(domestic pets only, the antidote for overdose, milk.)
  No Gnus T-Shirt for sale: http://ingebrigtsen.no/no.php
  and http://lars.ingebrigtsen.no/2013/08/twenty-years-of-september.html



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Mysterious gzipped images
  2013-08-08 14:47             ` Stefan Monnier
@ 2013-08-11 19:47               ` Lars Magne Ingebrigtsen
  2013-08-12  0:54                 ` Stefan Monnier
  2013-08-12 12:13                 ` Eli Zaretskii
  0 siblings, 2 replies; 46+ messages in thread
From: Lars Magne Ingebrigtsen @ 2013-08-11 19:47 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

Stefan Monnier <monnier@iro.umontreal.ca> writes:

> Could you try and use a "package prefix".  E.g. "libz-" or "zlib-"?

I've checked in what I had now so that other people can fix my code.
:-)

Sure, renaming from `decompress-gzipped-region' to something more
prefixey would make sense.  Do you prefer zlib (which is what the zlib
people call it), or libz (which is what all other people call their
libraries)?

I think that perhaps "libz" sounds a bit...  uhm...  odd.  A misspelling
of "libs".  >"?

>         Stefan "Hmm... looks like this doesn't include a function to
>                 decompress PDF data yet :-("

Does PDF need decompression?

-- 
(domestic pets only, the antidote for overdose, milk.)
  No Gnus T-Shirt for sale: http://ingebrigtsen.no/no.php
  and http://lars.ingebrigtsen.no/2013/08/twenty-years-of-september.html



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Mysterious gzipped images
  2013-08-08 15:43                 ` Romain Francoise
  2013-08-08 15:52                   ` Lars Magne Ingebrigtsen
@ 2013-08-11 19:51                   ` Lars Magne Ingebrigtsen
  1 sibling, 0 replies; 46+ messages in thread
From: Lars Magne Ingebrigtsen @ 2013-08-11 19:51 UTC (permalink / raw)
  To: Romain Francoise; +Cc: Stefan Monnier, emacs-devel

Romain Francoise <romain@orebokech.com> writes:

> Lars Magne Ingebrigtsen <larsi@gnus.org> writes:
>
>> gzip is "free", since it's already built into 99% of Emacs builds
>> already.
>
> This argument applies just as well to other compression types, e.g. in
> Debian all variants of Emacs 24 are built with libxml support, and
> libxml is linked with liblzma so xz would also be "free".

Sure.  But we have no real xz use cases.  Including all other "free"
libraries into Emacs "just because" makes no sense.

-- 
(domestic pets only, the antidote for overdose, milk.)
  No Gnus T-Shirt for sale: http://ingebrigtsen.no/no.php
  and http://lars.ingebrigtsen.no/2013/08/twenty-years-of-september.html



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Mysterious gzipped images
  2013-08-08 15:31                 ` Lars Magne Ingebrigtsen
@ 2013-08-11 22:05                   ` Lars Magne Ingebrigtsen
  0 siblings, 0 replies; 46+ messages in thread
From: Lars Magne Ingebrigtsen @ 2013-08-11 22:05 UTC (permalink / raw)
  To: emacs-devel

Lars Magne Ingebrigtsen <larsi@gnus.org> writes:

> Or perhaps the C-g check should be in the output part.  A compressecd
> small file can expand to a huge uncompressed file, so it should check
> for C-g ever few mb out data output.

And perhaps it should just bail if it makes the buffer bigger than the
"maximum buffer size"?  Hm.  I can't find the variable that says what a
"big file" is at the moment...

-- 
(domestic pets only, the antidote for overdose, milk.)
  No Gnus T-Shirt for sale: http://ingebrigtsen.no/no.php
  and http://lars.ingebrigtsen.no/2013/08/twenty-years-of-september.html



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Mysterious gzipped images
  2013-08-11 19:47               ` Lars Magne Ingebrigtsen
@ 2013-08-12  0:54                 ` Stefan Monnier
  2013-08-12 14:05                   ` Lars Magne Ingebrigtsen
  2013-08-12 12:13                 ` Eli Zaretskii
  1 sibling, 1 reply; 46+ messages in thread
From: Stefan Monnier @ 2013-08-12  0:54 UTC (permalink / raw)
  To: Lars Magne Ingebrigtsen; +Cc: emacs-devel

> Sure, renaming from `decompress-gzipped-region' to something more
> prefixey would make sense.  Do you prefer zlib (which is what the zlib
> people call it), or libz (which is what all other people call their
> libraries)?

If (random 2) return 1, then "libz-" is the clear winner, but if it
returns 0 then "zlib-" is way better.

>> Stefan "Hmm... looks like this doesn't include a function to
>> decompress PDF data yet :-("
> Does PDF need decompression?

I have Elisp code that reads a PDF file and returns the number of pages
it has (I use it in doc-view to determine the last page without having
to render the whole file), but it doesn't work in recent PDFs because
the index is now compressed using some kind of gzip but without the
usual file header, kind of like the gzip compression used in ssh.

This PDF-reading code is only in my local changes because without
support for current PDFs it's pretty useless.  But its absence in trunk
also makes it harder (read: not installed yet) to provide good support
for lazy-rendering (i.e. only render those pages you look at), which is
a feature I use a lot and that some users have requested.

Hence, it'd be nice if zlib could provide the function I need to
decompress the index at the end of PDF files.


        Stefan



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Mysterious gzipped images
  2013-08-11 19:47               ` Lars Magne Ingebrigtsen
  2013-08-12  0:54                 ` Stefan Monnier
@ 2013-08-12 12:13                 ` Eli Zaretskii
  2013-08-12 12:21                   ` Eli Zaretskii
  1 sibling, 1 reply; 46+ messages in thread
From: Eli Zaretskii @ 2013-08-12 12:13 UTC (permalink / raw)
  To: Lars Magne Ingebrigtsen; +Cc: emacs-devel

> From: Lars Magne Ingebrigtsen <larsi@gnus.org>
> Date: Sun, 11 Aug 2013 21:47:33 +0200
> Cc: emacs-devel@gnu.org
> 
> Stefan Monnier <monnier@iro.umontreal.ca> writes:
> 
> > Could you try and use a "package prefix".  E.g. "libz-" or "zlib-"?
> 
> I've checked in what I had now so that other people can fix my code.
> :-)

It breaks the Windows build:

  gcc  -std=gnu99 -Demacs  -I. -I/d/gnu/bzr/emacs/trunk/src  -I../lib -I/d/gnu/bzr/emacs/trunk/src/../lib   -mtune=pentium4        -Id:/usr/include/libxml2      -MMD -MF deps/.d -MP  -Id:/usr/include -Id:/usr/include/p11-kit-1       -O0 -gdwarf-2 -g3  -Wl,-stack,0x00800000 -Wl,-heap,0x00100000 -Wl,-image-base,0x01000000 -Wl,-entry,__start -Wl,-Map,./temacs.map  \
    -o temacs firstfile.o vm-limit.o dispnew.o frame.o scroll.o xdisp.o menu.o  window.o  charset.o coding.o category.o ccl.o character.o chartab.o bidi.o   term.o terminal.o xfaces.o     emacs.o keyboard.o macros.o keymap.o sysdep.o  buffer.o filelock.o insdel.o marker.o  minibuf.o fileio.o dired.o  cmds.o casetab.o casefiddle.o indent.o search.o regex.o undo.o  alloc.o data.o doc.o editfns.o callint.o  eval.o floatfns.o fns.o font.o print.o lread.o  syntax.o unexw32.o bytecode.o  process.o gnutls.o callproc.o  region-cache.o sound.o atimer.o  doprnt.o intervals.o textprop.o composite.o xml.o w32notify.o  profiler.o decompress.o   w32fns.o w32menu.o w32reg.o w32font.o w32term.o w32xfns.o w32select.o w32uniscribe.o w32.o w32console.o w32heap.o w32inevt.o w32proc.o fontset.o fringe.o image.
 o   tparam.o  gmalloc.o ralloc.o  lastfile.o   ../lib/libgnu.a emacs.res  -lwinmm -lgdi32 -lcomdlg32 -lmpr -lwinspool -lole32 -lcomctl32 -lusp10

  decompress.o(.text+0x16): In function `unwind_decompress':
  d:\gnu\bzr\emacs\trunk\src/decompress.c:40: undefined reference to `inflateEnd'
  decompress.o(.text+0xf7): In function `Fdecompress_gzipped_region':
  d:\gnu\bzr\emacs\trunk\src/decompress.c:83: undefined reference to `inflateInit2_'
  decompress.o(.text+0x242):d:\gnu\bzr\emacs\trunk\src/decompress.c:114: undefined reference to `inflate'
  collect2: ld returned 1 exit status
  Makefile:677: recipe for target `temacs.exe' failed
  make[1]: *** [temacs.exe] Error 1

Looks like -lz is missing from the link command line.

The configure script did discover (correctly) that zlib is installed:

  Does Emacs directly use zlib?                           yes



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Mysterious gzipped images
  2013-08-12 12:13                 ` Eli Zaretskii
@ 2013-08-12 12:21                   ` Eli Zaretskii
  2013-08-12 13:06                     ` Lars Magne Ingebrigtsen
  0 siblings, 1 reply; 46+ messages in thread
From: Eli Zaretskii @ 2013-08-12 12:21 UTC (permalink / raw)
  To: larsi; +Cc: emacs-devel

> Date: Mon, 12 Aug 2013 15:13:49 +0300
> From: Eli Zaretskii <eliz@gnu.org>
> Cc: emacs-devel@gnu.org
> 
>   decompress.o(.text+0x16): In function `unwind_decompress':
>   d:\gnu\bzr\emacs\trunk\src/decompress.c:40: undefined reference to `inflateEnd'
>   decompress.o(.text+0xf7): In function `Fdecompress_gzipped_region':
>   d:\gnu\bzr\emacs\trunk\src/decompress.c:83: undefined reference to `inflateInit2_'
>   decompress.o(.text+0x242):d:\gnu\bzr\emacs\trunk\src/decompress.c:114: undefined reference to `inflate'
>   collect2: ld returned 1 exit status
>   Makefile:677: recipe for target `temacs.exe' failed
>   make[1]: *** [temacs.exe] Error 1
> 
> Looks like -lz is missing from the link command line.

I see the problem in configure.ac:

  HAVE_ZLIB=no
  LIBZ=
  if test "${with_zlib}" != "no"; then
    if test "${HAVE_PNG}" = "yes"; then
      ### PNG depends on zlib, so if we have PNG, we have zlib.
      HAVE_ZLIB=yes               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    else
      ### No PNG, so check zlib ourselves.

Your assumption that if PNG is being compiled in, the link command
line will necessarily include -lz is false.  In particular, on Windows
libpng is loaded dynamically, so no -lFOO link arguments are used
during the link, even though the PNG support is being compiled.

(I don't really understand why you needed to take this shortcut
anyway: what's wrong with having two -lz switches passed to the
linker?)



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Mysterious gzipped images
  2013-08-12 12:21                   ` Eli Zaretskii
@ 2013-08-12 13:06                     ` Lars Magne Ingebrigtsen
  2013-08-12 13:29                       ` Eli Zaretskii
  2013-08-13 23:40                       ` Lars Magne Ingebrigtsen
  0 siblings, 2 replies; 46+ messages in thread
From: Lars Magne Ingebrigtsen @ 2013-08-12 13:06 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

> (I don't really understand why you needed to take this shortcut
> anyway: what's wrong with having two -lz switches passed to the
> linker?)

I just assumed that that was something we didn't want.  But if it makes
no difference, then that part of the .ac code should be adjusted to just
add -lz anyway.

-- 
(domestic pets only, the antidote for overdose, milk.)
  No Gnus T-Shirt for sale: http://ingebrigtsen.no/no.php
  and http://lars.ingebrigtsen.no/2013/08/twenty-years-of-september.html



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Mysterious gzipped images
  2013-08-12 13:06                     ` Lars Magne Ingebrigtsen
@ 2013-08-12 13:29                       ` Eli Zaretskii
  2013-08-13 23:40                       ` Lars Magne Ingebrigtsen
  1 sibling, 0 replies; 46+ messages in thread
From: Eli Zaretskii @ 2013-08-12 13:29 UTC (permalink / raw)
  To: Lars Magne Ingebrigtsen; +Cc: emacs-devel

> From: Lars Magne Ingebrigtsen <larsi@gnus.org>
> Cc: emacs-devel@gnu.org
> Date: Mon, 12 Aug 2013 15:06:35 +0200
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> > (I don't really understand why you needed to take this shortcut
> > anyway: what's wrong with having two -lz switches passed to the
> > linker?)
> 
> I just assumed that that was something we didn't want.  But if it makes
> no difference, then that part of the .ac code should be adjusted to just
> add -lz anyway.

For now, I fixed the Windows build without making any changes to
configury (except that I enhanced the comment about why not having -lz
is OK on Windows).  I will let more knowledgeable people decide
whether to keep the current logic.



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Mysterious gzipped images
  2013-08-12  0:54                 ` Stefan Monnier
@ 2013-08-12 14:05                   ` Lars Magne Ingebrigtsen
  2013-08-12 15:09                     ` Stefan Monnier
  2013-08-12 15:20                     ` Eli Zaretskii
  0 siblings, 2 replies; 46+ messages in thread
From: Lars Magne Ingebrigtsen @ 2013-08-12 14:05 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

Stefan Monnier <monnier@iro.umontreal.ca> writes:

>> Sure, renaming from `decompress-gzipped-region' to something more
>> prefixey would make sense.  Do you prefer zlib (which is what the zlib
>> people call it), or libz (which is what all other people call their
>> libraries)?
>
> If (random 2) return 1, then "libz-" is the clear winner, but if it
> returns 0 then "zlib-" is way better.

`zlib' won.  I've now renamed the function.

> I have Elisp code that reads a PDF file and returns the number of pages
> it has (I use it in doc-view to determine the last page without having
> to render the whole file), but it doesn't work in recent PDFs because
> the index is now compressed using some kind of gzip but without the
> usual file header, kind of like the gzip compression used in ssh.

Hm...  could it be a DEFLATE format thing?  Uhm...  RFC1951, apparently,
according to the manual:

http://www.zlib.net/manual.html

zlib should support decompressing that, too -- I think the only thing
that should need twiddling is the inflateInit2 call, which takes magical
badly documented parameters to tell it what to do...

I think.  The real documentation is in /usr/include/zlib.h.  Read the
bit about inflateInit2 and scratch your head...

> This PDF-reading code is only in my local changes because without
> support for current PDFs it's pretty useless.  But its absence in trunk
> also makes it harder (read: not installed yet) to provide good support
> for lazy-rendering (i.e. only render those pages you look at), which is
> a feature I use a lot and that some users have requested.

If you have some test files, I can try to see if I can get zlib to do
the right thing.

-- 
(domestic pets only, the antidote for overdose, milk.)
  No Gnus T-Shirt for sale: http://ingebrigtsen.no/no.php
  and http://lars.ingebrigtsen.no/2013/08/twenty-years-of-september.html



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Mysterious gzipped images
  2013-08-12 14:05                   ` Lars Magne Ingebrigtsen
@ 2013-08-12 15:09                     ` Stefan Monnier
  2013-08-12 15:20                     ` Eli Zaretskii
  1 sibling, 0 replies; 46+ messages in thread
From: Stefan Monnier @ 2013-08-12 15:09 UTC (permalink / raw)
  To: Lars Magne Ingebrigtsen; +Cc: emacs-devel

>> If (random 2) return 1, then "libz-" is the clear winner, but if it
>> returns 0 then "zlib-" is way better.
> `zlib' won.  I've now renamed the function.

I can't believe you even considered "libz-".  Such a loser!

> If you have some test files, I can try to see if I can get zlib to do
> the right thing.

That'll have to wait until I get back to doc-view-mode.


        Stefan



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Mysterious gzipped images
  2013-08-12 14:05                   ` Lars Magne Ingebrigtsen
  2013-08-12 15:09                     ` Stefan Monnier
@ 2013-08-12 15:20                     ` Eli Zaretskii
  2013-08-12 16:27                       ` Lars Magne Ingebrigtsen
  1 sibling, 1 reply; 46+ messages in thread
From: Eli Zaretskii @ 2013-08-12 15:20 UTC (permalink / raw)
  To: Lars Magne Ingebrigtsen; +Cc: monnier, emacs-devel

> From: Lars Magne Ingebrigtsen <larsi@gnus.org>
> Date: Mon, 12 Aug 2013 16:05:35 +0200
> Cc: emacs-devel@gnu.org
> 
> Stefan Monnier <monnier@iro.umontreal.ca> writes:
> 
> >> Sure, renaming from `decompress-gzipped-region' to something more
> >> prefixey would make sense.  Do you prefer zlib (which is what the zlib
> >> people call it), or libz (which is what all other people call their
> >> libraries)?
> >
> > If (random 2) return 1, then "libz-" is the clear winner, but if it
> > returns 0 then "zlib-" is way better.
> 
> `zlib' won.  I've now renamed the function.

But having both "zlib" and "gzipped" in the name is redundant, I
think.  How about zlib-decompress-region?



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Mysterious gzipped images
  2013-08-12 15:20                     ` Eli Zaretskii
@ 2013-08-12 16:27                       ` Lars Magne Ingebrigtsen
  2013-08-12 16:42                         ` Eli Zaretskii
  0 siblings, 1 reply; 46+ messages in thread
From: Lars Magne Ingebrigtsen @ 2013-08-12 16:27 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: monnier, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

> But having both "zlib" and "gzipped" in the name is redundant, I
> think.  How about zlib-decompress-region?

zlib supports both decompressing gzip-format data and the older format
which is just called "zlib".  It's a bit confusing.

So if we choose to offer decompressing the older format in the future,
then it would be called `zlib-decompress-zlibbed-region'.  Or
something.  >"?

And as Stefan said, there's possibly a "headerless gzip" format used for
streaming and pdfs, which would (possibly) need a different calling
sequence.

I'm not sure how much code would be shared between these functions, but
if the main loop is identical, then it would make sense to rename this
function to `zlib-decompress-region' and have the format be a third
parameter (`gzip', `zlib', `gzip-without-a-header')...

Hm...

----
windowBits can also be greater than 15 for optional gzip decoding. Add
32 to windowBits to enable zlib and gzip decoding with automatic header
detection, or add 16 to decode only the gzip format (the zlib format
will return a Z_DATA_ERROR). If a gzip stream is being decoded,
strm->adler is a crc32 instead of an adler32.
----

Oh, we can actually get automatic detection of the zlib and gzip format
by using a different magical constant.  If we did that, then it would
totally make sense to rename the function.

-- 
(domestic pets only, the antidote for overdose, milk.)
  No Gnus T-Shirt for sale: http://ingebrigtsen.no/no.php
  and http://lars.ingebrigtsen.no/2013/08/twenty-years-of-september.html



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Mysterious gzipped images
  2013-08-12 16:27                       ` Lars Magne Ingebrigtsen
@ 2013-08-12 16:42                         ` Eli Zaretskii
  0 siblings, 0 replies; 46+ messages in thread
From: Eli Zaretskii @ 2013-08-12 16:42 UTC (permalink / raw)
  To: Lars Magne Ingebrigtsen; +Cc: monnier, emacs-devel

> From: Lars Magne Ingebrigtsen <larsi@gnus.org>
> Cc: monnier@iro.umontreal.ca,  emacs-devel@gnu.org
> Date: Mon, 12 Aug 2013 18:27:28 +0200
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> > But having both "zlib" and "gzipped" in the name is redundant, I
> > think.  How about zlib-decompress-region?
> 
> zlib supports both decompressing gzip-format data and the older format
> which is just called "zlib".  It's a bit confusing.

Additional formats could be handled via an optional argument METHOD or
something.



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Mysterious gzipped images
  2013-08-12 13:06                     ` Lars Magne Ingebrigtsen
  2013-08-12 13:29                       ` Eli Zaretskii
@ 2013-08-13 23:40                       ` Lars Magne Ingebrigtsen
  2013-08-14 13:09                         ` Lars Magne Ingebrigtsen
  1 sibling, 1 reply; 46+ messages in thread
From: Lars Magne Ingebrigtsen @ 2013-08-13 23:40 UTC (permalink / raw)
  To: emacs-devel

I was tracking down a parsing problem, and the cause was that
zlib-decompress-region moves point around.  If it left point "where it
was" (for some value of), then things would be OK.  And that seems like
a nicer behaviour, anyway.

So I tried the following, but it doesn't help.  Point is stubbornly at
the end of the buffer here:

 (url-retrieve "https://icons.duckduckgo.com/i/a-z-animals.com.ico" (lambda (&rest ignore) (pop-to-buffer (current-buffer))))

I'm probably doing something stupid, and I'm too tired to see what it
is.  gdb claims that old_point is what is should be -- the start of the
body.
 
=== modified file 'src/decompress.c'
--- src/decompress.c	2013-08-13 21:17:09 +0000
+++ src/decompress.c	2013-08-13 23:28:30 +0000
@@ -95,12 +95,14 @@
   struct decompress_unwind_data *data = ddata;
   fn_inflateEnd (data->stream);
 
-  /* Delete any uncompressed data already inserted and restore point.  */
+  /* Delete any uncompressed data already inserted on error.  */
   if (data->start)
-    {
-      del_range (data->start, PT);
-      SET_PT (data->old_point);
-    }
+    del_range (data->start, PT);
+
+  /* Put point where it was, or if the buffer has shrunk because the
+     compressed data is bigger than the uncompressed, at
+     point-max.  */
+  SET_PT (min (data->old_point, ZV));
 }
 
 DEFUN ("zlib-available-p", Fzlib_available_p, Szlib_available_p, 0, 0, 0,


-- 
(domestic pets only, the antidote for overdose, milk.)
  No Gnus T-Shirt for sale: http://ingebrigtsen.no/no.php
  and http://lars.ingebrigtsen.no/2013/08/twenty-years-of-september.html




^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Mysterious gzipped images
  2013-08-13 23:40                       ` Lars Magne Ingebrigtsen
@ 2013-08-14 13:09                         ` Lars Magne Ingebrigtsen
  2013-08-14 15:12                           ` Eli Zaretskii
  0 siblings, 1 reply; 46+ messages in thread
From: Lars Magne Ingebrigtsen @ 2013-08-14 13:09 UTC (permalink / raw)
  To: emacs-devel

Lars Magne Ingebrigtsen <larsi@gnus.org> writes:

> I'm probably doing something stupid, and I'm too tired to see what it
> is.  gdb claims that old_point is what is should be -- the start of the
> body.

It turned out to have nothing to do with that, but was a combination of
two other ... bugs?

I've fixed one of them (making `url-retrieve' put the point where it
should be), but I still don't quite understand this behaviour.

Eval the following:

(url-retrieve "https://icons.duckduckgo.com/i/a-z-animals.com.ico"
              (lambda (&rest ignore)
                (message "%s" (point))
                (switch-to-buffer (current-buffer))))

I get a message saying "1", but after switching to the buffer, point is
visibly at the end of the buffer.

What's up with that?
                
-- 
(domestic pets only, the antidote for overdose, milk.)
  No Gnus T-Shirt for sale: http://ingebrigtsen.no/no.php
  and http://lars.ingebrigtsen.no/2013/08/twenty-years-of-september.html



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Mysterious gzipped images
  2013-08-14 13:09                         ` Lars Magne Ingebrigtsen
@ 2013-08-14 15:12                           ` Eli Zaretskii
  2013-08-17 15:01                             ` Lars Magne Ingebrigtsen
  0 siblings, 1 reply; 46+ messages in thread
From: Eli Zaretskii @ 2013-08-14 15:12 UTC (permalink / raw)
  To: Lars Magne Ingebrigtsen; +Cc: emacs-devel

> From: Lars Magne Ingebrigtsen <larsi@gnus.org>
> Date: Wed, 14 Aug 2013 15:09:24 +0200
> 
> Eval the following:
> 
> (url-retrieve "https://icons.duckduckgo.com/i/a-z-animals.com.ico"
>               (lambda (&rest ignore)
>                 (message "%s" (point))
>                 (switch-to-buffer (current-buffer))))
> 
> I get a message saying "1", but after switching to the buffer, point is
> visibly at the end of the buffer.
> 
> What's up with that?

Did you look at the source of switch-to-buffer?  This portion sounds
relevant:

      (let* ((entry (assq buffer (window-prev-buffers)))
	     (displayed (and (eq switch-to-buffer-preserve-window-point
				 'already-displayed)
			     (get-buffer-window buffer 0))))
	(set-window-buffer nil buffer)
	(when (and entry
		   (or (eq switch-to-buffer-preserve-window-point t)
		       displayed))
	  ;; Try to restore start and point of buffer in the selected
	  ;; window (Bug#4041).
	  (set-window-start (selected-window) (nth 1 entry) t)
	  (set-window-point nil (nth 2 entry))))))




^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Mysterious gzipped images
  2013-08-14 15:12                           ` Eli Zaretskii
@ 2013-08-17 15:01                             ` Lars Magne Ingebrigtsen
  2013-08-17 15:27                               ` Eli Zaretskii
  0 siblings, 1 reply; 46+ messages in thread
From: Lars Magne Ingebrigtsen @ 2013-08-17 15:01 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>> (url-retrieve "https://icons.duckduckgo.com/i/a-z-animals.com.ico"
>>               (lambda (&rest ignore)
>>                 (message "%s" (point))
>>                 (switch-to-buffer (current-buffer))))

[...]

> Did you look at the source of switch-to-buffer?  This portion sounds
> relevant:
>
>       (let* ((entry (assq buffer (window-prev-buffers)))
> 	     (displayed (and (eq switch-to-buffer-preserve-window-point
> 				 'already-displayed)
> 			     (get-buffer-window buffer 0))))
> 	(set-window-buffer nil buffer)
> 	(when (and entry
> 		   (or (eq switch-to-buffer-preserve-window-point t)
> 		       displayed))
> 	  ;; Try to restore start and point of buffer in the selected
> 	  ;; window (Bug#4041).
> 	  (set-window-start (selected-window) (nth 1 entry) t)
> 	  (set-window-point nil (nth 2 entry))))))

switch-to-buffer-preserve-window-point is a variable defined in `window.el'.
Its value is nil

Documentation:
If non-nil, `switch-to-buffer' tries to preserve `window-point'.
If this is nil, `switch-to-buffer' displays the buffer at that
buffer's `point'.

So point here was at the beginning of the buffer.  But switching to this
buffer, which isn't displayed, puts point at the end of the buffer.

Surely that's not the intended result?

-- 
(domestic pets only, the antidote for overdose, milk.)
  No Gnus T-Shirt for sale: http://ingebrigtsen.no/no.php
  and http://lars.ingebrigtsen.no/2013/08/twenty-years-of-september.html



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Mysterious gzipped images
  2013-08-17 15:01                             ` Lars Magne Ingebrigtsen
@ 2013-08-17 15:27                               ` Eli Zaretskii
  2013-08-17 17:04                                 ` Eli Zaretskii
  0 siblings, 1 reply; 46+ messages in thread
From: Eli Zaretskii @ 2013-08-17 15:27 UTC (permalink / raw)
  To: Lars Magne Ingebrigtsen; +Cc: emacs-devel

> From: Lars Magne Ingebrigtsen <larsi@gnus.org>
> Date: Sat, 17 Aug 2013 17:01:09 +0200
> Cc: emacs-devel@gnu.org
> 
> > Did you look at the source of switch-to-buffer?  This portion sounds
> > relevant:
> >
> >       (let* ((entry (assq buffer (window-prev-buffers)))
> > 	     (displayed (and (eq switch-to-buffer-preserve-window-point
> > 				 'already-displayed)
> > 			     (get-buffer-window buffer 0))))
> > 	(set-window-buffer nil buffer)
> > 	(when (and entry
> > 		   (or (eq switch-to-buffer-preserve-window-point t)
> > 		       displayed))
> > 	  ;; Try to restore start and point of buffer in the selected
> > 	  ;; window (Bug#4041).
> > 	  (set-window-start (selected-window) (nth 1 entry) t)
> > 	  (set-window-point nil (nth 2 entry))))))
> 
> switch-to-buffer-preserve-window-point is a variable defined in `window.el'.
> Its value is nil

I couldn't have possibly known that.

> So point here was at the beginning of the buffer.  But switching to this
> buffer, which isn't displayed, puts point at the end of the buffer.

You may wish to step into url-retrieve and the functions it calls, and
see what happens there.



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Mysterious gzipped images
  2013-08-17 15:27                               ` Eli Zaretskii
@ 2013-08-17 17:04                                 ` Eli Zaretskii
  2013-08-18 17:06                                   ` Lars Magne Ingebrigtsen
  0 siblings, 1 reply; 46+ messages in thread
From: Eli Zaretskii @ 2013-08-17 17:04 UTC (permalink / raw)
  To: larsi; +Cc: emacs-devel

> Date: Sat, 17 Aug 2013 18:27:21 +0300
> From: Eli Zaretskii <eliz@gnu.org>
> Cc: emacs-devel@gnu.org
> 
> > So point here was at the beginning of the buffer.  But switching to this
> > buffer, which isn't displayed, puts point at the end of the buffer.
> 
> You may wish to step into url-retrieve and the functions it calls, and
> see what happens there.

It's not switch-to-buffer that moves point, it's
url-http-wait-for-headers-change-function, which is called by
url-http-generic-filter.  Here's the relevant fragment:

    ;; We are still at the beginning of the buffer... must just be
    ;; waiting for a response.
    (url-http-debug "Spinning waiting for headers...")
    (when (eq process-buffer (current-buffer))
      (goto-char (point-max)))))  <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Mysterious gzipped images
  2013-08-17 17:04                                 ` Eli Zaretskii
@ 2013-08-18 17:06                                   ` Lars Magne Ingebrigtsen
  0 siblings, 0 replies; 46+ messages in thread
From: Lars Magne Ingebrigtsen @ 2013-08-18 17:06 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

> It's not switch-to-buffer that moves point, it's
> url-http-wait-for-headers-change-function, which is called by
> url-http-generic-filter.  Here's the relevant fragment:
>
>     ;; We are still at the beginning of the buffer... must just be
>     ;; waiting for a response.
>     (url-http-debug "Spinning waiting for headers...")
>     (when (eq process-buffer (current-buffer))
>       (goto-char (point-max)))))  <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<

Ah, good catch.

I wonder why it's doing that even after the process has exited.  It's
calling that after the callback has been called, and the callback will
kill the buffer 99% of the time, so I wonder whether it should check
whether the buffer is live before going to point-max.  But I'm not
familiar enough with that code to say...

-- 
(domestic pets only, the antidote for overdose, milk.)
  No Gnus T-Shirt for sale: http://ingebrigtsen.no/no.php
  and http://lars.ingebrigtsen.no/2013/08/twenty-years-of-september.html



^ permalink raw reply	[flat|nested] 46+ messages in thread

end of thread, other threads:[~2013-08-18 17:06 UTC | newest]

Thread overview: 46+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-08-06 22:20 Mysterious gzipped images Lars Magne Ingebrigtsen
2013-08-06 22:30 ` Lars Magne Ingebrigtsen
2013-08-06 23:05   ` Andreas Schwab
2013-08-06 23:38     ` Lars Magne Ingebrigtsen
2013-08-07  1:08       ` Lars Magne Ingebrigtsen
2013-08-07  8:51         ` Julien Danjou
2013-08-07 14:35         ` Stefan Monnier
2013-08-07 16:47           ` chad
2013-08-08 10:35           ` Lars Magne Ingebrigtsen
2013-08-08 10:57             ` Andreas Schwab
2013-08-11 19:44               ` Lars Magne Ingebrigtsen
2013-08-08 13:22             ` Paul Eggert
2013-08-08 14:06               ` Lars Magne Ingebrigtsen
2013-08-08 15:31                 ` Lars Magne Ingebrigtsen
2013-08-11 22:05                   ` Lars Magne Ingebrigtsen
2013-08-08 13:24             ` Romain Francoise
2013-08-08 13:28               ` Lars Magne Ingebrigtsen
2013-08-08 13:36                 ` Christopher Schmidt
2013-08-08 13:55                   ` Lars Magne Ingebrigtsen
2013-08-08 14:06                     ` Christopher Schmidt
2013-08-08 14:09                       ` Lars Magne Ingebrigtsen
2013-08-08 15:43                 ` Romain Francoise
2013-08-08 15:52                   ` Lars Magne Ingebrigtsen
2013-08-11 19:51                   ` Lars Magne Ingebrigtsen
2013-08-08 14:47             ` Stefan Monnier
2013-08-11 19:47               ` Lars Magne Ingebrigtsen
2013-08-12  0:54                 ` Stefan Monnier
2013-08-12 14:05                   ` Lars Magne Ingebrigtsen
2013-08-12 15:09                     ` Stefan Monnier
2013-08-12 15:20                     ` Eli Zaretskii
2013-08-12 16:27                       ` Lars Magne Ingebrigtsen
2013-08-12 16:42                         ` Eli Zaretskii
2013-08-12 12:13                 ` Eli Zaretskii
2013-08-12 12:21                   ` Eli Zaretskii
2013-08-12 13:06                     ` Lars Magne Ingebrigtsen
2013-08-12 13:29                       ` Eli Zaretskii
2013-08-13 23:40                       ` Lars Magne Ingebrigtsen
2013-08-14 13:09                         ` Lars Magne Ingebrigtsen
2013-08-14 15:12                           ` Eli Zaretskii
2013-08-17 15:01                             ` Lars Magne Ingebrigtsen
2013-08-17 15:27                               ` Eli Zaretskii
2013-08-17 17:04                                 ` Eli Zaretskii
2013-08-18 17:06                                   ` Lars Magne Ingebrigtsen
2013-08-07 18:25         ` Rüdiger Sonderfeld
2013-08-07 21:06           ` FFI (was: Mysterious gzipped images) Stefan Monnier
2013-08-08  0:50             ` Stephen J. Turnbull

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).