unofficial mirror of guix-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: Eric Bavier <ericbavier@openmailbox.org>
To: Leo Famulari <leo@famulari.name>
Cc: guix-devel@gnu.org
Subject: Re: [PATCH 3/4] gnu: Add Swish-e.
Date: Tue, 23 Aug 2016 17:34:45 -0500	[thread overview]
Message-ID: <20160823173445.0364aeaa@openmailbox.org> (raw)
In-Reply-To: <20160823204651.GA23118@jasmine>

On Tue, 23 Aug 2016 16:46:51 -0400
Leo Famulari <leo@famulari.name> wrote:

> On Tue, Aug 23, 2016 at 01:49:12AM -0500, Eric Bavier wrote:
> > * gnu/packages/search.scm (swish-e): New variable.
> > * gnu/packages/patches/swish-e-search.patch,
> > gnu/packages/patches/swish-e-format-security.patch: New patches.
> > * gnu/local.mk (dist_patch_DATA): Add them.  
> 
> It would be ideal to present these patches to the upstream maintainers,
> but their site is offline. Do we know if the project is still active?

The last active maintainer stepped out a while ago, and it seems no one
else has stepped up:
https://web.archive.org/web/20150908004634/http://www.swish-e.org/archive/2014-04/13214.html

> 
> > diff --git a/gnu/packages/patches/swish-e-format-security.patch b/gnu/packages/patches/swish-e-format-security.patch
> > new file mode 100644
> > index 0000000..be9d7cb
> > --- /dev/null
> > +++ b/gnu/packages/patches/swish-e-format-security.patch
> > @@ -0,0 +1,33 @@
> > +Borrowed from Debian.
> > +
> > +--- swish-e-2.4.7/src/parser.c	2009-04-05 03:58:32.000000000 +0200
> > ++++ swish-e-2.4.7/src/parser.c	2013-06-11 13:53:08.196559035 +0200
> > +@@ -1760,7 +1760,7 @@
> > +     va_start(args, msg);
> > +     vsnprintf(str, 1000, msg, args );
> > +     va_end(args);
> > +-    xmlParserError(parse_data->ctxt, str);
> > ++    xmlParserError(parse_data->ctxt, "%s", str);
> > + }
> > + 
> > + static void warning(void *data, const char *msg, ...)
> > +@@ -1772,7 +1772,7 @@
> > +     va_start(args, msg);
> > +     vsnprintf(str, 1000, msg, args );
> > +     va_end(args);
> > +-    xmlParserWarning(parse_data->ctxt, str);
> > ++    xmlParserWarning(parse_data->ctxt, "%s", str);
> > + }  
> 
> My understanding is that xmlParserWarning() is from libxml2, defined in
> 'xmlerror.h' like this:
> 
> XMLPUBFUN void XMLCDECL
>     xmlParserWarning            (void *ctx,
>                                  const char *msg,
>                                  ...) LIBXML_ATTR_FORMAT(2,3);
> 
> I don't understand this definition very much, but in libxml2 file
> 'xmlversion.h', LIBXML_ATTR_FORMAT is commented with "Macro used to
> indicate to GCC the parameter are printf like".
> 
> Somebody else should review this.
> 
> > +--- swish-e-2.4.7/src/result_output.c	2009-04-05 03:58:32.000000000 +0200
> > ++++ swish-e-2.4.7/src/result_output.c	2013-06-11 13:53:38.593550825 +0200
> > +@@ -752,7 +752,7 @@
> > +             s = (char *) emalloc(MAXWORDLEN + 1);
> > +             n = strftime(s, (size_t) MAXWORDLEN, fmt, localtime(&(pv->value.v_date)));
> > +             if (n && f)
> > +-                fprintf(f, s);
> > ++                fprintf(f, "%s", s);  
> 
> LGTM
> 
> > diff --git a/gnu/packages/patches/swish-e-search.patch b/gnu/packages/patches/swish-e-search.patch
> > new file mode 100644
> > index 0000000..2a57a31
> > --- /dev/null
> > +++ b/gnu/packages/patches/swish-e-search.patch
> > @@ -0,0 +1,43 @@
> > +From http://swish-e.org/archive/2015-09/13295.html  
> 
> The site is offline, but I found it on archive.org:
> https://web.archive.org/web/20150907203848/http://www.swish-e.org/archive/2015-09/13295.html
> 
> Interestingly, I'm a few blocks the patch author's office :)
> 
> As far as I can tell, nobody from swish-e ever replied.

AFAICT that right.

> 
> > +
> > +--- a/src/compress.c	
> > ++++ a/src/compress.c	
> > +@@ -995,7 +995,7 @@ void    remove_worddata_longs(unsigned char *worddata,int *sz_worddata)
> > +             progerr("Internal error in remove_worddata_longs");
> > + 
> > +         /* dst may be smaller than src. So move the data */
> > +-        memcpy(dst,src,data_len);
> > ++        memmove(dst,src,data_len);  
> 
> LGTM
> 
> > + 
> > +         /* Increase pointers */
> > +         src += data_len;
> > +--- a/src/headers.c	
> > ++++ a/src/headers.c	
> > +@@ -280,7 +280,7 @@ static SWISH_HEADER_VALUE fetch_single_header( IndexFILE *indexf, HEADER_MAP *he
> > + 
> > +         case SWISH_NUMBER:
> > +         case SWISH_BOOL:
> > +-            value.number = *(unsigned long *) data_pointer;
> > ++            value.number = *(unsigned int *) data_pointer;  
> 
> Could there be any risk in reducing the size of the variable like this?

Assuming the value is indeed a boolean, probably not.

> 
> > + 
> > +             /* $$$ Ugly hack alert! */
> > +             /* correct for removed files */
> > +--- a/src/swishspider	
> > ++++ a/src/swishspider	
> > +@@ -27,6 +27,7 @@ use LWP::UserAgent;
> > + use HTTP::Status;
> > + use HTML::Parser 3.00;
> > + use HTML::LinkExtor;
> > ++use Encode;
> > + 
> > +     if (scalar(@ARGV) != 2) {
> > +         print STDERR "Usage: $0 localpath url\n";
> > +@@ -94,7 +95,7 @@ use HTML::LinkExtor;
> > +     # Don't allow links above the base
> > +     $URI::ABS_REMOTE_LEADING_DOTS = 1;
> > + 
> > +-    $p->parse( $$content_ref );
> > ++    $p->parse( decode_utf8 $$content_ref );  
> 
> Can you explain why we need this?

Presumably to better handle utf8-encoded input.

Tomb developers have expressed interest in replacing their use of
swish-e with the "Recoll" search tool
https://github.com/dyne/Tomb/issues/211.  If maintenance of this
package turns out to be burdensome, we might be able to drop it.

Thanks for reviewing,

`~Eric

  reply	other threads:[~2016-08-23 22:34 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-23  6:15 [PATCH 1/4] gnu: Add mlocate Eric Bavier
2016-08-23  6:15 ` [PATCH 2/4] gnu: Add steghide Eric Bavier
2016-08-23 20:16   ` Leo Famulari
2016-08-23 22:19     ` Eric Bavier
2016-08-23 23:53       ` Leo Famulari
2016-08-23  6:15 ` [PATCH 3/4] gnu: Add Swish-e Eric Bavier
2016-08-23  6:27   ` Leo Famulari
2016-08-23  6:49     ` Eric Bavier
2016-08-23 20:46       ` Leo Famulari
2016-08-23 22:34         ` Eric Bavier [this message]
2016-08-23  6:15 ` [PATCH 4/4] gnu: Add Tomb Eric Bavier
2016-08-23 20:50   ` Leo Famulari
2016-08-23 22:41     ` Eric Bavier
2016-08-23 23:50       ` Leo Famulari
2016-08-23 20:12 ` [PATCH 1/4] gnu: Add mlocate Leo Famulari

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://guix.gnu.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160823173445.0364aeaa@openmailbox.org \
    --to=ericbavier@openmailbox.org \
    --cc=guix-devel@gnu.org \
    --cc=leo@famulari.name \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).