unofficial mirror of guile-user@gnu.org 
 help / color / mirror / Atom feed
From: Andy Wingo <wingo@pobox.com>
Subject: Re: (ice-9 syncase)
Date: Tue, 05 Oct 2004 21:30:50 +0200	[thread overview]
Message-ID: <1097004650.3932.144.camel@localhost> (raw)
In-Reply-To: <1096921106.3932.139.camel@localhost>

[-- Attachment #1: Type: text/plain, Size: 1165 bytes --]

Sorry to reply to my own message -- I haven't even seen the responses
yet :/

On Mon, 2004-10-04 at 22:18 +0200, Andy Wingo wrote:
> guile> ssax:make-parser
> $1 = #<macro! sc-macro>
> guile> (ssax:make-parser)
> ERROR: invalid syntax (find k-args (DOCTYPE . default) DOCTYPE val . others)
> ABORT: (misc-error)
> 
> This same code me a more proper error about not putting the right
> arguments if I put myself in the (sxml ssax) module first.

It also works if I:

(module-use! (current-module) (resolve-module '(sxml ssax)))

Note that's the module, and not the interface. I tried exporting all
bindings in (sxml ssax), but that's not enough. I suspect it has
something to do with the module eval closure.

In (sxml ssax), I reluctantly put this hack:

(module-use! (module-public-interface (current-module))
             (current-module))

So that users won't have to. I'm not happy, but it's better now.

As an example, I attached a file that parses an openoffice document and
outputs all words that aren't in the dictionary. Useful e.g. if you're
writing a document in multiple languages.

Cheers,
-- 
Andy Wingo <wingo@pobox.com>
http://ambient.2y.net/wingo/

[-- Attachment #2: sxw2words --]
[-- Type: text/plain, Size: 2295 bytes --]

#!/usr/bin/guile -s
!#
(use-modules (sxml ssax)
             (os process)
             (ice-9 rdelim)
             (srfi srfi-14))

(or (= (length (program-arguments)) 2)
    (begin
      (display "usage: sxw2words SXW-FILE\n" (current-error-port))
      (exit 1)))

(define sxw-file (cadr (program-arguments)))

(define (get-dict-words)
  (let ((port (open-input-file "/usr/share/dict/words")))
    (let lp ((words '()) (line (read-line port)))
      (if (eof-object? line)
          (sort! (reverse! words) string-ci<?)
          (lp (cons line words) (read-line port))))))

(define (uniq l)
  (let lp ((last-word "") (in l) (out '()))
    (cond ((null? in) (reverse! out))
          ((string-ci=? last-word (car in)) (lp last-word (cdr in) out))
          (else (lp (car in) (cdr in) (cons (car in) out))))))

(define trim-char-set (char-set-complement char-set:letter))
(define (get-sxw-words)
  ((ssax:make-parser
    NEW-LEVEL-SEED 
    (lambda (elem-gi attributes namespaces
                     expected-content seed)
      seed)
    
    FINISH-ELEMENT
    (lambda (elem-gi attributes namespaces parent-seed seed)
      seed)

    CHAR-DATA-HANDLER
    (lambda (string1 string2 seed)
      (let* ((strs (map
                    (lambda (x) (string-trim-both x trim-char-set))
                    (remove!
                     string-null? 
                     (append-map
                      (lambda (x) (string-split x #\space))
                      (string-split string1 #\newline)))))
             (seed (append! strs seed)))
        (if (string-null? string2) seed
            (cons string2 seed)))))
   (cdr (run-with-pipe ; "r" for read-only
         "r" "unzip" "-p" sxw-file "content.xml"))
   '()))

(let lp ((words (uniq (sort! (get-sxw-words) string-ci<?)))
         (dict-words (get-dict-words))
         (out '()))
  (cond
   ((null? words)
    (for-each (lambda (x) (display x) (newline)) (reverse! out)))
   ((string-ci=? (car words) (car dict-words))
    (lp (cdr words) (cdr dict-words) out))
   ((string-ci>? (car words) (car dict-words))
    (lp words (cdr dict-words) out))
   (else
    (lp (cdr words) dict-words (cons (car words) out)))))

;;; arch-tag: 6c2617d3-32a4-4a4d-8914-48c7ee1b5ad8

[-- Attachment #3: Type: text/plain, Size: 140 bytes --]

_______________________________________________
Guile-user mailing list
Guile-user@gnu.org
http://lists.gnu.org/mailman/listinfo/guile-user

  reply	other threads:[~2004-10-05 19:30 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-10-04 20:18 (ice-9 syncase) Andy Wingo
2004-10-05 19:30 ` Andy Wingo [this message]
2004-10-18 12:39   ` Marius Vollmer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/guile/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1097004650.3932.144.camel@localhost \
    --to=wingo@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).