unofficial mirror of guile-devel@gnu.org 
 help / color / mirror / Atom feed
* [PATCH v3] rdelim: Add new procedure `for-line-in-file`.
@ 2024-12-16  8:43 Adam Faiz
  2024-12-16 11:27 ` Tomas Volf
  2024-12-16 11:33 ` Ricardo Wurmus
  0 siblings, 2 replies; 5+ messages in thread
From: Adam Faiz @ 2024-12-16  8:43 UTC (permalink / raw)
  To: guile-devel; +Cc: Nala Ginrut, Ricardo Wurmus, Maxime Devos

From fe113e9efc08aae2a7e3792a1018c496212bd774 Mon Sep 17 00:00:00 2001
From: AwesomeAdam54321 <adam.faiz@disroot.org>
Date: Sun, 15 Dec 2024 23:48:30 +0800
Subject: [PATCH v3] rdelim: Add new procedure `for-line-in-file`.

* module/ice-9/rdelim.scm (for-line-in-file): Add it.

This procedure makes it convenient to do per-line processing of a text
file.
---
 module/ice-9/rdelim.scm | 23 ++++++++++++++++++++++-
 1 file changed, 22 insertions(+), 1 deletion(-)

diff --git a/module/ice-9/rdelim.scm b/module/ice-9/rdelim.scm
index d2cd081d7..b4c55c12e 100644
--- a/module/ice-9/rdelim.scm
+++ b/module/ice-9/rdelim.scm
@@ -23,7 +23,8 @@
 ;;; similar to (scsh rdelim) but somewhat incompatible.
 
 (define-module (ice-9 rdelim)
-  #:export (read-line
+  #:export (for-line-in-file
+            read-line
             read-line!
             read-delimited
             read-delimited!
@@ -206,3 +207,23 @@ characters to read.  By default, there is no limit."
 	      line)
       (else
        (error "unexpected handle-delim value: " handle-delim)))))
+
+(define* (for-line-in-file file proc
+                           #:key (encoding #f) (guess-encoding #f))
+  "Call PROC for every line in FILE until the eof-object is reached.
+FILE can either be a filename string or an already opened input port.
+The corresponding port is closed upon completion.
+
+The line provided to PROC is guaranteed to be a string."
+  (let ((port
+         (if (input-port? file)
+             file
+             (open-input-file file
+                              #:encoding encoding
+                              #:guess-encoding guess-encoding))))
+    (let loop ((line (read-line port)))
+      (cond ((eof-object? line)
+             (close-port port))
+            (else
+             (proc line)
+             (loop (read-line port)))))))
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v3] rdelim: Add new procedure `for-line-in-file`.
  2024-12-16  8:43 [PATCH v3] rdelim: Add new procedure `for-line-in-file` Adam Faiz
@ 2024-12-16 11:27 ` Tomas Volf
  2024-12-16 13:08   ` Nala Ginrut
  2024-12-16 11:33 ` Ricardo Wurmus
  1 sibling, 1 reply; 5+ messages in thread
From: Tomas Volf @ 2024-12-16 11:27 UTC (permalink / raw)
  To: Adam Faiz; +Cc: guile-devel, Nala Ginrut, Ricardo Wurmus, Maxime Devos

[-- Attachment #1: Type: text/plain, Size: 2944 bytes --]

Adam Faiz <adam.faiz@disroot.org> writes:

> From fe113e9efc08aae2a7e3792a1018c496212bd774 Mon Sep 17 00:00:00 2001
> From: AwesomeAdam54321 <adam.faiz@disroot.org>
> Date: Sun, 15 Dec 2024 23:48:30 +0800
> Subject: [PATCH v3] rdelim: Add new procedure `for-line-in-file`.
>
> * module/ice-9/rdelim.scm (for-line-in-file): Add it.
>
> This procedure makes it convenient to do per-line processing of a text
> file.
> ---
>  module/ice-9/rdelim.scm | 23 ++++++++++++++++++++++-

It should also be documented in the manual I would think.

>  1 file changed, 22 insertions(+), 1 deletion(-)
>
> diff --git a/module/ice-9/rdelim.scm b/module/ice-9/rdelim.scm
> index d2cd081d7..b4c55c12e 100644
> --- a/module/ice-9/rdelim.scm
> +++ b/module/ice-9/rdelim.scm
> @@ -23,7 +23,8 @@
>  ;;; similar to (scsh rdelim) but somewhat incompatible.
>
>  (define-module (ice-9 rdelim)
> -  #:export (read-line
> +  #:export (for-line-in-file
> +            read-line
>              read-line!
>              read-delimited
>              read-delimited!
> @@ -206,3 +207,23 @@ characters to read.  By default, there is no limit."
>  	      line)
>        (else
>         (error "unexpected handle-delim value: " handle-delim)))))
> +
> +(define* (for-line-in-file file proc

What about naming it for-delimited-in-file and adding #:delims argument?
That would allow reading files that have "lines" terminated with #\nul
instead of just #\newline, which would be handy for processing output of
shell commands (-z, -0, ...).  Delims could default to "\n", so
ergonomics of your use case would not be impacted.

On completely separate note, having (fold|reduce)-delimited-in-file
would be cool too (/me makes a note to write it).

> +                           #:key (encoding #f) (guess-encoding #f))
> +  "Call PROC for every line in FILE until the eof-object is reached.
> +FILE can either be a filename string or an already opened input port.
> +The corresponding port is closed upon completion.
> +
> +The line provided to PROC is guaranteed to be a string."
> +  (let ((port
> +         (if (input-port? file)
> +             file
> +             (open-input-file file
> +                              #:encoding encoding
> +                              #:guess-encoding guess-encoding))))
> +    (let loop ((line (read-line port)))
> +      (cond ((eof-object? line)
> +             (close-port port))

I know you were told that it should close the port, but I am not sure
about it.  It should close the port it opened, but should it also the
one it got?  This will prevent the same port being processed multiple
times, which could be annoying.

> +            (else
> +             (proc line)
> +             (loop (read-line port)))))))

Have a nice day,
Tomas

-- 
There are only two hard things in Computer Science:
cache invalidation, naming things and off-by-one errors.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 853 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v3] rdelim: Add new procedure `for-line-in-file`.
  2024-12-16  8:43 [PATCH v3] rdelim: Add new procedure `for-line-in-file` Adam Faiz
  2024-12-16 11:27 ` Tomas Volf
@ 2024-12-16 11:33 ` Ricardo Wurmus
  2024-12-16 12:18   ` Maxime Devos
  1 sibling, 1 reply; 5+ messages in thread
From: Ricardo Wurmus @ 2024-12-16 11:33 UTC (permalink / raw)
  To: Adam Faiz; +Cc: guile-devel, Nala Ginrut, Maxime Devos

Adam Faiz <adam.faiz@disroot.org> writes:

> +(define* (for-line-in-file file proc
> +                           #:key (encoding #f) (guess-encoding #f))
> +  "Call PROC for every line in FILE until the eof-object is reached.
> +FILE can either be a filename string or an already opened input port.
> +The corresponding port is closed upon completion.
> +
> +The line provided to PROC is guaranteed to be a string."
> +  (let ((port
> +         (if (input-port? file)
> +             file
> +             (open-input-file file
> +                              #:encoding encoding
> +                              #:guess-encoding guess-encoding))))
> +    (let loop ((line (read-line port)))
> +      (cond ((eof-object? line)
> +             (close-port port))
> +            (else
> +             (proc line)
> +             (loop (read-line port)))))))

I think the port would leak if PROC were to raise an exception.

-- 
Ricardo



^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: [PATCH v3] rdelim: Add new procedure `for-line-in-file`.
  2024-12-16 11:33 ` Ricardo Wurmus
@ 2024-12-16 12:18   ` Maxime Devos
  0 siblings, 0 replies; 5+ messages in thread
From: Maxime Devos @ 2024-12-16 12:18 UTC (permalink / raw)
  To: Ricardo Wurmus, Adam Faiz; +Cc: guile-devel@gnu.org, Nala Ginrut

[-- Attachment #1: Type: text/plain, Size: 1783 bytes --]

>I think the port would leak if PROC were to raise an exception.

To my knowledge, this is currently kind of impossible to properly handle, since Scheme doesn’t have ‘finally’. Closest thing is ‘dynamic-wind’ + close it in the ‘out-guard’, but that isn’t quite right since (re)winding can happen because of scheduling (e.g. Fibers) or other reasons other than exceptions, in which case it shouldn’t be closed.

A potential other option is to implement ‘finally’ in terms of exception handling, but even in case of exceptions, sometimes it shouldn’t be closed – if it is continuable and it is continued, then the port shouldn’t be closed.

I think the solution to this, is to make dynamic-wind overridable – the current dynamic-wind would be renamed to primitive-dynamic-wind, dynamic-wind would default to primitive-dynamic-wind but could be overriden (maybe with a parameter), and userspace scheduler libraries can override ‘dynamic-wind’ such that the ‘in-guard’ & ‘out-guard’ is _not_ run when the (re)winding is because of scheduling purposes. 

Then, if the user needs a dynamic-wind for things like implementing parameter-like things (e.g. adjust a C thread-local variable with a similar API like parameters), it would use primitive-dynamic-wind, and if it needs a dynamic-wind for things like resource cleanup, it would use ‘dynamic-wind’.

While not quite integrated in Guile like this yet, for practical implementation see:
• https://github.com/wingo/fibers/commit/cc0e84cd56df3b07d378f710df39f8822317a2a2https://git.sr.ht/~old/guile-parallel/tree/master/item/parallel.scm#L29
• (what’s missing here, is a way to override Guile’s dynamic-wind in a transparent manner)

Best regards,
Maxime Devos


[-- Attachment #2: Type: text/html, Size: 6566 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v3] rdelim: Add new procedure `for-line-in-file`.
  2024-12-16 11:27 ` Tomas Volf
@ 2024-12-16 13:08   ` Nala Ginrut
  0 siblings, 0 replies; 5+ messages in thread
From: Nala Ginrut @ 2024-12-16 13:08 UTC (permalink / raw)
  To: Adam Faiz, guile-devel, Nala Ginrut, Ricardo Wurmus, Maxime Devos

[-- Attachment #1: Type: text/plain, Size: 407 bytes --]

>
> I know you were told that it should close the port, but I am not sure
> about it.  It should close the port it opened, but should it also the
> one it got?  This will prevent the same port being processed multiple
> times, which could be annoying.
>

I'm sorry folks, I regretted to give this suggestion, it's not a good idea
after my carefully thought. It's better to let caller decide.

Best regards.

[-- Attachment #2: Type: text/html, Size: 940 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2024-12-16 13:08 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-12-16  8:43 [PATCH v3] rdelim: Add new procedure `for-line-in-file` Adam Faiz
2024-12-16 11:27 ` Tomas Volf
2024-12-16 13:08   ` Nala Ginrut
2024-12-16 11:33 ` Ricardo Wurmus
2024-12-16 12:18   ` Maxime Devos

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).