unofficial mirror of guile-devel@gnu.org 
 help / color / mirror / Atom feed
* RFC: Changing the initial value of %default-port-conversion-strategy
@ 2024-05-29 20:26 Tomas Volf
  2024-05-30  7:54 ` Attila Lendvai
  0 siblings, 1 reply; 2+ messages in thread
From: Tomas Volf @ 2024-05-29 20:26 UTC (permalink / raw)
  To: guile-devel

[-- Attachment #1: Type: text/plain, Size: 2306 bytes --]

Greetings,

during my current quest to get more G-expressions working with UTF-8 input, I
have read the Guile's documentation, in particular '(guile)Encoding', and I
think change in default behavior is warranted.

Currently the initial value of %default-port-conversion-strategy is 'substitute.
I would like to propose changing it to 'error on the ground of preventing subtle
bugs and data corruption.

Just a reminder, when 'substitute is used, any non-representable character is
replaced with #\?.  No error is signaled and user has no way to detect it even
happened.  I just do not believe that to be a reasonable default.

Let us take a look for example at test-suite/standalone/test-mb-regexp.  It
contains this code:

    (regexp-exec
     (make-regexp "(.)(.)(.)")
     (string (integer->char 200) #\x (integer->char 202)))

That might look sensible until you realize that the following regexp *also*
matches:

    (make-regexp "(\\?)(.)(\\?)")

This is just asking for potential bugs (possibly security related) and data
corruption.  The 'substitute strategy should of course stay (if someone actually
needs it), but the default should really be changed to 'error.

Work-wise it is very feasible, the change is minimal (single line both in
ports.c and in documentation) and just few tests break:
* test-mb-regexp:
    But this just demonstrates code that should have not worked in the first
    place.  IMO.
* test-bad-identifiers:
    Requires setlocale to UTF-8 locale and converting one source file
    (guardians.c) from latin1 to UTF-8.
* ports.test:
    This explicitly tests the default value, so it needs to be adjusted.

Real world impact should be limited, since most people are likely to run with
LANG set to *some* UTF-8 locale.  And if you do not have that, I (and I expect
majority of engineers) would prefer correctness over convenience.

I strongly believe the current default is wrong and dangerous, but I am
obviously interested what other people think, hence this message.  Please let me
know what you think.  Should I put this into actual patch?  Does it have chance
to be accepted and merged into the master?

Thank you for reading and have a nice day,
Tomas Volf

--
There are only two hard things in Computer Science:
cache invalidation, naming things and off-by-one errors.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: RFC: Changing the initial value of %default-port-conversion-strategy
  2024-05-29 20:26 RFC: Changing the initial value of %default-port-conversion-strategy Tomas Volf
@ 2024-05-30  7:54 ` Attila Lendvai
  0 siblings, 0 replies; 2+ messages in thread
From: Attila Lendvai @ 2024-05-30  7:54 UTC (permalink / raw)
  To: Tomas Volf; +Cc: guile-devel

> I strongly believe the current default is wrong and dangerous, but I am
> obviously interested what other people think, hence this message. Please let me
> know what you think. Should I put this into actual patch? Does it have chance
> to be accepted and merged into the master?


i have no authority to speak on behalf of guile, but as a longtime lisper i have spent so much time debugging swallowed/silent errors, that i feel anger when i come about a default like that.

i came to believe that it's an insult towards your fellow men to silently swallow any error (as in unexpected outcome). and that also includes the author himself two weeks down the road, when he will have purged the context from his brain already.

code for long enough, and you'll learn painfully well that the extra minutes spent on properly dealing with errors will pay out several times over in time *not* spent on debugging. and there's really no excuse in a language that has some form of exceptions.

just some 0.02,

-- 
• attila lendvai
• PGP: 963F 5D5F 45C7 DFCD 0A39
--
“Death is not the greatest loss in life. The greatest loss is what dies inside us while we live.”
	— Norman Cousins




^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2024-05-30  7:54 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-05-29 20:26 RFC: Changing the initial value of %default-port-conversion-strategy Tomas Volf
2024-05-30  7:54 ` Attila Lendvai

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).