all messages for Guix-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
* bug#44675: guix lint: support for spellchecker or basic grammar
@ 2020-11-16  1:53 Vagrant Cascadian
  2020-11-16  5:55 ` zimoun
  2021-04-21 23:10 ` Vagrant Cascadian
  0 siblings, 2 replies; 16+ messages in thread
From: Vagrant Cascadian @ 2020-11-16  1:53 UTC (permalink / raw)
  To: 44675

[-- Attachment #1: Type: text/plain, Size: 1026 bytes --]

Please consider a guix lint description/synopsis check for basic
spelling, typo and rudimentary grammar issues.

Most of the ones I've found were caught by debian's "lintian" tool:

  https://tracker.debian.org/lintian


Common issues appear to be:

  "This packages" -> "This package"
  "allows to X" -> "Xs" or "Xing"


I've fixed many of these in the past:

  git log --author=vagrant --extended-regexp --grep='spelling|typo|grammar' --patch

But some of the very same patterns keep reappearing!


Many of these are likely to be caught by most spell checking routines;
I'm not sure if there is anything that would be implementable in pure
guile, or it if would make sense to call out to an external
spellchecker.

Some of them might be harder, and obviously we do not want too many
false positives, but no need to get perfectionist on solving this; even
just checking for "This packages" would haved detected many of these
issues!

That is, of course, if "guix lint" is being used consistently... :)


live well,
  vagrant

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 227 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* bug#44675: guix lint: support for spellchecker or basic grammar
  2020-11-16  1:53 bug#44675: guix lint: support for spellchecker or basic grammar Vagrant Cascadian
@ 2020-11-16  5:55 ` zimoun
  2021-04-21 23:10 ` Vagrant Cascadian
  1 sibling, 0 replies; 16+ messages in thread
From: zimoun @ 2020-11-16  5:55 UTC (permalink / raw)
  To: Vagrant Cascadian, 44675

Hi Vagrant,

On Sun, 15 Nov 2020 at 17:53, Vagrant Cascadian <vagrant@debian.org> wrote:
> Please consider a guix lint description/synopsis check for basic
> spelling, typo and rudimentary grammar issues.
>
> Most of the ones I've found were caught by debian's "lintian" tool:
>
>   https://tracker.debian.org/lintian

[...]

> Many of these are likely to be caught by most spell checking routines;
> I'm not sure if there is anything that would be implementable in pure
> guile, or it if would make sense to call out to an external
> spellchecker.

The tool is ’spellintian’ [1], right?  If yes, the work seems done by
[2] but I am not sure to understand if it is only regexp and Perl or if
an external tool is called.  And the list in debian/control is not very
helpful.

1:
https://salsa.debian.org/lintian/lintian/-/blob/master/bin/spellintian
2:
https://salsa.debian.org/lintian/lintian/-/blob/master/lib/Lintian/Spelling.pm 


> That is, of course, if "guix lint" is being used consistently... :)

It should be! :-)


All the best,
simon




^ permalink raw reply	[flat|nested] 16+ messages in thread

* bug#44675: guix lint: support for spellchecker or basic grammar
  2020-11-16  1:53 bug#44675: guix lint: support for spellchecker or basic grammar Vagrant Cascadian
  2020-11-16  5:55 ` zimoun
@ 2021-04-21 23:10 ` Vagrant Cascadian
  2021-04-22 16:42   ` Maxime Devos
  2021-04-25  7:27   ` Efraim Flashner
  1 sibling, 2 replies; 16+ messages in thread
From: Vagrant Cascadian @ 2021-04-21 23:10 UTC (permalink / raw)
  To: 44675


[-- Attachment #1.1: Type: text/plain, Size: 1100 bytes --]

Control: tags 44675 +patch

On 2020-11-15, Vagrant Cascadian wrote:
> Please consider a guix lint description/synopsis check for basic
> spelling, typo and rudimentary grammar issues.
...
> Many of these are likely to be caught by most spell checking routines;
> I'm not sure if there is anything that would be implementable in pure
> guile, or it if would make sense to call out to an external
> spellchecker.
>
> Some of them might be harder, and obviously we do not want too many
> false positives, but no need to get perfectionist on solving this; even
> just checking for "This packages" would haved detected many of these
> issues!

In the attached patch, I've implemented a simple lint check for "This
packages", which has been fixed in ... 42 packages so far in the git
repository, so maybe this could help catch future ones!

I haven't implemented a more complicated spellchecker or grammar checker
or anything, but at least this is a start.

I think it is also within my skills to address "allows to" and "permits
to", if I'm not heading down the wrong path here...


live well,
  vagrant


[-- Attachment #1.2: 0001-lint-Add-description-check-for-check-pluralized-pack.patch --]
[-- Type: text/x-diff, Size: 2453 bytes --]

From d4b851f5722cd6f8d514a4254884d1f7a016b74f Mon Sep 17 00:00:00 2001
From: Vagrant Cascadian <vagrant@debian.org>
Date: Wed, 21 Apr 2021 09:26:45 -0700
Subject: [PATCH] lint: Add description check for check-pluralized-package

Fixes: https://issues.guix.gnu.org/44675

* guix/lint.scm: Check for occurances of "This packages" in package
  descriptions.
* tests/lint.scm: Add test.
---
 guix/lint.scm  | 9 +++++++++
 tests/lint.scm | 7 +++++++
 2 files changed, 16 insertions(+)

diff --git a/guix/lint.scm b/guix/lint.scm
index 1bebfe03d3..ffeac18077 100644
--- a/guix/lint.scm
+++ b/guix/lint.scm
@@ -221,6 +221,14 @@ markup is valid return a plain-text version of DESCRIPTION, otherwise #f."
                       (G_ "Texinfo markup in description is invalid")
                       #:field 'description))))
 
+  (define (check-pluralized-this-package description)
+    "Check that DESCRIPTION does not contain This packages"
+    (if (string-match "This packages" description)
+	(list
+	 (make-warning package
+		       (G_ "description contains This Packages but should just be This package")))
+	'()))
+
   (define (check-trademarks description)
     "Check that DESCRIPTION does not contain '™' or '®' characters.  See
 http://www.gnu.org/prep/standards/html_node/Trademarks.html."
@@ -283,6 +291,7 @@ by two spaces; possible infraction~p at ~{~a~^, ~}")
          (check-not-empty description)
          (check-quotes description)
          (check-trademarks description)
+         (check-pluralized-this-package description)
          ;; Use raw description for this because Texinfo rendering
          ;; automatically fixes end of sentence space.
          (check-end-of-sentence-space description)
diff --git a/tests/lint.scm b/tests/lint.scm
index a2c8665142..6cb7a98686 100644
--- a/tests/lint.scm
+++ b/tests/lint.scm
@@ -160,6 +160,13 @@
                              (description "This is a 'quoted' thing."))))
      (check-description-style pkg))))
 
+(test-equal "description: pluralized this package"
+  "description contains This Packages but should just be This package"
+  (single-lint-warning-message
+   (let ((pkg (dummy-package "x"
+                             (description "This packages is a typo."))))
+     (check-description-style pkg))))
+
 (test-equal "synopsis: not a string"
   "invalid synopsis: #f"
   (single-lint-warning-message
-- 
2.30.2


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 227 bytes --]

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* bug#44675: guix lint: support for spellchecker or basic grammar
  2021-04-21 23:10 ` Vagrant Cascadian
@ 2021-04-22 16:42   ` Maxime Devos
  2021-04-22 17:57     ` Vagrant Cascadian
  2021-04-25  7:27   ` Efraim Flashner
  1 sibling, 1 reply; 16+ messages in thread
From: Maxime Devos @ 2021-04-22 16:42 UTC (permalink / raw)
  To: Vagrant Cascadian, 44675

[-- Attachment #1: Type: text/plain, Size: 903 bytes --]

+  (define (check-pluralized-this-package description)
+    "Check that DESCRIPTION does not contain This packages"

The sentence structure would be clearer if you used quotes here,
something like "Check that DESCRIPTION does not contain ‘This packages’".

+    (if (string-match "This packages" description)
+	(list
+	 (make-warning package
+		       (G_ "description contains This Packages but should just be This package")))

There are no package descriptions containing "This Packages".
Did you mean "This packages"?
 
> +(test-equal "description: pluralized this package"
Quotes: "description: pluralized ‘this package’".

> +  "description contains This Packages but should just be This package"
Capitalisation error: This Packages --> This packages
Also, quotes: "description contains ‘This packages’ but should just be ‘This package’".

Greetings,
Maxime.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 260 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* bug#44675: guix lint: support for spellchecker or basic grammar
  2021-04-22 16:42   ` Maxime Devos
@ 2021-04-22 17:57     ` Vagrant Cascadian
  2021-04-22 18:05       ` Maxime Devos
  2021-05-04 16:40       ` Ludovic Courtès
  0 siblings, 2 replies; 16+ messages in thread
From: Vagrant Cascadian @ 2021-04-22 17:57 UTC (permalink / raw)
  To: Maxime Devos, 44675


[-- Attachment #1.1: Type: text/plain, Size: 1358 bytes --]

On 2021-04-22, Maxime Devos wrote:
> +  (define (check-pluralized-this-package description)
> +    "Check that DESCRIPTION does not contain This packages"
>
> The sentence structure would be clearer if you used quotes here,
> something like "Check that DESCRIPTION does not contain ‘This packages’".

Any compelling reason to use ‘This packages’ vs. 'This packages' ? It
seems other quotes in guix/lint.scm use '' also, and I'm not apparently
skilled enough with a keyboard to generate ‘’-style quotes... :)


> +    (if (string-match "This packages" description)
> +	(list
> +	 (make-warning package
> +		       (G_ "description contains This Packages but should just be This package")))
>
> There are no package descriptions containing "This Packages".
> Did you mean "This packages"?

Nice catch, thanks!


>> +(test-equal "description: pluralized this package"
> Quotes: "description: pluralized ‘this package’".

Noted.


>> +  "description contains This Packages but should just be This package"
> Capitalisation error: This Packages --> This packages
> Also, quotes: "description contains ‘This packages’ but should just be ‘This package’".

Again, nice catch!


Updated the commit message and incorporated the above suggestions into
the updated attached patch.


live well,
  vagrant

[-- Attachment #1.2: 0001-lint-Add-description-check-for-pluralized-This-packa.patch --]
[-- Type: text/x-diff, Size: 2512 bytes --]

From 4e724fbe9815e1c27967b835f08d2259164538ba Mon Sep 17 00:00:00 2001
From: Vagrant Cascadian <vagrant@debian.org>
Date: Wed, 21 Apr 2021 09:26:45 -0700
Subject: [PATCH] lint: Add description check for pluralized "This package"

Partial fix for: https://issues.guix.gnu.org/44675

* guix/lint.scm (check-pluralized-this-package): Add check for
  occurances of "This packages" in package descriptions.
* tests/lint.scm: Add test.
---
 guix/lint.scm  | 9 +++++++++
 tests/lint.scm | 7 +++++++
 2 files changed, 16 insertions(+)

diff --git a/guix/lint.scm b/guix/lint.scm
index 1bebfe03d3..e00048349b 100644
--- a/guix/lint.scm
+++ b/guix/lint.scm
@@ -221,6 +221,14 @@ markup is valid return a plain-text version of DESCRIPTION, otherwise #f."
                       (G_ "Texinfo markup in description is invalid")
                       #:field 'description))))
 
+  (define (check-pluralized-this-package description)
+    "Check that DESCRIPTION does not contain 'This packages'"
+    (if (string-match "This packages" description)
+	(list
+	 (make-warning package
+		       (G_ "description contains 'This packages' but should just be 'This package'")))
+	'()))
+
   (define (check-trademarks description)
     "Check that DESCRIPTION does not contain '™' or '®' characters.  See
 http://www.gnu.org/prep/standards/html_node/Trademarks.html."
@@ -283,6 +291,7 @@ by two spaces; possible infraction~p at ~{~a~^, ~}")
          (check-not-empty description)
          (check-quotes description)
          (check-trademarks description)
+         (check-pluralized-this-package description)
          ;; Use raw description for this because Texinfo rendering
          ;; automatically fixes end of sentence space.
          (check-end-of-sentence-space description)
diff --git a/tests/lint.scm b/tests/lint.scm
index a2c8665142..3e1b95680a 100644
--- a/tests/lint.scm
+++ b/tests/lint.scm
@@ -160,6 +160,13 @@
                              (description "This is a 'quoted' thing."))))
      (check-description-style pkg))))
 
+(test-equal "description: pluralized 'This package'"
+  "description contains 'This packages' but should just be 'This package'"
+  (single-lint-warning-message
+   (let ((pkg (dummy-package "x"
+                             (description "This packages is a typo."))))
+     (check-description-style pkg))))
+
 (test-equal "synopsis: not a string"
   "invalid synopsis: #f"
   (single-lint-warning-message
-- 
2.30.2


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 227 bytes --]

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* bug#44675: guix lint: support for spellchecker or basic grammar
  2021-04-22 17:57     ` Vagrant Cascadian
@ 2021-04-22 18:05       ` Maxime Devos
  2021-05-04 16:40       ` Ludovic Courtès
  1 sibling, 0 replies; 16+ messages in thread
From: Maxime Devos @ 2021-04-22 18:05 UTC (permalink / raw)
  To: Vagrant Cascadian, 44675

[-- Attachment #1: Type: text/plain, Size: 1371 bytes --]

Vagrant Cascadian schreef op do 22-04-2021 om 10:57 [-0700]:
> On 2021-04-22, Maxime Devos wrote:
> > +  (define (check-pluralized-this-package description)
> > +    "Check that DESCRIPTION does not contain This packages"
> > 
> > The sentence structure would be clearer if you used quotes here,
> > something like "Check that DESCRIPTION does not contain ‘This packages’".
> 
> Any compelling reason to use ‘This packages’ vs. 'This packages' ?

I find ‘curly quotes’ more aesthetically pleasing, though that's a bit subjective
I guess.

> It seems other quotes in guix/lint.scm use '' also,
I believe they should use ‘curly quotes’ as well, though I would like to hear
what other things about that first.

>  and I'm not apparently
> skilled enough with a keyboard to generate ‘’-style quotes... :)

If your keyboard is azerty, you could choose ‘Belgian alternative’, and type
‘ with alt-gr+f and ’ with alt-gr+g.

> Updated the commit message and incorporated the above suggestions into
> the updated attached patch.

One other suggestion: you used "string-match" in 'check-pluralized-this-package',
which is a bit overkill.  string-match interprets its first argument as a regex.
The procedure "string-contains" is simpler and probably more efficient.

The patch looks good otherwise.

Greetings,
Maxime.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 260 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* bug#44675: guix lint: support for spellchecker or basic grammar
  2021-04-21 23:10 ` Vagrant Cascadian
  2021-04-22 16:42   ` Maxime Devos
@ 2021-04-25  7:27   ` Efraim Flashner
  2021-04-25 16:43     ` Vagrant Cascadian
  1 sibling, 1 reply; 16+ messages in thread
From: Efraim Flashner @ 2021-04-25  7:27 UTC (permalink / raw)
  To: Vagrant Cascadian; +Cc: 44675

[-- Attachment #1: Type: text/plain, Size: 1762 bytes --]

On Wed, Apr 21, 2021 at 04:10:40PM -0700, Vagrant Cascadian wrote:
> Control: tags 44675 +patch
> 
> On 2020-11-15, Vagrant Cascadian wrote:
> > Please consider a guix lint description/synopsis check for basic
> > spelling, typo and rudimentary grammar issues.
> ...
> > Many of these are likely to be caught by most spell checking routines;
> > I'm not sure if there is anything that would be implementable in pure
> > guile, or it if would make sense to call out to an external
> > spellchecker.
> >
> > Some of them might be harder, and obviously we do not want too many
> > false positives, but no need to get perfectionist on solving this; even
> > just checking for "This packages" would haved detected many of these
> > issues!
> 
> In the attached patch, I've implemented a simple lint check for "This
> packages", which has been fixed in ... 42 packages so far in the git
> repository, so maybe this could help catch future ones!
> 
> I haven't implemented a more complicated spellchecker or grammar checker
> or anything, but at least this is a start.
> 
> I think it is also within my skills to address "allows to" and "permits
> to", if I'm not heading down the wrong path here...
> 
> 
> live well,
>   vagrant

It might make more sense to name it something more like
'catch-common-typos' and to search for 'This packages', 'allows to',
'permits to', 'file-name' and then print out the different mistakes in
the description. Then we can add more as we find them, rather than one
check per mistake.


-- 
Efraim Flashner   <efraim@flashner.co.il>   אפרים פלשנר
GPG key = A28B F40C 3E55 1372 662D  14F7 41AA E7DC CA3D 8351
Confidentiality cannot be guaranteed on emails sent or received unencrypted

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* bug#44675: guix lint: support for spellchecker or basic grammar
  2021-04-25  7:27   ` Efraim Flashner
@ 2021-04-25 16:43     ` Vagrant Cascadian
  0 siblings, 0 replies; 16+ messages in thread
From: Vagrant Cascadian @ 2021-04-25 16:43 UTC (permalink / raw)
  To: Efraim Flashner; +Cc: 44675

[-- Attachment #1: Type: text/plain, Size: 2252 bytes --]

On 2021-04-25, Efraim Flashner wrote:
> On Wed, Apr 21, 2021 at 04:10:40PM -0700, Vagrant Cascadian wrote:
>> Control: tags 44675 +patch
>> 
>> On 2020-11-15, Vagrant Cascadian wrote:
>> > Please consider a guix lint description/synopsis check for basic
>> > spelling, typo and rudimentary grammar issues.
>> ...
>> > Many of these are likely to be caught by most spell checking routines;
>> > I'm not sure if there is anything that would be implementable in pure
>> > guile, or it if would make sense to call out to an external
>> > spellchecker.
>> >
>> > Some of them might be harder, and obviously we do not want too many
>> > false positives, but no need to get perfectionist on solving this; even
>> > just checking for "This packages" would haved detected many of these
>> > issues!
>> 
>> In the attached patch, I've implemented a simple lint check for "This
>> packages", which has been fixed in ... 42 packages so far in the git
>> repository, so maybe this could help catch future ones!
>> 
>> I haven't implemented a more complicated spellchecker or grammar checker
>> or anything, but at least this is a start.
>> 
>> I think it is also within my skills to address "allows to" and "permits
>> to", if I'm not heading down the wrong path here...
>> 
>> 
>> live well,
>>   vagrant
>
> It might make more sense to name it something more like
> 'catch-common-typos' and to search for 'This packages', 'allows to',
> 'permits to', 'file-name' and then print out the different mistakes in
> the description. Then we can add more as we find them, rather than one
> check per mistake.

That makes sense, though 'This packages' is very straightforward and has
a simple recommendation to fix it, whereas 'allows to' requires more
complicated english skills to come up with the correct solution... it
could just simply flag those cases as "wrong" without a solution.

Basically, I already stretched my cargo-culting, er, guile skills just
to get something obvious fixed that I keep seeing over and over
again. It would be a good excercise for me to better learn guile to
extend to further typos, though ... limited time.

Playing whack-a-mole with typos does get tiring :)


live well,
  vagrant

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 227 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* bug#44675: guix lint: support for spellchecker or basic grammar
  2021-04-22 17:57     ` Vagrant Cascadian
  2021-04-22 18:05       ` Maxime Devos
@ 2021-05-04 16:40       ` Ludovic Courtès
  2021-06-09 15:33         ` Vagrant Cascadian
  1 sibling, 1 reply; 16+ messages in thread
From: Ludovic Courtès @ 2021-05-04 16:40 UTC (permalink / raw)
  To: Vagrant Cascadian; +Cc: 44675

Hi Vagrant,

Vagrant Cascadian <vagrant@debian.org> skribis:

> From 4e724fbe9815e1c27967b835f08d2259164538ba Mon Sep 17 00:00:00 2001
> From: Vagrant Cascadian <vagrant@debian.org>
> Date: Wed, 21 Apr 2021 09:26:45 -0700
> Subject: [PATCH] lint: Add description check for pluralized "This package"
>
> Partial fix for: https://issues.guix.gnu.org/44675
>
> * guix/lint.scm (check-pluralized-this-package): Add check for
>   occurances of "This packages" in package descriptions.
> * tests/lint.scm: Add test.

I had missed this patch, nice!

> +  (define (check-pluralized-this-package description)
> +    "Check that DESCRIPTION does not contain 'This packages'"
> +    (if (string-match "This packages" description)
> +	(list
> +	 (make-warning package
> +		       (G_ "description contains 'This packages' but should just be 'This package'")))
> +	'()))

How about making this ‘check-spelling’ and generalizing a bit so that it
iterates over a bunch of regexps or strings?

Like:

    (if (any (cut string-contains description <>) patterns)
         …)

where ‘patterns’ is a list of strings.

(Note that ‘string-match’ invokes libc’s regcomp + regexec, so it’s more
heavyweight than needed here.)

Thanks,
Ludo’.




^ permalink raw reply	[flat|nested] 16+ messages in thread

* bug#44675: guix lint: support for spellchecker or basic grammar
  2021-05-04 16:40       ` Ludovic Courtès
@ 2021-06-09 15:33         ` Vagrant Cascadian
  2021-10-21 23:18           ` Vagrant Cascadian
  0 siblings, 1 reply; 16+ messages in thread
From: Vagrant Cascadian @ 2021-06-09 15:33 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: 44675

[-- Attachment #1: Type: text/plain, Size: 1711 bytes --]

On 2021-05-04, Ludovic Courtès wrote:
> Vagrant Cascadian <vagrant@debian.org> skribis:
>
>> From 4e724fbe9815e1c27967b835f08d2259164538ba Mon Sep 17 00:00:00 2001
>> From: Vagrant Cascadian <vagrant@debian.org>
>> Date: Wed, 21 Apr 2021 09:26:45 -0700
>> Subject: [PATCH] lint: Add description check for pluralized "This package"
>>
>> Partial fix for: https://issues.guix.gnu.org/44675
>>
>> * guix/lint.scm (check-pluralized-this-package): Add check for
>>   occurances of "This packages" in package descriptions.
>> * tests/lint.scm: Add test.
>
> I had missed this patch, nice!
>
>> +  (define (check-pluralized-this-package description)
>> +    "Check that DESCRIPTION does not contain 'This packages'"
>> +    (if (string-match "This packages" description)
>> +	(list
>> +	 (make-warning package
>> +		       (G_ "description contains 'This packages' but should just be 'This package'")))
>> +	'()))
>
> How about making this ‘check-spelling’ and generalizing a bit so that it
> iterates over a bunch of regexps or strings?
>
> Like:
>
>     (if (any (cut string-contains description <>) patterns)
>          …)
>
> where ‘patterns’ is a list of strings.

Love the idea, but would need some help in implementing!

There have been at least three newly added "This packages" since I
submitted this patch, so wondering if we can at least get the simple
case merged before getting too caught up in all the potential
improvements?


> (Note that ‘string-match’ invokes libc’s regcomp + regexec, so it’s more
> heavyweight than needed here.)

I can probably manage that! (I'll dig up where the simpler suggestion was)


live well,
  vagrant

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 227 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* bug#44675: guix lint: support for spellchecker or basic grammar
  2021-06-09 15:33         ` Vagrant Cascadian
@ 2021-10-21 23:18           ` Vagrant Cascadian
  2021-10-22  8:33             ` zimoun
  0 siblings, 1 reply; 16+ messages in thread
From: Vagrant Cascadian @ 2021-10-21 23:18 UTC (permalink / raw)
  To: 44675


[-- Attachment #1.1: Type: text/plain, Size: 742 bytes --]

On 2021-06-09, Vagrant Cascadian wrote:
> There have been at least three newly added "This packages" since I
> submitted this patch, so wondering if we can at least get the simple
> case merged before getting too caught up in all the potential
> improvements?

And up until today, that list grew to 7! (fixed by 

Long delayed updated patch ... I think it addresses almost all of the
issues brought up, and maybe introduces a few new ones!

It has been rewritten to easily add new typo checks, but this one so far
only addresses pluralized "This packages". Would be easy enough to add
"allows to" but hard to add a suggested fix...


Big thanks to rekado, vivien and nckx who helped via #guix IRC!


live well,
  vagrant

[-- Attachment #1.2: 0001-lint-Add-description-check-for-pluralized-This-packa.patch --]
[-- Type: text/x-diff, Size: 2642 bytes --]

From 3ab46ca7932614ab4c699512c2fbfa8207ffa964 Mon Sep 17 00:00:00 2001
From: Vagrant Cascadian <vagrant@debian.org>
Date: Thu, 21 Oct 2021 15:51:11 -0700
Subject: [PATCH] lint: Add description check for pluralized "This package"

Partial fix for: https://issues.guix.gnu.org/44675

* guix/lint.scm (check-description-typo): Add check for
  occurances of "This packages" in package descriptions.
* tests/lint.scm: Add test.
---
 guix/lint.scm  | 12 ++++++++++++
 tests/lint.scm |  7 +++++++
 2 files changed, 19 insertions(+)

diff --git a/guix/lint.scm b/guix/lint.scm
index 7b02b9cec0..b22454fd31 100644
--- a/guix/lint.scm
+++ b/guix/lint.scm
@@ -321,6 +321,17 @@ markup is valid return a plain-text version of DESCRIPTION, otherwise #f."
                       (G_ "Texinfo markup in description is invalid")
                       #:field 'description))))
 
+  (define (check-description-typo description typo correction)
+    "Check that DESCRIPTION does not contain typo, with optional correction"
+    (if (string-contains description typo)
+        (list
+	 (make-warning package
+		       (G_
+                        (format #false
+                                "description contains typo '~a'~@[, should be '~a'~]"
+		                typo correction))))
+        '()))
+
   (define (check-trademarks description)
     "Check that DESCRIPTION does not contain '™' or '®' characters.  See
 http://www.gnu.org/prep/standards/html_node/Trademarks.html."
@@ -401,6 +412,7 @@ by two spaces; possible infraction~p at ~{~a~^, ~}")
          (check-not-empty description)
          (check-quotes description)
          (check-trademarks description)
+         (check-description-typo description "This packages" "This package")
          ;; Use raw description for this because Texinfo rendering
          ;; automatically fixes end of sentence space.
          (check-end-of-sentence-space description)
diff --git a/tests/lint.scm b/tests/lint.scm
index 699a750eb9..1902a87354 100644
--- a/tests/lint.scm
+++ b/tests/lint.scm
@@ -177,6 +177,13 @@
                               (description "Whitespace. "))))
      (check-description-style pkg))))
 
+(test-equal "description: pluralized 'This package'"
+  "description contains typo 'This packages', should be 'This package'"
+  (single-lint-warning-message
+   (let ((pkg (dummy-package "x"
+                             (description "This packages is a typo."))))
+     (check-description-style pkg))))
+
 (test-equal "synopsis: not a string"
   "invalid synopsis: #f"
   (single-lint-warning-message
-- 
2.30.2


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 227 bytes --]

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* bug#44675: guix lint: support for spellchecker or basic grammar
  2021-10-21 23:18           ` Vagrant Cascadian
@ 2021-10-22  8:33             ` zimoun
  2021-10-24 11:22               ` Vagrant Cascadian
  0 siblings, 1 reply; 16+ messages in thread
From: zimoun @ 2021-10-22  8:33 UTC (permalink / raw)
  To: Vagrant Cascadian, 44675

Hi Vagrant,

On Thu, 21 Oct 2021 at 16:18, Vagrant Cascadian <vagrant@debian.org> wrote:

> It has been rewritten to easily add new typo checks, but this one so far
> only addresses pluralized "This packages". Would be easy enough to add
> "allows to" but hard to add a suggested fix...

If I remember correctly the previous discussion, Debian uses an external
tool for spellchecking.  Here, the patch uses a list of “common”
mistakes.  It is seems an easy good start. :-)


> +  (define (check-description-typo description typo correction)

Instead, I would use a list of alist ’typo-corrections’ as argument.
For instance,

--8<---------------cut here---------------start------------->8---
(define (check-description-typo description typo-corrections)
  (for-each
   (match-lambda
     ((typo . correction)
      (if (string-contains description typo)
          (list
           (make-warning ...))
          '())))
   typo-corrections))
--8<---------------cut here---------------end--------------->8---

> +    "Check that DESCRIPTION does not contain typo, with optional
> correction"
> +    (if (string-contains description typo)
> +        (list
> +	   (make-warning package
> +	     	       (G_
> +                        (format #false
> +                                "description contains typo '~a'~@[, should be '~a'~]"
> +		                typo correction))))
> +        '()))
> +

[...]

> +         (check-description-typo description "This packages" "This package")

And the call reads,

            (check-description-typo description '(("This packages" . "This package")))

which allows easily to add new pattern; such as,

        '(("This packages" . "This package")
          ("this packages" . "this package")
          ("This modules" . "This module"))

Then, as a second step, depending on the patterns listed, let see if
there is a pattern inside these patterns. ;-) (Check both capitalize and
lower-case, etc.)


Cheers,
simon




^ permalink raw reply	[flat|nested] 16+ messages in thread

* bug#44675: guix lint: support for spellchecker or basic grammar
  2021-10-22  8:33             ` zimoun
@ 2021-10-24 11:22               ` Vagrant Cascadian
  2021-10-24 11:56                 ` zimoun
  0 siblings, 1 reply; 16+ messages in thread
From: Vagrant Cascadian @ 2021-10-24 11:22 UTC (permalink / raw)
  To: zimoun; +Cc: 44675


[-- Attachment #1.1: Type: text/plain, Size: 2249 bytes --]

On 2021-10-22, zimoun wrote:
> On Thu, 21 Oct 2021 at 16:18, Vagrant Cascadian <vagrant@debian.org> wrote:
>
>> It has been rewritten to easily add new typo checks, but this one so far
>> only addresses pluralized "This packages". Would be easy enough to add
>> "allows to" but hard to add a suggested fix...
>
> If I remember correctly the previous discussion, Debian uses an external
> tool for spellchecking.  Here, the patch uses a list of “common”
> mistakes.  It is seems an easy good start. :-)

Yes, I definitely went for "catch some useful things" rather than trying
to be comprehensive... these checks should catch the majority of
historical ones I've noticed, but I didn't bother adding one-off typos.


> Instead, I would use a list of alist ’typo-corrections’ as argument.
> For instance,
>
> --8<---------------cut here---------------start------------->8---
> (define (check-description-typo description typo-corrections)
>   (for-each
>    (match-lambda
>      ((typo . correction)
>       (if (string-contains description typo)
>           (list
>            (make-warning ...))
>           '())))
>    typo-corrections))
> --8<---------------cut here---------------end--------------->8---

So close! Ludo spotted that the for-each needed to be replaced with
append-map, and that basically did it!


>             (check-description-typo description '(("This packages" . "This package")))
>
> which allows easily to add new pattern; such as,
>
>         '(("This packages" . "This package")
>           ("this packages" . "this package")
>           ("This modules" . "This module"))
>
> Then, as a second step, depending on the patterns listed, let see if
> there is a pattern inside these patterns. ;-) (Check both capitalize and
> lower-case, etc.)

I haven't seen case issues in practice for these.

I've added "This packages", "This modules", "allows to" and "permits to"
to the list while I was at it. I only added tests for "This packages"
and "allows to" as the others are so similar I'm not sure it's worth
testing...

Here's to hoping for a guix with fewer glaring typos with which to craft
guix poetry. :)


New patch attached!


live well,
  vagrant

[-- Attachment #1.2: 0001-lint-Add-description-check-for-common-typos.patch --]
[-- Type: text/x-diff, Size: 3389 bytes --]

From a171769da0f737468e06866164eadf1e720764ba Mon Sep 17 00:00:00 2001
From: Vagrant Cascadian <vagrant@debian.org>
Date: Sun, 24 Oct 2021 04:00:15 -0700
Subject: [PATCH] lint: Add description check for common typos.

Fixes: https://issues.guix.gnu.org/44675

* guix/lint.scm (check-description-typo): Add check for occurances of
  "This packages", "This modules", "allows to" and "permits to" in package
  descriptions.
* tests/lint.scm: Add tests for "This packages" and "allows to".
---
 guix/lint.scm  | 19 +++++++++++++++++++
 tests/lint.scm | 14 ++++++++++++++
 2 files changed, 33 insertions(+)

diff --git a/guix/lint.scm b/guix/lint.scm
index 7b02b9cec0..ac2e7b3841 100644
--- a/guix/lint.scm
+++ b/guix/lint.scm
@@ -321,6 +321,21 @@ markup is valid return a plain-text version of DESCRIPTION, otherwise #f."
                       (G_ "Texinfo markup in description is invalid")
                       #:field 'description))))
 
+  (define (check-description-typo description typo-corrections)
+    "Check that DESCRIPTION does not contain typo, with optional correction"
+    (append-map
+     (match-lambda
+      ((typo . correction)
+       (if (string-contains description typo)
+           (list
+            (make-warning package
+                          (G_
+                           (format #false
+                                   "description contains typo '~a'~@[, should be '~a'~]"
+                                   typo correction))))
+           '())))
+     typo-corrections))
+
   (define (check-trademarks description)
     "Check that DESCRIPTION does not contain '™' or '®' characters.  See
 http://www.gnu.org/prep/standards/html_node/Trademarks.html."
@@ -401,6 +416,10 @@ by two spaces; possible infraction~p at ~{~a~^, ~}")
          (check-not-empty description)
          (check-quotes description)
          (check-trademarks description)
+         (check-description-typo description '(("This packages" . "This package")
+                                               ("This modules" . "This module")
+                                               ("allows to" . #f)
+                                               ("permits to" . #f)))
          ;; Use raw description for this because Texinfo rendering
          ;; automatically fixes end of sentence space.
          (check-end-of-sentence-space description)
diff --git a/tests/lint.scm b/tests/lint.scm
index 699a750eb9..6a7eed02e0 100644
--- a/tests/lint.scm
+++ b/tests/lint.scm
@@ -177,6 +177,20 @@
                               (description "Whitespace. "))))
      (check-description-style pkg))))
 
+(test-equal "description: pluralized 'This package'"
+  "description contains typo 'This packages', should be 'This package'"
+  (single-lint-warning-message
+   (let ((pkg (dummy-package "x"
+                             (description "This packages is a typo."))))
+     (check-description-style pkg))))
+
+(test-equal "description: grammar 'allows to'"
+  "description contains typo 'allows to'"
+  (single-lint-warning-message
+   (let ((pkg (dummy-package "x"
+                             (description "This package allows to do stuff."))))
+     (check-description-style pkg))))
+
 (test-equal "synopsis: not a string"
   "invalid synopsis: #f"
   (single-lint-warning-message
-- 
2.30.2


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 227 bytes --]

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* bug#44675: guix lint: support for spellchecker or basic grammar
  2021-10-24 11:22               ` Vagrant Cascadian
@ 2021-10-24 11:56                 ` zimoun
  2021-10-24 19:02                   ` Vagrant Cascadian
  0 siblings, 1 reply; 16+ messages in thread
From: zimoun @ 2021-10-24 11:56 UTC (permalink / raw)
  To: Vagrant Cascadian; +Cc: 44675

Hi,

On Sun, 24 Oct 2021 at 04:22, Vagrant Cascadian <vagrant@debian.org> wrote:

> So close! Ludo spotted that the for-each needed to be replaced with
> append-map, and that basically did it!

Indeed, I have not checked the type of full chain. :-) And the return
value probably needs to be strongly a list, instead of not-specified. ;-)


> Here's to hoping for a guix with fewer glaring typos with which to craft
> guix poetry. :)

Guix poetry, oh well. :-)


> From a171769da0f737468e06866164eadf1e720764ba Mon Sep 17 00:00:00 2001
> From: Vagrant Cascadian <vagrant@debian.org>
> Date: Sun, 24 Oct 2021 04:00:15 -0700
> Subject: [PATCH] lint: Add description check for common typos.
>
> Fixes: https://issues.guix.gnu.org/44675
>
> * guix/lint.scm (check-description-typo): Add check for occurances of
                                                          ----^^

Occurrence, no?  Well, maybe a US vs UK vs French-glish. :-)


>   "This packages", "This modules", "allows to" and "permits to" in package
>   descriptions.
> * tests/lint.scm: Add tests for "This packages" and "allows to".
> ---
>  guix/lint.scm  | 19 +++++++++++++++++++
>  tests/lint.scm | 14 ++++++++++++++
>  2 files changed, 33 insertions(+)

Otherwise, LGTM.


Cheers,
simon




^ permalink raw reply	[flat|nested] 16+ messages in thread

* bug#44675: guix lint: support for spellchecker or basic grammar
  2021-10-24 11:56                 ` zimoun
@ 2021-10-24 19:02                   ` Vagrant Cascadian
  2021-10-24 21:41                     ` Vagrant Cascadian
  0 siblings, 1 reply; 16+ messages in thread
From: Vagrant Cascadian @ 2021-10-24 19:02 UTC (permalink / raw)
  To: zimoun; +Cc: 44675

[-- Attachment #1: Type: text/plain, Size: 1078 bytes --]

On 2021-10-24, zimoun wrote:
> On Sun, 24 Oct 2021 at 04:22, Vagrant Cascadian <vagrant@debian.org> wrote:
>> From a171769da0f737468e06866164eadf1e720764ba Mon Sep 17 00:00:00 2001
>> From: Vagrant Cascadian <vagrant@debian.org>
>> Date: Sun, 24 Oct 2021 04:00:15 -0700
>> Subject: [PATCH] lint: Add description check for common typos.
>>
>> Fixes: https://issues.guix.gnu.org/44675
>>
>> * guix/lint.scm (check-description-typo): Add check for occurances of
>                                                           ----^^
>
> Occurrence, no?  Well, maybe a US vs UK vs French-glish. :-)

You are correct! French-glish FTW! :)

>>   "This packages", "This modules", "allows to" and "permits to" in package
>>   descriptions.
>> * tests/lint.scm: Add tests for "This packages" and "allows to".
>> ---
>>  guix/lint.scm  | 19 +++++++++++++++++++
>>  tests/lint.scm | 14 ++++++++++++++
>>  2 files changed, 33 insertions(+)
>
> Otherwise, LGTM.

Shall I fix the above typo (the irony!) and push before we get any more
of these typos in the repository? :)


live well,
  vagrant

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 227 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* bug#44675: guix lint: support for spellchecker or basic grammar
  2021-10-24 19:02                   ` Vagrant Cascadian
@ 2021-10-24 21:41                     ` Vagrant Cascadian
  0 siblings, 0 replies; 16+ messages in thread
From: Vagrant Cascadian @ 2021-10-24 21:41 UTC (permalink / raw)
  To: zimoun; +Cc: 44675-done

[-- Attachment #1: Type: text/plain, Size: 701 bytes --]

On 2021-10-24, Vagrant Cascadian wrote:
> On 2021-10-24, zimoun wrote:
>> On Sun, 24 Oct 2021 at 04:22, Vagrant Cascadian <vagrant@debian.org> wrote:
>>>   "This packages", "This modules", "allows to" and "permits to" in package
>>>   descriptions.
>>> * tests/lint.scm: Add tests for "This packages" and "allows to".
>>> ---
>>>  guix/lint.scm  | 19 +++++++++++++++++++
>>>  tests/lint.scm | 14 ++++++++++++++
>>>  2 files changed, 33 insertions(+)
>>
>> Otherwise, LGTM.
>
> Shall I fix the above typo (the irony!) and push before we get any more
> of these typos in the repository? :)

Pushed to master as b5f45a21c27b80210753e184e52708bb75a347bb.

Thanks for all your help!


live well,
  vagrant

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 227 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2021-10-24 21:42 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-16  1:53 bug#44675: guix lint: support for spellchecker or basic grammar Vagrant Cascadian
2020-11-16  5:55 ` zimoun
2021-04-21 23:10 ` Vagrant Cascadian
2021-04-22 16:42   ` Maxime Devos
2021-04-22 17:57     ` Vagrant Cascadian
2021-04-22 18:05       ` Maxime Devos
2021-05-04 16:40       ` Ludovic Courtès
2021-06-09 15:33         ` Vagrant Cascadian
2021-10-21 23:18           ` Vagrant Cascadian
2021-10-22  8:33             ` zimoun
2021-10-24 11:22               ` Vagrant Cascadian
2021-10-24 11:56                 ` zimoun
2021-10-24 19:02                   ` Vagrant Cascadian
2021-10-24 21:41                     ` Vagrant Cascadian
2021-04-25  7:27   ` Efraim Flashner
2021-04-25 16:43     ` Vagrant Cascadian

Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/guix.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.