unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#66614: 29.1.50; Support not capitalizing words inside symbols
@ 2023-10-18 16:32 Spencer Baugh
  2023-10-18 17:01 ` Spencer Baugh
  0 siblings, 1 reply; 11+ messages in thread
From: Spencer Baugh @ 2023-10-18 16:32 UTC (permalink / raw)
  To: 66614


Quick definitions:
- word: a sequence of characters whose syntax is word constituent
- symbol: a sequence of characters whose syntax is either word
constituent or symbol constituent

In some programming languages and styles, a symbol (or every symbol in a
sequence of symbols) might be capitalized, but the individual words
making up the symbol should never be capitalized.

For example, in OCaml, type names Look_like_this and variable names
look_like_this, but it is basically never correct for something to
Look_Like_This.  And one might have "aa_bb cc_dd ee_ff" or "Aa_bb Cc_dd
Ee_ff", but never "Aa_Bb Cc_Dd Ee_Ff".

Currently, case handling in casefiddle.c and Freplace_match always
capitalizes individual words, which has undesirable effects when
programming in these styles.  Three examples:

- If I have a variable "hash_set" and a type name "Hash_set", and I type
  "Ha" and dabbrev-expand, then depending on the context Ha may expand
  to Hash_Set instead of hash_set.  But it is never correct to
  caplitalize internal words in this style, so we could avoid this.

- If I have a variable foo and Foo, and I query-replace foo with
  bar_baz, the replacements will be bar_baz and Bar_Baz.  But again it
  is never correct to capitalize internal words in this style.

- More concretely,
  (progn 
    (string-match "az" "Az")
    (replace-match "az_bz" nil nil "Az"))
  yields Az_Bz, but in this programming style it should always yield
  Az_bz.

A naive solution is to change the syntax class of symbol constituents so
that they are treated as part of words.  Or, equivalently, to use
superword-mode.  This solution is incorrect, though: the distinction
between symbols and words is still useful for word and symbol navigation
commands, and other purposes besides.  Changing the syntax class will
break those other use cases.  The only thing that needs to be changed is
the case behavior.

This is straightforwardly solvable by supporting a behavior where symbol
constituents are treated as part of words only for case operations.

A patch follows which adds a variable which allows that.


In GNU Emacs 29.1.50 (build 11, x86_64-pc-linux-gnu, X toolkit, cairo
 version 1.15.12, Xaw scroll bars) of 2023-10-18 built on

Repository revision: 9163e634e296435aa7a78bc6b77b4ee90666d2ac
Repository branch: emacs-29
Windowing system distributor 'The X.Org Foundation', version 11.0.12011000






^ permalink raw reply	[flat|nested] 11+ messages in thread

* bug#66614: 29.1.50; Support not capitalizing words inside symbols
  2023-10-18 16:32 bug#66614: 29.1.50; Support not capitalizing words inside symbols Spencer Baugh
@ 2023-10-18 17:01 ` Spencer Baugh
  2023-10-18 18:24   ` Eli Zaretskii
  2023-10-18 18:34   ` Eli Zaretskii
  0 siblings, 2 replies; 11+ messages in thread
From: Spencer Baugh @ 2023-10-18 17:01 UTC (permalink / raw)
  To: 66614

[-- Attachment #1: Type: text/plain, Size: 17 bytes --]


Patch follows.


[-- Attachment #2: 0001-Add-case-symbols-as-words-to-configure-symbol-case-b.patch --]
[-- Type: text/x-patch, Size: 8683 bytes --]

From e11c5096b2e0a3eddec8fac692142ff31c889109 Mon Sep 17 00:00:00 2001
From: Spencer Baugh <sbaugh@janestreet.com>
Date: Wed, 18 Oct 2023 12:51:37 -0400
Subject: [PATCH] Add case-symbols-as-words to configure symbol case behavior

In some programming languages and styles, a symbol (or every symbol in
a sequence of symbols) might be capitalized, but the individual words
making up the symbol should never be capitalized.

For example, in OCaml, type names Look_like_this and variable names
look_like_this, but it is basically never correct for something to
Look_Like_This.  And one might have "aa_bb cc_dd ee_ff" or "Aa_bb
Cc_dd Ee_ff", but never "Aa_Bb Cc_Dd Ee_Ff".

To support this, the new variable case-symbols-as-words causes symbol
constituents to be treated as part of words only for case operations.

* src/casefiddle.c (case_ch_is_word): Add.
(case_character_impl): Use case_ch_is_word.
(case_character): Use case_ch_is_word.
(syms_of_casefiddle): Define case-symbols-as-words. (bug#66614)
* src/search.c (Freplace_match): Use case-symbols-as-words when
calculating case pattern.
* test/src/casefiddle-tests.el (casefiddle-tests--check-syms)
(casefiddle-case-symbols-as-words): Test case-symbols-as-words.
* etc/NEWS: Announce case-symbols-as-words.
* doc/lispref/strings.texi (Case Conversion): Document
case-symbols-as-words.
---
 doc/lispref/strings.texi     |  8 ++++++--
 etc/NEWS                     |  8 ++++++++
 src/casefiddle.c             | 23 +++++++++++++++++++++--
 src/search.c                 | 11 +++++++----
 test/src/casefiddle-tests.el | 12 ++++++++++++
 5 files changed, 54 insertions(+), 8 deletions(-)

diff --git a/doc/lispref/strings.texi b/doc/lispref/strings.texi
index 7d11db49def..417614c9320 100644
--- a/doc/lispref/strings.texi
+++ b/doc/lispref/strings.texi
@@ -1510,7 +1510,9 @@ Case Conversion
 
 The definition of a word is any sequence of consecutive characters that
 are assigned to the word constituent syntax class in the current syntax
-table (@pxref{Syntax Class Table}).
+table (@pxref{Syntax Class Table}), or if @var{case-symbols-as-words} is
+non-nil, also characters assigned to the symbol constituent syntax
+class.
 
 When @var{string-or-char} is a character, this function does the same
 thing as @code{upcase}.
@@ -1542,7 +1544,9 @@ Case Conversion
 
 The definition of a word is any sequence of consecutive characters that
 are assigned to the word constituent syntax class in the current syntax
-table (@pxref{Syntax Class Table}).
+table (@pxref{Syntax Class Table}), or if @var{case-symbols-as-words} is
+non-nil, also characters assigned to the symbol constituent syntax
+class.
 
 When the argument to @code{upcase-initials} is a character,
 @code{upcase-initials} has the same result as @code{upcase}.
diff --git a/etc/NEWS b/etc/NEWS
index 129017f7dbe..23078f18273 100644
--- a/etc/NEWS
+++ b/etc/NEWS
@@ -1100,6 +1100,14 @@ instead of "ctags", "ebrowse", "etags", "hexl", "emacsclient", and
 "rcs2log", when starting one of these built in programs in a
 subprocess.
 
++++
+** New variable 'case-symbols-as-words' to change case behavior for symbols.
+If this is set to non-nil, then case operations such as
+'upcase-initials' or 'replace-match' (with nil FIXEDCASE) will treat
+symbol constituents as if they were part of words.  This is useful for
+programming languages and style where words in the middle of symbols
+are never capitalized.
+
 +++
 ** 'x-popup-menu' now understands touch screen events.
 When a 'touchscreen-begin' or 'touchscreen-end' event is passed as the
diff --git a/src/casefiddle.c b/src/casefiddle.c
index d567a5e353a..3f1f5680dd5 100644
--- a/src/casefiddle.c
+++ b/src/casefiddle.c
@@ -92,6 +92,12 @@ prepare_casing_context (struct casing_context *ctx,
     SETUP_BUFFER_SYNTAX_TABLE ();	/* For syntax_prefix_flag_p.  */
 }
 
+static bool
+case_ch_is_word (enum syntaxcode syntax)
+{
+  return syntax == Sword || (case_symbols_as_words && syntax == Ssymbol);
+}
+
 struct casing_str_buf
 {
   unsigned char data[max (6, MAX_MULTIBYTE_LENGTH)];
@@ -115,7 +121,7 @@ case_character_impl (struct casing_str_buf *buf,
 
   /* Update inword state */
   bool was_inword = ctx->inword;
-  ctx->inword = SYNTAX (ch) == Sword &&
+  ctx->inword = case_ch_is_word (SYNTAX (ch)) &&
     (!ctx->inbuffer || was_inword || !syntax_prefix_flag_p (ch));
 
   /* Normalize flag so its one of CASE_UP, CASE_DOWN or CASE_CAPITALIZE.  */
@@ -222,7 +228,7 @@ case_character (struct casing_str_buf *buf, struct casing_context *ctx,
      has a word syntax (i.e. current character is end of word), use final
      sigma.  */
   if (was_inword && ch == GREEK_CAPITAL_LETTER_SIGMA && changed
-      && (!next || SYNTAX (STRING_CHAR (next)) != Sword))
+      && (!next || !case_ch_is_word (SYNTAX (STRING_CHAR (next)))))
     {
       buf->len_bytes = CHAR_STRING (GREEK_SMALL_LETTER_FINAL_SIGMA, buf->data);
       buf->len_chars = 1;
@@ -720,6 +726,19 @@ syms_of_casefiddle (void)
   3rd argument.  */);
   Vregion_extract_function = Qnil; /* simple.el sets this.  */
 
+  DEFVAR_BOOL ("case-symbols-as-words", case_symbols_as_words,
+	       doc: /* If non-nil, case functions treat symbol syntax as part of words.
+
+Functions such as `upcase-initials' and `replace-match' check or modify
+the case pattern of sequences of characters.  Normally, these operate on
+sequences of characters whose syntax is word constituent.  If this
+variable is non-nil, then they operate on sequences of characters who
+syntax is either word constituent or symbol constituent.
+
+This is useful for programming styles which wish to capitalize the
+beginning of symbols, but not capitalize individual words in a symbol.*/);
+  case_symbols_as_words = 0;
+
   defsubr (&Supcase);
   defsubr (&Sdowncase);
   defsubr (&Scapitalize);
diff --git a/src/search.c b/src/search.c
index e9b29bb7179..b15ec52fa46 100644
--- a/src/search.c
+++ b/src/search.c
@@ -2365,7 +2365,7 @@ DEFUN ("replace-match", Freplace_match, Sreplace_match, 1, 5, 0,
 convert NEWTEXT to all caps.  Otherwise if all words are capitalized
 in the replaced text, capitalize each word in NEWTEXT.  Note that
 what exactly is a word is determined by the syntax tables in effect
-in the current buffer.
+in the current buffer, and the variable `case-symbols-as-words'.
 
 If optional third arg LITERAL is non-nil, insert NEWTEXT literally.
 Otherwise treat `\\' as special:
@@ -2479,7 +2479,8 @@ DEFUN ("replace-match", Freplace_match, Sreplace_match, 1, 5, 0,
 	      /* Cannot be all caps if any original char is lower case */
 
 	      some_lowercase = 1;
-	      if (SYNTAX (prevc) != Sword)
+	      if (SYNTAX (prevc) != Sword
+		  && (!case_symbols_as_words || SYNTAX (prevc) != Ssymbol))
 		some_nonuppercase_initial = 1;
 	      else
 		some_multiletter_word = 1;
@@ -2487,7 +2488,8 @@ DEFUN ("replace-match", Freplace_match, Sreplace_match, 1, 5, 0,
 	  else if (uppercasep (c))
 	    {
 	      some_uppercase = 1;
-	      if (SYNTAX (prevc) != Sword)
+	      if (SYNTAX (prevc) != Sword
+		  && (!case_symbols_as_words || SYNTAX (prevc) != Ssymbol))
 		;
 	      else
 		some_multiletter_word = 1;
@@ -2496,7 +2498,8 @@ DEFUN ("replace-match", Freplace_match, Sreplace_match, 1, 5, 0,
 	    {
 	      /* If the initial is a caseless word constituent,
 		 treat that like a lowercase initial.  */
-	      if (SYNTAX (prevc) != Sword)
+	      if (SYNTAX (prevc) != Sword
+		  && (!case_symbols_as_words || SYNTAX (prevc) != Ssymbol))
 		some_nonuppercase_initial = 1;
 	    }
 
diff --git a/test/src/casefiddle-tests.el b/test/src/casefiddle-tests.el
index e7f4348b0c6..12984d898b9 100644
--- a/test/src/casefiddle-tests.el
+++ b/test/src/casefiddle-tests.el
@@ -294,4 +294,16 @@ casefiddle-turkish
     ;;(should (string-equal (capitalize "indIá") "İndıa"))
     ))
 
+(defun casefiddle-tests--check-syms (init with-words with-symbols)
+  (let ((case-symbols-as-words nil))
+    (should (string-equal (upcase-initials init) with-words)))
+  (let ((case-symbols-as-words t))
+    (should (string-equal (upcase-initials init) with-symbols))))
+
+(ert-deftest casefiddle-case-symbols-as-words ()
+  (casefiddle-tests--check-syms "Aa_bb Cc_dd" "Aa_Bb Cc_Dd" "Aa_bb Cc_dd")
+  (casefiddle-tests--check-syms "Aa_bb cc_DD" "Aa_Bb Cc_DD" "Aa_bb Cc_DD")
+  (casefiddle-tests--check-syms "aa_bb cc_dd" "Aa_Bb Cc_Dd" "Aa_bb Cc_dd")
+  (casefiddle-tests--check-syms "Aa_Bb Cc_Dd" "Aa_Bb Cc_Dd" "Aa_Bb Cc_Dd"))
+
 ;;; casefiddle-tests.el ends here
-- 
2.39.3


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* bug#66614: 29.1.50; Support not capitalizing words inside symbols
  2023-10-18 17:01 ` Spencer Baugh
@ 2023-10-18 18:24   ` Eli Zaretskii
  2023-10-18 18:55     ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2023-10-18 18:34   ` Eli Zaretskii
  1 sibling, 1 reply; 11+ messages in thread
From: Eli Zaretskii @ 2023-10-18 18:24 UTC (permalink / raw)
  To: Spencer Baugh, Stefan Monnier; +Cc: 66614

> From: Spencer Baugh <sbaugh@janestreet.com>
> Date: Wed, 18 Oct 2023 13:01:43 -0400
> 
> >From e11c5096b2e0a3eddec8fac692142ff31c889109 Mon Sep 17 00:00:00 2001
> From: Spencer Baugh <sbaugh@janestreet.com>
> Date: Wed, 18 Oct 2023 12:51:37 -0400
> Subject: [PATCH] Add case-symbols-as-words to configure symbol case behavior
> 
> In some programming languages and styles, a symbol (or every symbol in
> a sequence of symbols) might be capitalized, but the individual words
> making up the symbol should never be capitalized.
> 
> For example, in OCaml, type names Look_like_this and variable names
> look_like_this, but it is basically never correct for something to
> Look_Like_This.  And one might have "aa_bb cc_dd ee_ff" or "Aa_bb
> Cc_dd Ee_ff", but never "Aa_Bb Cc_Dd Ee_Ff".
> 
> To support this, the new variable case-symbols-as-words causes symbol
> constituents to be treated as part of words only for case operations.
> 
> * src/casefiddle.c (case_ch_is_word): Add.
> (case_character_impl): Use case_ch_is_word.
> (case_character): Use case_ch_is_word.
> (syms_of_casefiddle): Define case-symbols-as-words. (bug#66614)
> * src/search.c (Freplace_match): Use case-symbols-as-words when
> calculating case pattern.
> * test/src/casefiddle-tests.el (casefiddle-tests--check-syms)
> (casefiddle-case-symbols-as-words): Test case-symbols-as-words.
> * etc/NEWS: Announce case-symbols-as-words.
> * doc/lispref/strings.texi (Case Conversion): Document
> case-symbols-as-words.

Stefan, any comments?





^ permalink raw reply	[flat|nested] 11+ messages in thread

* bug#66614: 29.1.50; Support not capitalizing words inside symbols
  2023-10-18 17:01 ` Spencer Baugh
  2023-10-18 18:24   ` Eli Zaretskii
@ 2023-10-18 18:34   ` Eli Zaretskii
  2023-10-18 19:38     ` Spencer Baugh
  1 sibling, 1 reply; 11+ messages in thread
From: Eli Zaretskii @ 2023-10-18 18:34 UTC (permalink / raw)
  To: Spencer Baugh; +Cc: 66614

> From: Spencer Baugh <sbaugh@janestreet.com>
> Date: Wed, 18 Oct 2023 13:01:43 -0400
> 
> --- a/doc/lispref/strings.texi
> +++ b/doc/lispref/strings.texi
> @@ -1510,7 +1510,9 @@ Case Conversion
>  
>  The definition of a word is any sequence of consecutive characters that
>  are assigned to the word constituent syntax class in the current syntax
> -table (@pxref{Syntax Class Table}).
> +table (@pxref{Syntax Class Table}), or if @var{case-symbols-as-words} is
> +non-nil, also characters assigned to the symbol constituent syntax
> +class.
>  
>  When @var{string-or-char} is a character, this function does the same
>  thing as @code{upcase}.
> @@ -1542,7 +1544,9 @@ Case Conversion
>  
>  The definition of a word is any sequence of consecutive characters that
>  are assigned to the word constituent syntax class in the current syntax
> -table (@pxref{Syntax Class Table}).
> +table (@pxref{Syntax Class Table}), or if @var{case-symbols-as-words} is
> +non-nil, also characters assigned to the symbol constituent syntax
> +class.

These two hunks use @var incorrectly: case-symbols-as-words is a
literal symbol, so it should have the @code markup.

> ++++
> +** New variable 'case-symbols-as-words' to change case behavior for symbols.

"Case behavior" is confusing.  I think you mean

  New variable 'case-symbols-as-words' affects case operations for symbols.

> +If this is set to non-nil, then case operations such as
> +'upcase-initials' or 'replace-match' (with nil FIXEDCASE) will treat
> +symbol constituents as if they were part of words.

Don't you mean

  will treat the entire symbol name as a single word

?  I find the text you used confusing, FWIW.

>                                                    This is useful for
> +programming languages and style where words in the middle of symbols
> +are never capitalized.

Likewise here: instead of talking about "words in the middle of
symbols", wouldn't it be better to say something like

  ...style where only the first letter of a symbol's name is ever
  capitalized.

?

Also, please say here that the default of this new variable is nil.

> +  DEFVAR_BOOL ("case-symbols-as-words", case_symbols_as_words,
> +	       doc: /* If non-nil, case functions treat symbol syntax as part of words.
> +
> +Functions such as `upcase-initials' and `replace-match' check or modify
> +the case pattern of sequences of characters.  Normally, these operate on
> +sequences of characters whose syntax is word constituent.  If this
> +variable is non-nil, then they operate on sequences of characters who
> +syntax is either word constituent or symbol constituent.
> +
> +This is useful for programming styles which wish to capitalize the
> +beginning of symbols, but not capitalize individual words in a symbol.*/);

Similar comments about this doc string.

Also, shouldn't this variable be buffer-local?  You want certain major
modes to set it, right?

> -	      if (SYNTAX (prevc) != Sword)
> +	      if (SYNTAX (prevc) != Sword
> +		  && (!case_symbols_as_words || SYNTAX (prevc) != Ssymbol))

I think the code will be more clear if you use

		  && !(case_symbols_as_words && SYNTAX (prevc) == Ssymbol))

>  	  else if (uppercasep (c))
>  	    {
>  	      some_uppercase = 1;
> -	      if (SYNTAX (prevc) != Sword)
> +	      if (SYNTAX (prevc) != Sword
> +		  && (!case_symbols_as_words || SYNTAX (prevc) != Ssymbol))

Same here.

>  	      /* If the initial is a caseless word constituent,
>  		 treat that like a lowercase initial.  */
> -	      if (SYNTAX (prevc) != Sword)
> +	      if (SYNTAX (prevc) != Sword
> +		  && (!case_symbols_as_words || SYNTAX (prevc) != Ssymbol))
>  		some_nonuppercase_initial = 1;

And here.

Thanks.





^ permalink raw reply	[flat|nested] 11+ messages in thread

* bug#66614: 29.1.50; Support not capitalizing words inside symbols
  2023-10-18 18:24   ` Eli Zaretskii
@ 2023-10-18 18:55     ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 0 replies; 11+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-10-18 18:55 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Spencer Baugh, 66614

> Stefan, any comments?

Not really, no.  Ideally, we could introduce some kind of
`case-word-forward-function` hook so that we can accommodate even more
conventions, but this boolean var doesn't cost much.


        Stefan






^ permalink raw reply	[flat|nested] 11+ messages in thread

* bug#66614: 29.1.50; Support not capitalizing words inside symbols
  2023-10-18 18:34   ` Eli Zaretskii
@ 2023-10-18 19:38     ` Spencer Baugh
  2023-10-19  4:35       ` Eli Zaretskii
  2023-10-19 10:54       ` Po Lu via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 2 replies; 11+ messages in thread
From: Spencer Baugh @ 2023-10-18 19:38 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 66614

[-- Attachment #1: Type: text/plain, Size: 4298 bytes --]

Eli Zaretskii <eliz@gnu.org> writes:
>> From: Spencer Baugh <sbaugh@janestreet.com>
>> Date: Wed, 18 Oct 2023 13:01:43 -0400
>> 
>> --- a/doc/lispref/strings.texi
>> +++ b/doc/lispref/strings.texi
>> @@ -1510,7 +1510,9 @@ Case Conversion
>>  
>>  The definition of a word is any sequence of consecutive characters that
>>  are assigned to the word constituent syntax class in the current syntax
>> -table (@pxref{Syntax Class Table}).
>> +table (@pxref{Syntax Class Table}), or if @var{case-symbols-as-words} is
>> +non-nil, also characters assigned to the symbol constituent syntax
>> +class.
>>  
>>  When @var{string-or-char} is a character, this function does the same
>>  thing as @code{upcase}.
>> @@ -1542,7 +1544,9 @@ Case Conversion
>>  
>>  The definition of a word is any sequence of consecutive characters that
>>  are assigned to the word constituent syntax class in the current syntax
>> -table (@pxref{Syntax Class Table}).
>> +table (@pxref{Syntax Class Table}), or if @var{case-symbols-as-words} is
>> +non-nil, also characters assigned to the symbol constituent syntax
>> +class.
>
> These two hunks use @var incorrectly: case-symbols-as-words is a
> literal symbol, so it should have the @code markup.

Fixed.

>> ++++
>> +** New variable 'case-symbols-as-words' to change case behavior for symbols.
>
> "Case behavior" is confusing.  I think you mean
>
>   New variable 'case-symbols-as-words' affects case operations for symbols.

Fixed.

>> +If this is set to non-nil, then case operations such as
>> +'upcase-initials' or 'replace-match' (with nil FIXEDCASE) will treat
>> +symbol constituents as if they were part of words.
>
> Don't you mean
>
>   will treat the entire symbol name as a single word
>
> ?  I find the text you used confusing, FWIW.

Fixed.

>>                                                    This is useful for
>> +programming languages and style where words in the middle of symbols
>> +are never capitalized.
>
> Likewise here: instead of talking about "words in the middle of
> symbols", wouldn't it be better to say something like
>
>   ...style where only the first letter of a symbol's name is ever
>   capitalized.
>
> ?
>
> Also, please say here that the default of this new variable is nil.

Fixed.

>> +  DEFVAR_BOOL ("case-symbols-as-words", case_symbols_as_words,
>> +	       doc: /* If non-nil, case functions treat symbol syntax as part of words.
>> +
>> +Functions such as `upcase-initials' and `replace-match' check or modify
>> +the case pattern of sequences of characters.  Normally, these operate on
>> +sequences of characters whose syntax is word constituent.  If this
>> +variable is non-nil, then they operate on sequences of characters who
>> +syntax is either word constituent or symbol constituent.
>> +
>> +This is useful for programming styles which wish to capitalize the
>> +beginning of symbols, but not capitalize individual words in a symbol.*/);
>
> Similar comments about this doc string.

Fixed.

> Also, shouldn't this variable be buffer-local?  You want certain major
> modes to set it, right?

Yes, I want certain major modes to set it, although it's also possible
that some users will want to set it globally.

Are you suggesting it should be a DEFVAR_PER_BUFFER?  I can do that, but
I didn't think it was worth putting another slot into struct buffer.
Plus DEFVAR_PER_BUFFER has bad performance (O(#buffers)) when you
let-bind it, which I expect users might want to do sometimes.

>> -	      if (SYNTAX (prevc) != Sword)
>> +	      if (SYNTAX (prevc) != Sword
>> +		  && (!case_symbols_as_words || SYNTAX (prevc) != Ssymbol))
>
> I think the code will be more clear if you use
>
> 		  && !(case_symbols_as_words && SYNTAX (prevc) == Ssymbol))

Fixed.

>>  	  else if (uppercasep (c))
>>  	    {
>>  	      some_uppercase = 1;
>> -	      if (SYNTAX (prevc) != Sword)
>> +	      if (SYNTAX (prevc) != Sword
>> +		  && (!case_symbols_as_words || SYNTAX (prevc) != Ssymbol))
>
> Same here.
>

Fixed.

>>  	      /* If the initial is a caseless word constituent,
>>  		 treat that like a lowercase initial.  */
>> -	      if (SYNTAX (prevc) != Sword)
>> +	      if (SYNTAX (prevc) != Sword
>> +		  && (!case_symbols_as_words || SYNTAX (prevc) != Ssymbol))
>>  		some_nonuppercase_initial = 1;
>
> And here.
>

Fixed.


[-- Attachment #2: 0001-Add-case-symbols-as-words-to-configure-symbol-case-b.patch --]
[-- Type: text/x-patch, Size: 8673 bytes --]

From 8286118c70288217badbbb2afd7863ae2ba6848c Mon Sep 17 00:00:00 2001
From: Spencer Baugh <sbaugh@janestreet.com>
Date: Wed, 18 Oct 2023 12:51:37 -0400
Subject: [PATCH] Add case-symbols-as-words to configure symbol case behavior

In some programming languages and styles, a symbol (or every symbol in
a sequence of symbols) might be capitalized, but the individual words
making up the symbol should never be capitalized.

For example, in OCaml, type names Look_like_this and variable names
look_like_this, but it is basically never correct for something to
Look_Like_This.  And one might have "aa_bb cc_dd ee_ff" or "Aa_bb
Cc_dd Ee_ff", but never "Aa_Bb Cc_Dd Ee_Ff".

To support this, the new variable case-symbols-as-words causes symbol
constituents to be treated as part of words only for case operations.

* src/casefiddle.c (case_ch_is_word): Add.
(case_character_impl): Use case_ch_is_word.
(case_character): Use case_ch_is_word.
(syms_of_casefiddle): Define case-symbols-as-words. (bug#66614)
* src/search.c (Freplace_match): Use case-symbols-as-words when
calculating case pattern.
* test/src/casefiddle-tests.el (casefiddle-tests--check-syms)
(casefiddle-case-symbols-as-words): Test case-symbols-as-words.
* etc/NEWS: Announce case-symbols-as-words.
* doc/lispref/strings.texi (Case Conversion): Document
case-symbols-as-words.
---
 doc/lispref/strings.texi     |  8 ++++++--
 etc/NEWS                     |  8 ++++++++
 src/casefiddle.c             | 23 +++++++++++++++++++++--
 src/search.c                 | 11 +++++++----
 test/src/casefiddle-tests.el | 12 ++++++++++++
 5 files changed, 54 insertions(+), 8 deletions(-)

diff --git a/doc/lispref/strings.texi b/doc/lispref/strings.texi
index 7d11db49def..665d4f9a8dc 100644
--- a/doc/lispref/strings.texi
+++ b/doc/lispref/strings.texi
@@ -1510,7 +1510,9 @@ Case Conversion
 
 The definition of a word is any sequence of consecutive characters that
 are assigned to the word constituent syntax class in the current syntax
-table (@pxref{Syntax Class Table}).
+table (@pxref{Syntax Class Table}), or if @code{case-symbols-as-words}
+is non-nil, also characters assigned to the symbol constituent syntax
+class.
 
 When @var{string-or-char} is a character, this function does the same
 thing as @code{upcase}.
@@ -1542,7 +1544,9 @@ Case Conversion
 
 The definition of a word is any sequence of consecutive characters that
 are assigned to the word constituent syntax class in the current syntax
-table (@pxref{Syntax Class Table}).
+table (@pxref{Syntax Class Table}), or if @code{case-symbols-as-words}
+is non-nil, also characters assigned to the symbol constituent syntax
+class.
 
 When the argument to @code{upcase-initials} is a character,
 @code{upcase-initials} has the same result as @code{upcase}.
diff --git a/etc/NEWS b/etc/NEWS
index 129017f7dbe..23867aafe6f 100644
--- a/etc/NEWS
+++ b/etc/NEWS
@@ -1100,6 +1100,14 @@ instead of "ctags", "ebrowse", "etags", "hexl", "emacsclient", and
 "rcs2log", when starting one of these built in programs in a
 subprocess.
 
++++
+** New variable 'case-symbols-as-words' affects case operations for symbols.
+If non-nil, then case operations such as 'upcase-initials' or
+'replace-match' (with nil FIXEDCASE) will treat the entire symbol name
+as a single word.  This is useful for programming languages and styles
+where only the first letter of a symbol's name is ever capitalized.
+It defaults to nil.
+
 +++
 ** 'x-popup-menu' now understands touch screen events.
 When a 'touchscreen-begin' or 'touchscreen-end' event is passed as the
diff --git a/src/casefiddle.c b/src/casefiddle.c
index d567a5e353a..47e8950cda6 100644
--- a/src/casefiddle.c
+++ b/src/casefiddle.c
@@ -92,6 +92,12 @@ prepare_casing_context (struct casing_context *ctx,
     SETUP_BUFFER_SYNTAX_TABLE ();	/* For syntax_prefix_flag_p.  */
 }
 
+static bool
+case_ch_is_word (enum syntaxcode syntax)
+{
+  return syntax == Sword || (case_symbols_as_words && syntax == Ssymbol);
+}
+
 struct casing_str_buf
 {
   unsigned char data[max (6, MAX_MULTIBYTE_LENGTH)];
@@ -115,7 +121,7 @@ case_character_impl (struct casing_str_buf *buf,
 
   /* Update inword state */
   bool was_inword = ctx->inword;
-  ctx->inword = SYNTAX (ch) == Sword &&
+  ctx->inword = case_ch_is_word (SYNTAX (ch)) &&
     (!ctx->inbuffer || was_inword || !syntax_prefix_flag_p (ch));
 
   /* Normalize flag so its one of CASE_UP, CASE_DOWN or CASE_CAPITALIZE.  */
@@ -222,7 +228,7 @@ case_character (struct casing_str_buf *buf, struct casing_context *ctx,
      has a word syntax (i.e. current character is end of word), use final
      sigma.  */
   if (was_inword && ch == GREEK_CAPITAL_LETTER_SIGMA && changed
-      && (!next || SYNTAX (STRING_CHAR (next)) != Sword))
+      && (!next || !case_ch_is_word (SYNTAX (STRING_CHAR (next)))))
     {
       buf->len_bytes = CHAR_STRING (GREEK_SMALL_LETTER_FINAL_SIGMA, buf->data);
       buf->len_chars = 1;
@@ -720,6 +726,19 @@ syms_of_casefiddle (void)
   3rd argument.  */);
   Vregion_extract_function = Qnil; /* simple.el sets this.  */
 
+  DEFVAR_BOOL ("case-symbols-as-words", case_symbols_as_words,
+	       doc: /* If non-nil, case functions treat symbol syntax as part of words.
+
+Functions such as `upcase-initials' and `replace-match' check or modify
+the case pattern of sequences of characters.  Normally, these operate on
+sequences of characters whose syntax is word constituent.  If this
+variable is non-nil, then they operate on sequences of characters whose
+syntax is either word constituent or symbol constituent.
+
+This is useful for programming languages and styles where only the first
+letter of a symbol's name is ever capitalized.*/);
+  case_symbols_as_words = 0;
+
   defsubr (&Supcase);
   defsubr (&Sdowncase);
   defsubr (&Scapitalize);
diff --git a/src/search.c b/src/search.c
index e9b29bb7179..692d8488049 100644
--- a/src/search.c
+++ b/src/search.c
@@ -2365,7 +2365,7 @@ DEFUN ("replace-match", Freplace_match, Sreplace_match, 1, 5, 0,
 convert NEWTEXT to all caps.  Otherwise if all words are capitalized
 in the replaced text, capitalize each word in NEWTEXT.  Note that
 what exactly is a word is determined by the syntax tables in effect
-in the current buffer.
+in the current buffer, and the variable `case-symbols-as-words'.
 
 If optional third arg LITERAL is non-nil, insert NEWTEXT literally.
 Otherwise treat `\\' as special:
@@ -2479,7 +2479,8 @@ DEFUN ("replace-match", Freplace_match, Sreplace_match, 1, 5, 0,
 	      /* Cannot be all caps if any original char is lower case */
 
 	      some_lowercase = 1;
-	      if (SYNTAX (prevc) != Sword)
+	      if (SYNTAX (prevc) != Sword
+		  && !(case_symbols_as_words && SYNTAX (prevc) == Ssymbol))
 		some_nonuppercase_initial = 1;
 	      else
 		some_multiletter_word = 1;
@@ -2487,7 +2488,8 @@ DEFUN ("replace-match", Freplace_match, Sreplace_match, 1, 5, 0,
 	  else if (uppercasep (c))
 	    {
 	      some_uppercase = 1;
-	      if (SYNTAX (prevc) != Sword)
+	      if (SYNTAX (prevc) != Sword
+		  && !(case_symbols_as_words && SYNTAX (prevc) == Ssymbol))
 		;
 	      else
 		some_multiletter_word = 1;
@@ -2496,7 +2498,8 @@ DEFUN ("replace-match", Freplace_match, Sreplace_match, 1, 5, 0,
 	    {
 	      /* If the initial is a caseless word constituent,
 		 treat that like a lowercase initial.  */
-	      if (SYNTAX (prevc) != Sword)
+	      if (SYNTAX (prevc) != Sword
+		  && !(case_symbols_as_words && SYNTAX (prevc) == Ssymbol))
 		some_nonuppercase_initial = 1;
 	    }
 
diff --git a/test/src/casefiddle-tests.el b/test/src/casefiddle-tests.el
index e7f4348b0c6..12984d898b9 100644
--- a/test/src/casefiddle-tests.el
+++ b/test/src/casefiddle-tests.el
@@ -294,4 +294,16 @@ casefiddle-turkish
     ;;(should (string-equal (capitalize "indIá") "İndıa"))
     ))
 
+(defun casefiddle-tests--check-syms (init with-words with-symbols)
+  (let ((case-symbols-as-words nil))
+    (should (string-equal (upcase-initials init) with-words)))
+  (let ((case-symbols-as-words t))
+    (should (string-equal (upcase-initials init) with-symbols))))
+
+(ert-deftest casefiddle-case-symbols-as-words ()
+  (casefiddle-tests--check-syms "Aa_bb Cc_dd" "Aa_Bb Cc_Dd" "Aa_bb Cc_dd")
+  (casefiddle-tests--check-syms "Aa_bb cc_DD" "Aa_Bb Cc_DD" "Aa_bb Cc_DD")
+  (casefiddle-tests--check-syms "aa_bb cc_dd" "Aa_Bb Cc_Dd" "Aa_bb Cc_dd")
+  (casefiddle-tests--check-syms "Aa_Bb Cc_Dd" "Aa_Bb Cc_Dd" "Aa_Bb Cc_Dd"))
+
 ;;; casefiddle-tests.el ends here
-- 
2.39.3


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* bug#66614: 29.1.50; Support not capitalizing words inside symbols
  2023-10-18 19:38     ` Spencer Baugh
@ 2023-10-19  4:35       ` Eli Zaretskii
  2023-10-21 15:11         ` sbaugh
  2023-10-19 10:54       ` Po Lu via Bug reports for GNU Emacs, the Swiss army knife of text editors
  1 sibling, 1 reply; 11+ messages in thread
From: Eli Zaretskii @ 2023-10-19  4:35 UTC (permalink / raw)
  To: Spencer Baugh; +Cc: 66614

> From: Spencer Baugh <sbaugh@janestreet.com>
> Cc: 66614@debbugs.gnu.org
> Date: Wed, 18 Oct 2023 15:38:34 -0400
> 
> >> +  DEFVAR_BOOL ("case-symbols-as-words", case_symbols_as_words,
> >> +	       doc: /* If non-nil, case functions treat symbol syntax as part of words.
> >> +
> >> +Functions such as `upcase-initials' and `replace-match' check or modify
> >> +the case pattern of sequences of characters.  Normally, these operate on
> >> +sequences of characters whose syntax is word constituent.  If this
> >> +variable is non-nil, then they operate on sequences of characters who
> >> +syntax is either word constituent or symbol constituent.
> >> +
> >> +This is useful for programming styles which wish to capitalize the
> >> +beginning of symbols, but not capitalize individual words in a symbol.*/);
> >
> > Similar comments about this doc string.
> 
> Fixed.
> 
> > Also, shouldn't this variable be buffer-local?  You want certain major
> > modes to set it, right?
> 
> Yes, I want certain major modes to set it, although it's also possible
> that some users will want to set it globally.
> 
> Are you suggesting it should be a DEFVAR_PER_BUFFER?  I can do that, but
> I didn't think it was worth putting another slot into struct buffer.

You don't have to add it to the buffer structure, you could call
Fmake_variable_buffer_local instead.  We already do that for some
variables.

> Plus DEFVAR_PER_BUFFER has bad performance (O(#buffers)) when you
> let-bind it, which I expect users might want to do sometimes.

That is a separate problem, and adding one more buffer-local variable
will hardly change the fact that let-binding and/or temporarily
switching buffers is expensive.  We should think about correctness
before we think about performance.  And correctness requires that this
variable be buffer-local, as making it global makes no sense IMO.





^ permalink raw reply	[flat|nested] 11+ messages in thread

* bug#66614: 29.1.50; Support not capitalizing words inside symbols
  2023-10-18 19:38     ` Spencer Baugh
  2023-10-19  4:35       ` Eli Zaretskii
@ 2023-10-19 10:54       ` Po Lu via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2023-10-21 15:13         ` sbaugh
  1 sibling, 1 reply; 11+ messages in thread
From: Po Lu via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-10-19 10:54 UTC (permalink / raw)
  To: Spencer Baugh; +Cc: Eli Zaretskii, 66614

Spencer Baugh <sbaugh@janestreet.com> writes:

> +  DEFVAR_BOOL ("case-symbols-as-words", case_symbols_as_words,
> +	       doc: /* If non-nil, case functions treat symbol syntax as part of words.
> +
> +Functions such as `upcase-initials' and `replace-match' check or modify
> +the case pattern of sequences of characters.  Normally, these operate on
> +sequences of characters whose syntax is word constituent.  If this
> +variable is non-nil, then they operate on sequences of characters whose
> +syntax is either word constituent or symbol constituent.
> +
> +This is useful for programming languages and styles where only the first
> +letter of a symbol's name is ever capitalized.*/);
> +  case_symbols_as_words = 0;

Incidentally:

Let's not introduce further instances of the anti-pattern where the
``doc:'' marker in DEFVAR constructs is aligned with its opening paren,
rather than two columns into the word DEFVAR itself.  Generally, this
deprives you of latitude for the first line of the doc string, for you
are either compelled to contravene the 80 column limit or to render that
line exceptionally laconic, both of which are ultimately
counterproductive.

TIA.





^ permalink raw reply	[flat|nested] 11+ messages in thread

* bug#66614: 29.1.50; Support not capitalizing words inside symbols
  2023-10-19  4:35       ` Eli Zaretskii
@ 2023-10-21 15:11         ` sbaugh
  2023-10-29 11:42           ` Eli Zaretskii
  0 siblings, 1 reply; 11+ messages in thread
From: sbaugh @ 2023-10-21 15:11 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Spencer Baugh, 66614

[-- Attachment #1: Type: text/plain, Size: 1489 bytes --]

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Spencer Baugh <sbaugh@janestreet.com>
>> Cc: 66614@debbugs.gnu.org
>> Date: Wed, 18 Oct 2023 15:38:34 -0400
>> 
>> >> +  DEFVAR_BOOL ("case-symbols-as-words", case_symbols_as_words,
>> >> +	       doc: /* If non-nil, case functions treat symbol syntax as part of words.
>> >> +
>> >> +Functions such as `upcase-initials' and `replace-match' check or modify
>> >> +the case pattern of sequences of characters.  Normally, these operate on
>> >> +sequences of characters whose syntax is word constituent.  If this
>> >> +variable is non-nil, then they operate on sequences of characters who
>> >> +syntax is either word constituent or symbol constituent.
>> >> +
>> >> +This is useful for programming styles which wish to capitalize the
>> >> +beginning of symbols, but not capitalize individual words in a symbol.*/);
>> >
>> > Similar comments about this doc string.
>> 
>> Fixed.
>> 
>> > Also, shouldn't this variable be buffer-local?  You want certain major
>> > modes to set it, right?
>> 
>> Yes, I want certain major modes to set it, although it's also possible
>> that some users will want to set it globally.
>> 
>> Are you suggesting it should be a DEFVAR_PER_BUFFER?  I can do that, but
>> I didn't think it was worth putting another slot into struct buffer.
>
> You don't have to add it to the buffer structure, you could call
> Fmake_variable_buffer_local instead.  We already do that for some
> variables.

Oh, of course.  Done.


[-- Attachment #2: 0001-Add-case-symbols-as-words-to-configure-symbol-case-b.patch --]
[-- Type: text/x-patch, Size: 8787 bytes --]

From 22540be262399f3ec232da713b3ba454299e18d2 Mon Sep 17 00:00:00 2001
From: Spencer Baugh <sbaugh@catern.com>
Date: Sat, 21 Oct 2023 11:09:39 -0400
Subject: [PATCH] Add case-symbols-as-words to configure symbol case behavior

In some programming languages and styles, a symbol (or every symbol in
a sequence of symbols) might be capitalized, but the individual words
making up the symbol should never be capitalized.

For example, in OCaml, type names Look_like_this and variable names
look_like_this, but it is basically never correct for something to
Look_Like_This.  And one might have "aa_bb cc_dd ee_ff" or "Aa_bb
Cc_dd Ee_ff", but never "Aa_Bb Cc_Dd Ee_Ff".

To support this, the new variable case-symbols-as-words causes symbol
constituents to be treated as part of words only for case operations.

* src/casefiddle.c (case_ch_is_word): Add.
(case_character_impl): Use case_ch_is_word.
(case_character): Use case_ch_is_word.
(syms_of_casefiddle): Define case-symbols-as-words. (bug#66614)
* src/search.c (Freplace_match): Use case-symbols-as-words when
calculating case pattern.
* test/src/casefiddle-tests.el (casefiddle-tests--check-syms)
(casefiddle-case-symbols-as-words): Test case-symbols-as-words.
* etc/NEWS: Announce case-symbols-as-words.
* doc/lispref/strings.texi (Case Conversion): Document
case-symbols-as-words.
---
 doc/lispref/strings.texi     |  8 ++++++--
 etc/NEWS                     |  8 ++++++++
 src/casefiddle.c             | 25 +++++++++++++++++++++++--
 src/search.c                 | 11 +++++++----
 test/src/casefiddle-tests.el | 12 ++++++++++++
 5 files changed, 56 insertions(+), 8 deletions(-)

diff --git a/doc/lispref/strings.texi b/doc/lispref/strings.texi
index 7d11db49def..665d4f9a8dc 100644
--- a/doc/lispref/strings.texi
+++ b/doc/lispref/strings.texi
@@ -1510,7 +1510,9 @@ Case Conversion
 
 The definition of a word is any sequence of consecutive characters that
 are assigned to the word constituent syntax class in the current syntax
-table (@pxref{Syntax Class Table}).
+table (@pxref{Syntax Class Table}), or if @code{case-symbols-as-words}
+is non-nil, also characters assigned to the symbol constituent syntax
+class.
 
 When @var{string-or-char} is a character, this function does the same
 thing as @code{upcase}.
@@ -1542,7 +1544,9 @@ Case Conversion
 
 The definition of a word is any sequence of consecutive characters that
 are assigned to the word constituent syntax class in the current syntax
-table (@pxref{Syntax Class Table}).
+table (@pxref{Syntax Class Table}), or if @code{case-symbols-as-words}
+is non-nil, also characters assigned to the symbol constituent syntax
+class.
 
 When the argument to @code{upcase-initials} is a character,
 @code{upcase-initials} has the same result as @code{upcase}.
diff --git a/etc/NEWS b/etc/NEWS
index 4a44782f972..9d1a81789c6 100644
--- a/etc/NEWS
+++ b/etc/NEWS
@@ -1131,6 +1131,14 @@ instead of "ctags", "ebrowse", "etags", "hexl", "emacsclient", and
 "rcs2log", when starting one of these built in programs in a
 subprocess.
 
++++
+** New variable 'case-symbols-as-words' affects case operations for symbols.
+If non-nil, then case operations such as 'upcase-initials' or
+'replace-match' (with nil FIXEDCASE) will treat the entire symbol name
+as a single word.  This is useful for programming languages and styles
+where only the first letter of a symbol's name is ever capitalized.
+It defaults to nil.
+
 +++
 ** 'x-popup-menu' now understands touch screen events.
 When a 'touchscreen-begin' or 'touchscreen-end' event is passed as the
diff --git a/src/casefiddle.c b/src/casefiddle.c
index d567a5e353a..3afb131c50e 100644
--- a/src/casefiddle.c
+++ b/src/casefiddle.c
@@ -92,6 +92,12 @@ prepare_casing_context (struct casing_context *ctx,
     SETUP_BUFFER_SYNTAX_TABLE ();	/* For syntax_prefix_flag_p.  */
 }
 
+static bool
+case_ch_is_word (enum syntaxcode syntax)
+{
+  return syntax == Sword || (case_symbols_as_words && syntax == Ssymbol);
+}
+
 struct casing_str_buf
 {
   unsigned char data[max (6, MAX_MULTIBYTE_LENGTH)];
@@ -115,7 +121,7 @@ case_character_impl (struct casing_str_buf *buf,
 
   /* Update inword state */
   bool was_inword = ctx->inword;
-  ctx->inword = SYNTAX (ch) == Sword &&
+  ctx->inword = case_ch_is_word (SYNTAX (ch)) &&
     (!ctx->inbuffer || was_inword || !syntax_prefix_flag_p (ch));
 
   /* Normalize flag so its one of CASE_UP, CASE_DOWN or CASE_CAPITALIZE.  */
@@ -222,7 +228,7 @@ case_character (struct casing_str_buf *buf, struct casing_context *ctx,
      has a word syntax (i.e. current character is end of word), use final
      sigma.  */
   if (was_inword && ch == GREEK_CAPITAL_LETTER_SIGMA && changed
-      && (!next || SYNTAX (STRING_CHAR (next)) != Sword))
+      && (!next || !case_ch_is_word (SYNTAX (STRING_CHAR (next)))))
     {
       buf->len_bytes = CHAR_STRING (GREEK_SMALL_LETTER_FINAL_SIGMA, buf->data);
       buf->len_chars = 1;
@@ -720,6 +726,21 @@ syms_of_casefiddle (void)
   3rd argument.  */);
   Vregion_extract_function = Qnil; /* simple.el sets this.  */
 
+  DEFVAR_BOOL ("case-symbols-as-words", case_symbols_as_words,
+    doc: /* If non-nil, case functions treat symbol syntax as part of words.
+
+Functions such as `upcase-initials' and `replace-match' check or modify
+the case pattern of sequences of characters.  Normally, these operate on
+sequences of characters whose syntax is word constituent.  If this
+variable is non-nil, then they operate on sequences of characters whose
+syntax is either word constituent or symbol constituent.
+
+This is useful for programming languages and styles where only the first
+letter of a symbol's name is ever capitalized.*/);
+  case_symbols_as_words = 0;
+  DEFSYM (Qcase_symbols_as_words, "case-symbols-as-words");
+  Fmake_variable_buffer_local (Qcase_symbols_as_words);
+
   defsubr (&Supcase);
   defsubr (&Sdowncase);
   defsubr (&Scapitalize);
diff --git a/src/search.c b/src/search.c
index e9b29bb7179..692d8488049 100644
--- a/src/search.c
+++ b/src/search.c
@@ -2365,7 +2365,7 @@ DEFUN ("replace-match", Freplace_match, Sreplace_match, 1, 5, 0,
 convert NEWTEXT to all caps.  Otherwise if all words are capitalized
 in the replaced text, capitalize each word in NEWTEXT.  Note that
 what exactly is a word is determined by the syntax tables in effect
-in the current buffer.
+in the current buffer, and the variable `case-symbols-as-words'.
 
 If optional third arg LITERAL is non-nil, insert NEWTEXT literally.
 Otherwise treat `\\' as special:
@@ -2479,7 +2479,8 @@ DEFUN ("replace-match", Freplace_match, Sreplace_match, 1, 5, 0,
 	      /* Cannot be all caps if any original char is lower case */
 
 	      some_lowercase = 1;
-	      if (SYNTAX (prevc) != Sword)
+	      if (SYNTAX (prevc) != Sword
+		  && !(case_symbols_as_words && SYNTAX (prevc) == Ssymbol))
 		some_nonuppercase_initial = 1;
 	      else
 		some_multiletter_word = 1;
@@ -2487,7 +2488,8 @@ DEFUN ("replace-match", Freplace_match, Sreplace_match, 1, 5, 0,
 	  else if (uppercasep (c))
 	    {
 	      some_uppercase = 1;
-	      if (SYNTAX (prevc) != Sword)
+	      if (SYNTAX (prevc) != Sword
+		  && !(case_symbols_as_words && SYNTAX (prevc) == Ssymbol))
 		;
 	      else
 		some_multiletter_word = 1;
@@ -2496,7 +2498,8 @@ DEFUN ("replace-match", Freplace_match, Sreplace_match, 1, 5, 0,
 	    {
 	      /* If the initial is a caseless word constituent,
 		 treat that like a lowercase initial.  */
-	      if (SYNTAX (prevc) != Sword)
+	      if (SYNTAX (prevc) != Sword
+		  && !(case_symbols_as_words && SYNTAX (prevc) == Ssymbol))
 		some_nonuppercase_initial = 1;
 	    }
 
diff --git a/test/src/casefiddle-tests.el b/test/src/casefiddle-tests.el
index e7f4348b0c6..12984d898b9 100644
--- a/test/src/casefiddle-tests.el
+++ b/test/src/casefiddle-tests.el
@@ -294,4 +294,16 @@ casefiddle-turkish
     ;;(should (string-equal (capitalize "indIá") "İndıa"))
     ))
 
+(defun casefiddle-tests--check-syms (init with-words with-symbols)
+  (let ((case-symbols-as-words nil))
+    (should (string-equal (upcase-initials init) with-words)))
+  (let ((case-symbols-as-words t))
+    (should (string-equal (upcase-initials init) with-symbols))))
+
+(ert-deftest casefiddle-case-symbols-as-words ()
+  (casefiddle-tests--check-syms "Aa_bb Cc_dd" "Aa_Bb Cc_Dd" "Aa_bb Cc_dd")
+  (casefiddle-tests--check-syms "Aa_bb cc_DD" "Aa_Bb Cc_DD" "Aa_bb Cc_DD")
+  (casefiddle-tests--check-syms "aa_bb cc_dd" "Aa_Bb Cc_Dd" "Aa_bb Cc_dd")
+  (casefiddle-tests--check-syms "Aa_Bb Cc_Dd" "Aa_Bb Cc_Dd" "Aa_Bb Cc_Dd"))
+
 ;;; casefiddle-tests.el ends here
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* bug#66614: 29.1.50; Support not capitalizing words inside symbols
  2023-10-19 10:54       ` Po Lu via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2023-10-21 15:13         ` sbaugh
  0 siblings, 0 replies; 11+ messages in thread
From: sbaugh @ 2023-10-21 15:13 UTC (permalink / raw)
  To: Po Lu; +Cc: Spencer Baugh, Eli Zaretskii, 66614

Po Lu <luangruo@yahoo.com> writes:
> Spencer Baugh <sbaugh@janestreet.com> writes:
>
>> +  DEFVAR_BOOL ("case-symbols-as-words", case_symbols_as_words,
>> +	       doc: /* If non-nil, case functions treat symbol syntax as part of words.
>> +
>> +Functions such as `upcase-initials' and `replace-match' check or modify
>> +the case pattern of sequences of characters.  Normally, these operate on
>> +sequences of characters whose syntax is word constituent.  If this
>> +variable is non-nil, then they operate on sequences of characters whose
>> +syntax is either word constituent or symbol constituent.
>> +
>> +This is useful for programming languages and styles where only the first
>> +letter of a symbol's name is ever capitalized.*/);
>> +  case_symbols_as_words = 0;
>
> Incidentally:
>
> Let's not introduce further instances of the anti-pattern where the
> ``doc:'' marker in DEFVAR constructs is aligned with its opening paren,
> rather than two columns into the word DEFVAR itself.  Generally, this
> deprives you of latitude for the first line of the doc string, for you
> are either compelled to contravene the 80 column limit or to render that
> line exceptionally laconic, both of which are ultimately
> counterproductive.
>
> TIA.

I agree and I made this change, but if you want this style of
indentation for doc: markers to be more common, then you should probably
make c-indent-line-or-region do this.  Right now if I indent a region it
will re-indent the doc: marker to align with the opening paren.





^ permalink raw reply	[flat|nested] 11+ messages in thread

* bug#66614: 29.1.50; Support not capitalizing words inside symbols
  2023-10-21 15:11         ` sbaugh
@ 2023-10-29 11:42           ` Eli Zaretskii
  0 siblings, 0 replies; 11+ messages in thread
From: Eli Zaretskii @ 2023-10-29 11:42 UTC (permalink / raw)
  To: sbaugh; +Cc: sbaugh, 66614-done

> From: sbaugh@catern.com
> Date: Sat, 21 Oct 2023 15:11:08 +0000 (UTC)
> Cc: Spencer Baugh <sbaugh@janestreet.com>, 66614@debbugs.gnu.org
> 
> > You don't have to add it to the buffer structure, you could call
> > Fmake_variable_buffer_local instead.  We already do that for some
> > variables.
> 
> Oh, of course.  Done.

Thanks, installed on master, and closing the bug.

Btw, I consistently need to make minor fixups in your commit log
messages.  Please try to follow our style better, as indicated by my
changes.





^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2023-10-29 11:42 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-10-18 16:32 bug#66614: 29.1.50; Support not capitalizing words inside symbols Spencer Baugh
2023-10-18 17:01 ` Spencer Baugh
2023-10-18 18:24   ` Eli Zaretskii
2023-10-18 18:55     ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-10-18 18:34   ` Eli Zaretskii
2023-10-18 19:38     ` Spencer Baugh
2023-10-19  4:35       ` Eli Zaretskii
2023-10-21 15:11         ` sbaugh
2023-10-29 11:42           ` Eli Zaretskii
2023-10-19 10:54       ` Po Lu via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-10-21 15:13         ` sbaugh

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).