* Fwd: Re: Unrecognized \ sequences and Elisp
@ 2004-01-27 3:37 Roland Orre
2004-01-27 9:04 ` Paul Jarc
0 siblings, 1 reply; 10+ messages in thread
From: Roland Orre @ 2004-01-27 3:37 UTC (permalink / raw)
I'm a little concerned about the idea that general string handling
should include interpretation of the content of the string. I don't
really like it. We will get more and more issues like the one below
about elisp.
R5RS defines only \\ and \"; ``Scheme does not specify the effect
of a backslash within a string that is not followed by a
doublequote or backslash.
I consider the reason for escape sequences in strings is to be able
to express a " within a string, i.e. to be able to express a
character that we would otherwise not be able to put in the string.
I consider that it is better to have routines like read-ansi-string
and write-ansi-string, which was previously suggested, and otherwise
leave the rest of formatting to specific format strings as general
interpretation of escape sequences within strings will affect the
general handling of strings in different applications.
I think it is better to go back to the R5RS specification, which I
consider has been somewhat misinterpreted.
Best regards
Roland Orre
On Tue, 2004-01-27 at 03:45, Paul Jarc wrote:
> Neil Jerram <neil@ossau.uklinux.net> wrote:
> > The recent change to signal an error for "unrecognized" \ sequences
> > has negatively affected the Elisp translator, because Elisp code often
> > uses "\(" in doc strings. (I think this is when the "(" would
> > otherwise be in column 0, to avoid Emacs thinking that it is the start
> > of a new defun.)
>
> Would it work to change the Emacs docstrings to look like this?
> "foo...\
> \n(bar...)"
>
> > Is it reasonable to again allow "\(" as a special case
>
> That would avoid the need to edit the Emacs sources, though it's a bit
> ugly. I guess the Right Way would be to fix Emacs so it doesn't get
> confused by parentheses inside strings.
>
>
> paul
_______________________________________________
Guile-devel mailing list
Guile-devel@gnu.org
http://mail.gnu.org/mailman/listinfo/guile-devel
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Fwd: Re: Unrecognized \ sequences and Elisp 2004-01-27 3:37 Fwd: Re: Unrecognized \ sequences and Elisp Roland Orre @ 2004-01-27 9:04 ` Paul Jarc 2004-01-27 12:50 ` Roland Orre 0 siblings, 1 reply; 10+ messages in thread From: Paul Jarc @ 2004-01-27 9:04 UTC (permalink / raw) Cc: guile-devel Roland Orre <orre@nada.kth.se> wrote: > R5RS defines only \\ and \"; ``Scheme does not specify the effect > of a backslash within a string that is not followed by a > doublequote or backslash. Then portable programs cannot use other escape sequences, which means that implementations can invent meanings for other escape sequences without breaking portable programs. > I consider the reason for escape sequences in strings is to be able > to express a " within a string, i.e. to be able to express a > character that we would otherwise not be able to put in the string. They're also useful for representing characters that would be difficult to work with literally - control characters, etc. > I consider that it is better to have routines like read-ansi-string > and write-ansi-string, which was previously suggested, Do you have a reference? I don't know what those are intended to do, but it doesn't sound like the same job that escape sequences do. > general interpretation of escape sequences within strings will > affect the general handling of strings in different applications. Well, if we see the two characters \( in the middle of a string, then the program is not portable, and we have to decide on our own what to do about it. Treating it like plain ( eliminates the possibility of future extensions, so now Guile has been changed to throw an error - i.e., it refuses to interpret this escape sequence at all. Isn't that what you want? paul _______________________________________________ Guile-devel mailing list Guile-devel@gnu.org http://mail.gnu.org/mailman/listinfo/guile-devel ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Unrecognized \ sequences and Elisp 2004-01-27 9:04 ` Paul Jarc @ 2004-01-27 12:50 ` Roland Orre 2004-01-27 16:26 ` tomas ` (2 more replies) 0 siblings, 3 replies; 10+ messages in thread From: Roland Orre @ 2004-01-27 12:50 UTC (permalink / raw) Cc: guile-devel On Tue, 2004-01-27 at 10:04, Paul Jarc wrote: > Then portable programs cannot use other escape sequences, which means > that implementations can invent meanings for other escape sequences > without breaking portable programs. OK, the question then is what to do with those escape sequences which are not understood? My view is that with a minimum of defined escape sequences as in R5RS you have a lot of freedom but not understood sequences should then not generate error, only be passed through. The follow up question then is of course, should "foo\(bar" pass as "foo(bar" or "foo\(bar"? >From my view it it sounds quite wrong to have to change the actual sequences now occuring in elisp to fit guile. This is the wrong way around, as I considier that guile should be a general platform. > > I consider the reason for escape sequences in strings is to be able > > to express a " within a string, i.e. to be able to express a > > character that we would otherwise not be able to put in the string. > They're also useful for representing characters that would be > difficult to work with literally - control characters, etc. Yes, I agree to a certain extent, but this can be handled other ways, (string-append "foo" (string #\nl) "bar") is of course a lot uglier than "foo\nbar" but in ordinary code you don't need this very often. This facility is mostly useful in format strings, and I consider it is better to be kept there. One way to view the question is, do we want the strings to actually contain what control characters etc we feed into them, or do we only want them to be printed in a certain way? When I (define foobar "foo\nbar") do I want it to really evaluate to "foo bar" or do I just want it to print a newline when passed through a certain write command? For my own I consider the latter the more useful approach. > > I consider that it is better to have routines like read-ansi-string > > and write-ansi-string, which was previously suggested, > > Do you have a reference? I don't know what those are intended to do, > but it doesn't sound like the same job that escape sequences do. Sorry, this was no deep thinking, I just found some statements about this in the mail archives: http://mail.gnu.org/archive/html/guile-devel/2001-05/msg00465.html > > general interpretation of escape sequences within strings will > > affect the general handling of strings in different applications. > > Well, if we see the two characters \( in the middle of a string, then > the program is not portable, and we have to decide on our own what to > do about it. Treating it like plain ( eliminates the possibility of > future extensions, so now Guile has been changed to throw an error - > i.e., it refuses to interpret this escape sequence at all. Isn't that > what you want? Well, emacs regexp expressions uses e.g \( and \| where guile regexp uses only ( and |. I see the whole issue as tricky. Even Aubrey Jaffer's scm and emacs lisp interpret e.g. "foo\nbar" as "foo bar" I think I would have preferred if escape sequences in read/write of strings in general never had become an issue, as these mostly have to do with formatting, which should be kept separate. My view is somewhat that when escape sequences are used in strings, then they should work so that any character after the escape character is feed into the string, which makes \\ and \" meaningful, but this then create insane strings like "\\(foo\\|bar\\)" if the escape character itself is used by the application. I'm stuck, I can't find a good solution to this, but my worry is that more complex handling of escape sequences in strings will cause more problems and extra work than it solves. Best regards Roland _______________________________________________ Guile-devel mailing list Guile-devel@gnu.org http://mail.gnu.org/mailman/listinfo/guile-devel ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Unrecognized \ sequences and Elisp 2004-01-27 12:50 ` Roland Orre @ 2004-01-27 16:26 ` tomas 2004-01-27 17:43 ` Stephen Compall 2004-01-28 17:03 ` Neil Jerram 2 siblings, 0 replies; 10+ messages in thread From: tomas @ 2004-01-27 16:26 UTC (permalink / raw) Cc: Paul Jarc, guile-devel On Tue, Jan 27, 2004 at 01:50:14PM +0100, Roland Orre wrote: > On Tue, 2004-01-27 at 10:04, Paul Jarc wrote: > > Then portable programs cannot use other escape sequences, which means > > that implementations can invent meanings for other escape sequences > > without breaking portable programs. [...] > When I (define foobar "foo\nbar") do I want it to really evaluate to > "foo > bar" > or do I just want it to print a newline when passed through a certain > write command? For my own I consider the latter the more useful > approach. Hmm. I'd rather expect the former (that's what ``all others'', from C through Elisp settled on). [...] Regards -- tomás _______________________________________________ Guile-devel mailing list Guile-devel@gnu.org http://mail.gnu.org/mailman/listinfo/guile-devel ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Unrecognized \ sequences and Elisp 2004-01-27 12:50 ` Roland Orre 2004-01-27 16:26 ` tomas @ 2004-01-27 17:43 ` Stephen Compall 2004-01-28 17:03 ` Neil Jerram 2 siblings, 0 replies; 10+ messages in thread From: Stephen Compall @ 2004-01-27 17:43 UTC (permalink / raw) Cc: Paul Jarc, guile-devel Roland Orre <orre@nada.kth.se> (not really) writes: > > Well, if we see the two characters \( in the middle of a string, then > > the program is not portable, and we have to decide on our own what to > > do about it. Treating it like plain ( eliminates the possibility of > > future extensions, so now Guile has been changed to throw an error - > > i.e., it refuses to interpret this escape sequence at all. Isn't that > > what you want? Speaking of future extensions, why not just make this extensible, then, and use the needed extensions when reading elisp? -- Stephen Compall or s11 or sirian If it were thought that anything I wrote was influenced by Robert Frost, I would take that particular work of mine, shred it, and flush it down the toilet, hoping not to clog the pipes. A more sententious, holding- forth old bore who expected every hero-worshiping adenoidal little twerp of a student-poet to hang on to his every word I never saw. -- James Dickey UFO STARLAN Glock Leitrim military event security Audiotel Marxist genetic defense information warfare MDA Panama BROMURE MD5 Firefly _______________________________________________ Guile-devel mailing list Guile-devel@gnu.org http://mail.gnu.org/mailman/listinfo/guile-devel ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Unrecognized \ sequences and Elisp 2004-01-27 12:50 ` Roland Orre 2004-01-27 16:26 ` tomas 2004-01-27 17:43 ` Stephen Compall @ 2004-01-28 17:03 ` Neil Jerram 2004-02-08 19:16 ` Neil Jerram 2 siblings, 1 reply; 10+ messages in thread From: Neil Jerram @ 2004-01-28 17:03 UTC (permalink / raw) Cc: Paul Jarc, guile-devel >>>>> "Roland" == Roland Orre <orre@nada.kth.se> writes: Roland> On Tue, 2004-01-27 at 10:04, Paul Jarc wrote: >> Then portable programs cannot use other escape sequences, which means >> that implementations can invent meanings for other escape sequences >> without breaking portable programs. Roland> OK, the question then is what to do with those escape sequences which Roland> are not understood? My view is that with a minimum of defined escape Roland> sequences as in R5RS you have a lot of freedom but not understood Roland> sequences should then not generate error, only be passed through. Roland> The follow up question then is of course, should "foo\(bar" pass Roland> as "foo(bar" or "foo\(bar"? The position as regards portable Scheme programs is clear: a portable R5RS program should not use anything that is not defined by R5RS, which includes \(. If a supposedly portable program accidentally uses something non-portable, it is usually better to get a clearly signaled error, than (silently) some behavior which may not be what was intended. Guile by default goes a long way beyond R5RS, so there is no good portability argument against allowing \( in Guile at all. If there was, then an exactly analogous argument could be made against Guile's many non-R5RS primitives that are provided in the default environment: it could be argued that Guile should signal an error if any non-R5RS procedure was assumed to exist. Guile also provides a strict R5RS mode, though, where the last clause is true. In this mode, use of \( should signal an error. Finally, there is the fact that \( per se isn't actually useful, other than for handling existing Elisp code. This tends to suggest that \( should not be on by default, even outside strict R5RS mode. Therefore I propose to reinstate \( support, but subject to the SCM_ENABLE_ELISP #define and a runtime read option. I also propose to do the same for any other bits of Elisp read syntax that I need. (Currently this is [ ] for vectors and \).) This means that to make \( etc. work, you'd need both to configure with --enable-elisp, and to set a read option using (read-enable 'escaped-parentheses). Does this sound OK? Roland> Yes, I agree to a certain extent, but this can be handled other ways, Roland> (string-append "foo" (string #\nl) "bar") Roland> is of course a lot uglier than "foo\nbar" but in ordinary code you Roland> don't need this very often. This facility is mostly useful in Roland> format strings, and I consider it is better to be kept there. Roland> One way to view the question is, do we want the strings to actually Roland> contain what control characters etc we feed into them, or do we Roland> only want them to be printed in a certain way? Roland> When I (define foobar "foo\nbar") do I want it to really evaluate to Roland> "foo Roland> bar" Roland> or do I just want it to print a newline when passed through a certain Roland> write command? For my own I consider the latter the more useful Roland> approach. What a fascinating idea! I'd never thought of things like this before. However, I agree with others' replies that it would be odd for Guile to differ in its interpretation here from most other programming languages. I also think it would be counter intuitive to be able to form a string by concatenating "foo\" and "nbar", and then for this string to print with a newline in the middle of it. I also worry that your approach would affect the ability to repeatedly write and reread strings (the intention being that you get an equivalent string back, of course). So in summary, I'm afraid I don't think we should switch to your alternative. Roland> Well, emacs regexp expressions uses e.g \( and \| where guile regexp Roland> uses only ( and |. Just to avert any possible confusion here, note that this difference is one level higher than the current string syntax issue. As strings, these are: "\\(" and "\\|" in emacs "(" and "|" in guile. Regards, Neil _______________________________________________ Guile-devel mailing list Guile-devel@gnu.org http://mail.gnu.org/mailman/listinfo/guile-devel ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Unrecognized \ sequences and Elisp 2004-01-28 17:03 ` Neil Jerram @ 2004-02-08 19:16 ` Neil Jerram 2004-02-08 22:06 ` Marius Vollmer 0 siblings, 1 reply; 10+ messages in thread From: Neil Jerram @ 2004-02-08 19:16 UTC (permalink / raw) Cc: Paul Jarc >>>>> "Neil" == Neil Jerram <neil@ossau.uklinux.net> writes: Neil> Therefore I propose to reinstate \( support, but subject to the Neil> SCM_ENABLE_ELISP #define and a runtime read option. I also propose to Neil> do the same for any other bits of Elisp read syntax that I need. Neil> (Currently this is [ ] for vectors and \).) This means that to make Neil> \( etc. work, you'd need both to configure with --enable-elisp, and to Neil> set a read option using (read-enable 'escaped-parentheses). Neil> Does this sound OK? In the absence of any responses, I've committed this change: * read.c (scm_read_opts): New opts `elisp-vectors' and `escaped-parens'. (s_vector): New. (scm_lreadr): Use scm_lreadparen1 instead of scm_lreadparen. Make handling of elisp vector syntax dependent on SCM_ENABLE_ELISP and `elisp-vectors' option instead of SCM_ELISP_READ_EXTENSIONS. Allow "\(" and "\)" in strings when SCM_ENABLE_ELISP defined and `escaped-parens' option set. (scm_read_token): If elisp vector syntax active, disallow [ and ] in tokens. (scm_lreadparen): Rewrite as interface to scm_lreadparen1. (scm_lreadparen1): New. * read.h: Remove conditionally compiled last arg to scm_lreadparen. (SCM_ELISP_VECTORS_P, SCM_ESCAPED_PARENS_P): New. Because it is declared with SCM_API, I've taken care, and introduced a slight extra complexity, to preserve the interface of scm_lreadparen. Is this really necessary? Neil _______________________________________________ Guile-devel mailing list Guile-devel@gnu.org http://mail.gnu.org/mailman/listinfo/guile-devel ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Unrecognized \ sequences and Elisp 2004-02-08 19:16 ` Neil Jerram @ 2004-02-08 22:06 ` Marius Vollmer 2004-02-10 19:27 ` Neil Jerram 0 siblings, 1 reply; 10+ messages in thread From: Marius Vollmer @ 2004-02-08 22:06 UTC (permalink / raw) Cc: Paul Jarc, guile-devel Neil Jerram <neil@ossau.uklinux.net> writes: > In the absence of any responses, I've committed this change: > > * read.c (scm_read_opts): New opts `elisp-vectors' and > `escaped-parens'. What about having 'elisp-strings' instead escaped-parens? Escaped-parens seems very, very specific to me but still its only use is to allow the reading of elisp strings. In the long run, I think we should have a configurable reader, much like Common Lisp specifies it. > Because it is declared with SCM_API, I've taken care, and introduced a > slight extra complexity, to preserve the interface of scm_lreadparen. > Is this really necessary? It is a good thing to do, but i this case, not strictly necessary, I'd say. You could also have renamed scm_lreadparen to scm_i_lreadparen (or made it static, maybe) and thus removed it from the public interface of libguile. -- GPG: D5D4E405 - 2F9B BCCC 8527 692A 04E3 331E FAF8 226A D5D4 E405 _______________________________________________ Guile-devel mailing list Guile-devel@gnu.org http://mail.gnu.org/mailman/listinfo/guile-devel ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Unrecognized \ sequences and Elisp 2004-02-08 22:06 ` Marius Vollmer @ 2004-02-10 19:27 ` Neil Jerram 2004-02-18 21:10 ` Marius Vollmer 0 siblings, 1 reply; 10+ messages in thread From: Neil Jerram @ 2004-02-10 19:27 UTC (permalink / raw) Cc: Paul Jarc, guile-devel >>>>> "Marius" == Marius Vollmer <mvo@zagadka.de> writes: Marius> What about having 'elisp-strings' instead escaped-parens? Marius> Escaped-parens seems very, very specific to me but still its only use Marius> is to allow the reading of elisp strings. OK, I'll make this change. Marius> In the long run, I think we should have a configurable reader, much Marius> like Common Lisp specifies it. Can you give a reference? >> Because it is declared with SCM_API, I've taken care, and introduced a >> slight extra complexity, to preserve the interface of scm_lreadparen. >> Is this really necessary? Marius> It is a good thing to do, but i this case, not strictly necessary, I'd Marius> say. You could also have renamed scm_lreadparen to scm_i_lreadparen Marius> (or made it static, maybe) and thus removed it from the public Marius> interface of libguile. So ... (1) In this case, should I simplify my change by: - eliminating scm_lreadparen (as it is now) - renaming scm_lreadparen1 to scm_lreadparen - making scm_lreadparen static? (2) In general, how do users know what in the C API we are trying to preserve? (I think the answer should be "whatever is documented in the manual", but currently that probably doesn't cover enough.) Thanks, Neil _______________________________________________ Guile-devel mailing list Guile-devel@gnu.org http://mail.gnu.org/mailman/listinfo/guile-devel ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Unrecognized \ sequences and Elisp 2004-02-10 19:27 ` Neil Jerram @ 2004-02-18 21:10 ` Marius Vollmer 0 siblings, 0 replies; 10+ messages in thread From: Marius Vollmer @ 2004-02-18 21:10 UTC (permalink / raw) Cc: Paul Jarc, guile-devel Neil Jerram <neil@ossau.uklinux.net> writes: > Marius> What about having 'elisp-strings' instead > Marius> escaped-parens? Escaped-parens seems very, very > Marius> specific to me but still its only use is to allow the > Marius> reading of elisp strings. > > OK, I'll make this change. Thanks! > Marius> In the long run, I think we should have a configurable > Marius> reader, much like Common Lisp specifies it. > > Can you give a reference? Here: http://www.lispworks.com/reference/HyperSpec/Body/02_aa.htm > Marius> It is a good thing to do, but i this case, not strictly > Marius> necessary, I'd say. You could also have renamed > Marius> scm_lreadparen to scm_i_lreadparen (or made it static, > Marius> maybe) and thus removed it from the public interface of > Marius> libguile. > > So ... > > (1) In this case, should I simplify my change by: > > - eliminating scm_lreadparen (as it is now) > > - renaming scm_lreadparen1 to scm_lreadparen > > - making scm_lreadparen static? Yes. > (2) In general, how do users know what in the C API we are trying to > preserve? (I think the answer should be "whatever is documented > in the manual", but currently that probably doesn't cover enough.) There is a naming convention: everything starting with "scm_i_" is internal. But not everything that shuld be internal has been marked this way yet... -- GPG: D5D4E405 - 2F9B BCCC 8527 692A 04E3 331E FAF8 226A D5D4 E405 _______________________________________________ Guile-devel mailing list Guile-devel@gnu.org http://mail.gnu.org/mailman/listinfo/guile-devel ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2004-02-18 21:10 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2004-01-27 3:37 Fwd: Re: Unrecognized \ sequences and Elisp Roland Orre 2004-01-27 9:04 ` Paul Jarc 2004-01-27 12:50 ` Roland Orre 2004-01-27 16:26 ` tomas 2004-01-27 17:43 ` Stephen Compall 2004-01-28 17:03 ` Neil Jerram 2004-02-08 19:16 ` Neil Jerram 2004-02-08 22:06 ` Marius Vollmer 2004-02-10 19:27 ` Neil Jerram 2004-02-18 21:10 ` Marius Vollmer
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).