all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: pjb@informatimago.com (Pascal J. Bourguignon)
To: help-gnu-emacs@gnu.org
Subject: Re: avoid interpretation of \n, \t, ... in string
Date: Wed, 28 Jan 2009 15:02:07 +0100	[thread overview]
Message-ID: <7czlhbil5s.fsf@pbourguignon.anevia.com> (raw)
In-Reply-To: 986c2f5b-67bb-4059-a4b3-f1748dc45e50@z27g2000prd.googlegroups.com

Peter Tury <tury.peter@gmail.com> writes:

> Hi,
>
> Pascal J. Bourguignon wrote:
>
>> Switch to Common Lisp.  There's no reader macro in emacs lisp, so you
>> cannot do much about it.  In Common Lisp, you can trivially implement
>
> I think this will be a longer journey sometime in the future. CL is on
> my "todo" list for some time ;-)
>
>> Ok, another way to do it would be to store your paths in a file, and
>> to read it:
>>
>> (defun read-paths (file)
>>   (with-temp-buffer
>>     (insert-file-contents file)
>>     (delete "" (split-string (buffer-substring-no-properties
>>                               (point-min) (point-max))
>>                    "[\n\r]+"))))
>
> Great, thanks!
> I've checked it and found that in fact `buffer-substring-no-
> properties' does the trick here. So my original question can be
> reformulated now:
>
> ---> is there a way to get string (text) representation in a form as
> `buffer-substring-no-properties' do it, i.e. duplicating single `\'-s
> automatically (without(!) interpreting "pseudo-escape-sequences" (\n,
> \t, ...) in the original text)?

buffer-substring-no-properties doesn't do anything.  There is
absolutely no duplicating of any character.

Try to understand that there is only one character in the string "\\".

(length "\\") --> 1

  (insert (format "%s %S" "\\" "\\")) 

inserts:

  \ "\\"


The double backslash comes from the string quoting.

Here are some characters:  abc'\"def

Now the problem is to quote these characters to be able to put them in
a program, as a string literal, so they aren't interpreted as code.
We do that by surrounding the characters with double-quotes:

                          "abc'\"def"

Oops!  That is broken because one of these characters is a
double-quote, so we'd interpret that as the string containing the
characters:
                           abc'\
followed by the symbol named:   def
and a stray double-quote           "

The problem here is that we'd need a way to escape the meaning of the
double-quote, so it doesn't mean anymore to close the string literal.
The idea is to use an 'escape' character, back-slash.

                          "abc'\\"def"

Oops!  Still a problem here.  Since there is also a back-slash in the
string, it needs to be escaped too, otherwise we will consider it
escapes the following character...

                          "abc'\\\"def"

Ok, so now we can tell that this is a string literal because of the
opening double-quote:     "
that contains the normal characters:
                           abc'                            *
then an escaped character prefixed by:
                               \
which is a back-slash character itself:
                                \                          *
then an escaped character prefixed by:
                                  \
which is a double-quote character itself:
                                   "                       *
followed by the normal characters:  def                    *
and closed by a double-quote:          "

So finally, this string literal only contains the characters:
                           abc'\"def


This algorithm of reading string literals is implemented by the emacs
lisp reader.  And of course, when you want to print (format) a string,
you can either output the characters contained in the string (format
"%s" ...), princ), or output characters that will be read a string
literal, with double-quotes and escaping back-slashes (format "%S"
...), prin1, print).

(let ((string "abc with escape: \\ and with substring: \"abc\"."))
   (terpri)
   (princ "with princ: ") (princ string)
   (terpri)
   (princ "with prin1: ") (prin1 string)
   (terpri)
   (princ "with print: ") (print string)
   (terpri))

inserts:

with princ: abc with escape: \ and with substring: "abc".
with prin1: "abc with escape: \\ and with substring: \"abc\"."
with print: 
"abc with escape: \\ and with substring: \"abc\"."

returns: t  
   
The double-quotes and back-slashes are added by prin1 and print just
to allow reading back data that has been printed.







The Common Lisp reader algorithms is more sophisticated, it allows for
hooks called reader macros, which let you implement your own string
reading algorithm.  For example, you could change the escaping
character, or not have any, and this would let  you write strings
containing back-slashes.

We would have to change the function read1 in lread.c to add this
feature.  Unfortunately we cannot just redefine in emeacs lisp such a
function, because all the code written in C is already linked to the
old function written in C, and wouldn't use our implementation in
emacs lisp.  We would have to modify the C sources (and have the patch
accepted by RMS).


-- 
__Pascal Bourguignon__


  reply	other threads:[~2009-01-28 14:02 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-01-27 17:13 avoid interpretation of \n, \t, ... in string Peter Tury
2009-01-27 18:26 ` Peter Dyballa
     [not found] ` <mailman.5991.1233080786.26697.help-gnu-emacs@gnu.org>
2009-01-27 18:38   ` Peter Tury
2009-01-28  7:52 ` Kevin Rodgers
2009-01-28  9:59 ` Pascal J. Bourguignon
2009-01-28 12:24   ` Peter Tury
2009-01-28 14:02     ` Pascal J. Bourguignon [this message]
     [not found] ` <mailman.6050.1233129149.26697.help-gnu-emacs@gnu.org>
2009-01-28 10:16   ` Peter Tury

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7czlhbil5s.fsf@pbourguignon.anevia.com \
    --to=pjb@informatimago.com \
    --cc=help-gnu-emacs@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.