From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: pjb@informatimago.com (Pascal J. Bourguignon) Newsgroups: gmane.emacs.help Subject: Re: avoid interpretation of \n, \t, ... in string Date: Wed, 28 Jan 2009 15:02:07 +0100 Organization: Anevia SAS Message-ID: <7czlhbil5s.fsf@pbourguignon.anevia.com> References: <665cbb9b-8140-489e-a4d8-a15acce224be@r37g2000prr.googlegroups.com> <7c7i4fkayi.fsf@pbourguignon.informatimago.com> <986c2f5b-67bb-4059-a4b3-f1748dc45e50@z27g2000prd.googlegroups.com> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: ger.gmane.org 1233153656 11501 80.91.229.12 (28 Jan 2009 14:40:56 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Wed, 28 Jan 2009 14:40:56 +0000 (UTC) To: help-gnu-emacs@gnu.org Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Wed Jan 28 15:42:09 2009 Return-path: Envelope-to: geh-help-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1LSBcA-0004DI-FK for geh-help-gnu-emacs@m.gmane.org; Wed, 28 Jan 2009 15:42:03 +0100 Original-Received: from localhost ([127.0.0.1]:45046 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1LSBas-0003Cb-6u for geh-help-gnu-emacs@m.gmane.org; Wed, 28 Jan 2009 09:40:42 -0500 Original-Path: news.stanford.edu!newsfeed.stanford.edu!postnews.google.com!news3.google.com!proxad.net!feeder1-2.proxad.net!cleanfeed2-b.proxad.net!nnrp8-2.free.fr!not-for-mail Original-Newsgroups: gnu.emacs.help Face: iVBORw0KGgoAAAANSUhEUgAAADAAAAAwAQMAAABtzGvEAAAABlBMVEUAAAD///+l2Z/dAAAA oElEQVR4nK3OsRHCMAwF0O8YQufUNIQRGIAja9CxSA55AxZgFO4coMgYrEDDQZWPIlNAjwq9 033pbOBPtbXuB6PKNBn5gZkhGa86Z4x2wE67O+06WxGD/HCOGR0deY3f9Ijwwt7rNGNf6Oac l/GuZTF1wFGKiYYHKSFAkjIo1b6sCYS1sVmFhhhahKQssRjRT90ITWUk6vvK3RsPGs+M1RuR mV+hO/VvFAAAAABJRU5ErkJggg== X-Accept-Language: fr, es, en X-Disabled: X-No-Archive: no User-Agent: Gnus/5.1008 (Gnus v5.10.8) Emacs/22.2 (gnu/linux) Cancel-Lock: sha1:ZDhkOTgwNWI2MTEyMjE1MTRiOTY0YjFjYWY0OThmYTkyZTJlYzdjMg== Original-Lines: 144 Original-NNTP-Posting-Date: 28 Jan 2009 15:02:07 MET Original-NNTP-Posting-Host: 88.170.236.224 Original-X-Trace: 1233151327 news-2.free.fr 1465 88.170.236.224:56989 Original-X-Complaints-To: abuse@proxad.net Original-Xref: news.stanford.edu gnu.emacs.help:166409 X-BeenThere: help-gnu-emacs@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Users list for the GNU Emacs text editor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.help:61729 Archived-At: Peter Tury writes: > Hi, > > Pascal J. Bourguignon wrote: > >> Switch to Common Lisp. There's no reader macro in emacs lisp, so you >> cannot do much about it. In Common Lisp, you can trivially implement > > I think this will be a longer journey sometime in the future. CL is on > my "todo" list for some time ;-) > >> Ok, another way to do it would be to store your paths in a file, and >> to read it: >> >> (defun read-paths (file) >> (with-temp-buffer >> (insert-file-contents file) >> (delete "" (split-string (buffer-substring-no-properties >> (point-min) (point-max)) >> "[\n\r]+")))) > > Great, thanks! > I've checked it and found that in fact `buffer-substring-no- > properties' does the trick here. So my original question can be > reformulated now: > > ---> is there a way to get string (text) representation in a form as > `buffer-substring-no-properties' do it, i.e. duplicating single `\'-s > automatically (without(!) interpreting "pseudo-escape-sequences" (\n, > \t, ...) in the original text)? buffer-substring-no-properties doesn't do anything. There is absolutely no duplicating of any character. Try to understand that there is only one character in the string "\\". (length "\\") --> 1 (insert (format "%s %S" "\\" "\\")) inserts: \ "\\" The double backslash comes from the string quoting. Here are some characters: abc'\"def Now the problem is to quote these characters to be able to put them in a program, as a string literal, so they aren't interpreted as code. We do that by surrounding the characters with double-quotes: "abc'\"def" Oops! That is broken because one of these characters is a double-quote, so we'd interpret that as the string containing the characters: abc'\ followed by the symbol named: def and a stray double-quote " The problem here is that we'd need a way to escape the meaning of the double-quote, so it doesn't mean anymore to close the string literal. The idea is to use an 'escape' character, back-slash. "abc'\\"def" Oops! Still a problem here. Since there is also a back-slash in the string, it needs to be escaped too, otherwise we will consider it escapes the following character... "abc'\\\"def" Ok, so now we can tell that this is a string literal because of the opening double-quote: " that contains the normal characters: abc' * then an escaped character prefixed by: \ which is a back-slash character itself: \ * then an escaped character prefixed by: \ which is a double-quote character itself: " * followed by the normal characters: def * and closed by a double-quote: " So finally, this string literal only contains the characters: abc'\"def This algorithm of reading string literals is implemented by the emacs lisp reader. And of course, when you want to print (format) a string, you can either output the characters contained in the string (format "%s" ...), princ), or output characters that will be read a string literal, with double-quotes and escaping back-slashes (format "%S" ...), prin1, print). (let ((string "abc with escape: \\ and with substring: \"abc\".")) (terpri) (princ "with princ: ") (princ string) (terpri) (princ "with prin1: ") (prin1 string) (terpri) (princ "with print: ") (print string) (terpri)) inserts: with princ: abc with escape: \ and with substring: "abc". with prin1: "abc with escape: \\ and with substring: \"abc\"." with print: "abc with escape: \\ and with substring: \"abc\"." returns: t The double-quotes and back-slashes are added by prin1 and print just to allow reading back data that has been printed. The Common Lisp reader algorithms is more sophisticated, it allows for hooks called reader macros, which let you implement your own string reading algorithm. For example, you could change the escaping character, or not have any, and this would let you write strings containing back-slashes. We would have to change the function read1 in lread.c to add this feature. Unfortunately we cannot just redefine in emeacs lisp such a function, because all the code written in C is already linked to the old function written in C, and wouldn't use our implementation in emacs lisp. We would have to modify the C sources (and have the patch accepted by RMS). -- __Pascal Bourguignon__