From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Emanuel Berg Newsgroups: gmane.emacs.help Subject: Re: regular expression Date: Mon, 30 Jun 2014 22:04:39 +0200 Organization: Aioe.org NNTP Server Message-ID: <871tu6b1uw.fsf@debian.uxu> References: NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain X-Trace: ger.gmane.org 1404158727 26924 80.91.229.3 (30 Jun 2014 20:05:27 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Mon, 30 Jun 2014 20:05:27 +0000 (UTC) To: help-gnu-emacs@gnu.org Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Mon Jun 30 22:05:22 2014 Return-path: Envelope-to: geh-help-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1X1hpH-0005pi-PL for geh-help-gnu-emacs@m.gmane.org; Mon, 30 Jun 2014 22:05:19 +0200 Original-Received: from localhost ([::1]:36377 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1X1hpH-0006hD-8l for geh-help-gnu-emacs@m.gmane.org; Mon, 30 Jun 2014 16:05:19 -0400 Original-Path: usenet.stanford.edu!news.kjsl.com!feeder.erje.net!eu.feeder.erje.net!news.albasani.net!news.mixmin.net!aioe.org!.POSTED!not-for-mail Original-Newsgroups: comp.emacs,gnu.emacs.help Original-Followup-To: gnu.emacs.help Original-Lines: 86 Original-NNTP-Posting-Host: SIvZRMPqRkkTHAHL6NkRuw.user.speranza.aioe.org Original-X-Complaints-To: abuse@aioe.org User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3 (gnu/linux) X-Notice: Filtered by postfilter v. 0.8.2 Cancel-Lock: sha1:vfBay9LPxAqo9+iNk0VJwP5B58o= Mail-Copies-To: never Original-Xref: usenet.stanford.edu comp.emacs:102985 gnu.emacs.help:206192 X-BeenThere: help-gnu-emacs@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Users list for the GNU Emacs text editor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Original-Sender: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.help:98463 Archived-At: renato.pontefice@gmail.com writes: > Hi, I'm newbe on this group. > > I know, that I can use Regexp in emacs. And, I would > do that. Can someone help me? > > I have a text file, that is a converted .pdf > file. So, I have many dirty character inside. > > I've found some reg- expression. > > i.e.: in a line like this: > > 40 STREET DW... > > I want to made substitution like this: > > 40#STREEDW... > > can someone help me to build this expression? I suspect it is better to use gnu.emacs.help for this kind of question as that group is much more active. Therefore, I post this on both groups. You can later remove the crosspost depending on where the action is from now on. As for your question, you only give one example so I had to guess a bit what the general case is. For just one example, you might as well use one (non-regexp) search-and-replace, right? But I suspect you want to do this on all cases like this: 40 STREET DW 6 ROAD EW 666 A Z 666 a z So try the below command: (replace-regexp "\\([0-9]+\\) \\([A-Z]+\\) \\([A-Z]+\\)" "\\1#\\2\\3") Here is how it works: [...] are ranges + is "one, or many (but never zero) of the previous" whitespace is whitespace \\(...\\) is a group - those are used in the "replace with" expression - \\1 means insert group 1 (from left to right), and so on. Note that [A-Z] matches [a-z] as well (the lowercase equivalent) unless the variable case-fold-search is nil. If you want to have case-sensitive replacement (where [A-Z] makes sense), you can enclose the command like this: (let ((case-fold-search nil)) (replace-regexp "\\([0-9]+\\) \\([A-Z]+\\) \\([A-Z]+\\)" "\\1#\\2\\3") ) You can watch this in action by running it on the examples above - see how now, the "a z" one is left alone! Yes, you can do this without writing code - but it is easier to write it in code and execute it. The reason is you have better overview and it is easier to adjust the regexp (both the match and replacement parts) - and this is often a thing you'd do a couple of times, to get it right. So that is easier than to input it all again and again interactively. Come back with more question if you have any. Otherwise tell us if you got it to work. Good luck! -- underground experts united: http://user.it.uu.se/~embe8573