From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Marcin Borkowski Newsgroups: gmane.emacs.help Subject: Re: For text processing, which is more powerful, emacs or perl? Date: Fri, 08 Oct 2021 18:52:54 +0200 Message-ID: <87tuhr5uhl.fsf@mbork.pl> References: Mime-Version: 1.0 Content-Type: text/plain Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="15131"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: mu4e 1.1.0; emacs 28.0.50 Cc: help-gnu-emacs To: Hongyi Zhao Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane-mx.org@gnu.org Fri Oct 08 18:54:26 2021 Return-path: Envelope-to: geh-help-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1mYt8c-0003kY-Cu for geh-help-gnu-emacs@m.gmane-mx.org; Fri, 08 Oct 2021 18:54:26 +0200 Original-Received: from localhost ([::1]:42536 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mYt8b-0005hW-2X for geh-help-gnu-emacs@m.gmane-mx.org; Fri, 08 Oct 2021 12:54:25 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:54518) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mYt7Q-0005gc-0Z for help-gnu-emacs@gnu.org; Fri, 08 Oct 2021 12:53:15 -0400 Original-Received: from mail.mojserwer.eu ([195.110.48.8]:37076) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mYt7J-0002H7-Q5 for help-gnu-emacs@gnu.org; Fri, 08 Oct 2021 12:53:11 -0400 Original-Received: from localhost (localhost [127.0.0.1]) by mail.mojserwer.eu (Postfix) with ESMTP id CD748E6AC9; Fri, 8 Oct 2021 18:53:00 +0200 (CEST) X-Virus-Scanned: Debian amavisd-new at mail.mojserwer.eu Original-Received: from mail.mojserwer.eu ([127.0.0.1]) by localhost (mail.mojserwer.eu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id sh74dy3fkqDq; Fri, 8 Oct 2021 18:52:57 +0200 (CEST) Original-Received: from localhost (83.8.170.120.ipv4.supernova.orange.pl [83.8.170.120]) by mail.mojserwer.eu (Postfix) with ESMTPSA id D0C85E681D; Fri, 8 Oct 2021 18:52:57 +0200 (CEST) In-reply-to: Received-SPF: pass client-ip=195.110.48.8; envelope-from=mbork@mbork.pl; helo=mail.mojserwer.eu X-Spam_score_int: -25 X-Spam_score: -2.6 X-Spam_bar: -- X-Spam_report: (-2.6 / 5.0 requ) BAYES_00=-1.9, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: help-gnu-emacs@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Users list for the GNU Emacs text editor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: "help-gnu-emacs" Xref: news.gmane.io gmane.emacs.help:133657 Archived-At: On 2020-12-19, at 08:43, Hongyi Zhao wrote: > It's well known that perl's regexp is very powerful for its capability > of text processing. So, which is more powerful, emacs or perl, in this > scenario? While others offered free jokes and musings about performance (which is usually irrelevant unless you process multi-gigabyte files, craft malicious regexen or do something extremely complicated), let me mention something that wasn't mentioned in this thread (or so I think). Perl (or Python, or whatever "mainstream" language you take) usually processes texts in a traditional way, using strings and regexen. Emacs, OTOH, is a text editor, with all its concepts, so you may process texts using buffers, which differ from strings in one important respect: the notion of /point/. (There are other differences, too, e.g., buffers are usually more performant.) This means that you can write code that processes text like a human editor would do, in terms of "moving the point a word forward, transposing two sentences, deleting from point to the end of line" etc. It's much like the difference between classical graphical operations (draw a point, a line segment between two points, a circle with given center and radius etc.) and turtle graphics known from LOGO. I'm not saying that the Emacs way is definitely better - that probably depends on the context and the nature of your text processing - but you might find it quite intuitive and easier, both in implementing and in studying existing code (when you e.g. need to improve something you wrote 6 months earlier and you remember nothing about the implementation). What is the point of having unreadable code running in 2 milliseconds instead of clear and easy to understand code running in even 200 milliseconds when you don't want to run it a million times, but you need to modify it from time to time? Hth, -- Marcin Borkowski http://mbork.pl