From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Thierry Volpiatto Newsgroups: gmane.emacs.help Subject: Re: uniq without sort <-------------- GURU NEEDED Date: Fri, 25 Jan 2008 08:56:10 +0100 Message-ID: <87lk6e9w4l.fsf@gmail.com> References: <08698dfa-d5dc-484d-a5fb-fe84bb0d2893@s13g2000prd.googlegroups.com> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: ger.gmane.org 1201247803 30543 80.91.229.12 (25 Jan 2008 07:56:43 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Fri, 25 Jan 2008 07:56:43 +0000 (UTC) Cc: help-gnu-emacs@gnu.org To: gnuist006@gmail.com Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Fri Jan 25 08:57:02 2008 Return-path: Envelope-to: geh-help-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1JIJQq-00088h-2v for geh-help-gnu-emacs@m.gmane.org; Fri, 25 Jan 2008 08:57:00 +0100 Original-Received: from localhost ([127.0.0.1] helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1JIJQP-0004dE-Oy for geh-help-gnu-emacs@m.gmane.org; Fri, 25 Jan 2008 02:56:33 -0500 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1JIJQB-0004d9-AB for help-gnu-emacs@gnu.org; Fri, 25 Jan 2008 02:56:19 -0500 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1JIJQ9-0004cx-Tn for help-gnu-emacs@gnu.org; Fri, 25 Jan 2008 02:56:18 -0500 Original-Received: from [199.232.76.173] (helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1JIJQ9-0004cu-Pe for help-gnu-emacs@gnu.org; Fri, 25 Jan 2008 02:56:17 -0500 Original-Received: from mx20.gnu.org ([199.232.41.8]) by monty-python.gnu.org with esmtps (TLS-1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.60) (envelope-from ) id 1JIJQ9-0005Z3-51 for help-gnu-emacs@gnu.org; Fri, 25 Jan 2008 02:56:17 -0500 Original-Received: from fg-out-1718.google.com ([72.14.220.159]) by mx20.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1JIJQ7-0006eQ-Sh for help-gnu-emacs@gnu.org; Fri, 25 Jan 2008 02:56:16 -0500 Original-Received: by fg-out-1718.google.com with SMTP id d23so563844fga.30 for ; Thu, 24 Jan 2008 23:56:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:from:to:cc:subject:references:date:in-reply-to:message-id:user-agent:mime-version:content-type; bh=vdmelYynGdgmLx9TmHvx+PIKMdYh5iQN3d12CJ9+bFE=; b=TgUOh45fVcfOU7tisCKWNW5KVJRN5NjmuLJSKsyyLmpbVqcoSA6g5PYLNE5klk96TWapd2i6n7cAFtv7ARilF7P2+XNppmKBwHNq7tb7ZqsWAfmk+bi78Y6XL8myxqZV0axdNRshFImq3DNSs2n7FZAc9kQCysnJ3T/QBYe68Q4= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=from:to:cc:subject:references:date:in-reply-to:message-id:user-agent:mime-version:content-type; b=mMAfY7FplFeGxWFoQUFqHweTDNz9OYzTrbI9fVH0xQ+WgyOwcOgmwWPP4HZnM2jH8sdEh+UYk91Iwm8NQsy+04jq2NYdmsLs8J9s0+cb+CxnWIZWrN2CWIF30sN1Uwf5xbWW0A/22xRE5dd2qfvtX0ebBmUFEn5E1Mz3TxS2hb0= Original-Received: by 10.86.74.15 with SMTP id w15mr1548095fga.46.1201247773754; Thu, 24 Jan 2008 23:56:13 -0800 (PST) Original-Received: from laptop ( [78.114.32.179]) by mx.google.com with ESMTPS id 4sm1754486fge.3.2008.01.24.23.56.11 (version=TLSv1/SSLv3 cipher=OTHER); Thu, 24 Jan 2008 23:56:12 -0800 (PST) In-Reply-To: <08698dfa-d5dc-484d-a5fb-fe84bb0d2893@s13g2000prd.googlegroups.com> (gnuist's message of "Thu, 24 Jan 2008 18:45:24 -0800 (PST)") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.0.60 (gnu/linux) X-detected-kernel: by mx20.gnu.org: Linux 2.6 (newer, 2) X-detected-kernel: by monty-python.gnu.org: Linux 2.6, seldom 2.4 (older, 4) X-BeenThere: help-gnu-emacs@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Users list for the GNU Emacs text editor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.help:50942 Archived-At: gnuist006@gmail.com writes: > This is a tough problem, and needs a guru. > > I know it is very easy to find uniq or non-uniq lines if you scramble > all of them and sort them. Its trivially > > echo -e "a\nc\nd\nb\nc\nd" | sort | uniq > > $ echo -e "a\nc\nd\nb\nc\nd" > a > c > d > b > c > d > > $ echo -e "a\nc\nd\nb\nc\nd"|sort|uniq > a > b > c > d > > > So it is TRIVIAL with sort. > > I want uniq without sorting the initial order. > > The algorithm is this. For every line, look above if there is another > line like it. If so, then ignore it. If not, then output it. I am > sure, I can spend some time to write this in C. But what is the > solution using shell ? This way I can get an output that preserves the > order of first occurrence. It is needed in many problems. Here in python but the same can be done in lisp or shell In [13]: B = ["a", "c", "d", "b", "e", "a", "d", "e"] In [14]: A = [] In [15]: for i in B: ....: if i not in A: A.append(i) In [16]: A Out[16]: ['a', 'c', 'd', 'b', 'e'] -- A + Thierry Pub key: http://pgp.mit.edu