* uniq without sort <-------------- GURU NEEDED
@ 2008-01-25 2:45 gnuist006
2008-01-25 7:56 ` Thierry Volpiatto
` (3 more replies)
0 siblings, 4 replies; 5+ messages in thread
From: gnuist006 @ 2008-01-25 2:45 UTC (permalink / raw)
To: help-gnu-emacs
This is a tough problem, and needs a guru.
I know it is very easy to find uniq or non-uniq lines if you scramble
all of them and sort them. Its trivially
echo -e "a\nc\nd\nb\nc\nd" | sort | uniq
$ echo -e "a\nc\nd\nb\nc\nd"
a
c
d
b
c
d
$ echo -e "a\nc\nd\nb\nc\nd"|sort|uniq
a
b
c
d
So it is TRIVIAL with sort.
I want uniq without sorting the initial order.
The algorithm is this. For every line, look above if there is another
line like it. If so, then ignore it. If not, then output it. I am
sure, I can spend some time to write this in C. But what is the
solution using shell ? This way I can get an output that preserves the
order of first occurrence. It is needed in many problems.
Thanks to the star who can help
gnuist
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: uniq without sort <-------------- GURU NEEDED
2008-01-25 2:45 uniq without sort <-------------- GURU NEEDED gnuist006
@ 2008-01-25 7:56 ` Thierry Volpiatto
2008-01-25 9:11 ` Peter Dyballa
` (2 subsequent siblings)
3 siblings, 0 replies; 5+ messages in thread
From: Thierry Volpiatto @ 2008-01-25 7:56 UTC (permalink / raw)
To: gnuist006; +Cc: help-gnu-emacs
gnuist006@gmail.com writes:
> This is a tough problem, and needs a guru.
>
> I know it is very easy to find uniq or non-uniq lines if you scramble
> all of them and sort them. Its trivially
>
> echo -e "a\nc\nd\nb\nc\nd" | sort | uniq
>
> $ echo -e "a\nc\nd\nb\nc\nd"
> a
> c
> d
> b
> c
> d
>
> $ echo -e "a\nc\nd\nb\nc\nd"|sort|uniq
> a
> b
> c
> d
>
>
> So it is TRIVIAL with sort.
>
> I want uniq without sorting the initial order.
>
> The algorithm is this. For every line, look above if there is another
> line like it. If so, then ignore it. If not, then output it. I am
> sure, I can spend some time to write this in C. But what is the
> solution using shell ? This way I can get an output that preserves the
> order of first occurrence. It is needed in many problems.
Here in python but the same can be done in lisp or shell
In [13]: B = ["a", "c", "d", "b", "e", "a", "d", "e"]
In [14]: A = []
In [15]: for i in B:
....: if i not in A: A.append(i)
In [16]: A
Out[16]: ['a', 'c', 'd', 'b', 'e']
--
A + Thierry
Pub key: http://pgp.mit.edu
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: uniq without sort <-------------- GURU NEEDED
2008-01-25 2:45 uniq without sort <-------------- GURU NEEDED gnuist006
2008-01-25 7:56 ` Thierry Volpiatto
@ 2008-01-25 9:11 ` Peter Dyballa
[not found] ` <slrnfpki57.7nj.andrews@sdf.lonestar.org>
2008-01-29 13:16 ` Michele Dondi
3 siblings, 0 replies; 5+ messages in thread
From: Peter Dyballa @ 2008-01-25 9:11 UTC (permalink / raw)
To: gnuist006; +Cc: help-gnu-emacs
Am 25.01.2008 um 03:45 schrieb gnuist006@gmail.com:
> The algorithm is this. For every line, look above if there is another
> line like it. If so, then ignore it. If not, then output it. I am
> sure, I can spend some time to write this in C. But what is the
> solution using shell ?
Put the output to make unique into an array. Mark a duplicate with
something invalid. Filter the array that all invalid entries are
eliminated.
--
Greetings
Pete
To drink without thirst and to make love all the time, madam, it is
only these which distinguish us from the other beasts.
– Beaumarchais
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: uniq without sort <-------------- GURU NEEDED
[not found] ` <5462c3ef-cb53-40d8-8a96-bbf624408300@v4g2000hsf.googlegroups.com>
@ 2008-01-28 16:51 ` thermate
0 siblings, 0 replies; 5+ messages in thread
From: thermate @ 2008-01-28 16:51 UTC (permalink / raw)
To: help-gnu-emacs
On Jan 26, 6:35 pm, gnuist...@gmail.com wrote:
> cat input|awk '!_[$0]++' <---- I am interested in understanding
> this and other one liners.
I show you equivalences line by line with reason for each equivalence
in comment
uniq without sort - a one liner w/o any pipes - based on associative
array or symbol-value-table
-----------------
NOTE: In tcsh each instance of NOT or ! must be replaced by \! ie,
escaped.
#
echo -e "a\nc\nd\nb\nc\nd\nc" | # the
input data
awk ' ! count [ $0 ] ++ ' <=> # print $0
is the default action
awk '!_[$0]++' <=> # _ is
cryptic name of associativ array
awk ' !_[$0]++ { print $0 } ' <=> # pattern
action or true action
awk ' /.*/ { if ( !_[$0]++ ) { print $0 } } ' <=> # /.*/ is
any pattern, but not /*/
awk ' /.*/ { if ( !_[$0]++ != 0 ) { print $0 } } ' <=> # like C
zero is the only false in awk
awk ' /.*/ { if ( _[$0]++ == 0 ) { print $0 } } ' <=> ## NOTE
all /.*/ can be omitted everywhere
awk ' /.*/ { if ( ++_[$0] == 1 ) { print $0 } } ' <=>
awk ' { _[$0]++ ; if ( _[$0] == 1 ){ print $0 } } ' <=> # omitting
default pattern /.*/
awk ' /.*/ { a[$0]++ ; if ( a[$0] == 1 ){ print $0 } } ' #
associative array a[index] where
# index is
the line and value is the
# count.
only if count==1 then print.
perl -ne ' if ( ! $count{ $_ } ++ ){ print $_ } ' # perl has
$count{} and $_ and does not
# assume
pattern, so no outer {}
Now some lesson on history:
First the speech by Mr Benjamin H Freedman at http://iamthewitness.com
??? Understanding the MOTIVE FORCE of World History from horse's
mouth
itself - Mr. Benjamin H Friedman was a GENIUS ???
Full Article: http://iamthewitness.com/FreedmanFactsAreFacts.html
<-------- KEY DOCUMENT
Steamy Excerpts:
Will you be patient with me while I review here as briefly as I can
the history of that political emergence and disappearance of a nation
from the pages of history?
In the year 1948 in the Pentagon in Washington I addressed a large
assembly of the highest ranking officers of the United States Army
principally in the G2 branch of Military Intelligence on the highly
explosive geopolitical situation in eastern Europe and the Middle
East. Then as now that area of the world was a potential threat to
the
peace of the world and to the security of this nation I explained to
them fully the origin of the Khazars and Khazar Kingdom. I felt then
as I feel now that without a clear and comprehensive knowledge of
that
subject it is not possible to understand or to evaluate properly what
has been taking place in the world since 1917, the year of the
Bolshevik revolution in Russia. It is the "key" to that problem.
Upon the conclusion of my talk a very alert Lieutenant Colonel
present
at the meeting informed me that he was the head of the history
department of one of the largest and highest scholastic rated
institutions of higher education in the United States. He had taught
history there for 16 years. He had recently been called back to
Washington for further military service. To my astonishment he
informed me that he had never in all his career as a history teachers
or otherwise heard the word "khazar" before he heard me mention it
there. That must give you some idea, my dear Dr. Goldstein, of how
successful that mysterious secret power was with their plot to "block
out" the origin and the history of the Khazars and Khazar Kingdom in
order to conceal from the world and particularly Christians the true
origin and the history of the so-called or self- styled "Jews" in
eastern Europe.
FBI bastards, where is the anthrax mailer ?????
Using full names and fake telephone nos or addressses to get our
trust, names from France, Germany, Italy, finland and other
countries,
yank bastards from the 911 controlled demolition group, which spread
lies and disinformation right on and after 911 is doing their evil
work of sabotaging useful discussions on the internet.
These corporatist evil ones believe in DIVIDING us.
They are EVIL BASTARDS.
FBI never caught the anthrax mailer with fake letter and military
grade anthrax because that was one of these yank bastards they were
afraid to catch.
please click on my profile under google groups to see videos about
these yank bastards.
these bastards use multiple nicks to deceive you.
subtle derailment of threads, casting aspersions is their
methodology.
watch alex jones "terror storm" and other videos to learn what these
evil bastards are upto and how they have perfected psychological
techniques to manipulate you.
On newsgroups there one and only one goal is to divide people and
make
them slave to corporations.
Subject: Re: RACIST YANK BASTARDS FROM 911 CONTROLLED DEMOLITION GROUP
SABOTAGING INFORMATIVE THREADS
subtle derailment of threads, by casting aspersions is their
methodology, using multiple nicks with fake identities, using
sophisticated software and a network of proxies and remailers
including TOR is their methodology. please watch alex jones video
terror storm and see how these bastards using techniques by Edward
Bernays who was Freud's nephew.
subtle derailment of threads, by casting aspersions is their
methodology
subtle derailment of threads, by casting aspersions is their
methodology
subtle derailment of threads, by casting aspersions is their
methodology
subtle derailment of threads, by casting aspersions is their
methodology
Monica Lewdinsky
Valery Plame Wilson <---- michelle blonde evil yank whose goal was
various sabotages
Newton Gingrich <---- BiBBle waving ADULTERER, yank bastard
What about the CHILD MOLESTERS ?
What about the one who goes in MINNESOTA tapping adjacent bathroom
cells for gay sex ? :)))))
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: uniq without sort <-------------- GURU NEEDED
2008-01-25 2:45 uniq without sort <-------------- GURU NEEDED gnuist006
` (2 preceding siblings ...)
[not found] ` <slrnfpki57.7nj.andrews@sdf.lonestar.org>
@ 2008-01-29 13:16 ` Michele Dondi
3 siblings, 0 replies; 5+ messages in thread
From: Michele Dondi @ 2008-01-29 13:16 UTC (permalink / raw)
To: help-gnu-emacs
On Thu, 24 Jan 2008 18:45:24 -0800 (PST), gnuist006@gmail.com wrote:
>I want uniq without sorting the initial order.
>
>The algorithm is this. For every line, look above if there is another
>line like it. If so, then ignore it. If not, then output it. I am
>sure, I can spend some time to write this in C. But what is the
>solution using shell ? This way I can get an output that preserves the
>order of first occurrence. It is needed in many problems.
In shell I don't know. In Perl it's well known to be as trivial as
perl -ne 'print unless $saw{$_}++' file
(And it's not even the most golfed down solution!)
Michele
--
Se, nella notte in cui concepi' il duce,
Donna Rosa, toccata da divina luce,
avesse dato al fabbro predappiano
invece della fica il deretano,
l'avrebbe presa in culo quella sera
Rosa sola e non l'Italia intera.
- Poesia antifascista
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2008-01-29 13:16 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-01-25 2:45 uniq without sort <-------------- GURU NEEDED gnuist006
2008-01-25 7:56 ` Thierry Volpiatto
2008-01-25 9:11 ` Peter Dyballa
[not found] ` <slrnfpki57.7nj.andrews@sdf.lonestar.org>
[not found] ` <slrnfpkj9c.18qn.read_the_sig@mantell0.local>
[not found] ` <7d849d0c-9d8e-44e9-b461-38657fae0a7d@b2g2000hsg.googlegroups.com>
[not found] ` <5462c3ef-cb53-40d8-8a96-bbf624408300@v4g2000hsf.googlegroups.com>
2008-01-28 16:51 ` thermate
2008-01-29 13:16 ` Michele Dondi
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).