all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Bob Proulx <bob@proulx.com>
To: help-gnu-emacs@gnu.org
Subject: Re: Most used words in current buffer
Date: Thu, 19 Jul 2018 14:42:36 -0600	[thread overview]
Message-ID: <20180719140935156302029@bob.proulx.com> (raw)
In-Reply-To: <piq3hm$sff$1@dont-email.me>

Udyant Wig wrote:
> Bob Proulx wrote:
> > Hmm...  I think looking behind the abstract data type at how it might
> > or might not be implemented is a stretch to say the least.  The entire
> > reason it is an abstract data type is to hide those types of
> > implementation details.  :-)
> 
> Indeed.  The principle generalizes to abstract functionality as well,
> doesn't it?  E.g. qsort in the C library may or may not be an
> implementation of quick sort; or closer to the topic of the newsgroup,
> one ought not to care about the actual algorithm of the SORT function in
> Emacs.

Yes!  That is it exactly!

> > The naming of things usually says more about the person that named the
> > thing than the thing itself.  Associative arrays is a naming that
> > reflects the concepts involved in what it does.  This is the same as
> > when someone else names it a map table, or a dictionary.  Those are
> > all the same thing.  Just using different names because people took
> > different paths to get there.
> 
> Yes.  Just as, arguably, vectors are a special case of the general
> concept of arrays, though the terms are commonly used to name the same
> thing.

Agree completely.

> > For such things I generally prefer balanced tree structures because
> > work is amortized instead of lumped.  But the important point here is
> > that for every algorithm + data structure there is a trade-off of some
> > sort between one thing and another thing.
> 
> Hmm.  I had written a tree version of the word counter I had mentioned
> before.  I had stumbled upon the AVL tree package in Emacs and thought I
> might try using it.  This tree-based attempt turned out to be slower
> than my straightforward hashing solution.
> 
> I have no doubts this code could be written better by someone more
> experienced than I.

I don't know if the AVL package you used was implemented in elisp or
in C or otherwise.  And even though I am a long time user of emacs I
have never acquired the elisp skill to the same level as other
languages and therefore can't comment on that part.  But I know that
when people have implemented such data structures in Perl that the
result has never been as fast and efficient as in a native C version.
If so then that may easily account for performance differences.  And
also the native implementation of "hashes" in awk, perl, python, ruby
is quite optimized and very fast.  They have had years of eyes and
tweaking upon them.

> > I am in total agreement over using sed instead of head if you want to
> > do that.  Seeing 'sed 20q' should roll off the keyboard as print lines
> > until line 20 and then quit.  Very simple and to the point.  There is
> > definitely no need for a separate head command.  Other than for
> > symmetry with tail which is not as simple in sed.
> 
> I see that.  You could implement head on top of sed if you wanted to.  I
> myself have been using head for long enough for its stated purpose that
> grasping a sed equivalent was not immediately obvious.

Writing clear code that can be understood immediately by the entire
range of programmer skill is important in my not so humble opinion.
One shouldn't need to be a master experienced programmer to understand
what has been written.  Therefore I usually use 'head' specifically
for the clarity of it to everyone.  Seeing "head -n40" is not going to
confuse anyone.  Therefore I usually use it instead of "sed 40q" even
though I could remove 'head' entirely from my system if I were to
uniformly implement one in terms of the other.  Clarity is more
important.

And before someone mentions performance let me remind that we are
talking shell scripts.  In a shell script clarity is more important
than performance.  Always.  If the resulting shell script results in a
performance problem than choosing a better algorithm will almost
certainly be the better solution.  And if not than then choosing a
different language more efficient at the task is next.

I do expect some skill to be learned with 'awk' however.  It is so
very useful that seeing "awk '{print$1}' should not be that confusing
that it is printing the first field column.  Or that '{print$NF}' is
a common idiom for printing the last field.  (NF is the Number of
Fields in the line that was split by whitespace.  $NF is therefore the
last field.  If NF is 5 then $NF is saying $5 and therefore always the
last field of the line.)  A little bit of awk learning pays back a
large return on the investment.

> These things do take time to gain currency, don't they?  Under Linux,
> for example, the ip set of commands has been named the successor to
> ifconfig, and it too is taking time to diffuse into general knowledge.

Yes.  And 'ip' is an excellent example!  Even I have converted to
using ip and the iproute2 family instead of ifconfig.

One thing to note about the iproute2 family is that it is reasonably
well written.  We are not forced to use it.  Instead we are attracted
to using it in order to get access to the entire set of new networking
features available only through them.  It is a carrot not a stick.

> (And, although there have been a number of revisions of Standard C since
> 1989/1990, a lot of projects still write to that now legacy standard.
> But there may be other issues to consider here.)

Another good example. :-)

Bob



  reply	other threads:[~2018-07-19 20:42 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-17  9:28 Most used words in current buffer Udyant Wig
2018-07-17 18:41 ` Emanuel Berg
2018-07-18  9:36   ` Udyant Wig
2018-07-18 11:48     ` Emanuel Berg
2018-07-18 14:50       ` Udyant Wig
2018-07-18 16:32         ` Emanuel Berg
2018-07-18 22:39     ` Ben Bacarisse
2018-07-19  0:45       ` Bob Proulx
     [not found]       ` <mailman.3785.1531961144.1292.help-gnu-emacs@gnu.org>
2018-07-19  5:33         ` Udyant Wig
2018-07-19  7:04           ` Bob Proulx
2018-07-19  7:25             ` tomas
2018-07-19 17:19             ` Nick Dokos
2018-07-19 17:30               ` Eli Zaretskii
2018-07-19 20:08               ` Bob Proulx
2018-07-20 16:39                 ` Nick Dokos
     [not found]                 ` <mailman.3909.1532104802.1292.help-gnu-emacs@gnu.org>
2018-07-20 18:13                   ` Udyant Wig
2018-07-20 22:24                     ` Bob Newell
2018-07-21  0:00                       ` Nick Dokos
2018-07-21  0:18                     ` Nick Dokos
     [not found]               ` <mailman.3843.1532030947.1292.help-gnu-emacs@gnu.org>
2018-07-20  6:19                 ` Udyant Wig
2018-07-20 23:25                   ` Bob Proulx
2018-07-21  0:26                     ` Nick Dokos
2018-07-21  4:03                       ` Bob Proulx
     [not found]                   ` <mailman.3934.1532129163.1292.help-gnu-emacs@gnu.org>
2018-07-21 13:39                     ` Udyant Wig
     [not found]             ` <mailman.3826.1532020800.1292.help-gnu-emacs@gnu.org>
2018-07-20  5:52               ` Udyant Wig
     [not found]           ` <mailman.3796.1531983885.1292.help-gnu-emacs@gnu.org>
2018-07-19 13:26             ` Udyant Wig
2018-07-19 20:42               ` Bob Proulx [this message]
2018-07-20  3:08                 ` Bob Newell
     [not found]                 ` <mailman.3861.1532056120.1292.help-gnu-emacs@gnu.org>
2018-07-21 12:51                   ` Udyant Wig
2018-07-21 16:15                     ` Eric Abrahamsen
     [not found]                     ` <mailman.3982.1532189751.1292.help-gnu-emacs@gnu.org>
2018-07-21 19:46                       ` Udyant Wig
2018-07-22  3:57                         ` Eric Abrahamsen
2018-07-22  4:00                           ` Eric Abrahamsen
2018-07-22  4:05                             ` Eric Abrahamsen
     [not found]                           ` <mailman.4008.1532232144.1292.help-gnu-emacs@gnu.org>
2018-07-22 18:28                             ` Udyant Wig
2018-07-22 20:05                               ` Eric Abrahamsen
     [not found]                         ` <mailman.4007.1532231884.1292.help-gnu-emacs@gnu.org>
2018-07-22 18:19                           ` Udyant Wig
     [not found]               ` <mailman.3845.1532032966.1292.help-gnu-emacs@gnu.org>
2018-07-20 13:18                 ` Udyant Wig
2018-07-21 18:22               ` Stefan Monnier
2018-07-22  9:02                 ` tomas
2018-07-23  6:09                   ` Bob Proulx
2018-07-23  7:34                     ` tomas
     [not found]                   ` <mailman.4074.1532326162.1292.help-gnu-emacs@gnu.org>
2018-07-23  7:26                     ` Udyant Wig
     [not found]                 ` <mailman.4013.1532250176.1292.help-gnu-emacs@gnu.org>
2018-07-22 18:58                   ` Udyant Wig
     [not found]               ` <mailman.3991.1532197378.1292.help-gnu-emacs@gnu.org>
2018-07-21 19:39                 ` Udyant Wig
2018-07-21 20:54                   ` Stefan Monnier
     [not found]                   ` <mailman.3995.1532206511.1292.help-gnu-emacs@gnu.org>
2018-07-22 18:43                     ` Udyant Wig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180719140935156302029@bob.proulx.com \
    --to=bob@proulx.com \
    --cc=help-gnu-emacs@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.