all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: <tomas@tuxteam.de>
To: help-gnu-emacs@gnu.org
Subject: Re: alist keys: strings or symbols
Date: Mon, 20 Jul 2020 11:01:35 +0200	[thread overview]
Message-ID: <20200720090135.GA8851@tuxteam.de> (raw)
In-Reply-To: <MCbssBv--3-2@tutanota.com>

[-- Attachment #1: Type: text/plain, Size: 4796 bytes --]

On Sun, Jul 19, 2020 at 06:23:52PM +0200, excalamus--- via Users list for the GNU Emacs text editor wrote:
> Some questions about alists:
> 
> - Is it a better practice to convert string keys to symbols?

It depends. Strings have an "inner life", i.e. are sequences
of characters, symbols are atomic and have no innards (but
see below).

So if you just want to know whether two keys are equal or not,
symbols are the more appropriate choice: it'll be faster, too;
if you find yourself asking whether one key is "greater" (that'd
be lexicographically, I guess) or "less" than another, or whether
it has such-and-such a prefix, you'd rather want a string.

The borders are somewhat fuzzy, since it's possible to extract
the string representation of a symbol). In Emacs Lisp they are
even fuzzier, since you can treat, given the right context, a
symbol as a string. This works for Emacs Lisp:

  (string< 'boo "far")
  => t

Emacs lisp transforms 'boo to "foo" and compares the strings
lexicographically.

* Different equalities:

What you have to bear in mind is that there are different measures
of equality: if you are comparing just the "objects" (if you come
from C, that's --basically-- the object's addresses), you use eq.
In that case, asking for "greater" or "less" doesn't make much sense.

If you are comparing the object's "innards", you use =equal=

>  Is =intern= best for this?  What about handling illegal symbol names?

Yes. And... there are few, if any, illegal symbol names. Try

  (setq foo ".(")

It works. It's a funny symbol, but who cares ;-)

> - If a symbol is used as a key and that symbol is already in use
>   elsewhere, is there potential for conflict with the existing symbol?

No. Interning something gives you an address (well, there's a type
tag attached to it). If it's used somewhere else, it'll reuse that,
otherwise, a new symbol is created. Since those things are immutable,
you don't care.

[...]

> Notice that the keys are strings.  This means that they require
> an equality predicate like ='string-equal= to retrieve unless I use
> =assoc= and =cdr=:

They only require it because you want them compared _as strings_. Had
you put symbols in there, then you could have used =eq= as comparison,
which is the default (so you can leave it out).

[...]

> This works, but now the code is getting messy. There are two forms of
> lookup: the verbose =alist-get= and the brute force =assoc/cdr=.  One
> requires ='string-equal=, the other does not.  If I forget the
> predicate, the lookup will fail silently.

"fail silently" meaning that it's looking for the wrong thing in your
assoc list and not finding it.

> I could convert the keys to symbols using =intern=.  

All that said, I'd think you go with this... unless you find yourself
looking at the innards of your keys too often (extracting prefixes,
doing case-insensitive search, that kind of thing). Remember that
=eq= is just one comparison (address, basically), whereas =equal=
has to first dereference the string and then compare character by
character.

Your keywords are a choice from a limited set, and are immutable,
so to me, they /look/ like symbols. That seems to be the fitting
representation.

> This has several apparent problems.
> 
> As I understand it, this would pollute the global obarray. Is that a
> real concern?

Shouldn't be. The global obarray is built for this.

> [...]  Regardless, I
> don't want my package to conflict with (i.e. overwrite) a person's
> environment unknowingly.

It won't. The obarray just maps a string to some immutable thingy
(basically a pointer with some decorations). This thingy can be
used for many things in different contexts. If some package out
there, say =shiny-widgets.el= binds some variable to the symbol
named "THE TITLE", that won't interfere with your usage. You just
happen to both use the symbol =0xdeadbef-plus-some-type-tags=
(which points to the symbol "THE TITLE" in the obarray) for
different things.

> 
> The string may also have characters illegal for use as a symbol.  
> Here's what happens with illegal symbol characters in the string.
> #+begin_src emacs-lisp :results verbatim :session exc
> (setq exc-bad-meta-data
>   (concat
>    "#+THE TITLE: Test post\n"
>    "#+AUTHOR: Excalamus\n"
>    "#+DATE: 2020-07-17\n"
>    "#+POST TAGS: blogging tests\n"
>    "\n"))
> 
> (setq exc-alist-i-bad (exc-parse-org-meta-data-intern exc-bad-meta-data))

I havent't had a look at your code, but "THE TITLE" interns fine as a
symbol here.

The important thing is that you make a choice and stick consistently
to it. That includes being aware of the comparison functions used.

Cheers
-- t

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

      parent reply	other threads:[~2020-07-20  9:01 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-19 16:23 alist keys: strings or symbols excalamus--- via Users list for the GNU Emacs text editor
2020-07-19 23:23 ` Dmitry Alexandrov
2020-07-20  9:01 ` tomas [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200720090135.GA8851@tuxteam.de \
    --to=tomas@tuxteam.de \
    --cc=help-gnu-emacs@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.