The function `follow_key` has been changed by the problematic commit.
Formerly severall keymaps have been passed in an array. Each keymap
has been checked in turn for a binding. One of the keymaps is
`evil-esc-map`. If this keymap is checked no binding is found. So the
next keymap is checked an it may contain a binding for M-x so this
binding is used.
Oh, I think I see what's going on. So the Evil code (and Viper, since
it seems to use the same gymnastics) really relies on some pretty nasty
detail of the level at which the M-x => ESC x rewriting took place,
which was subtly changed.
That could also explain why `f1 f M-x' already didn't find the binding
in the old code.
Anyhow, the real problem is to "multiplex" the (kbd "ESC") event in
the terminal. Any solution that sends 'escape instead of (kbd "ESC")
if another event arrives within a short period should solve the
problem.
Now my question is: why do it with a minor-mode map rather than with
an input-decode-map (which would also save you from having to rely on
unread-command-events)? Oh, yes, of course, that input-decode-map
binding would collide with the escape-sequence remappings.
How 'bout something like:
(defvar evil-normal-esc-map (lookup-key input-decode-map [?\e]))
(define-key input-decode-map
[?\e] `(menu-item "" ,evil-normal-esc-map
:filter ,(lambda (map)
(if (sit-for evail-esc-delay) [escape] map))))