unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#56824: 29.0.50; mail-header-parse-address drops the 1st character from the name
@ 2022-07-29 13:31 Sam Steingold
  2022-07-29 13:36 ` Lars Ingebrigtsen
  0 siblings, 1 reply; 5+ messages in thread
From: Sam Steingold @ 2022-07-29 13:31 UTC (permalink / raw)
  To: 56824

On several occasions:

https://debbugs.gnu.org/cgi/bugreport.cgi?bug=10406
https://debbugs.gnu.org/cgi/bugreport.cgi?bug=56422

Lars told me to use `mail-header-parse-address' instead of
`mail-extract-address-components'.

Well, I tried, with a positive effect (thanks Lars!),
but here is the 1st time the former is deficient:

--8<---------------cut here---------------start------------->8---
(mail-header-parse-addresses "Štěpán Němec <stepnem@gmail.com>")
==> (("stepnem@gmail.com" . "těpán Němec"))
(mail-extract-address-components "Štěpán Němec <stepnem@gmail.com>")
==> ("Štěpán Němec" "stepnem@gmail.com")
--8<---------------cut here---------------end--------------->8---

Thank you.


In GNU Emacs 29.0.50 (build 5, x86_64-apple-darwin21.5.0, NS appkit-2113.50 Version 12.4 (Build 21F79))
 of 2022-07-25 built on 3c22fb11fdab.ant.amazon.com
Repository revision: ffe12ff2503917e47c0356195b31430996c148f9
Repository branch: master
Windowing system distributor 'Apple', version 10.3.2113
System Description:  macOS 12.4

Configured using:
 'configure --with-imagemagick --with-mailutils --with-ns
 PKG_CONFIG_PATH='

Configured features:
ACL GIF GMP GNUTLS IMAGEMAGICK JPEG JSON LCMS2 LIBXML2 MODULES NOTIFY
KQUEUE NS PDUMPER PNG SQLITE3 THREADS TIFF TOOLKIT_SCROLL_BARS WEBP ZLIB

Important settings:
  value of $LANG: C
  locale-coding-system: utf-8-unix

Major mode: ELisp/l

Minor modes in effect:
  shell-dirtrack-mode: t
  pyvenv-mode: t
  global-edit-server-edit-mode: t
  winner-mode: t
  which-function-mode: t
  url-handler-mode: t
  desktop-save-mode: t
  tooltip-mode: t
  global-eldoc-mode: t
  eldoc-mode: t
  show-paren-mode: t
  electric-indent-mode: t
  mouse-wheel-mode: t
  menu-bar-mode: t
  file-name-shadow-mode: t
  global-font-lock-mode: t
  font-lock-mode: t
  blink-cursor-mode: t
  column-number-mode: t
  line-number-mode: t
  auto-composition-mode: t
  auto-encryption-mode: t
  auto-compression-mode: t
  abbrev-mode: t

Load-path shadows:
None found.

Features:
(edebug ein-ipynb-mode js ein-process kmacro tramp-cmds ein-jupyter
ein-dev ein-notebook ein-python-send ein-traceback ein-pytools ein-pager
ein-completer ein-notification ein-scratchsheet ein-worksheet poly-ein
quail polymode poly-lock polymode-base polymode-weave polymode-export
polymode-compat polymode-methods polymode-core polymode-classes
eieio-custom eieio-base ein-kill-ring ein-cell ein-shared-output
ein-output-area ein-kernelinfo ein-kernel ein-ipdb ein-events
ein-websocket websocket bindat ein-file ein-node ein-notebooklist
ein-contents-api ein-query ein-log ein-classes ein-core request
autorevert anaphora ein-utils deferred dash ein vc-annotate face-remap
log-edit apropos gnus-fun pulse ffap cal-move conf-mode shadow emacsbug
find-dired markdown-mode color cl-print debug backtrace flow-fill
misearch multi-isearch smtpmail skeleton shortdoc dabbrev rot13
bbdb-message mailalias cookie1 sort smiley gnus-cite mail-extr qp nndoc
textsec uni-scripts idna-mapping uni-confusable textsec-check gnus-async
gnus-bcklg gnus-dup gnus-ml hl-line disp-table spam spam-stat gnus-uu
yenc utf-7 nndraft nnmh gnus-agent gnus-srvr gnus-score nnvirtual
gnus-msg gnus-cache bbdb-gnus nntp vc-src vc-sccs vc-svn vc-cvs vc-rcs
log-view pcvs-util tempo make-mode loaddefs-gen lisp-mnt tar-mode
mm-archive network-stream url-http url-gw nsm url-cache url-auth
display-line-numbers finder-inf package vc-hg vc-bzr tramp-cache
tramp-sh tramp tramp-loaddefs trampver tramp-integration tramp-compat
shell ls-lisp remember sgml-mode facemenu arc-mode archive-mode
company-oddmuse company-keywords company-etags company-gtags
company-dabbrev-code company-dabbrev company-files company-clang
company-template company-cmake company-bbdb yasnippet-snippets yasnippet
flymake-proc flymake company-capf company help-fns radix-tree elpy
elpy-rpc pyvenv eshell esh-cmd esh-ext esh-opt esh-proc esh-io esh-arg
esh-module esh-groups esh-util elpy-shell elpy-profile elpy-django s
elpy-refactor ido hideshow grep compile files-x etags fileloop xref
project cus-edit pp cus-start python add-log score-mode texinfo
texinfo-loaddefs pcase emacs-news-mode dired-aux cc-mode cc-fonts
cc-guess cc-menus cc-cmds cc-styles cc-align cc-engine cc-vars cc-defs
smerge-mode diff vc-dir ewoc vc bug-reference flyspell ispell
org-element avl-tree generator cl-extra help-mode ol-eww eww xdg
url-queue thingatpt mm-url ol-rmail ol-mhe ol-irc ol-info ol-gnus
nnselect gnus-art mm-uu mml2015 mm-view mml-smime smime gnutls dig
gnus-sum shr pixel-fill kinsoku url-file url-dired svg dom browse-url
url url-proxy url-privacy url-expand url-methods url-history url-cookie
generate-lisp-file url-domsuf url-util gnus-group gnus-undo gnus-start
gnus-dbus dbus xml gnus-cloud nnimap nnmail mail-source utf7 netrc nnoo
parse-time gnus-spec gnus-int gnus-range message sendmail mailcap
yank-media puny rfc822 mml mml-sec epa derived epg rfc6068 epg-config
mm-decode mm-bodies mm-encode mail-parse rfc2231 rfc2047 rfc2045
ietf-drums gmm-utils mailheader gnus-win ol-docview doc-view filenotify
jka-compr image-mode exif dired dired-loaddefs ol-bibtex ol-bbdb ol-w3m
ol-doi org-link-doi org ob ob-tangle ob-ref ob-lob ob-table ob-exp
org-macro org-footnote org-src ob-comint org-pcomplete pcomplete comint
ansi-color org-list org-faces org-entities noutline outline org-version
ob-emacs-lisp ob-core ob-eval org-table oc-basic bibtex iso8601 ol rx
org-keys oc org-compat org-macs org-loaddefs format-spec find-func cal-x
view cal-china cal-bahai cal-islam holidays holiday-loaddefs bbdb-anniv
cal-iso cal-hebrew lunar cal-julian solar cal-dst appt diary-lib
diary-loaddefs cal-menu calendar cal-loaddefs vc-git diff-mode
whitespace easy-mmode vc-dispatcher midnight warnings gnus nnheader
gnus-util text-property-search time-date mail-utils range mm-util
mail-prsvr wid-edit bbdb-mua bbdb-com crm mailabbrev bbdb bbdb-site
timezone edit-server advice server winner ring which-func imenu
url-handlers url-parse auth-source cl-seq eieio eieio-core cl-macs
password-cache json subr-x map byte-opt gv bytecomp byte-compile cconv
url-vars help-at-pt desktop frameset cl-loaddefs cl-lib cus-load info
bbdb-autoloads ein-autoloads elpy-autoloads company-autoloads
async-autoloads f-autoloads dash-autoloads markdown-mode-autoloads
request-autoloads with-editor-autoloads compat-autoloads
yasnippet-snippets-autoloads rmc iso-transl tooltip eldoc paren electric
uniquify ediff-hook vc-hooks lisp-float-type elisp-mode mwheel
term/ns-win ns-win ucs-normalize mule-util term/common-win tool-bar dnd
fontset image regexp-opt fringe tabulated-list replace newcomment
text-mode lisp-mode prog-mode register page tab-bar menu-bar rfn-eshadow
isearch easymenu timer select scroll-bar mouse jit-lock font-lock syntax
font-core term/tty-colors frame minibuffer nadvice seq simple cl-generic
indonesian philippine cham georgian utf-8-lang misc-lang vietnamese
tibetan thai tai-viet lao korean japanese eucjp-ms cp51932 hebrew greek
romanian slovak czech european ethiopic indian cyrillic chinese
composite emoji-zwj charscript charprop case-table epa-hook
jka-cmpr-hook help abbrev obarray oclosure cl-preloaded button loaddefs
faces cus-face macroexp files window text-properties overlay sha1 md5
base64 format env code-pages mule custom widget keymap
hashtable-print-readable backquote threads kqueue cocoa ns lcms2
multi-tty make-network-process emacs)

Memory information:
((conses 16 2715614 260146)
 (symbols 48 63656 81)
 (strings 32 558432 23159)
 (string-bytes 1 16334268)
 (vectors 16 178969)
 (vector-slots 8 3589518 493193)
 (floats 8 1638 1494)
 (intervals 56 269645 10452)
 (buffers 992 253))

-- 
Sam Steingold (http://sds.podval.org/) on darwin Ns 10.3.2113
http://childpsy.net http://calmchildstories.com http://steingoldpsychology.com
https://iris.org.il https://honestreporting.com https://thereligionofpeace.com
Warning! Dates in calendar are closer than they appear!





^ permalink raw reply	[flat|nested] 5+ messages in thread

* bug#56824: 29.0.50; mail-header-parse-address drops the 1st character from the name
  2022-07-29 13:31 bug#56824: 29.0.50; mail-header-parse-address drops the 1st character from the name Sam Steingold
@ 2022-07-29 13:36 ` Lars Ingebrigtsen
  2022-07-29 13:58   ` Sam Steingold
  0 siblings, 1 reply; 5+ messages in thread
From: Lars Ingebrigtsen @ 2022-07-29 13:36 UTC (permalink / raw)
  To: Sam Steingold; +Cc: 56824

Sam Steingold <sds@gnu.org> writes:

> https://debbugs.gnu.org/cgi/bugreport.cgi?bug=10406
> https://debbugs.gnu.org/cgi/bugreport.cgi?bug=56422
>
> Lars told me to use `mail-header-parse-address' instead of
> `mail-extract-address-components'.

Well, not exactly.

> Well, I tried, with a positive effect (thanks Lars!),
> but here is the 1st time the former is deficient:
>
> (mail-header-parse-addresses "Štěpán Němec <stepnem@gmail.com>")
> ==> (("stepnem@gmail.com" . "těpán Němec"))
> (mail-extract-address-components "Štěpán Němec <stepnem@gmail.com>")
> ==> ("Štěpán Němec" "stepnem@gmail.com")

`mail-header-parse-addresses' is for parsing RFC822bis mail addresses.
"Štěpán Němec <stepnem@gmail.com>" is definitely not one of those.

You're probably looking for `mail-header-parse-address-lax':

(mail-header-parse-address-lax "Štěpán Němec <stepnem@gmail.com>")
-> ("stepnem@gmail.com" . "Štěpán Němec")

This should probably be documented better.







^ permalink raw reply	[flat|nested] 5+ messages in thread

* bug#56824: 29.0.50; mail-header-parse-address drops the 1st character from the name
  2022-07-29 13:36 ` Lars Ingebrigtsen
@ 2022-07-29 13:58   ` Sam Steingold
  2022-07-30 11:46     ` Lars Ingebrigtsen
  2022-07-30 11:47     ` Lars Ingebrigtsen
  0 siblings, 2 replies; 5+ messages in thread
From: Sam Steingold @ 2022-07-29 13:58 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: 56824

> * Lars Ingebrigtsen <ynefv@tahf.bet> [2022-07-29 15:36:20 +0200]:
>
> Sam Steingold <sds@gnu.org> writes:
>
>> https://debbugs.gnu.org/cgi/bugreport.cgi?bug=10406
>> https://debbugs.gnu.org/cgi/bugreport.cgi?bug=56422
>>
>> Lars told me to use `mail-header-parse-address' instead of
>> `mail-extract-address-components'.
>
> Well, not exactly.

oops. sorry. ;-)

>> Well, I tried, with a positive effect (thanks Lars!),
>> but here is the 1st time the former is deficient:
>>
>> (mail-header-parse-addresses "Štěpán Němec <stepnem@gmail.com>")
>> ==> (("stepnem@gmail.com" . "těpán Němec"))
>> (mail-extract-address-components "Štěpán Němec <stepnem@gmail.com>")
>> ==> ("Štěpán Němec" "stepnem@gmail.com")
>
> `mail-header-parse-addresses' is for parsing RFC822bis mail addresses.
> "Štěpán Němec <stepnem@gmail.com>" is definitely not one of those.

Hmmm, it _used to be_
--8<---------------cut here---------------start------------->8---
"=?utf-8?B?xaB0xJtww6FuIE7Em21lYw==?= <stepnem@gmail.com>"
--8<---------------cut here---------------end--------------->8---
(is that what RFC822bis specifies?)

but was passed through `mail-decode-encoded-word-string':
--8<---------------cut here---------------start------------->8---
(mail-decode-encoded-word-string "=?utf-8?B?xaB0xJtww6FuIE7Em21lYw==?= <stepnem@gmail.com>")
==> "Štěpán Němec <stepnem@gmail.com>"
--8<---------------cut here---------------end--------------->8---
and then passed on to `mail-header-parse-addresses'.

What is TRT here?

Should I call `mail-header-parse-addresses' _first_ and then
`mail-decode-encoded-word-string'?
Come to think of it, passing _email address_ through
`mail-decode-encoded-word-string' seems unnecessary...

--8<---------------cut here---------------start------------->8---
(mail-header-parse-addresses "=?utf-8?B?xaB0xJtww6FuIE7Em21lYw==?= <stepnem@gmail.com>")
==> (("stepnem@gmail.com" . "=?utf-8?B?xaB0xJtww6FuIE7Em21lYw==?="))
(mail-decode-encoded-word-string "=?utf-8?B?xaB0xJtww6FuIE7Em21lYw==?=")
==> "Štěpán Němec"
--8<---------------cut here---------------end--------------->8---

> You're probably looking for `mail-header-parse-address-lax':
>
> (mail-header-parse-address-lax "Štěpán Němec <stepnem@gmail.com>")
> -> ("stepnem@gmail.com" . "Štěpán Němec")

... Or maybe I should keep the current order of processing and use
`mail-header-parse-address-lax'?

Thank you for your speedy and enlightening reply!

-- 
Sam Steingold (http://sds.podval.org/) on darwin Ns 10.3.2113
http://childpsy.net http://calmchildstories.com http://steingoldpsychology.com
http://think-israel.org https://thereligionofpeace.com https://iris.org.il
In the race between idiot-proof software and idiots, the idiots are winning.





^ permalink raw reply	[flat|nested] 5+ messages in thread

* bug#56824: 29.0.50; mail-header-parse-address drops the 1st character from the name
  2022-07-29 13:58   ` Sam Steingold
@ 2022-07-30 11:46     ` Lars Ingebrigtsen
  2022-07-30 11:47     ` Lars Ingebrigtsen
  1 sibling, 0 replies; 5+ messages in thread
From: Lars Ingebrigtsen @ 2022-07-30 11:46 UTC (permalink / raw)
  To: Sam Steingold; +Cc: 56824

Sam Steingold <sds@gnu.org> writes:

> Should I call `mail-header-parse-addresses' _first_ and then
> `mail-decode-encoded-word-string'?

Yup.  Or use the RAWP parameter to `mail-header-parse-addresses' and
then parse with `mail-header-parse-address' with a DECODE parameter.

>> You're probably looking for `mail-header-parse-address-lax':
>>
>> (mail-header-parse-address-lax "Štěpán Němec <stepnem@gmail.com>")
>> -> ("stepnem@gmail.com" . "Štěpán Němec")
>
> ... Or maybe I should keep the current order of processing and use
> `mail-header-parse-address-lax'?

No, that'll always be less reliable, because the -lax version uses
heuristics.  This makes a difference when the address is something weird
like the name part containing an @ character (like
"=?utf-8?Q?=40ndr=C3=A9?= <andre@example.com>" decoded to, "@ndré
<andre@example.com>" which is valid and appears in the wild).






^ permalink raw reply	[flat|nested] 5+ messages in thread

* bug#56824: 29.0.50; mail-header-parse-address drops the 1st character from the name
  2022-07-29 13:58   ` Sam Steingold
  2022-07-30 11:46     ` Lars Ingebrigtsen
@ 2022-07-30 11:47     ` Lars Ingebrigtsen
  1 sibling, 0 replies; 5+ messages in thread
From: Lars Ingebrigtsen @ 2022-07-30 11:47 UTC (permalink / raw)
  To: Sam Steingold; +Cc: 56824

Anyway, I've now amended the doc strings to these functions in Emacs 29,
because this wasn't explained at all, unfortunately, and has led to a
lot of confusion.

And with that, I'm closing this bug report.





^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2022-07-30 11:47 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-29 13:31 bug#56824: 29.0.50; mail-header-parse-address drops the 1st character from the name Sam Steingold
2022-07-29 13:36 ` Lars Ingebrigtsen
2022-07-29 13:58   ` Sam Steingold
2022-07-30 11:46     ` Lars Ingebrigtsen
2022-07-30 11:47     ` Lars Ingebrigtsen

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).