all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Lars Ingebrigtsen <larsi@gnus.org>
To: 20258@debbugs.gnu.org
Cc: stefan@marxist.se, gunnar.horrigmo@usit.uio.no
Subject: bug#20258: 24.5; format-time-string miscounting of multibyte characters
Date: Mon, 30 Sep 2019 05:09:08 +0200	[thread overview]
Message-ID: <87blv2a3aj.fsf@gnus.org> (raw)
In-Reply-To: <CADwFkmn+YiVh+b0HKisYMVw-3bzG6fD7xuQx5HKUMdgz8STxgQ@mail.gmail.com> (Stefan Kangas's message of "Mon, 30 Sep 2019 02:35:08 +0200")

Stefan Kangas <stefan@marxist.se> writes:

>>> As the subject says, format-time-string miscounts multibyte characters.
>>> Simple example with nb_NO.utf8 locale, where ø is two bytes:
>>>
>>> (format-time-string "%6a" (date-to-time "Sat Apr  4 16:14:40 2015"))
>>> "  lø."
>>>
>>> (length (format-time-string "%6a" (date-to-time "Sat Apr  4 16:14:40 2015")))
>>> 5
>>
>> 'length' counts characters, not bytes.  If you need to count bytes,
>> use 'string-bytes' instead:
>>
>>   (string-bytes "  lø.") => 6
>
> I can see no bug here, only a misunderstanding about the length
> function.  I'm therefore closing this bug.  If that's incorrect, please
> reopen this bug report.

But the issue here is that "%6a" should give you a string that's six
characters long, I think?  Admittedly the doc string is vague here:

---
A field width N is an unsigned decimal integer with a leading digit nonzero.
%NX is like %X, but takes up at least N positions.
---

But the natural interpretation of "positions" isn't bytes, I think, and
if is, then the doc string should say so.

(let ((system-time-locale "nb_NO.UTF-8"))
  (format-time-string "%6a" (date-to-time "Sat Apr  4 16:14:40 2015")))
=> "  lø."

(if you have that locale in /etc/locale.gen.)

But I seem to remember from previous discussions that this quirk is in
the C strftime function?  And Emacs just call it?  I haven't checked.
But this means that you can't use format-time-string to line stuff up,
but have to use `format':

(let ((system-time-locale "nb_NO.UTF-8"))
  (format "%6s" (format-time-string "%a" (date-to-time "Sat Apr  4 16:14:40 2015"))))
=> "   lø."

So I think what WIDTH means should be said explicitly in the doc string.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





  reply	other threads:[~2019-09-30  3:09 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-04-04 14:33 bug#20258: 24.5; format-time-string miscounting of multibyte characters Gunnar Horrigmo
2015-04-04 15:42 ` Eli Zaretskii
2015-04-04 16:03   ` Stefan Monnier
2015-04-04 16:42 ` Andreas Schwab
2019-09-30  0:35 ` Stefan Kangas
2019-09-30  3:09   ` Lars Ingebrigtsen [this message]
2019-09-30  7:01     ` Eli Zaretskii
2019-09-30  8:41       ` Andreas Schwab
2019-09-30  9:13         ` Eli Zaretskii
2019-09-30 13:39           ` Lars Ingebrigtsen
2019-09-30 13:58             ` Eli Zaretskii
2019-09-30 14:12               ` Lars Ingebrigtsen
2019-09-30 14:30                 ` Gunnar Horrigmo
2019-09-30 14:44                   ` Eli Zaretskii
2019-09-30 14:41                 ` Eli Zaretskii
2019-09-30 14:48                   ` Lars Ingebrigtsen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87blv2a3aj.fsf@gnus.org \
    --to=larsi@gnus.org \
    --cc=20258@debbugs.gnu.org \
    --cc=gunnar.horrigmo@usit.uio.no \
    --cc=stefan@marxist.se \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.