From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp1 ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms11 with LMTPS id kIJKIMh/kV/6fgAA0tVLHw (envelope-from ) for ; Thu, 22 Oct 2020 12:49:12 +0000 Received: from aspmx1.migadu.com ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp1 with LMTPS id +NSVGsh/kV8fEgAAbx9fmQ (envelope-from ) for ; Thu, 22 Oct 2020 12:49:12 +0000 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 22C499400EF for ; Thu, 22 Oct 2020 12:49:12 +0000 (UTC) Received: from localhost ([::1]:60234 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kVa1n-0006A4-2e for larch@yhetil.org; Thu, 22 Oct 2020 08:49:11 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:52156) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kVa1d-00069w-TJ for guix-devel@gnu.org; Thu, 22 Oct 2020 08:49:01 -0400 Received: from sender4-of-o51.zoho.com ([136.143.188.51]:21199) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kVa1b-0006qa-N8; Thu, 22 Oct 2020 08:49:01 -0400 ARC-Seal: i=1; a=rsa-sha256; t=1603370935; cv=none; d=zohomail.com; s=zohoarc; b=Q+YOQxrAWMe5zzRaR4Dy+ikRSSv9ev3vqhITdxj/GRc2bj/OmE7CIbqdW61WYtFk3ZV9mZXt4P5EqKaGb1yb/bfP9ZJa72a0tHN+FEp+WvduICdqyIMBpI6Tq9VvE1labHom8pe5iztCutt6g7jJ80zbCmkEkcru4qD5nRh8zZ0= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1603370935; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:MIME-Version:Message-ID:References:Subject:To; bh=Z71JlQBJZqkBSdQyEUqXVkqmM8NHV4oLteBrbCLtokI=; b=fEX/haIGkNvUdHLjKF/IaFQVHjbMb+byAA6MuMS8TgVm1dNqogbIt6nsaSm2pEYHefp3+ZUa8vSFpd4LJZ+ZJWVNtvU33TnvzGRqsE0xUuC/bhirWubw8MMBvJRHq4OHcCnrSAAjBUd1TgXlaRkPtS7gHeXu04epP4DmY5EDoDk= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass header.i=elephly.net; spf=pass smtp.mailfrom=rekado@elephly.net; dmarc=pass header.from= header.from= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; t=1603370935; s=zoho; d=elephly.net; i=rekado@elephly.net; h=References:From:To:Cc:Subject:In-reply-to:Date:Message-ID:MIME-Version:Content-Type:Content-Transfer-Encoding; bh=Z71JlQBJZqkBSdQyEUqXVkqmM8NHV4oLteBrbCLtokI=; b=J4tHYZcbz+gDjKrcFTu+eS9IEj689XYox5uyt0PtY4bP54KYJkcFFALU3bOalVGF v/BS366qY64jMRKH2SONNUo6r6zzbfWVCpMtKl4AmzeP7Fd6EJs7jHlTC1aSDDZDkUy cHEgjTaEbjVMuamGXlGFhjebnjjNMNfHZOLoz1jk= Received: from localhost (p4fd5a5b7.dip0.t-ipconnect.de [79.213.165.183]) by mx.zohomail.com with SMTPS id 1603370933462127.30050362727047; Thu, 22 Oct 2020 05:48:53 -0700 (PDT) References: <86lfg6z0lm.fsf@gmail.com> <87tuuuvum5.fsf@elephly.net> <87r1pyvsn4.fsf@elephly.net> <87lffz6fmc.fsf@gnu.org> <878sbzut6q.fsf@elephly.net> <87zh4etqnj.fsf@elephly.net> User-agent: mu4e 1.4.13; emacs 27.1 From: Ricardo Wurmus To: Ludovic =?utf-8?Q?Court=C3=A8s?= Subject: Re: Manual PDF and translation (modular texlive?) In-reply-to: <87zh4etqnj.fsf@elephly.net> X-URL: https://elephly.net X-PGP-Key: https://elephly.net/rekado.pubkey X-PGP-Fingerprint: BCA6 89B6 3655 3801 C3C6 2150 197A 5888 235F ACAC Date: Thu, 22 Oct 2020 14:50:40 +0200 Message-ID: <87wnzitof3.fsf@elephly.net> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-ZohoMailClient: External Received-SPF: pass client-ip=136.143.188.51; envelope-from=rekado@elephly.net; helo=sender4-of-o51.zoho.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/10/22 08:48:57 X-ACL-Warn: Detected OS = Linux 3.11 and newer [fuzzy] X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: guix-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Guix Devel Errors-To: guix-devel-bounces+larch=yhetil.org@gnu.org Sender: "Guix-devel" X-Scanner: scn0 Authentication-Results: aspmx1.migadu.com; dkim=fail (rsa verify failed) header.d=elephly.net header.s=zoho header.b=J4tHYZcb; dmarc=none; spf=pass (aspmx1.migadu.com: domain of guix-devel-bounces@gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=guix-devel-bounces@gnu.org X-Spam-Score: 3.49 X-TUID: bY8XiRcR128y Ricardo Wurmus writes: > Ricardo Wurmus writes: > >>> What=E2=80=99s interesting is that it breaks accents in the table of co= ntents, >>> but not elsewhere. >> >> These double caret sequences are representations of multi-byte >> characters. =E2=80=9C^^c3^^b6=E2=80=9D, for example, is a lowercase a w= ith umlaut. >> >> The TeX log file contains a whole bunch of these messages: >> >> l.139: Unicode char @u8:^^e5^^8f^^82 not defined for Texinfo >> >> Then later things like this: >> >> Missing character: There is no ^^c3 in font cmr10! >> Missing character: There is no ^^9f in font cmr10! >> Missing character: There is no ^^c3 in font cmr10! >> Missing character: There is no ^^9f in font cmr10! >> Missing character: There is no ^^c3 in font cmr10! >> Missing character: There is no ^^a4 in font cmr10! >> >> I=E2=80=99m not sure this is correct, because it seems to me that =E2=80= =9C^^c3=E2=80=9D is only >> part of a longer multi-byte sequence, but this error indicates that >> individual bytes are looked up in the font. > > With the full =E2=80=9Ctexlive=E2=80=9D package I also see =E2=80=9Cnot d= efined for Texinfo=E2=80=9D in > the logs, but the characters use octal notation instead of double caret > notation. The generated guix.de.toc contains the correct characters > with umlauts, while the .toc file generated with the modular TeX Live > contains caret-notated characters. > > I=E2=80=99ll try to figure out why that is. The reason is that the generated guix.de.toc file is ASCII-encoded in the modular case but UTF-8 encoded in the monolithic case. Why is that? texinfo.tex enables byte-I/O for engines that do not have native UTF-8 support; it uses native UTF-8 for LuaTeX and XeTeX only. Sure enough, with PDFTEX=3Dxetex make doc/guix.de.pdf the TOC looks actually fine! LuaTeX is broken due to a botched upgrade (I=E2=80=99m working on a fix), so I haven=E2=80=99t tested it. Two things are weird here: 1) texi2dvi still fails, because apparently =E2=80=9Cxetex=E2=80=9D didn=E2= =80=99t return a good status code; the PDF was built fine, though. 2) we aren=E2=80=99t using XeTeX or LuaTeX with the monolithic =E2=80=9Ctex= live=E2=80=9D package, so why does pdfTeX behave differently here? I see in the logs that the date of the format file differs =E2=80=94 does this indicate that = our pdfTeX format file is wrong? I will compare the two files. Another observation: the pdftex.map file in the monolithic =E2=80=9Ctexlive= =E2=80=9D package is huge and mentions a great many fonts; in the modular TeX Live this is generated for fonts that are actually available. It=E2=80=99s not impossible that this font map needs more entries, but perhaps everything is fine already. I just can=E2=80=99t say for sure. --=20 Ricardo