unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Re: master updated (e6041a0 -> 6220fae)
       [not found] <20170215164632.13342.86743@vcs0.savannah.gnu.org>
@ 2017-02-17  5:26 ` Glenn Morris
  2017-02-17 14:51   ` Michal Nazarewicz
  2017-02-17 15:43   ` Michal Nazarewicz
  0 siblings, 2 replies; 7+ messages in thread
From: Glenn Morris @ 2017-02-17  5:26 UTC (permalink / raw)
  To: Michal Nazarewicz; +Cc: emacs-devel


Hi - this fails to bootstrap from a truly clean tree (probably you need
to "make extraclean" in an existing tree to see it).

Ref: http://hydra.nixos.org/build/48774928

  Loading lisp/international/characters.el (source)...
  Wrong type argument: char-table-p, nil



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: master updated (e6041a0 -> 6220fae)
  2017-02-17  5:26 ` master updated (e6041a0 -> 6220fae) Glenn Morris
@ 2017-02-17 14:51   ` Michal Nazarewicz
  2017-02-17 15:26     ` Eli Zaretskii
  2017-02-17 15:43   ` Michal Nazarewicz
  1 sibling, 1 reply; 7+ messages in thread
From: Michal Nazarewicz @ 2017-02-17 14:51 UTC (permalink / raw)
  To: Glenn Morris; +Cc: emacs-devel

On Fri, Feb 17 2017, Glenn Morris wrote:
> Hi - this fails to bootstrap from a truly clean tree (probably you
> need to "make extraclean" in an existing tree to see it).
>
> Ref: http://hydra.nixos.org/build/48774928
>
>   Loading lisp/international/characters.el (source)...
>   Wrong type argument: char-table-p, nil

Unicode tables aren’t available and this causes the failure. ;/

This is unfortunate since without Unicode tables it seems to me that
characters.el needs to be kept in sync with Unicode manually.

I don’t fully comprehend how bootstrapping work just yet though.  In
particular, characters.el has:

    (when (setq unicode-category-table
                (unicode-property-table-internal 'general-category))
      (map-char-table #'(lambda (key val) …)
                      unicode-category-table))

So perhaps it’s enough to hard-code syntax-table for ASCII and update
the rest with similar conditional?

-- 
Best regards
ミハウ “𝓶𝓲𝓷𝓪86” ナザレヴイツ
«If at first you don’t succeed, give up skydiving»



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: master updated (e6041a0 -> 6220fae)
  2017-02-17 14:51   ` Michal Nazarewicz
@ 2017-02-17 15:26     ` Eli Zaretskii
  0 siblings, 0 replies; 7+ messages in thread
From: Eli Zaretskii @ 2017-02-17 15:26 UTC (permalink / raw)
  To: Michal Nazarewicz; +Cc: rgm, emacs-devel

> From: Michal Nazarewicz <mina86@mina86.com>
> Date: Fri, 17 Feb 2017 15:51:00 +0100
> Cc: emacs-devel@gnu.org
> 
> On Fri, Feb 17 2017, Glenn Morris wrote:
> > Hi - this fails to bootstrap from a truly clean tree (probably you
> > need to "make extraclean" in an existing tree to see it).
> >
> > Ref: http://hydra.nixos.org/build/48774928
> >
> >   Loading lisp/international/characters.el (source)...
> >   Wrong type argument: char-table-p, nil
> 
> Unicode tables aren’t available and this causes the failure. ;/

I think they are not available because the uni-*.el files were not yet
produced.  Is that right?

If so, you could try conditioning the code on the availability of the
Unicode tables, because by the time characters.el is byte-compiled,
they will be.  And when we load characters.el as source, we hopefully
don't need syntax-table yet.

> I don’t fully comprehend how bootstrapping work just yet though.  In
> particular, characters.el has:
> 
>     (when (setq unicode-category-table
>                 (unicode-property-table-internal 'general-category))
>       (map-char-table #'(lambda (key val) …)
>                       unicode-category-table))
> 
> So perhaps it’s enough to hard-code syntax-table for ASCII and update
> the rest with similar conditional?

Yes.  But first I'd try not to hard-code anything at all.

Thanks.



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: master updated (e6041a0 -> 6220fae)
  2017-02-17  5:26 ` master updated (e6041a0 -> 6220fae) Glenn Morris
  2017-02-17 14:51   ` Michal Nazarewicz
@ 2017-02-17 15:43   ` Michal Nazarewicz
  2017-02-17 16:33     ` Glenn Morris
  1 sibling, 1 reply; 7+ messages in thread
From: Michal Nazarewicz @ 2017-02-17 15:43 UTC (permalink / raw)
  To: Glenn Morris; +Cc: emacs-devel

On Fri, Feb 17 2017, Glenn Morris wrote:
> Hi - this fails to bootstrap from a truly clean tree (probably you need
> to "make extraclean" in an existing tree to see it).
>
> Ref: http://hydra.nixos.org/build/48774928
>
>   Loading lisp/international/characters.el (source)...
>   Wrong type argument: char-table-p, nil

The attached patch fixes this.  Anyone know of any downside of this
approach?  If not, I’ll go ahead and push it.

From 9f9863e50298a3506165cc1f056ab3238f37cb9f Mon Sep 17 00:00:00 2001
From: Michal Nazarewicz <mina86@mina86.com>
Date: Fri, 17 Feb 2017 16:36:44 +0100
Subject: [PATCH] =?UTF-8?q?Fix=20build=20failure=20caused=20by=20=E2=80=98?=
 =?UTF-8?q?Generate=20upcase=20and=20downcase=20tables=20from=20Unicode?=
 =?UTF-8?q?=E2=80=99?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

The [5ec3a584: Generate upcase and downcase tables from Unicode data]
commit broke bootstrap from a truly clean tree (e.g. a fresh clone or
one created with ‘make extraclean’), see
<http://hydra.nixos.org/build/48774928>.

The failure was caused by characters.el trying to read Unicode
property tables which aren’t available so early in the build process.

Wrap the part that requires Unicode property tables in a condition
checking if those are available.  If they aren’t they case and syntax
tables won’t be fully set but later on, the characters.el file will be
evaluated again and this time with Unicode properties available so
final Emacs ends up with the exact same case and syntax tables.
---
 lisp/international/characters.el | 132 ++++++++++++++++++++-------------------
 1 file changed, 68 insertions(+), 64 deletions(-)

diff --git a/lisp/international/characters.el b/lisp/international/characters.el
index 9e993e7060a..3eb287fd963 100644
--- a/lisp/international/characters.el
+++ b/lisp/international/characters.el
@@ -665,70 +665,74 @@ ?L
   ;; Combining marks
   (modify-category-entry '(#x20d0 . #x20ff) ?^)
 
-  ;; Set all Letter, uppercase; Letter, lowercase and Letter, titlecase syntax
-  ;; to word.
-  (let ((syn-tab (standard-syntax-table)))
-    (map-char-table
-     (lambda (ch cat)
-       (when (memq cat '(Lu Ll Lt))
-         (modify-syntax-entry ch "w   " syn-tab)))
-     (unicode-property-table-internal 'general-category))
-
-    ;; Ⅰ through Ⅻ had word syntax in the past so set it here as well.
-    ;; The general category of those characters is Number, Letter.
-    (modify-syntax-entry '(#x2160 . #x216b) "w   " syn-tab)
-
-    ;; ⓐ through ⓩ are symbols, other according to Unicode but Emacs set
-    ;; their syntax to word in the past so keep backwards compatibility.
-    (modify-syntax-entry '(#x24D0 . #x24E9) "w   " syn-tab))
-
-  ;; Set downcase and upcase from Unicode properties
-
-  ;; In some languages, such as Turkish, U+0049 LATIN CAPITAL LETTER I and
-  ;; U+0131 LATIN SMALL LETTER DOTLESS I make a case pair, and so do U+0130
-  ;; LATIN CAPITAL LETTER I WITH DOT ABOVE and U+0069 LATIN SMALL LETTER I.
-
-  ;; We used to set up half of those correspondence unconditionally, but that
-  ;; makes searches slow.  So now we don't set up either half of these
-  ;; correspondences by default.
-
-  ;; (set-downcase-syntax  ?İ ?i tbl)
-  ;; (set-upcase-syntax    ?I ?ı tbl)
-
-  (let ((map-unicode-property
-         (lambda (property func)
-           (map-char-table
-            (lambda (ch cased)
-              ;; ASCII characters skipped due to reasons outlined above.  As of
-              ;; Unicode 9.0, this exception affects the following:
-              ;;   lc(U+0130 İ) = i
-              ;;   uc(U+0131 ı) = I
-              ;;   uc(U+017F ſ) = S
-              ;;   uc(U+212A K) = k
-              (when (> cased 127)
-                (let ((end (if (consp ch) (cdr ch) ch)))
-                  (setq ch (max 128 (if (consp ch) (car ch) ch)))
-                  (while (<= ch end)
-                    (funcall func ch cased)
-                    (setq ch (1+ ch))))))
-            (unicode-property-table-internal property))))
-        (down tbl)
-        (up (case-table-get-table tbl 'up)))
-
-    ;; This works on an assumption that if toUpper(x) != x then toLower(x) ==
-    ;; x (and the opposite for toLower/toUpper).  This doesn’t hold for title
-    ;; case characters but those incorrect mappings will be overwritten later.
-    (funcall map-unicode-property 'uppercase
-             (lambda (lc uc) (aset down lc lc) (aset up uc uc)))
-    (funcall map-unicode-property 'lowercase
-             (lambda (uc lc) (aset down lc lc) (aset up uc uc)))
-
-    ;; Now deal with the actual mapping.  This will correctly assign casing for
-    ;; title-case characters.
-    (funcall map-unicode-property 'uppercase
-             (lambda (lc uc) (aset up lc uc) (aset up uc uc)))
-    (funcall map-unicode-property 'lowercase
-             (lambda (uc lc) (aset down uc lc) (aset down lc lc))))
+  (let ((gc (unicode-property-table-internal 'general-category))
+        (syn-table (standard-syntax-table)))
+    ;; In early bootstrapping Unicode tables are not available so we need to
+    ;; skip this step in those cases.
+    (when gc
+      ;; Set all Letter, uppercase; Letter, lowercase and Letter,
+      ;; titlecase syntax to word.
+      (map-char-table
+       (lambda (ch cat)
+         (when (memq cat '(Lu Ll Lt))
+           (modify-syntax-entry ch "w   " syn-table)))
+       gc)
+      ;; Ⅰ through Ⅻ had word syntax in the past so set it here as well.
+      ;; The general category of those characters is Number, Letter.
+      (modify-syntax-entry '(#x2160 . #x216b) "w   " syn-table)
+
+      ;; ⓐ through ⓩ are symbols, other according to Unicode but Emacs set
+      ;; their syntax to word in the past so keep backwards compatibility.
+      (modify-syntax-entry '(#x24D0 . #x24E9) "w   " syn-table)
+
+      ;; Set downcase and upcase from Unicode properties
+
+      ;; In some languages, such as Turkish, U+0049 LATIN CAPITAL LETTER I and
+      ;; U+0131 LATIN SMALL LETTER DOTLESS I make a case pair, and so do U+0130
+      ;; LATIN CAPITAL LETTER I WITH DOT ABOVE and U+0069 LATIN SMALL LETTER I.
+
+      ;; We used to set up half of those correspondence unconditionally, but
+      ;; that makes searches slow.  So now we don't set up either half of these
+      ;; correspondences by default.
+
+      ;; (set-downcase-syntax  ?İ ?i tbl)
+      ;; (set-upcase-syntax    ?I ?ı tbl)
+
+      (let ((map-unicode-property
+             (lambda (property func)
+               (map-char-table
+                (lambda (ch cased)
+                  ;; ASCII characters skipped due to reasons outlined above.  As
+                  ;; of Unicode 9.0, this exception affects the following:
+                  ;;   lc(U+0130 İ) = i
+                  ;;   uc(U+0131 ı) = I
+                  ;;   uc(U+017F ſ) = S
+                  ;;   uc(U+212A K) = k
+                  (when (> cased 127)
+                    (let ((end (if (consp ch) (cdr ch) ch)))
+                      (setq ch (max 128 (if (consp ch) (car ch) ch)))
+                      (while (<= ch end)
+                        (funcall func ch cased)
+                        (setq ch (1+ ch))))))
+                (unicode-property-table-internal property))))
+            (down tbl)
+            (up (case-table-get-table tbl 'up)))
+
+        ;; This works on an assumption that if toUpper(x) != x then toLower(x)
+        ;; == x (and the opposite for toLower/toUpper).  This doesn’t hold for
+        ;; title case characters but those incorrect mappings will be
+        ;; overwritten later.
+        (funcall map-unicode-property 'uppercase
+                 (lambda (lc uc) (aset down lc lc) (aset up uc uc)))
+        (funcall map-unicode-property 'lowercase
+                 (lambda (uc lc) (aset down lc lc) (aset up uc uc)))
+
+        ;; Now deal with the actual mapping.  This will correctly assign casing
+        ;; for title-case characters.
+        (funcall map-unicode-property 'uppercase
+                 (lambda (lc uc) (aset up lc uc) (aset up uc uc)))
+        (funcall map-unicode-property 'lowercase
+                 (lambda (uc lc) (aset down uc lc) (aset down lc lc))))))
 
   ;; Clear out the extra slots so that they will be recomputed from the main
   ;; (downcase) table and upcase table.  Since we’re side-stepping the usual
-- 
Best regards
ミハウ “𝓶𝓲𝓷𝓪86” ナザレヴイツ
«If at first you don’t succeed, give up skydiving»



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: master updated (e6041a0 -> 6220fae)
  2017-02-17 15:43   ` Michal Nazarewicz
@ 2017-02-17 16:33     ` Glenn Morris
  2017-02-17 18:35       ` Michal Nazarewicz
  0 siblings, 1 reply; 7+ messages in thread
From: Glenn Morris @ 2017-02-17 16:33 UTC (permalink / raw)
  To: Michal Nazarewicz; +Cc: emacs-devel


I didn't read the patch at all, but sounds like the right approach.
See 20372d0c891 for how a similar case was solved.
I suggest you commit sooner rather than waiting, since not being able to
bootstrap is a bit of a pain.



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: master updated (e6041a0 -> 6220fae)
  2017-02-17 16:33     ` Glenn Morris
@ 2017-02-17 18:35       ` Michal Nazarewicz
  2017-02-17 22:27         ` Glenn Morris
  0 siblings, 1 reply; 7+ messages in thread
From: Michal Nazarewicz @ 2017-02-17 18:35 UTC (permalink / raw)
  To: Glenn Morris; +Cc: emacs-devel

On Fri, Feb 17 2017, Glenn Morris wrote:
> I didn't read the patch at all, but sounds like the right approach.
> See 20372d0c891 for how a similar case was solved.
> I suggest you commit sooner rather than waiting, since not being able to
> bootstrap is a bit of a pain.

Pushed.  Sorry it took so long.  I got tangled up with stuff at work.

-- 
Best regards
ミハウ “𝓶𝓲𝓷𝓪86” ナザレヴイツ
«If at first you don’t succeed, give up skydiving»



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: master updated (e6041a0 -> 6220fae)
  2017-02-17 18:35       ` Michal Nazarewicz
@ 2017-02-17 22:27         ` Glenn Morris
  0 siblings, 0 replies; 7+ messages in thread
From: Glenn Morris @ 2017-02-17 22:27 UTC (permalink / raw)
  To: Michal Nazarewicz; +Cc: emacs-devel


I thought you fixed it quickly! :)
Thanks.



^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2017-02-17 22:27 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20170215164632.13342.86743@vcs0.savannah.gnu.org>
2017-02-17  5:26 ` master updated (e6041a0 -> 6220fae) Glenn Morris
2017-02-17 14:51   ` Michal Nazarewicz
2017-02-17 15:26     ` Eli Zaretskii
2017-02-17 15:43   ` Michal Nazarewicz
2017-02-17 16:33     ` Glenn Morris
2017-02-17 18:35       ` Michal Nazarewicz
2017-02-17 22:27         ` Glenn Morris

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).