>> ;; '(#x10A00 . #x10A5F) > > This last line should be removed, no? Indeed – left over from an earlier experiment. > >> '(#x10A3F . #x10A3F) >> (list >> (vector >> (concat consonant >> "\\(?:" virama consonant "\\)*" >> modifier "*" >> virama "?" >> vowel "*" >> modifier "*") >> 1 'font-shape-gstring)))) > > Note that according to the rule above, a sequence > > consonant modifier vowel > > will not be composed, although it matches the regexp, because its > second character is not a virama. Is this okay? Because of the '(#x10A3F . #x10A3F) bit? Yes, that may be a problem. With Kharosthi Unicode, out of the following three examples, the middle one (consonant + modifier + vowel) has its vowel attached incorrectly 𐨗𐨸𐨁 𐨣𐨸𐨁 𐨐𐨿𐨮𐨸𐨁 Cf. 𐨣𐨸𐨁 with modifier with 𐨣𐨁 without modifier: