From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Mark H Weaver Newsgroups: gmane.lisp.guile.user Subject: Re: Unicode numeric value Date: Mon, 17 Dec 2018 13:42:17 -0500 Message-ID: <87bm5kxae3.fsf@netris.org> References: <87pnu199cm.fsf@netris.org> NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain X-Trace: blaine.gmane.org 1545072107 16094 195.159.176.226 (17 Dec 2018 18:41:47 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Mon, 17 Dec 2018 18:41:47 +0000 (UTC) User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux) Cc: guile-user@gnu.org To: Freeman Gilmore Original-X-From: guile-user-bounces+guile-user=m.gmane.org@gnu.org Mon Dec 17 19:41:42 2018 Return-path: Envelope-to: guile-user@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gYxpm-00045w-8Q for guile-user@m.gmane.org; Mon, 17 Dec 2018 19:41:42 +0100 Original-Received: from localhost ([::1]:48384 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gYxrt-0004mK-29 for guile-user@m.gmane.org; Mon, 17 Dec 2018 13:43:53 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:33599) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gYxrZ-0004ln-UD for guile-user@gnu.org; Mon, 17 Dec 2018 13:43:34 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gYxrW-0006z6-K0 for guile-user@gnu.org; Mon, 17 Dec 2018 13:43:33 -0500 Original-Received: from world.peace.net ([64.112.178.59]:57428) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gYxrW-0006p5-FL for guile-user@gnu.org; Mon, 17 Dec 2018 13:43:30 -0500 Original-Received: from mhw by world.peace.net with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.89) (envelope-from ) id 1gYxrL-0002xE-8a; Mon, 17 Dec 2018 13:43:19 -0500 In-Reply-To: (Freeman Gilmore's message of "Sun, 16 Dec 2018 06:24:08 -0500") X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 64.112.178.59 X-BeenThere: guile-user@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: General Guile related discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guile-user-bounces+guile-user=m.gmane.org@gnu.org Original-Sender: "guile-user" Xref: news.gmane.org gmane.lisp.guile.user:15108 Archived-At: Hi, Freeman Gilmore writes: > On Sun, Dec 16, 2018 at 3:15 AM Mark H Weaver wrote: > > Freeman Gilmore writes: > > > I am looking for a procedure that will read the numeric value, field 8, of > > an Unicode numeric character. Has anyone written this procedure or know > > where I can find it? > > The 'r7rs-wip' branch of the Guile git repository contains a procedure > that does this, with a lookup table derived from Unicode 6.3.0. > > https://git.savannah.gnu.org/cgit/guile.git/tree/module/scheme/char.scm?h=r7rs-wip > > The file is written as an R7RS library form, which won't work on current > releases of Guile, but for now you could simply extract the > 'digit-value' procedure from it, provided that you preserve the > copyright notice. > > Mark > > Thank you Mark: > > That is only half the battle, let me explain. I do not want to read > the standard Unicode table. I want to directly read field 8 of a > numeric character in the privet use area of the Unicode. > > This is not part of scheme. The other half, I need to finger out how > to put the numeric values in field 8 for the characters I want to use. If the mapping from code points to numeric values is static, then you could simply modify the lookup table in the code I suggested above. If the mapping is dynamic, then you'll need a different strategy. One simple approach would be to use a hash table mapping from characters to digit values: (define digit-value-table (make-hash-table)) (define (set-digit-value! char value) (hashv-set! digit-value-table char value)) (define (digit-value char) (hashv-ref digit-value-table char #f)) If the range of relevant code points is small enough, another approach would be to use a vector: (define private-code-point-start #xE000) (define private-code-point-end #xF900) (define (code-point-in-range? cp) (<= private-code-point-start cp private-code-point-end)) (define digit-value-table (make-vector (- private-code-point-end private-code-point-start) #f)) (define (set-digit-value! char value) (let ((cp (char->integer char))) (unless (code-point-in-range? cp) (error "set-digit-value!: code point out of range:" cp)) (vector-set! digit-value-table (- cp private-code-point-start) value))) (define (digit-value char) (let ((cp (char->integer char))) (and (code-point-in-range? cp) (vector-ref digit-value-table (- cp private-code-point-start))))) For a more compact representation, you could use a SRFI-4 homogeneous numeric vector instead, although you'd need to designate a special numeric value to represent "not a digit". Regards, Mark