unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* describe-char and unicode data
@ 2003-05-09 18:31 James H. Cloos Jr.
  2003-05-10 10:06 ` Eli Zaretskii
                   ` (2 more replies)
  0 siblings, 3 replies; 22+ messages in thread
From: James H. Cloos Jr. @ 2003-05-09 18:31 UTC (permalink / raw)


Describe-char shows the unicode hex value of the character in question
if it exists (some chars do not translate to unicode).

Would a patch that expands that to also show the relevant data from
UnicodeData.txt be accepted?

Step one would be code to convert UnicodeData.txt to a suitable elisp
structure, generating a unicodedata.el file.  Given that, the
additional logic in describe-char is trivial.

To give an idea of the amount of data available, UnicodeData.txt is a
semicolon-separated text db with 15 fields per record, and currently
has 15100 records, so loading this may be an issue.  The related
Unihan.txt has up to 78 possible entries for each of 71098 characters.

The name entry from UnicodeData.txt and probably the kDefinition
entries from Unihan.txt would be the useful additions for
describe-char.  The rest of the data may however be useful elsewhere.

What is therefore the best structure to use for this data?

-JimC

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: describe-char and unicode data
  2003-05-09 18:31 describe-char and unicode data James H. Cloos Jr.
@ 2003-05-10 10:06 ` Eli Zaretskii
  2003-05-10 16:23   ` James H. Cloos Jr.
  2003-05-10 16:23 ` Florian Weimer
  2003-05-11 12:55 ` Richard Stallman
  2 siblings, 1 reply; 22+ messages in thread
From: Eli Zaretskii @ 2003-05-10 10:06 UTC (permalink / raw)
  Cc: emacs-devel

> From: "James H. Cloos Jr." <cloos@jhcloos.com>
> Date: 09 May 2003 14:31:52 -0400
> 
> Would a patch that expands that to also show the relevant data from
> UnicodeData.txt be accepted?

What data do you have in mind, specifically?  Can you show a
(ficticious) example of such output, so we could iscuss that first?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: describe-char and unicode data
  2003-05-09 18:31 describe-char and unicode data James H. Cloos Jr.
  2003-05-10 10:06 ` Eli Zaretskii
@ 2003-05-10 16:23 ` Florian Weimer
  2003-05-10 16:39   ` James H. Cloos Jr.
  2003-05-10 18:52   ` Simon Josefsson
  2003-05-11 12:55 ` Richard Stallman
  2 siblings, 2 replies; 22+ messages in thread
From: Florian Weimer @ 2003-05-10 16:23 UTC (permalink / raw)
  Cc: emacs-devel

"James H. Cloos Jr." <cloos@jhcloos.com> writes:

> The related Unihan.txt has up to 78 possible entries for each of
> 71098 characters.

Unihan.txt is strongly non-free, so it can't be distributed anyway.
Data derived from it falls under the same restrictive license, I fear.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: describe-char and unicode data
  2003-05-10 10:06 ` Eli Zaretskii
@ 2003-05-10 16:23   ` James H. Cloos Jr.
  0 siblings, 0 replies; 22+ messages in thread
From: James H. Cloos Jr. @ 2003-05-10 16:23 UTC (permalink / raw)
  Cc: emacs-devel

>>>>> "Eli" == Eli Zaretskii <eliz@elta.co.il> writes:

>> Would a patch that expands that to also show the relevant data from
>> UnicodeData.txt be accepted?

Eli> What data do you have in mind, specifically?  Can you show a
Eli> (ficticious) example of such output, so we could iscuss that
Eli> first?

A current example output is:
==================================================================
  character: ʻ (01211133, 332379, 0x5125b)
    charset: mule-unicode-0100-24ff (Unicode characters of the range U+0100..U+24FF.)
 code point: 36 91
     syntax: w 	which means: word
   category:
buffer code: 0x9C 0xF4 0xA4 0xDB
  file code: 0xCA 0xBB (encoded by coding system mule-utf-8)
    Unicode: 02BB
       font: -Misc-Fixed-Medium-R-Normal--18-120-100-100-C-90-ISO10646-1

There are text properties here:
  lazy-lock            t
==================================================================

I propose expanding the Unicode: line to look like (same character):

    Unicode: 02BB  MODIFIER LETTER TURNED COMMA

The alternate names, if any, should also appear on that line.

The rest of the data in UnicodeData.txt and its related files are
useful to have, especially things like bidi info, combining class,
normalization data, etc.  As the files do get updated, a script or
some elisp to convert each of the text files into elisp seems like
the way to go.  But if all of the data should be incorporated, I'm
not sure what data structure would be best.

-JimC

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: describe-char and unicode data
  2003-05-10 16:23 ` Florian Weimer
@ 2003-05-10 16:39   ` James H. Cloos Jr.
  2003-05-11 12:56     ` Richard Stallman
  2003-05-10 18:52   ` Simon Josefsson
  1 sibling, 1 reply; 22+ messages in thread
From: James H. Cloos Jr. @ 2003-05-10 16:39 UTC (permalink / raw)
  Cc: emacs-devel

>>>>> "Florian" == Florian Weimer <fw@deneb.enyo.de> writes:

Florian> Unihan.txt is strongly non-free, so it can't be distributed
Florian> anyway.  Data derived from it falls under the same
Florian> restrictive license, I fear.

Ack.  I didn't notice that before. :(

On the plus side, the license says:

Unihan.txt> Recipient is granted the right ... to freely use
Unihan.txt> the information supplied in the creation
Unihan.txt> of products supporting Unicode.

The license on the other files is in UCD.html, and does allow
redistribution so long as the copyright notice is included.

cf:  http://www.unicode.org/Public/UNIDATA/UCD.html

-JimC

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: describe-char and unicode data
  2003-05-10 16:23 ` Florian Weimer
  2003-05-10 16:39   ` James H. Cloos Jr.
@ 2003-05-10 18:52   ` Simon Josefsson
  2003-05-11 13:05     ` Florian Weimer
  2003-05-12  7:38     ` Richard Stallman
  1 sibling, 2 replies; 22+ messages in thread
From: Simon Josefsson @ 2003-05-10 18:52 UTC (permalink / raw)
  Cc: emacs-devel

Florian Weimer <fw@deneb.enyo.de> writes:

> "James H. Cloos Jr." <cloos@jhcloos.com> writes:
>
>> The related Unihan.txt has up to 78 possible entries for each of
>> 71098 characters.
>
> Unihan.txt is strongly non-free, so it can't be distributed anyway.
> Data derived from it falls under the same restrictive license, I fear.

Note the last sentence below.

UCD Terms of Use
Disclaimer

    The Unicode Character Database is provided as is by Unicode, Inc. No claims are made as to fitness for any particular purpose. No warranties of any kind are expressed or implied. The recipient agrees to determine applicability of information provided. If this file has been purchased on magnetic or optical media from Unicode, Inc., the sole remedy for any claim will be exchange of defective media within 90 days of receipt.

    This disclaimer is applicable for all other data files accompanying the Unicode Character Database, some of which have been compiled by the Unicode Consortium, and some of which have been supplied by other sources.

Limitations on Rights to Redistribute This Data

    Recipient is granted the right to make copies in any form for internal distribution and to freely use the information supplied in the creation of products supporting the UnicodeTM Standard. The files in the Unicode Character Database can be redistributed to third parties or other organizations (whether for profit or not) as long as this notice and the disclaimer notice are retained. Information can be extracted from these files and used in documentation or programs, as long as there is an accompanying notice indicating the source.

    The file Unihan.txt contains older and inconsistent Terms of Use. That language is overridden by these terms.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: describe-char and unicode data
  2003-05-09 18:31 describe-char and unicode data James H. Cloos Jr.
  2003-05-10 10:06 ` Eli Zaretskii
  2003-05-10 16:23 ` Florian Weimer
@ 2003-05-11 12:55 ` Richard Stallman
  2003-05-11 17:24   ` Stephen J. Turnbull
                     ` (2 more replies)
  2 siblings, 3 replies; 22+ messages in thread
From: Richard Stallman @ 2003-05-11 12:55 UTC (permalink / raw)
  Cc: emacs-devel

    Would a patch that expands that to also show the relevant data from
    UnicodeData.txt be accepted?

Sure, unless Handa sees some problem with it--provided that
UnicodeData.txt has a license that lets us use it.

    Step one would be code to convert UnicodeData.txt to a suitable elisp
    structure, generating a unicodedata.el file.  Given that, the
    additional logic in describe-char is trivial.

That method would use a lot of space in the Lisp world.
It might be better to load UnicodeData.txt into a buffer
and search it, then kill the buffer.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: describe-char and unicode data
  2003-05-10 16:39   ` James H. Cloos Jr.
@ 2003-05-11 12:56     ` Richard Stallman
  2003-05-11 13:04       ` Florian Weimer
  0 siblings, 1 reply; 22+ messages in thread
From: Richard Stallman @ 2003-05-11 12:56 UTC (permalink / raw)
  Cc: fw

    On the plus side, the license says:

    Unihan.txt> Recipient is granted the right ... to freely use
    Unihan.txt> the information supplied in the creation
    Unihan.txt> of products supporting Unicode.

It is not clear that that gives us permission to transform
this data into a anything that would be under a free license,
but we could ask those who released it whether they meant it
to allow that.  Would someone like to contact them and ask?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: describe-char and unicode data
  2003-05-11 12:56     ` Richard Stallman
@ 2003-05-11 13:04       ` Florian Weimer
  0 siblings, 0 replies; 22+ messages in thread
From: Florian Weimer @ 2003-05-11 13:04 UTC (permalink / raw)
  Cc: emacs-devel

Richard Stallman <rms@gnu.org> writes:

>     On the plus side, the license says:
>
>     Unihan.txt> Recipient is granted the right ... to freely use
>     Unihan.txt> the information supplied in the creation
>     Unihan.txt> of products supporting Unicode.
>
> It is not clear that that gives us permission to transform
> this data into a anything that would be under a free license,
> but we could ask those who released it whether they meant it
> to allow that.  Would someone like to contact them and ask?

I've already done this a couple of months ago, but never received any
reply.

http://news.gmane.org/onethread.php?group=gmane.text.unicode.general&root=%3C87heeplucn.fsf%40deneb.enyo.de%3E

Maybe some official entity should try an official channel. 8-/

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: describe-char and unicode data
  2003-05-10 18:52   ` Simon Josefsson
@ 2003-05-11 13:05     ` Florian Weimer
  2003-05-11 14:34       ` Simon Josefsson
  2003-05-12  7:38     ` Richard Stallman
  1 sibling, 1 reply; 22+ messages in thread
From: Florian Weimer @ 2003-05-11 13:05 UTC (permalink / raw)


Simon Josefsson <jas@extundo.com> writes:

>> Unihan.txt is strongly non-free, so it can't be distributed anyway.
>> Data derived from it falls under the same restrictive license, I fear.
>
> Note the last sentence below.

Still doesn't allow for modified distribution, does it?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: describe-char and unicode data
  2003-05-11 13:05     ` Florian Weimer
@ 2003-05-11 14:34       ` Simon Josefsson
  0 siblings, 0 replies; 22+ messages in thread
From: Simon Josefsson @ 2003-05-11 14:34 UTC (permalink / raw)
  Cc: emacs-devel

Florian Weimer <fw@deneb.enyo.de> writes:

> Simon Josefsson <jas@extundo.com> writes:
>
>>> Unihan.txt is strongly non-free, so it can't be distributed anyway.
>>> Data derived from it falls under the same restrictive license, I fear.
>>
>> Note the last sentence below.
>
> Still doesn't allow for modified distribution, does it?

I don't see why it is restricted

,----
| Limitations on Rights to Redistribute This Data
| 
|     Recipient is granted the right to make copies in any form for
|     internal distribution and to freely use the information supplied
                                   -----------------------------------
|     in the creation of products supporting the UnicodeTM Standard. The
      --------------------------------------------------------------
|     files in the Unicode Character Database can be redistributed to
|     third parties or other organizations (whether for profit or not)
|     as long as this notice and the disclaimer notice are
|     retained. Information can be extracted from these files and used
                ------------------------------------------------------
|     in documentation or programs, as long as there is an accompanying
      ----------------------------
|     notice indicating the source.
`----

as long as we include a notice, but I'm not a lawyer.

PS. I have forwarded Richard's question to Rick McGowan of Unicode
Inc.; I discussed this issue with him about a month ago.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: describe-char and unicode data
  2003-05-11 12:55 ` Richard Stallman
@ 2003-05-11 17:24   ` Stephen J. Turnbull
  2003-05-12 11:22   ` Kenichi Handa
  2003-05-21 21:52   ` James H. Cloos Jr.
  2 siblings, 0 replies; 22+ messages in thread
From: Stephen J. Turnbull @ 2003-05-11 17:24 UTC (permalink / raw)
  Cc: emacs-devel

>>>>> "rms" == Richard Stallman <rms@gnu.org> writes:

    >>     Step one would be code to convert UnicodeData.txt to a
    >> suitable elisp structure, generating a unicodedata.el file.
    >> Given that, the additional logic in describe-char is trivial.

    rms> That method would use a lot of space in the Lisp world.  It
    rms> might be better to load UnicodeData.txt into a buffer and
    rms> search it, then kill the buffer.

In XEmacs/UTF-2000, the Lisp databases (which contain a lot more than
simply Unicode and Mule databases, and XEmacs is substantially less
space efficient than Emacs, but still...) take up 20-25MB.  You want
this not in Lisp.


-- 
Institute of Policy and Planning Sciences     http://turnbull.sk.tsukuba.ac.jp
University of Tsukuba                    Tennodai 1-1-1 Tsukuba 305-8573 JAPAN
               Ask not how you can "do" free software business;
              ask what your business can "do for" free software.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: describe-char and unicode data
  2003-05-10 18:52   ` Simon Josefsson
  2003-05-11 13:05     ` Florian Weimer
@ 2003-05-12  7:38     ` Richard Stallman
  2003-05-12 11:24       ` Simon Josefsson
  2003-05-13  6:07       ` Simon Josefsson
  1 sibling, 2 replies; 22+ messages in thread
From: Richard Stallman @ 2003-05-12  7:38 UTC (permalink / raw)
  Cc: fw

	Recipient is granted the right to make copies in any form for
	internal distribution and to freely use the information
	supplied in the creation of products supporting the UnicodeTM
	Standard. The files in the Unicode Character Database can be
	redistributed to third parties or other organizations (whether
	for profit or not) as long as this notice and the disclaimer
	notice are retained. Information can be extracted from these
	files and used in documentation or programs, as long as there
	is an accompanying notice indicating the source.

Perhaps that last sentence gives us permission to release a free work
containing the full information, but we had better check that with a
lawyer first.

	The file Unihan.txt contains older and inconsistent Terms of
	Use. That language is overridden by these terms.

If these terms are free, then we can use Unihan also.
But we don't yet know if we can use any of it.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: describe-char and unicode data
  2003-05-11 12:55 ` Richard Stallman
  2003-05-11 17:24   ` Stephen J. Turnbull
@ 2003-05-12 11:22   ` Kenichi Handa
  2003-05-14 13:49     ` Richard Stallman
  2003-05-21 21:52   ` James H. Cloos Jr.
  2 siblings, 1 reply; 22+ messages in thread
From: Kenichi Handa @ 2003-05-12 11:22 UTC (permalink / raw)
  Cc: emacs-devel

In article <E19EqMX-0002WZ-00@fencepost.gnu.org>, Richard Stallman <rms@gnu.org> writes:
>     Would a patch that expands that to also show the relevant data from
>     UnicodeData.txt be accepted?

> Sure, unless Handa sees some problem with it--provided that
> UnicodeData.txt has a license that lets us use it.

I see no problem with such a patch as far as there's no
license problem.

>     Step one would be code to convert UnicodeData.txt to a suitable elisp
>     structure, generating a unicodedata.el file.  Given that, the
>     additional logic in describe-char is trivial.

> That method would use a lot of space in the Lisp world.
> It might be better to load UnicodeData.txt into a buffer
> and search it, then kill the buffer.

As UnicodeData.txt is less than 1M-byte, the above methos
will be ok, but Unihan.dat is about 26M-byte which, I think,
is too big even for just including in the Emacs
distribution.

---
Ken'ichi HANDA
handa@m17n.org

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: describe-char and unicode data
  2003-05-12  7:38     ` Richard Stallman
@ 2003-05-12 11:24       ` Simon Josefsson
  2003-05-13  6:07       ` Simon Josefsson
  1 sibling, 0 replies; 22+ messages in thread
From: Simon Josefsson @ 2003-05-12 11:24 UTC (permalink / raw)
  Cc: fw

Richard Stallman <rms@gnu.org> writes:

> 	Recipient is granted the right to make copies in any form for
> 	internal distribution and to freely use the information
> 	supplied in the creation of products supporting the UnicodeTM
> 	Standard. The files in the Unicode Character Database can be
> 	redistributed to third parties or other organizations (whether
> 	for profit or not) as long as this notice and the disclaimer
> 	notice are retained. Information can be extracted from these
> 	files and used in documentation or programs, as long as there
> 	is an accompanying notice indicating the source.
>
> Perhaps that last sentence gives us permission to release a free work
> containing the full information, but we had better check that with a
> lawyer first.

Is someone working on that?

Hopefully Unicode Inc. will clarify their license.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: describe-char and unicode data
  2003-05-12  7:38     ` Richard Stallman
  2003-05-12 11:24       ` Simon Josefsson
@ 2003-05-13  6:07       ` Simon Josefsson
  2003-05-15  4:54         ` Richard Stallman
  1 sibling, 1 reply; 22+ messages in thread
From: Simon Josefsson @ 2003-05-13  6:07 UTC (permalink / raw)
  Cc: fw

[-- Attachment #1: Type: text/plain, Size: 1408 bytes --]

Richard Stallman <rms@gnu.org> writes:

> 	Recipient is granted the right to make copies in any form for
> 	internal distribution and to freely use the information
> 	supplied in the creation of products supporting the UnicodeTM
> 	Standard. The files in the Unicode Character Database can be
> 	redistributed to third parties or other organizations (whether
> 	for profit or not) as long as this notice and the disclaimer
> 	notice are retained. Information can be extracted from these
> 	files and used in documentation or programs, as long as there
> 	is an accompanying notice indicating the source.
>
> Perhaps that last sentence gives us permission to release a free work
> containing the full information, but we had better check that with a
> lawyer first.

Below is the response from Unicode.  Is this sufficient?

Rick also made the following comment:

     Note that if you use any Unihan data, you should look at the UCD
     documentation and pay particular attention to which fields are
     normative and which are not; and to what is "provisional". A lot
     of the data in Unihan.txt is not normative, it is spotty and
     provisional and subject to change and improvement without notice.

So I think anyone working on this should separate the display into a
normative part and a "provisional" part so the user isn't lead to
believe some data are normative but really are unchecked data.


[-- Attachment #2: Type: message/rfc822, Size: 3886 bytes --]

From: Rick McGowan <rick@unicode.org>
To: jas@extundo.com
Cc: ken@unicode.org
Subject: Re: draft-rmcgowan-unicode-procs-02.txt
Date: Mon, 12 May 2003 17:05:02 -0700
Message-ID: <200305130005.h4D055r04467@unicode.org>

Hello Simon --

You asked:

> Has there been any progress?  I notice that the UCD-4.0.0.html says:
> |     Recipient is granted the right to make copies in any form ...
>

Yes there has been progress. You can take this as an official response.

We intend for the Unihan database to have the same rights and restrictions
as all of the UCD. Therefore we inserted the revised clause into
UCD-4.0.0.html, intending that it over-ride what is in the Unihan file.
The Unihan database has not yet been updated to a 4.0 version, so 3.2 is
the current one. But the 4.0 UCD clause over-rides the older terms in the  
Unihan 3.2 file.

When the Unihan database is (soon) updated to a 4.0 or later version, the
clause will be changed in the Unihan database itself to align with the new
intent, and to match the 4.0 UCD.

> In a discussion about adding support for this in the text editor
> application Emacs, Richard Stallman raised the following issue:
>
> ,----
> |     Unihan.txt> Recipient is granted the right ... to freely use
> |     Unihan.txt> the information supplied in the creation
> |     Unihan.txt> of products supporting Unicode.
> |
> | It is not clear that that gives us permission to transform
> | this data into a anything that would be under a free license,
> | but we could ask those who released it whether they meant it
> | to allow that.  Would someone like to contact them and ask?
> |
> `----

Yes, we mean that. If people couldn't take our data and transform it by
compression, compilation, extraction, etc, then it wouldn't be very useful.
We definitely intend people to use it. What we don't really want is for
people to take our data verbatim and re-distribute it verbatim, although
such use is definitely allowed explicity. We would prefer that, if
people need to distribute our data verbatim, they do so by referring to
the "latest version" on the web, and point to our web site. That way
constomers of the products can know how and where to get the latest
versions.

But to make a product that uses Unicode, almost everyone needs to take the
character information and properties and somehow distill that information
into a form that is suitable for use by a program during the course of
execution. When you do so, it is nice to also allow a means for end-users
to get an upgrade of the data by supplying some distillation mechanism, or
explaining your data format (if you use one) so that users can do manual
upgrades if needed.

A good example of using the Unicode data files is provided by this program:

	http://www.agfamonotype.com/software/charinfo.asp

That program comes with a compressed database, and has an option for live
update to the most recent Unicode data files by including a parser within
it. Any user can download the Unicode data files from our website, and ask
the program to upgrade itself from those files.

Please let me know if you have any further questions.

All the best,

	Rick

[-- Attachment #3: Type: text/plain, Size: 142 bytes --]

_______________________________________________
Emacs-devel mailing list
Emacs-devel@gnu.org
http://mail.gnu.org/mailman/listinfo/emacs-devel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: describe-char and unicode data
  2003-05-12 11:22   ` Kenichi Handa
@ 2003-05-14 13:49     ` Richard Stallman
  0 siblings, 0 replies; 22+ messages in thread
From: Richard Stallman @ 2003-05-14 13:49 UTC (permalink / raw)
  Cc: emacs-devel

    As UnicodeData.txt is less than 1M-byte, the above methos
    will be ok, but Unihan.dat is about 26M-byte which, I think,
    is too big even for just including in the Emacs
    distribution.

I agree.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: describe-char and unicode data
  2003-05-13  6:07       ` Simon Josefsson
@ 2003-05-15  4:54         ` Richard Stallman
  0 siblings, 0 replies; 22+ messages in thread
From: Richard Stallman @ 2003-05-15  4:54 UTC (permalink / raw)
  Cc: fw

That is entirely convincing.  So we could make a transformed
version of some of the data in these Unicode files
and release that under a free license.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: describe-char and unicode data
  2003-05-11 12:55 ` Richard Stallman
  2003-05-11 17:24   ` Stephen J. Turnbull
  2003-05-12 11:22   ` Kenichi Handa
@ 2003-05-21 21:52   ` James H. Cloos Jr.
  2003-05-22 15:29     ` Kevin Rodgers
  2003-05-23 12:04     ` Richard Stallman
  2 siblings, 2 replies; 22+ messages in thread
From: James H. Cloos Jr. @ 2003-05-21 21:52 UTC (permalink / raw)
  Cc: emacs-devel

>>>>> "Richard" == Richard Stallman <rms@gnu.org> writes:

Richard> It might be better to load UnicodeData.txt into a buffer and
Richard> search it, then kill the buffer.

As it looks like the licensing issues with the data files are worked
out, I'll start on this now.

The load-search-kill paradigm is not one I've done in elisp; is there
any defun I should look at for a good example of how to do it right?

Also, would UnicodeData.txt (and UCD.html for the license text?) go
into emacs's etc dir?

-JimC

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: describe-char and unicode data
  2003-05-21 21:52   ` James H. Cloos Jr.
@ 2003-05-22 15:29     ` Kevin Rodgers
  2003-05-22 19:25       ` James H. Cloos Jr.
  2003-05-23 12:04     ` Richard Stallman
  1 sibling, 1 reply; 22+ messages in thread
From: Kevin Rodgers @ 2003-05-22 15:29 UTC (permalink / raw)


James H. Cloos Jr. wrote:

> The load-search-kill paradigm is not one I've done in elisp; is there
> any defun I should look at for a good example of how to do it right?


Here's an excerpt from woman.el:

		  ;; Parse the file -- if no MANPATH data ignore it:
		  (with-temp-buffer
		    (insert-file-contents file)
		    (while (re-search-forward
			    "^[ \t]*\\(MANDATORY_\\)?MANPATH[ \t]+\\(\\S-+\\)"
			    nil t)
		      (setq manpath (cons (match-string 2) manpath)))
		    manpath)


> Also, would UnicodeData.txt (and UCD.html for the license text?) go
> into emacs's etc dir?

If so, that directory name is available in Emacs Lisp via the data-directory

variable.

-- 
<a href="mailto:&lt;kevin.rodgers&#64;ihs.com&gt;">Kevin Rodgers</a>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: describe-char and unicode data
  2003-05-22 15:29     ` Kevin Rodgers
@ 2003-05-22 19:25       ` James H. Cloos Jr.
  0 siblings, 0 replies; 22+ messages in thread
From: James H. Cloos Jr. @ 2003-05-22 19:25 UTC (permalink / raw)
  Cc: emacs-devel

Thanks for the tip, but Dave Love beat me to it, adapting code from
the unicode branch according to his log message.  Cf:

http://savannah.gnu.org/cgi-bin/viewcvs/emacs/emacs/lisp/descr-text.el.diff?tr1=1.10&tr2=1.11&r1=text&r2=text

-JimC

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: describe-char and unicode data
  2003-05-21 21:52   ` James H. Cloos Jr.
  2003-05-22 15:29     ` Kevin Rodgers
@ 2003-05-23 12:04     ` Richard Stallman
  1 sibling, 0 replies; 22+ messages in thread
From: Richard Stallman @ 2003-05-23 12:04 UTC (permalink / raw)
  Cc: emacs-devel

    As it looks like the licensing issues with the data files are worked
    out, I'll start on this now.

They have been "worked out" only in a very limited sense.
We can use the data in that file to produce some other file
that we can release under a free license.  The file itself
is not free, however.

So in order to use it we have to do something to change it
a substantial amount.

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2003-05-23 12:04 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-05-09 18:31 describe-char and unicode data James H. Cloos Jr.
2003-05-10 10:06 ` Eli Zaretskii
2003-05-10 16:23   ` James H. Cloos Jr.
2003-05-10 16:23 ` Florian Weimer
2003-05-10 16:39   ` James H. Cloos Jr.
2003-05-11 12:56     ` Richard Stallman
2003-05-11 13:04       ` Florian Weimer
2003-05-10 18:52   ` Simon Josefsson
2003-05-11 13:05     ` Florian Weimer
2003-05-11 14:34       ` Simon Josefsson
2003-05-12  7:38     ` Richard Stallman
2003-05-12 11:24       ` Simon Josefsson
2003-05-13  6:07       ` Simon Josefsson
2003-05-15  4:54         ` Richard Stallman
2003-05-11 12:55 ` Richard Stallman
2003-05-11 17:24   ` Stephen J. Turnbull
2003-05-12 11:22   ` Kenichi Handa
2003-05-14 13:49     ` Richard Stallman
2003-05-21 21:52   ` James H. Cloos Jr.
2003-05-22 15:29     ` Kevin Rodgers
2003-05-22 19:25       ` James H. Cloos Jr.
2003-05-23 12:04     ` Richard Stallman

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).