* Emacs 23.2 pretest freeze?
@ 2009-11-30 19:55 Karl Fogel
2009-11-30 22:48 ` Chong Yidong
2009-12-03 21:35 ` Emacs 23.2 Pretest next week Chong Yidong
0 siblings, 2 replies; 24+ messages in thread
From: Karl Fogel @ 2009-11-30 19:55 UTC (permalink / raw)
To: emacs-devel
In [1], Alan MacKenzie suggested we wait until the 23.2 pretest is out
before switching to Bazaar, and said he thought the pretest was close --
end of the month or so.
Is it almost ready?
The thread starting at [2] also implies the end of November. The most
recent exchange in that thread is [3], also with Alan, in which he says
his CC mode changes should be done by the end of the month (Yidong
replies "OK"). I don't know where that stands.
-Karl
[1] http://lists.gnu.org/archive/html/emacs-devel/2009-11/msg00674.html
[2] http://lists.gnu.org/archive/html/emacs-devel/2009-11/msg00173.html
[3] http://lists.gnu.org/archive/html/emacs-devel/2009-11/msg00248.html
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Emacs 23.2 pretest freeze?
2009-11-30 19:55 Emacs 23.2 pretest freeze? Karl Fogel
@ 2009-11-30 22:48 ` Chong Yidong
2009-11-30 23:05 ` Karl Fogel
2009-12-02 13:34 ` Alan Mackenzie
2009-12-03 21:35 ` Emacs 23.2 Pretest next week Chong Yidong
1 sibling, 2 replies; 24+ messages in thread
From: Chong Yidong @ 2009-11-30 22:48 UTC (permalink / raw)
To: Karl Fogel; +Cc: emacs-devel@gnu.org
Karl Fogel <kfogel@red-bean.com> writes:
> In [1], Alan MacKenzie suggested we wait until the 23.2 pretest is out
> before switching to Bazaar, and said he thought the pretest was close --
> end of the month or so.
>
> Is it almost ready?
>
> The thread starting at [2] also implies the end of November. The most
> recent exchange in that thread is [3], also with Alan, in which he says
> his CC mode changes should be done by the end of the month (Yidong
> replies "OK"). I don't know where that stands.
We're mostly waiting for Alan's code to land now. Hopefully in a week
or so.
That said, I'm still uneasy about the status of message-mode as default;
I'm not sure we've done enough to smooth the transition from mail-mode
as default.
(One more thing: sometime after the feature freeze and the start of the
pretest, I will add one additional piece of code: the Python parser for
Semantic. We're currently waiting for the paperwork. This is an
extremely self-contained part of Semantic, so it should not impact the
pretest.)
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Emacs 23.2 pretest freeze?
2009-11-30 22:48 ` Chong Yidong
@ 2009-11-30 23:05 ` Karl Fogel
2009-12-02 13:34 ` Alan Mackenzie
1 sibling, 0 replies; 24+ messages in thread
From: Karl Fogel @ 2009-11-30 23:05 UTC (permalink / raw)
To: Chong Yidong; +Cc: emacs-devel@gnu.org
Chong Yidong <cyd@stupidchicken.com> writes:
> We're mostly waiting for Alan's code to land now. Hopefully in a week
> or so.
Thanks.
> That said, I'm still uneasy about the status of message-mode as default;
> I'm not sure we've done enough to smooth the transition from mail-mode
> as default.
I don't know the details of the transition. All I can say is I heartily
support message-mode as default, as it's awesome :-).
Since the freeze doesn't apply to documentation changes, if what you're
worried about can mostly be addressed by doc improvements, then
message-mode isn't an issue as far as the pretest goes anyway, right?
> (One more thing: sometime after the feature freeze and the start of the
> pretest, I will add one additional piece of code: the Python parser for
> Semantic. We're currently waiting for the paperwork. This is an
> extremely self-contained part of Semantic, so it should not impact the
> pretest.)
*nod*
One nice thing about working with Bazaar will be that paperwork delays
will have less impact on development, because it'll be easier for any
developer to try out a branch containing the not-yet-approved changes.
(Whereas in CVS, the "apply a patch and revert it later" process is a
bit awkward -- a shallow but still noticeable gumption sink.)
-Karl
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Emacs 23.2 pretest freeze?
2009-11-30 22:48 ` Chong Yidong
2009-11-30 23:05 ` Karl Fogel
@ 2009-12-02 13:34 ` Alan Mackenzie
1 sibling, 0 replies; 24+ messages in thread
From: Alan Mackenzie @ 2009-12-02 13:34 UTC (permalink / raw)
To: Chong Yidong; +Cc: Karl Fogel, emacs-devel@gnu.org
Hi, Karl and Yidong!
On Mon, Nov 30, 2009 at 05:48:58PM -0500, Chong Yidong wrote:
> Karl Fogel <kfogel@red-bean.com> writes:
> > In [1], Alan MacKenzie suggested we wait until the 23.2 pretest is out
> > before switching to Bazaar, and said he thought the pretest was close --
> > end of the month or so.
> > Is it almost ready?
> > The thread starting at [2] also implies the end of November. The most
> > recent exchange in that thread is [3], also with Alan, in which he says
> > his CC mode changes should be done by the end of the month (Yidong
> > replies "OK"). I don't know where that stands.
> We're mostly waiting for Alan's code to land now. Hopefully in a week
> or so.
Yes, I both hope so and expect it. Thanks for waiting!
--
Alan Mackenzie (Nuremberg, Germany).
^ permalink raw reply [flat|nested] 24+ messages in thread
* Emacs 23.2 Pretest next week
2009-11-30 19:55 Emacs 23.2 pretest freeze? Karl Fogel
2009-11-30 22:48 ` Chong Yidong
@ 2009-12-03 21:35 ` Chong Yidong
2009-12-04 11:23 ` faster unicode character name completion Kenichi Handa
2009-12-04 19:04 ` Emacs 23.2 Pretest next week Dan Nicolaescu
1 sibling, 2 replies; 24+ messages in thread
From: Chong Yidong @ 2009-12-03 21:35 UTC (permalink / raw)
To: emacs-devel@gnu.org
We are close to ready to begin the Emacs 23.2 pretest. If there are no
objections, I will roll the 23.1.90 pretest next Tuesday. If you still
have some last-minute changes you want to make, that gives the weekend
to wrap them up. After the pretest begins, we will be in feature freeze
(and we'll begin the cvs to bzr crossover).
^ permalink raw reply [flat|nested] 24+ messages in thread
* faster unicode character name completion
2009-12-03 21:35 ` Emacs 23.2 Pretest next week Chong Yidong
@ 2009-12-04 11:23 ` Kenichi Handa
2009-12-04 12:08 ` Deniz Dogan
` (3 more replies)
2009-12-04 19:04 ` Emacs 23.2 Pretest next week Dan Nicolaescu
1 sibling, 4 replies; 24+ messages in thread
From: Kenichi Handa @ 2009-12-04 11:23 UTC (permalink / raw)
To: Chong Yidong; +Cc: emacs-devel
In article <87fx7r68s4.fsf@stupidchicken.com>, Chong Yidong <cyd@stupidchicken.com> writes:
> We are close to ready to begin the Emacs 23.2 pretest. If there are no
> objections, I will roll the 23.1.90 pretest next Tuesday. If you still
> have some last-minute changes you want to make, that gives the weekend
> to wrap them up. After the pretest begins, we will be in feature freeze
> (and we'll begin the cvs to bzr crossover).
I'm now trying to make the completion of unicode character
name (used by read-char-by-name) faster, at least fast
enough for interactive use. Now it's very slow at the first
time and consumes so much memory. Attached is the currently
working code. In the actual code, I'll eliminate the big
defvar of ucs-name-head-table, generate the value by
admin/unidata/unidata-gen, and store it in an extra slot of
a char-table for `name' char-code-property.
The drawback of the new code is that one can see only the
list of the first words of character names in the completion
buffer at once by C-x 8 RET TAB, instead of all of the
unicode character names.
What do you think? If people think the above is not a
problem, I'll go ahead.
---
Kenichi Handa
handa@m17n.org
--- ucs-name.el ---
(defvar ucs-name-head-table
'(("SPACE" 0)
("EXCLAMATION" 0 8192)
("QUOTATION" 0)
("NUMBER" 0 9344)
("DOLLAR" 0)
("PERCENT" 0)
("AMPERSAND" 0)
("APOSTROPHE" 0)
("LEFT" 0 128 3968 8192 8576 8832 8960 9088 9600 9856 10112 10496 10624 11008 11776 12288)
("RIGHT" 0 128 3968 8192 8576 8704 8832 8960 9088 9600 10112 10496 10624 11776 12288)
("ASTERISK" 0 8704)
("PLUS" 0 128 10752)
("COMMA" 0)
("HYPHEN" 0 8192 11776)
("FULL" 0 9600 10112)
("SOLIDUS" 0 10624)
("DIGIT" 0 9344 127232)
("COLON" 0 8320 8704)
("SEMICOLON" 0)
("LESS" 0 8704 8832 10496 10752 10880)
("EQUALS" 0 8704 10496 10624 10752 10880 11008)
("GREATER" 0 8704 8832 10496 10752 10880)
("QUESTION" 0 8192)
("COMMERCIAL" 0 8192)
("LATIN" 0 128 256 384 512 640 7424 7552 7680 7808 8320 8576 9984 11264 42752 42880 64256)
("REVERSE" 0 10112 10624 11008)
("CIRCUMFLEX" 0)
("LOW" 0 8192 12288)
("GRAVE" 0)
("VERTICAL" 0 8192 8832 8960 9088 9856 10112 10624 10880 11776 12288)
("TILDE" 0 8704 10496 10752 11008 11776)
("NO" 128 9856)
("INVERTED" 128 8192 8448 8704 11776)
("CENT" 128)
("POUND" 128)
("CURRENCY" 128)
("YEN" 128)
("BROKEN" 128 9088)
("SECTION" 128)
("DIAERESIS" 128)
("COPYRIGHT" 128)
("FEMININE" 128)
("NOT" 128 8704 8832 8960)
("SOFT" 128)
("REGISTERED" 128)
("MACRON" 128)
("DEGREE" 128 8448)
("SUPERSCRIPT" 128 8192)
("ACUTE" 128 10624)
("MICRO" 128)
("PILCROW" 128)
("MIDDLE" 128)
("CEDILLA" 128)
("MASCULINE" 128)
("VULGAR" 128 8448 8576)
("MULTIPLICATION" 128 9984 10752)
("DIVISION" 128 8704 8832)
("MODIFIER" 640 4224 7424 7552 11264 42752 42880)
("CARON" 640)
("BREVE" 640)
("DOT" 640 8704 8832)
("RING" 640 8704 11776)
("OGONEK" 640)
("SMALL" 640 8448 8704 8832 10752 65024 68352)
("DOUBLE" 640 8192 8448 8704 8832 9344 10624 10752 10880 11776 12288 65024)
("COMBINING" 768 1152 7552 8320 11648 12416 42496 43136 65024 119296)
("GREEK" 768 896 7424 7936 8064 65792 65920 119296)
("COPTIC" 896 11392)
("CYRILLIC" 1024 1152 1280 7424 42496 42624)
("ARMENIAN" 1280 1408 64256)
("HEBREW" 1408 64256)
("ARABIC" 1536 1664 1792 64256 64384 64512 64640 64768 64896 65024 65152)
("AFGHANI" 1536)
("EXTENDED" 1664)
("SYRIAC" 1792)
("THAANA" 1920)
("NKO" 1920)
("SAMARITAN" 2048)
("DEVANAGARI" 2304 43136)
("BENGALI" 2432)
("GURMUKHI" 2560)
("GUJARATI" 2688)
("ORIYA" 2816)
("TAMIL" 2944)
("TELUGU" 3072)
("KANNADA" 3200)
("MALAYALAM" 3328)
("SINHALA" 3456)
("THAI" 3584)
("LAO" 3712)
("TIBETAN" 3840 3968)
("MYANMAR" 4096 4224 43520)
("GEORGIAN" 4224 11520)
("HANGUL" 4352 4480 12288 12544 12672 43264 44032 44160 44288 44416 44544 44672 44800 44928 45056 45184 45312 45440 45568 45696 45824 45952 46080 46208 46336 46464 46592 46720 46848 46976 47104 47232 47360 47488 47616 47744 47872 48000 48128 48256 48384 48512 48640 48768 48896 49024 49152 49280 49408 49536 49664 49792 49920 50048 50176 50304 50432 50560 50688 50816 50944 51072 51200 51328 51456 51584 51712 51840 51968 52096 52224 52352 52480 52608 52736 52864 52992 53120 53248 53376 53504 53632 53760 53888 54016 54144 54272 54400 54528 54656 54784 54912 55040 55168)
("ETHIOPIC" 4608 4736 4864 4992 11648)
("CHEROKEE" 4992)
("CANADIAN" 5120 5248 5376 5504 5632 6272)
("OGHAM" 5760)
("RUNIC" 5760)
("TAGALOG" 5888)
("HANUNOO" 5888)
("PHILIPPINE" 5888)
("BUHID" 5888)
("TAGBANWA" 5888)
("KHMER" 6016 6528)
("MONGOLIAN" 6144 6272)
("LIMBU" 6400)
("TAI" 6400 6656 6784 43648)
("NEW" 6528 8320)
("BUGINESE" 6656)
("BALINESE" 6912)
("SUNDANESE" 7040)
("LEPCHA" 7168)
("OL" 7168)
("VEDIC" 7296)
("EN" 8192)
("EM" 8192)
("THREE" 8192 8576 9856 10112 10752 11008)
("FOUR" 8192 9984)
("SIX" 8192 9984)
("FIGURE" 8192)
("PUNCTUATION" 8192)
("THIN" 8192)
("HAIR" 8192)
("ZERO" 8192 65152)
("NON" 8192)
("HORIZONTAL" 8192 9088 9856 11008)
("SINGLE" 8192)
("DAGGER" 8192)
("BULLET" 8192 8704)
("TRIANGULAR" 8192)
("ONE" 8192 11776)
("TWO" 8192 10624 10752 11776)
("HYPHENATION" 8192)
("LINE" 8192 10752)
("PARAGRAPH" 8192)
("POP" 8192)
("NARROW" 8192)
("PER" 8192 8448)
("PRIME" 8192)
("TRIPLE" 8192 8704 8832 10624 10752 10880)
("REVERSED" 8192 8448 8704 8832 8960 9728 10624 10880 11776 12288)
("CARET" 8192)
("REFERENCE" 8192)
("INTERROBANG" 8192)
("OVERLINE" 8192)
("UNDERTIE" 8192)
("CHARACTER" 8192)
("ASTERISM" 8192)
("FRACTION" 8192 8448)
("TIRONIAN" 8192)
("BLACK" 8192 8448 9600 9728 9856 9984 10112 10624 11008)
("CLOSE" 8192)
("SWUNG" 8192)
("FLOWER" 8192 9856)
("QUADRUPLE" 8192 10752)
("FIVE" 8192 11776)
("DOTTED" 8192 9600 10624 11008 11776)
("TRICOLON" 8192)
("MEDIUM" 8192 9600 9856 9984)
("WORD" 8192 11776)
("FUNCTION" 8192)
("INVISIBLE" 8192)
("INHIBIT" 8192)
("ACTIVATE" 8192)
("NATIONAL" 8192)
("NOMINAL" 8192)
("SUBSCRIPT" 8320)
("EURO" 8320)
("CRUZEIRO" 8320)
("FRENCH" 8320)
("LIRA" 8320)
("MILL" 8320)
("NAIRA" 8320)
("PESETA" 8320)
("RUPEE" 8320)
("WON" 8320)
("DONG" 8320)
("KIP" 8320)
("TUGRIK" 8320)
("DRACHMA" 8320)
("GERMAN" 8320)
("PESO" 8320)
("GUARANI" 8320)
("AUSTRAL" 8320)
("HRYVNIA" 8320)
("CEDI" 8320)
("LIVRE" 8320)
("SPESMILO" 8320)
("TENGE" 8320)
("ACCOUNT" 8448)
("ADDRESSED" 8448)
("CENTRE" 8448)
("CARE" 8448)
("CADA" 8448)
("EULER" 8448)
("SCRUPLE" 8448)
("SCRIPT" 8448)
("PLANCK" 8448)
("L" 8448)
("NUMERO" 8448)
("SOUND" 8448)
("PRESCRIPTION" 8448)
("RESPONSE" 8448)
("SERVICE" 8448)
("TELEPHONE" 8448 8960 9984)
("TRADE" 8448)
("VERSICLE" 8448)
("OUNCE" 8448)
("OHM" 8448)
("TURNED" 8448 8960 9856 10624)
("KELVIN" 8448)
("ANGSTROM" 8448)
("ESTIMATED" 8448)
("ALEF" 8448)
("BET" 8448)
("GIMEL" 8448)
("DALET" 8448)
("INFORMATION" 8448)
("ROTATED" 8448 9984)
("FACSIMILE" 8448)
("PROPERTY" 8448)
("AKTIESELSKAB" 8448)
("SYMBOL" 8448 9216)
("ROMAN" 8448 8576 65920)
("LEFTWARDS" 8576 10496 11008)
("UPWARDS" 8576 10112 10496 11008 11776)
("RIGHTWARDS" 8576 10496 11008)
("DOWNWARDS" 8576 10112 10496 11008 11776)
("UP" 8576 8832 8960 9600 10112 10496 10624 11008)
("NORTH" 8576 10496 11008 43008)
("SOUTH" 8576 10496 11008)
("ANTICLOCKWISE" 8576 8704 10112 10496 10752)
("CLOCKWISE" 8576 8704 10112 10496)
("FOR" 8704)
("COMPLEMENT" 8704)
("PARTIAL" 8704)
("THERE" 8704)
("EMPTY" 8704 10624)
("INCREMENT" 8704)
("NABLA" 8704)
("ELEMENT" 8704 8832 10112 10880)
("CONTAINS" 8704 8832)
("DOES" 8704 8832 10880)
("END" 8704)
("N" 8704 8832 10752 10880)
("MINUS" 8704 10752)
("SET" 8704)
("SQUARE" 8704 8832 8960 9088 9600 9856 10624 10880 11008 12928 13056 13184 127360 127488)
("CUBE" 8704)
("FOURTH" 8704)
("PROPORTIONAL" 8704)
("INFINITY" 8704 10624)
("ANGLE" 8704 10624)
("MEASURED" 8704 10624)
("SPHERICAL" 8704 10624)
("DIVIDES" 8704)
("PARALLEL" 8704 10880)
("LOGICAL" 8704 10752)
("INTERSECTION" 8704 10752)
("UNION" 8704 10752)
("INTEGRAL" 8704 9088 10752)
("CONTOUR" 8704)
("SURFACE" 8704)
("VOLUME" 8704)
("THEREFORE" 8704)
("BECAUSE" 8704)
("RATIO" 8704)
("PROPORTION" 8704)
("EXCESS" 8704)
("GEOMETRIC" 8704)
("HOMOTHETIC" 8704)
("SINE" 8704)
("WREATH" 8704)
("ASYMPTOTICALLY" 8704)
("APPROXIMATELY" 8704 10752)
("NEITHER" 8704 8832)
("ALMOST" 8704 10752)
("ALL" 8704 8960)
("EQUIVALENT" 8704 10752)
("GEOMETRICALLY" 8704)
("DIFFERENCE" 8704)
("APPROACHES" 8704)
("IMAGE" 8704 8832)
("CORRESPONDS" 8704)
("ESTIMATES" 8704)
("EQUIANGULAR" 8704)
("STAR" 8704 8832 9728 9984)
("DELTA" 8704)
("EQUAL" 8704 8832)
("QUESTIONED" 8704)
("IDENTICAL" 8704 10624 10752)
("STRICTLY" 8704)
("MUCH" 8704)
("BETWEEN" 8704)
("PRECEDES" 8704 8832 10880)
("SUCCEEDS" 8704 8832 10880)
("SUBSET" 8832 10496 10880)
("SUPERSET" 8832 10112 10496 10880)
("MULTISET" 8832)
("CIRCLED" 8832 9088 9216 9344 9856 9984 10112 10624 10752 12288 12800 12928 127232)
("SQUARED" 8832 9856 10624 11776 127232 127488)
("DOWN" 8832 8960 10496 10624 10880)
("ASSERTION" 8832)
("MODELS" 8832)
("TRUE" 8832)
("FORCES" 8832)
("NEGATED" 8832)
("NORMAL" 8832)
("ORIGINAL" 8832)
("MULTIMAP" 8832)
("HERMITIAN" 8832)
("INTERCALATE" 8832)
("XOR" 8832)
("NAND" 8832)
("NOR" 8832)
("DIAMOND" 8832 11008)
("BOWTIE" 8832 10624)
("CURLY" 8832 9088)
("PITCHFORK" 8832 10880)
("VERY" 8832)
("MIDLINE" 8832)
("Z" 8832 10624 10752)
("DIAMETER" 8960)
("ELECTRIC" 8960)
("HOUSE" 8960)
("PROJECTIVE" 8960)
("PERSPECTIVE" 8960)
("WAVY" 8960 12288 65024)
("BOTTOM" 8960 9088 10496 11776)
("TOP" 8960 9088 10496 11776)
("ARC" 8960)
("SEGMENT" 8960)
("SECTOR" 8960)
("POSITION" 8960)
("VIEWDATA" 8960)
("PLACE" 8960)
("WATCH" 8960)
("HOURGLASS" 8960)
("FROWN" 8960)
("SMILE" 8960)
("OPTION" 8960)
("ERASE" 8960)
("X" 8960)
("KEYBOARD" 8960)
("BENZENE" 8960 9088)
("CYLINDRICITY" 8960)
("SYMMETRY" 8960)
("TOTAL" 8960)
("DIMENSION" 8960)
("CONICAL" 8960)
("SLOPE" 8960)
("COUNTERBORE" 8960)
("COUNTERSINK" 8960)
("APL" 8960 9088)
("SHOULDERED" 8960)
("BELL" 8960)
("INSERTION" 9088)
("CONTINUOUS" 9088)
("DISCONTINUOUS" 9088)
("EMPHASIS" 9088)
("COMPOSITION" 9088)
("WHITE" 9088 9600 9728 9856 9984 10112 10624 10880 11008 65024)
("ENTER" 9088)
("ALTERNATIVE" 9088)
("HELM" 9088)
("UNDO" 9088)
("MONOSTABLE" 9088)
("HYSTERESIS" 9088)
("OPEN" 9088 9216 9984 10112)
("PASSIVE" 9088)
("DIRECT" 9088)
("SOFTWARE" 9088)
("DECIMAL" 9088)
("PREVIOUS" 9088)
("NEXT" 9088)
("PRINT" 9088)
("CLEAR" 9088)
("UPPER" 9088 9600 9984 10112)
("SUMMATION" 9088 10752)
("RADICAL" 9088)
("DENTISTRY" 9088)
("RETURN" 9088)
("EJECT" 9088)
("METRICAL" 9088)
("EARTH" 9088 9728)
("FUSE" 9088)
("STRAIGHTNESS" 9088)
("FLATNESS" 9088)
("AC" 9088)
("ELECTRICAL" 9088)
("BLANK" 9216)
("OCR" 9216)
("PARENTHESIZED" 9216 9344 12800 127232)
("NEGATIVE" 9344 127232 127360)
("BOX" 9472)
("LOWER" 9600 9984 10112 10496)
("LIGHT" 9600 9984)
("DARK" 9600)
("QUADRANT" 9600)
("FISHEYE" 9600)
("LOZENGE" 9600 10112)
("CIRCLE" 9600 10624)
("BULLSEYE" 9600)
("INVERSE" 9600)
("LARGE" 9600 10112 10752 10880 68352)
("CLOUD" 9728)
("UMBRELLA" 9728 9856)
("SNOWMAN" 9728 9856)
("COMET" 9728)
("LIGHTNING" 9728)
("THUNDERSTORM" 9728)
("SUN" 9728 9856)
("ASCENDING" 9728)
("DESCENDING" 9728)
("CONJUNCTION" 9728)
("OPPOSITION" 9728)
("BALLOT" 9728 9984)
("SALTIRE" 9728)
("HOT" 9728)
("SHAMROCK" 9728)
("SKULL" 9728)
("CAUTION" 9728)
("RADIOACTIVE" 9728)
("BIOHAZARD" 9728)
("CADUCEUS" 9728)
("ANKH" 9728)
("ORTHODOX" 9728)
("CHI" 9728)
("CROSS" 9728)
("FARSI" 9728)
("ADI" 9728)
("HAMMER" 9728 9856)
("PEACE" 9728)
("YIN" 9728)
("TRIGRAM" 9728)
("WHEEL" 9728)
("FIRST" 9728)
("LAST" 9728)
("MERCURY" 9728)
("FEMALE" 9728)
("MALE" 9728 9856)
("JUPITER" 9728)
("SATURN" 9728)
("URANUS" 9728)
("NEPTUNE" 9728)
("PLUTO" 9728)
("ARIES" 9728)
("TAURUS" 9728)
("GEMINI" 9728)
("CANCER" 9728)
("LEO" 9728)
("VIRGO" 9728)
("LIBRA" 9728)
("SCORPIUS" 9728)
("SAGITTARIUS" 9728)
("CAPRICORN" 9728)
("AQUARIUS" 9728)
("PISCES" 9728)
("QUARTER" 9728)
("EIGHTH" 9728)
("BEAMED" 9728)
("MUSIC" 9728)
("WEST" 9728)
("EAST" 9728)
("UNIVERSAL" 9728)
("RECYCLING" 9728)
("RECYCLED" 9728)
("PARTIALLY" 9728)
("PERMANENT" 9728)
("WHEELCHAIR" 9728)
("DIE" 9856)
("MONOGRAM" 9856 119552)
("DIGRAM" 9856 119552)
("ANCHOR" 9856)
("CROSSED" 9856 127360)
("STAFF" 9856)
("SCALES" 9856)
("ALEMBIC" 9856)
("GEAR" 9856)
("ATOM" 9856)
("FLEUR" 9856)
("OUTLINED" 9856 9984)
("WARNING" 9856)
("HIGH" 9856)
("DOUBLED" 9856)
("INTERLOCKED" 9856)
("MARRIAGE" 9856)
("DIVORCE" 9856)
("UNMARRIED" 9856)
("COFFIN" 9856)
("FUNERAL" 9856)
("NEUTER" 9856)
("CERES" 9856)
("PALLAS" 9856)
("JUNO" 9856)
("VESTA" 9856)
("CHIRON" 9856)
("SEXTILE" 9856)
("SEMISEXTILE" 9856)
("QUINCUNX" 9856)
("SESQUIQUADRATE" 9856)
("SOCCER" 9856)
("BASEBALL" 9856)
("RAIN" 9856)
("THUNDER" 9856)
("CROSSING" 9856)
("DISABLED" 9856)
("PICK" 9856)
("CAR" 9856)
("HELMET" 9856)
("CHAINS" 9856)
("ALTERNATE" 9856)
("DRIVE" 9856)
("HEAVY" 9856 9984 10112 11008)
("FALLING" 9856 10496)
("RESTRICTED" 9856)
("SHINTO" 9856)
("CHURCH" 9856)
("CASTLE" 9856)
("HISTORIC" 9856)
("MAP" 9856)
("MOUNTAIN" 9856)
("FOUNTAIN" 9856)
("FLAG" 9856)
("FERRY" 9856)
("SAILBOAT" 9856)
("SKIER" 9856)
("ICE" 9856)
("PERSON" 9856)
("TENT" 9856)
("JAPANESE" 9856 12288)
("HEADSTONE" 9856)
("FUEL" 9856)
("CUP" 9856)
("TAPE" 9984)
("AIRPLANE" 9984)
("ENVELOPE" 9984)
("VICTORY" 9984)
("WRITING" 9984)
("PENCIL" 9984)
("CHECK" 9984)
("SHADOWED" 9984)
("MALTESE" 9984)
("STRESS" 9984)
("PINWHEEL" 9984)
("EIGHT" 9984)
("TWELVE" 9984)
("SIXTEEN" 9984)
("TEARDROP" 9984 10112)
("SNOWFLAKE" 9984)
("TIGHT" 9984)
("SPARKLE" 9984)
("BALLOON" 9984)
("CURVED" 9984)
("FLORAL" 9984)
("DINGBAT" 9984 10112)
("DRAFTING" 10112)
("TRIANGLE" 10112 10624)
("DASHED" 10112 65024)
("SQUAT" 10112)
("BACK" 10112)
("FRONT" 10112)
("NOTCHED" 10112)
("WEDGE" 10112)
("PERPENDICULAR" 10112 10880)
("OR" 10112)
("LONG" 10112 10880 11008)
("AND" 10112)
("MATHEMATICAL" 10112 119808 119936 120064 120192 120320 120448 120576 120704)
("BRAILLE" 10240 10368)
("RISING" 10496)
("WAVE" 10496 11008 12288)
("ARROW" 10496)
("SHORT" 10496 10880)
("OBLIQUE" 10624)
("S" 10624)
("TIMES" 10624)
("INCOMPLETE" 10624)
("TIE" 10624)
("INCREASES" 10624)
("SHUFFLE" 10624)
("GLEICH" 10624)
("THERMODYNAMIC" 10624)
("ERROR" 10624)
("RULE" 10624)
("BIG" 10624)
("TINY" 10624 68352)
("MINY" 10624)
("MODULO" 10752)
("FINITE" 10752)
("CIRCULATION" 10752)
("QUATERNION" 10752)
("JOIN" 10752)
("VECTOR" 10752)
("SEMIDIRECT" 10752)
("SMASH" 10752)
("INTERIOR" 10752)
("RIGHTHAND" 10752)
("AMALGAMATION" 10752)
("CLOSED" 10752 10880)
("SLOPING" 10752)
("SIMILAR" 10752 10880)
("CONGRUENT" 10752)
("SLANTED" 10880)
("SMALLER" 10880)
("LARGER" 10880)
("TRANSVERSAL" 10880)
("FORKING" 10880)
("NONFORKING" 10880)
("GLAGOLITIC" 11264)
("TIFINAGH" 11520)
("RAISED" 11776)
("EDITORIAL" 11776)
("PARAGRAPHOS" 11776)
("FORKED" 11776)
("HYPODIASTOLE" 11776)
("PALM" 11776)
("CJK" 11904 12672)
("KANGXI" 12032 12160)
("IDEOGRAPHIC" 12160 12288 12672 12928 13056 13184)
("DITTO" 12288)
("POSTAL" 12288)
("GETA" 12288)
("HANGZHOU" 12288)
("MASU" 12288)
("PART" 12288)
("HIRAGANA" 12288 12416)
("KATAKANA" 12416 12672)
("BOPOMOFO" 12544 12672)
("PARTNERSHIP" 12800)
("KOREAN" 12800)
("LIMITED" 12928)
("HEXAGRAM" 19840)
("YI" 40960 41088 41216 41344 41472 41600 41728 41856 41984 42112)
("LISU" 42112)
("VAI" 42240 42368 42496)
("SLAVONIC" 42496)
("BAMUM" 42624)
("SYLOTI" 43008)
("PHAGS" 43008)
("SAURASHTRA" 43136)
("KAYAH" 43264)
("REJANG" 43264)
("JAVANESE" 43392)
("CHAM" 43520)
("MEETEI" 43904)
("ORNATE" 64768)
("RIAL" 64896)
("VARIATION" 65024 917760 917888)
("PRESENTATION" 65024)
("SESAME" 65024)
("CENTRELINE" 65024)
("FULLWIDTH" 65280 65408)
("HALFWIDTH" 65280 65408)
("INTERLINEAR" 65408)
("OBJECT" 65408)
("REPLACEMENT" 65408)
("LINEAR" 65536 65664)
("AEGEAN" 65792)
("PHAISTOS" 65920)
("LYCIAN" 66176)
("CARIAN" 66176)
("OLD" 66304 66432 68096 68608)
("GOTHIC" 66304)
("UGARITIC" 66432)
("DESERET" 66560)
("SHAVIAN" 66560)
("OSMANYA" 66688)
("CYPRIOT" 67584)
("IMPERIAL" 67584)
("PHOENICIAN" 67840)
("LYDIAN" 67840)
("KHAROSHTHI" 68096)
("AVESTAN" 68352)
("INSCRIPTIONAL" 68352)
("RUMI" 69120)
("KAITHI" 69760)
("CUNEIFORM" 73728 73856 73984 74112 74240 74368 74496 74752)
("EGYPTIAN" 77824 77952 78080 78208 78336 78464 78592 78720 78848)
("BYZANTINE" 118784 118912)
("MUSICAL" 119040 119168)
("TETRAGRAM" 119552)
("COUNTING" 119552)
("MAHJONG" 126976)
("DOMINO" 126976 127104)
("TORTOISE" 127232 127488)
("LANGUAGE" 917504)
("TAG" 917504)
("CANCEL" 917504)))
(defun ucs-name-expand-table (head)
(let ((slot (assoc head ucs-name-head-table))
names)
(when slot
(if (consp (cadr slot))
(cdr slot)
(dolist (elt (cdr slot))
(dotimes (i #x80)
(let* ((c (+ elt i))
(name (get-char-code-property c 'name)))
(if (and name (eq (string-match head name) 0))
(push (cons name c) names)))))
(setcdr slot names)))))
(defun ucs-name-filter (str names)
(let (l)
(dolist (elt names)
(if (eq (string-match str (car elt)) 0)
(push elt l)))
l))
(defun ucs-name-completion (str)
(when (string-match "^[A-Za-z]*" str)
(let ((head (match-string 0 str))
slot names)
(if (and (= (length head) (length str))
(not (assoc-string str ucs-name-head-table)))
(ucs-name-filter str ucs-name-head-table)
(ucs-name-filter str (ucs-name-expand-table head))))))
(defun read-char-by-name (prompt)
(let* ((completion-ignore-case t)
(input (completing-read
prompt (completion-table-dynamic 'ucs-name-completion))))
(cond
((string-match-p "^[0-9a-fA-F]+$" input)
(string-to-number input 16))
((string-match-p "^#" input)
(read input))
(t
(or (and (string-match "^[A-Za-z]+" input)
(cdr (assoc input
(ucs-name-expand-table (match-string 0 input)))))
(error "Invalid character name: %s" input))))))
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: faster unicode character name completion
2009-12-04 11:23 ` faster unicode character name completion Kenichi Handa
@ 2009-12-04 12:08 ` Deniz Dogan
2009-12-04 13:04 ` Juanma Barranquero
` (2 subsequent siblings)
3 siblings, 0 replies; 24+ messages in thread
From: Deniz Dogan @ 2009-12-04 12:08 UTC (permalink / raw)
To: Kenichi Handa; +Cc: Chong Yidong, emacs-devel
2009/12/4 Kenichi Handa <handa@m17n.org>:
> In article <87fx7r68s4.fsf@stupidchicken.com>, Chong Yidong <cyd@stupidchicken.com> writes:
>
>> We are close to ready to begin the Emacs 23.2 pretest. If there are no
>> objections, I will roll the 23.1.90 pretest next Tuesday. If you still
>> have some last-minute changes you want to make, that gives the weekend
>> to wrap them up. After the pretest begins, we will be in feature freeze
>> (and we'll begin the cvs to bzr crossover).
>
> I'm now trying to make the completion of unicode character
> name (used by read-char-by-name) faster, at least fast
> enough for interactive use. Now it's very slow at the first
> time and consumes so much memory. Attached is the currently
> working code. In the actual code, I'll eliminate the big
> defvar of ucs-name-head-table, generate the value by
> admin/unidata/unidata-gen, and store it in an extra slot of
> a char-table for `name' char-code-property.
>
> The drawback of the new code is that one can see only the
> list of the first words of character names in the completion
> buffer at once by C-x 8 RET TAB, instead of all of the
> unicode character names.
>
> What do you think? If people think the above is not a
> problem, I'll go ahead.
>
Cool, I've been bugged for too long by the time it takes to insert a
named Unicode character.
What I've really been missing though is some "ido-like" equivalent of
ucs-insert, since I often don't know what the Unicode name for the
character is. As an example, if I want to insert λ, which is "GREEK
SMALL LETTER LAMBDA", it's not easy to know what the name is
beforehand.
--
Deniz Dogan
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: faster unicode character name completion
2009-12-04 11:23 ` faster unicode character name completion Kenichi Handa
2009-12-04 12:08 ` Deniz Dogan
@ 2009-12-04 13:04 ` Juanma Barranquero
2009-12-04 13:26 ` Florian Beck
2009-12-04 15:07 ` Stefan Monnier
3 siblings, 0 replies; 24+ messages in thread
From: Juanma Barranquero @ 2009-12-04 13:04 UTC (permalink / raw)
To: Kenichi Handa; +Cc: Chong Yidong, emacs-devel
On Fri, Dec 4, 2009 at 12:23, Kenichi Handa <handa@m17n.org> wrote:
> The drawback of the new code is that one can see only the
> list of the first words of character names in the completion
> buffer at once by C-x 8 RET TAB, instead of all of the
> unicode character names.
>
> What do you think? If people think the above is not a
> problem, I'll go ahead.
It's a small price to pay for such a speedup, IMHO.
Juanma
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: faster unicode character name completion
2009-12-04 11:23 ` faster unicode character name completion Kenichi Handa
2009-12-04 12:08 ` Deniz Dogan
2009-12-04 13:04 ` Juanma Barranquero
@ 2009-12-04 13:26 ` Florian Beck
2009-12-04 15:07 ` Stefan Monnier
3 siblings, 0 replies; 24+ messages in thread
From: Florian Beck @ 2009-12-04 13:26 UTC (permalink / raw)
To: emacs-devel
Kenichi Handa <handa@m17n.org> writes:
> The drawback of the new code is that one can see only the
> list of the first words of character names in the completion
> buffer at once by C-x 8 RET TAB, instead of all of the
> unicode character names.
>
> What do you think? If people think the above is not a
> problem, I'll go ahead.
I don't care about *seeing* all the character names, but I find myself
searching for things like '*arrow' quite a lot. This should work in my
opinion.
--
Florian Beck
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: faster unicode character name completion
2009-12-04 11:23 ` faster unicode character name completion Kenichi Handa
` (2 preceding siblings ...)
2009-12-04 13:26 ` Florian Beck
@ 2009-12-04 15:07 ` Stefan Monnier
2009-12-04 22:38 ` Miles Bader
2009-12-07 2:00 ` Kenichi Handa
3 siblings, 2 replies; 24+ messages in thread
From: Stefan Monnier @ 2009-12-04 15:07 UTC (permalink / raw)
To: Kenichi Handa; +Cc: Chong Yidong, emacs-devel
> The drawback of the new code is that one can see only the
> list of the first words of character names in the completion
> buffer at once by C-x 8 RET TAB, instead of all of the
> unicode character names.
That's a pretty serious drawback as it prevents uses such as
C-x 8 RET *arro TAB.
Maybe another way to speed things up is to precompute the
ucs-completions lazy completion table at compilation time and store it
in a .elc file, so it can be "computed" by reading that file.
This can be done simply by having an autoloaded `ucs-completions'
function in a file where the ucs-completions variable is defined with an
eval-when-compile expression.
> (defun ucs-name-filter (str names)
> (let (l)
> (dolist (elt names)
> (if (eq (string-match str (car elt)) 0)
> (push elt l)))
> l))
> (defun ucs-name-completion (str)
> (when (string-match "^[A-Za-z]*" str)
> (let ((head (match-string 0 str))
> slot names)
> (if (and (= (length head) (length str))
> (not (assoc-string str ucs-name-head-table)))
> (ucs-name-filter str ucs-name-head-table)
> (ucs-name-filter str (ucs-name-expand-table head))))))
I don't understand what ucs-name-filter is trying to do.
Stefan
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Emacs 23.2 Pretest next week
2009-12-03 21:35 ` Emacs 23.2 Pretest next week Chong Yidong
2009-12-04 11:23 ` faster unicode character name completion Kenichi Handa
@ 2009-12-04 19:04 ` Dan Nicolaescu
2009-12-04 21:15 ` Chong Yidong
1 sibling, 1 reply; 24+ messages in thread
From: Dan Nicolaescu @ 2009-12-04 19:04 UTC (permalink / raw)
To: Chong Yidong, Stefan Monnier; +Cc: emacs-devel@gnu.org
Chong Yidong <cyd@stupidchicken.com> writes:
> We are close to ready to begin the Emacs 23.2 pretest. If there are no
> objections, I will roll the 23.1.90 pretest next Tuesday. If you still
> have some last-minute changes you want to make, that gives the weekend
> to wrap them up. After the pretest begins, we will be in feature freeze
> (and we'll begin the cvs to bzr crossover).
We have quite a few bugs tagged "patch available" and [patch] in the
subject in the bug tracker. (and many that contain patches but are not
tagged as such).
Probably nobody felt they had the authority/responsibility to apply the
patches.
IMHO it would be a good idea if the maintainers could make a pass
through the bug tracker and apply/defer/reject/etc. the patches
available before the feature freeze.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Emacs 23.2 Pretest next week
2009-12-04 19:04 ` Emacs 23.2 Pretest next week Dan Nicolaescu
@ 2009-12-04 21:15 ` Chong Yidong
0 siblings, 0 replies; 24+ messages in thread
From: Chong Yidong @ 2009-12-04 21:15 UTC (permalink / raw)
To: Dan Nicolaescu; +Cc: Stefan Monnier, emacs-devel@gnu.org
Dan Nicolaescu <dann@ics.uci.edu> writes:
> Chong Yidong <cyd@stupidchicken.com> writes:
>
> > We are close to ready to begin the Emacs 23.2 pretest. If there are no
> > objections, I will roll the 23.1.90 pretest next Tuesday. If you still
> > have some last-minute changes you want to make, that gives the weekend
> > to wrap them up. After the pretest begins, we will be in feature freeze
> > (and we'll begin the cvs to bzr crossover).
>
> We have quite a few bugs tagged "patch available" and [patch] in the
> subject in the bug tracker. (and many that contain patches but are not
> tagged as such).
> Probably nobody felt they had the authority/responsibility to apply the
> patches.
>
> IMHO it would be a good idea if the maintainers could make a pass
> through the bug tracker and apply/defer/reject/etc. the patches
> available before the feature freeze.
I will do this this weekend.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: faster unicode character name completion
2009-12-04 15:07 ` Stefan Monnier
@ 2009-12-04 22:38 ` Miles Bader
2009-12-07 2:00 ` Kenichi Handa
1 sibling, 0 replies; 24+ messages in thread
From: Miles Bader @ 2009-12-04 22:38 UTC (permalink / raw)
To: Stefan Monnier; +Cc: Chong Yidong, emacs-devel, Kenichi Handa
Stefan Monnier <monnier@iro.umontreal.ca> writes:
>> The drawback of the new code is that one can see only the
>> list of the first words of character names in the completion
>> buffer at once by C-x 8 RET TAB, instead of all of the
>> unicode character names.
>
> That's a pretty serious drawback as it prevents uses such as
> C-x 8 RET *arro TAB.
If so, I think this change is unacceptable -- the only sane way to
search for characters in huge unicode name space with the existing
completion code is to use "*".
Because of the way unicode characters named, using large numbers of
"noise" words, it's often surprisingly hard to remember the true name of
a character, much less find it given a vague idea, if one is required to
type the name in order...
-Miles
--
Year, n. A period of three hundred and sixty-five disappointments.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: faster unicode character name completion
2009-12-04 15:07 ` Stefan Monnier
2009-12-04 22:38 ` Miles Bader
@ 2009-12-07 2:00 ` Kenichi Handa
2009-12-07 8:13 ` Kenichi Handa
2009-12-07 14:57 ` Stefan Monnier
1 sibling, 2 replies; 24+ messages in thread
From: Kenichi Handa @ 2009-12-07 2:00 UTC (permalink / raw)
To: Stefan Monnier; +Cc: cyd, emacs-devel
In article <jwvy6lisryp.fsf-monnier+emacs@gnu.org>, Stefan Monnier <monnier@iro.umontreal.ca> writes:
> > The drawback of the new code is that one can see only the
> > list of the first words of character names in the completion
> > buffer at once by C-x 8 RET TAB, instead of all of the
> > unicode character names.
> That's a pretty serious drawback as it prevents uses such as
> C-x 8 RET *arro TAB.
It may be possible to automatically fallback to the current
way of building a full list in such a case.
> Maybe another way to speed things up is to precompute the
> ucs-completions lazy completion table at compilation time and store it
> in a .elc file, so it can be "computed" by reading that file.
> This can be done simply by having an autoloaded `ucs-completions'
> function in a file where the ucs-completions variable is defined with an
> eval-when-compile expression.
Yes, that's one solution.
> > (defun ucs-name-filter (str names)
> > (let (l)
> > (dolist (elt names)
> > (if (eq (string-match str (car elt)) 0)
> > (push elt l)))
> > l))
> > (defun ucs-name-completion (str)
> > (when (string-match "^[A-Za-z]*" str)
> > (let ((head (match-string 0 str))
> > slot names)
> > (if (and (= (length head) (length str))
> > (not (assoc-string str ucs-name-head-table)))
> > (ucs-name-filter str ucs-name-head-table)
> > (ucs-name-filter str (ucs-name-expand-table head))))))
> I don't understand what ucs-name-filter is trying to do.
?? It simply filters out elements that doesn't match with
STR from NAMES (alist).
---
Kenichi Handa
handa@m17n.org
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: faster unicode character name completion
2009-12-07 2:00 ` Kenichi Handa
@ 2009-12-07 8:13 ` Kenichi Handa
2009-12-07 14:57 ` Stefan Monnier
1 sibling, 0 replies; 24+ messages in thread
From: Kenichi Handa @ 2009-12-07 8:13 UTC (permalink / raw)
To: Kenichi Handa; +Cc: cyd, monnier, emacs-devel
In article <tl7ocmbbl12.fsf@m17n.org>, Kenichi Handa <handa@m17n.org> writes:
> > That's a pretty serious drawback as it prevents uses such as
> > C-x 8 RET *arro TAB.
> It may be possible to automatically fallback to the current
> way of building a full list in such a case.
Attached is a slightly modified version to make that work.
---
Kenichi Handa
handa@m17n.org
--- ucs-name.el ---
(defvar ucs-name-head-table
'(("SPACE" 0)
("EXCLAMATION" 0 8192)
("QUOTATION" 0)
("NUMBER" 0 9344)
("DOLLAR" 0)
("PERCENT" 0)
("AMPERSAND" 0)
("APOSTROPHE" 0)
("LEFT" 0 128 3968 8192 8576 8832 8960 9088 9600 9856 10112 10496 10624 11008 11776 12288)
("RIGHT" 0 128 3968 8192 8576 8704 8832 8960 9088 9600 10112 10496 10624 11776 12288)
("ASTERISK" 0 8704)
("PLUS" 0 128 10752)
("COMMA" 0)
("HYPHEN" 0 8192 11776)
("FULL" 0 9600 10112)
("SOLIDUS" 0 10624)
("DIGIT" 0 9344 127232)
("COLON" 0 8320 8704)
("SEMICOLON" 0)
("LESS" 0 8704 8832 10496 10752 10880)
("EQUALS" 0 8704 10496 10624 10752 10880 11008)
("GREATER" 0 8704 8832 10496 10752 10880)
("QUESTION" 0 8192)
("COMMERCIAL" 0 8192)
("LATIN" 0 128 256 384 512 640 7424 7552 7680 7808 8320 8576 9984 11264 42752 42880 64256)
("REVERSE" 0 10112 10624 11008)
("CIRCUMFLEX" 0)
("LOW" 0 8192 12288)
("GRAVE" 0)
("VERTICAL" 0 8192 8832 8960 9088 9856 10112 10624 10880 11776 12288)
("TILDE" 0 8704 10496 10752 11008 11776)
("NO" 128 9856)
("INVERTED" 128 8192 8448 8704 11776)
("CENT" 128)
("POUND" 128)
("CURRENCY" 128)
("YEN" 128)
("BROKEN" 128 9088)
("SECTION" 128)
("DIAERESIS" 128)
("COPYRIGHT" 128)
("FEMININE" 128)
("NOT" 128 8704 8832 8960)
("SOFT" 128)
("REGISTERED" 128)
("MACRON" 128)
("DEGREE" 128 8448)
("SUPERSCRIPT" 128 8192)
("ACUTE" 128 10624)
("MICRO" 128)
("PILCROW" 128)
("MIDDLE" 128)
("CEDILLA" 128)
("MASCULINE" 128)
("VULGAR" 128 8448 8576)
("MULTIPLICATION" 128 9984 10752)
("DIVISION" 128 8704 8832)
("MODIFIER" 640 4224 7424 7552 11264 42752 42880)
("CARON" 640)
("BREVE" 640)
("DOT" 640 8704 8832)
("RING" 640 8704 11776)
("OGONEK" 640)
("SMALL" 640 8448 8704 8832 10752 65024 68352)
("DOUBLE" 640 8192 8448 8704 8832 9344 10624 10752 10880 11776 12288 65024)
("COMBINING" 768 1152 7552 8320 11648 12416 42496 43136 65024 119296)
("GREEK" 768 896 7424 7936 8064 65792 65920 119296)
("COPTIC" 896 11392)
("CYRILLIC" 1024 1152 1280 7424 42496 42624)
("ARMENIAN" 1280 1408 64256)
("HEBREW" 1408 64256)
("ARABIC" 1536 1664 1792 64256 64384 64512 64640 64768 64896 65024 65152)
("AFGHANI" 1536)
("EXTENDED" 1664)
("SYRIAC" 1792)
("THAANA" 1920)
("NKO" 1920)
("SAMARITAN" 2048)
("DEVANAGARI" 2304 43136)
("BENGALI" 2432)
("GURMUKHI" 2560)
("GUJARATI" 2688)
("ORIYA" 2816)
("TAMIL" 2944)
("TELUGU" 3072)
("KANNADA" 3200)
("MALAYALAM" 3328)
("SINHALA" 3456)
("THAI" 3584)
("LAO" 3712)
("TIBETAN" 3840 3968)
("MYANMAR" 4096 4224 43520)
("GEORGIAN" 4224 11520)
("HANGUL" 4352 4480 12288 12544 12672 43264 44032 44160 44288 44416 44544 44672 44800 44928 45056 45184 45312 45440 45568 45696 45824 45952 46080 46208 46336 46464 46592 46720 46848 46976 47104 47232 47360 47488 47616 47744 47872 48000 48128 48256 48384 48512 48640 48768 48896 49024 49152 49280 49408 49536 49664 49792 49920 50048 50176 50304 50432 50560 50688 50816 50944 51072 51200 51328 51456 51584 51712 51840 51968 52096 52224 52352 52480 52608 52736 52864 52992 53120 53248 53376 53504 53632 53760 53888 54016 54144 54272 54400 54528 54656 54784 54912 55040 55168)
("ETHIOPIC" 4608 4736 4864 4992 11648)
("CHEROKEE" 4992)
("CANADIAN" 5120 5248 5376 5504 5632 6272)
("OGHAM" 5760)
("RUNIC" 5760)
("TAGALOG" 5888)
("HANUNOO" 5888)
("PHILIPPINE" 5888)
("BUHID" 5888)
("TAGBANWA" 5888)
("KHMER" 6016 6528)
("MONGOLIAN" 6144 6272)
("LIMBU" 6400)
("TAI" 6400 6656 6784 43648)
("NEW" 6528 8320)
("BUGINESE" 6656)
("BALINESE" 6912)
("SUNDANESE" 7040)
("LEPCHA" 7168)
("OL" 7168)
("VEDIC" 7296)
("EN" 8192)
("EM" 8192)
("THREE" 8192 8576 9856 10112 10752 11008)
("FOUR" 8192 9984)
("SIX" 8192 9984)
("FIGURE" 8192)
("PUNCTUATION" 8192)
("THIN" 8192)
("HAIR" 8192)
("ZERO" 8192 65152)
("NON" 8192)
("HORIZONTAL" 8192 9088 9856 11008)
("SINGLE" 8192)
("DAGGER" 8192)
("BULLET" 8192 8704)
("TRIANGULAR" 8192)
("ONE" 8192 11776)
("TWO" 8192 10624 10752 11776)
("HYPHENATION" 8192)
("LINE" 8192 10752)
("PARAGRAPH" 8192)
("POP" 8192)
("NARROW" 8192)
("PER" 8192 8448)
("PRIME" 8192)
("TRIPLE" 8192 8704 8832 10624 10752 10880)
("REVERSED" 8192 8448 8704 8832 8960 9728 10624 10880 11776 12288)
("CARET" 8192)
("REFERENCE" 8192)
("INTERROBANG" 8192)
("OVERLINE" 8192)
("UNDERTIE" 8192)
("CHARACTER" 8192)
("ASTERISM" 8192)
("FRACTION" 8192 8448)
("TIRONIAN" 8192)
("BLACK" 8192 8448 9600 9728 9856 9984 10112 10624 11008)
("CLOSE" 8192)
("SWUNG" 8192)
("FLOWER" 8192 9856)
("QUADRUPLE" 8192 10752)
("FIVE" 8192 11776)
("DOTTED" 8192 9600 10624 11008 11776)
("TRICOLON" 8192)
("MEDIUM" 8192 9600 9856 9984)
("WORD" 8192 11776)
("FUNCTION" 8192)
("INVISIBLE" 8192)
("INHIBIT" 8192)
("ACTIVATE" 8192)
("NATIONAL" 8192)
("NOMINAL" 8192)
("SUBSCRIPT" 8320)
("EURO" 8320)
("CRUZEIRO" 8320)
("FRENCH" 8320)
("LIRA" 8320)
("MILL" 8320)
("NAIRA" 8320)
("PESETA" 8320)
("RUPEE" 8320)
("WON" 8320)
("DONG" 8320)
("KIP" 8320)
("TUGRIK" 8320)
("DRACHMA" 8320)
("GERMAN" 8320)
("PESO" 8320)
("GUARANI" 8320)
("AUSTRAL" 8320)
("HRYVNIA" 8320)
("CEDI" 8320)
("LIVRE" 8320)
("SPESMILO" 8320)
("TENGE" 8320)
("ACCOUNT" 8448)
("ADDRESSED" 8448)
("CENTRE" 8448)
("CARE" 8448)
("CADA" 8448)
("EULER" 8448)
("SCRUPLE" 8448)
("SCRIPT" 8448)
("PLANCK" 8448)
("L" 8448)
("NUMERO" 8448)
("SOUND" 8448)
("PRESCRIPTION" 8448)
("RESPONSE" 8448)
("SERVICE" 8448)
("TELEPHONE" 8448 8960 9984)
("TRADE" 8448)
("VERSICLE" 8448)
("OUNCE" 8448)
("OHM" 8448)
("TURNED" 8448 8960 9856 10624)
("KELVIN" 8448)
("ANGSTROM" 8448)
("ESTIMATED" 8448)
("ALEF" 8448)
("BET" 8448)
("GIMEL" 8448)
("DALET" 8448)
("INFORMATION" 8448)
("ROTATED" 8448 9984)
("FACSIMILE" 8448)
("PROPERTY" 8448)
("AKTIESELSKAB" 8448)
("SYMBOL" 8448 9216)
("ROMAN" 8448 8576 65920)
("LEFTWARDS" 8576 10496 11008)
("UPWARDS" 8576 10112 10496 11008 11776)
("RIGHTWARDS" 8576 10496 11008)
("DOWNWARDS" 8576 10112 10496 11008 11776)
("UP" 8576 8832 8960 9600 10112 10496 10624 11008)
("NORTH" 8576 10496 11008 43008)
("SOUTH" 8576 10496 11008)
("ANTICLOCKWISE" 8576 8704 10112 10496 10752)
("CLOCKWISE" 8576 8704 10112 10496)
("FOR" 8704)
("COMPLEMENT" 8704)
("PARTIAL" 8704)
("THERE" 8704)
("EMPTY" 8704 10624)
("INCREMENT" 8704)
("NABLA" 8704)
("ELEMENT" 8704 8832 10112 10880)
("CONTAINS" 8704 8832)
("DOES" 8704 8832 10880)
("END" 8704)
("N" 8704 8832 10752 10880)
("MINUS" 8704 10752)
("SET" 8704)
("SQUARE" 8704 8832 8960 9088 9600 9856 10624 10880 11008 12928 13056 13184 127360 127488)
("CUBE" 8704)
("FOURTH" 8704)
("PROPORTIONAL" 8704)
("INFINITY" 8704 10624)
("ANGLE" 8704 10624)
("MEASURED" 8704 10624)
("SPHERICAL" 8704 10624)
("DIVIDES" 8704)
("PARALLEL" 8704 10880)
("LOGICAL" 8704 10752)
("INTERSECTION" 8704 10752)
("UNION" 8704 10752)
("INTEGRAL" 8704 9088 10752)
("CONTOUR" 8704)
("SURFACE" 8704)
("VOLUME" 8704)
("THEREFORE" 8704)
("BECAUSE" 8704)
("RATIO" 8704)
("PROPORTION" 8704)
("EXCESS" 8704)
("GEOMETRIC" 8704)
("HOMOTHETIC" 8704)
("SINE" 8704)
("WREATH" 8704)
("ASYMPTOTICALLY" 8704)
("APPROXIMATELY" 8704 10752)
("NEITHER" 8704 8832)
("ALMOST" 8704 10752)
("ALL" 8704 8960)
("EQUIVALENT" 8704 10752)
("GEOMETRICALLY" 8704)
("DIFFERENCE" 8704)
("APPROACHES" 8704)
("IMAGE" 8704 8832)
("CORRESPONDS" 8704)
("ESTIMATES" 8704)
("EQUIANGULAR" 8704)
("STAR" 8704 8832 9728 9984)
("DELTA" 8704)
("EQUAL" 8704 8832)
("QUESTIONED" 8704)
("IDENTICAL" 8704 10624 10752)
("STRICTLY" 8704)
("MUCH" 8704)
("BETWEEN" 8704)
("PRECEDES" 8704 8832 10880)
("SUCCEEDS" 8704 8832 10880)
("SUBSET" 8832 10496 10880)
("SUPERSET" 8832 10112 10496 10880)
("MULTISET" 8832)
("CIRCLED" 8832 9088 9216 9344 9856 9984 10112 10624 10752 12288 12800 12928 127232)
("SQUARED" 8832 9856 10624 11776 127232 127488)
("DOWN" 8832 8960 10496 10624 10880)
("ASSERTION" 8832)
("MODELS" 8832)
("TRUE" 8832)
("FORCES" 8832)
("NEGATED" 8832)
("NORMAL" 8832)
("ORIGINAL" 8832)
("MULTIMAP" 8832)
("HERMITIAN" 8832)
("INTERCALATE" 8832)
("XOR" 8832)
("NAND" 8832)
("NOR" 8832)
("DIAMOND" 8832 11008)
("BOWTIE" 8832 10624)
("CURLY" 8832 9088)
("PITCHFORK" 8832 10880)
("VERY" 8832)
("MIDLINE" 8832)
("Z" 8832 10624 10752)
("DIAMETER" 8960)
("ELECTRIC" 8960)
("HOUSE" 8960)
("PROJECTIVE" 8960)
("PERSPECTIVE" 8960)
("WAVY" 8960 12288 65024)
("BOTTOM" 8960 9088 10496 11776)
("TOP" 8960 9088 10496 11776)
("ARC" 8960)
("SEGMENT" 8960)
("SECTOR" 8960)
("POSITION" 8960)
("VIEWDATA" 8960)
("PLACE" 8960)
("WATCH" 8960)
("HOURGLASS" 8960)
("FROWN" 8960)
("SMILE" 8960)
("OPTION" 8960)
("ERASE" 8960)
("X" 8960)
("KEYBOARD" 8960)
("BENZENE" 8960 9088)
("CYLINDRICITY" 8960)
("SYMMETRY" 8960)
("TOTAL" 8960)
("DIMENSION" 8960)
("CONICAL" 8960)
("SLOPE" 8960)
("COUNTERBORE" 8960)
("COUNTERSINK" 8960)
("APL" 8960 9088)
("SHOULDERED" 8960)
("BELL" 8960)
("INSERTION" 9088)
("CONTINUOUS" 9088)
("DISCONTINUOUS" 9088)
("EMPHASIS" 9088)
("COMPOSITION" 9088)
("WHITE" 9088 9600 9728 9856 9984 10112 10624 10880 11008 65024)
("ENTER" 9088)
("ALTERNATIVE" 9088)
("HELM" 9088)
("UNDO" 9088)
("MONOSTABLE" 9088)
("HYSTERESIS" 9088)
("OPEN" 9088 9216 9984 10112)
("PASSIVE" 9088)
("DIRECT" 9088)
("SOFTWARE" 9088)
("DECIMAL" 9088)
("PREVIOUS" 9088)
("NEXT" 9088)
("PRINT" 9088)
("CLEAR" 9088)
("UPPER" 9088 9600 9984 10112)
("SUMMATION" 9088 10752)
("RADICAL" 9088)
("DENTISTRY" 9088)
("RETURN" 9088)
("EJECT" 9088)
("METRICAL" 9088)
("EARTH" 9088 9728)
("FUSE" 9088)
("STRAIGHTNESS" 9088)
("FLATNESS" 9088)
("AC" 9088)
("ELECTRICAL" 9088)
("BLANK" 9216)
("OCR" 9216)
("PARENTHESIZED" 9216 9344 12800 127232)
("NEGATIVE" 9344 127232 127360)
("BOX" 9472)
("LOWER" 9600 9984 10112 10496)
("LIGHT" 9600 9984)
("DARK" 9600)
("QUADRANT" 9600)
("FISHEYE" 9600)
("LOZENGE" 9600 10112)
("CIRCLE" 9600 10624)
("BULLSEYE" 9600)
("INVERSE" 9600)
("LARGE" 9600 10112 10752 10880 68352)
("CLOUD" 9728)
("UMBRELLA" 9728 9856)
("SNOWMAN" 9728 9856)
("COMET" 9728)
("LIGHTNING" 9728)
("THUNDERSTORM" 9728)
("SUN" 9728 9856)
("ASCENDING" 9728)
("DESCENDING" 9728)
("CONJUNCTION" 9728)
("OPPOSITION" 9728)
("BALLOT" 9728 9984)
("SALTIRE" 9728)
("HOT" 9728)
("SHAMROCK" 9728)
("SKULL" 9728)
("CAUTION" 9728)
("RADIOACTIVE" 9728)
("BIOHAZARD" 9728)
("CADUCEUS" 9728)
("ANKH" 9728)
("ORTHODOX" 9728)
("CHI" 9728)
("CROSS" 9728)
("FARSI" 9728)
("ADI" 9728)
("HAMMER" 9728 9856)
("PEACE" 9728)
("YIN" 9728)
("TRIGRAM" 9728)
("WHEEL" 9728)
("FIRST" 9728)
("LAST" 9728)
("MERCURY" 9728)
("FEMALE" 9728)
("MALE" 9728 9856)
("JUPITER" 9728)
("SATURN" 9728)
("URANUS" 9728)
("NEPTUNE" 9728)
("PLUTO" 9728)
("ARIES" 9728)
("TAURUS" 9728)
("GEMINI" 9728)
("CANCER" 9728)
("LEO" 9728)
("VIRGO" 9728)
("LIBRA" 9728)
("SCORPIUS" 9728)
("SAGITTARIUS" 9728)
("CAPRICORN" 9728)
("AQUARIUS" 9728)
("PISCES" 9728)
("QUARTER" 9728)
("EIGHTH" 9728)
("BEAMED" 9728)
("MUSIC" 9728)
("WEST" 9728)
("EAST" 9728)
("UNIVERSAL" 9728)
("RECYCLING" 9728)
("RECYCLED" 9728)
("PARTIALLY" 9728)
("PERMANENT" 9728)
("WHEELCHAIR" 9728)
("DIE" 9856)
("MONOGRAM" 9856 119552)
("DIGRAM" 9856 119552)
("ANCHOR" 9856)
("CROSSED" 9856 127360)
("STAFF" 9856)
("SCALES" 9856)
("ALEMBIC" 9856)
("GEAR" 9856)
("ATOM" 9856)
("FLEUR" 9856)
("OUTLINED" 9856 9984)
("WARNING" 9856)
("HIGH" 9856)
("DOUBLED" 9856)
("INTERLOCKED" 9856)
("MARRIAGE" 9856)
("DIVORCE" 9856)
("UNMARRIED" 9856)
("COFFIN" 9856)
("FUNERAL" 9856)
("NEUTER" 9856)
("CERES" 9856)
("PALLAS" 9856)
("JUNO" 9856)
("VESTA" 9856)
("CHIRON" 9856)
("SEXTILE" 9856)
("SEMISEXTILE" 9856)
("QUINCUNX" 9856)
("SESQUIQUADRATE" 9856)
("SOCCER" 9856)
("BASEBALL" 9856)
("RAIN" 9856)
("THUNDER" 9856)
("CROSSING" 9856)
("DISABLED" 9856)
("PICK" 9856)
("CAR" 9856)
("HELMET" 9856)
("CHAINS" 9856)
("ALTERNATE" 9856)
("DRIVE" 9856)
("HEAVY" 9856 9984 10112 11008)
("FALLING" 9856 10496)
("RESTRICTED" 9856)
("SHINTO" 9856)
("CHURCH" 9856)
("CASTLE" 9856)
("HISTORIC" 9856)
("MAP" 9856)
("MOUNTAIN" 9856)
("FOUNTAIN" 9856)
("FLAG" 9856)
("FERRY" 9856)
("SAILBOAT" 9856)
("SKIER" 9856)
("ICE" 9856)
("PERSON" 9856)
("TENT" 9856)
("JAPANESE" 9856 12288)
("HEADSTONE" 9856)
("FUEL" 9856)
("CUP" 9856)
("TAPE" 9984)
("AIRPLANE" 9984)
("ENVELOPE" 9984)
("VICTORY" 9984)
("WRITING" 9984)
("PENCIL" 9984)
("CHECK" 9984)
("SHADOWED" 9984)
("MALTESE" 9984)
("STRESS" 9984)
("PINWHEEL" 9984)
("EIGHT" 9984)
("TWELVE" 9984)
("SIXTEEN" 9984)
("TEARDROP" 9984 10112)
("SNOWFLAKE" 9984)
("TIGHT" 9984)
("SPARKLE" 9984)
("BALLOON" 9984)
("CURVED" 9984)
("FLORAL" 9984)
("DINGBAT" 9984 10112)
("DRAFTING" 10112)
("TRIANGLE" 10112 10624)
("DASHED" 10112 65024)
("SQUAT" 10112)
("BACK" 10112)
("FRONT" 10112)
("NOTCHED" 10112)
("WEDGE" 10112)
("PERPENDICULAR" 10112 10880)
("OR" 10112)
("LONG" 10112 10880 11008)
("AND" 10112)
("MATHEMATICAL" 10112 119808 119936 120064 120192 120320 120448 120576 120704)
("BRAILLE" 10240 10368)
("RISING" 10496)
("WAVE" 10496 11008 12288)
("ARROW" 10496)
("SHORT" 10496 10880)
("OBLIQUE" 10624)
("S" 10624)
("TIMES" 10624)
("INCOMPLETE" 10624)
("TIE" 10624)
("INCREASES" 10624)
("SHUFFLE" 10624)
("GLEICH" 10624)
("THERMODYNAMIC" 10624)
("ERROR" 10624)
("RULE" 10624)
("BIG" 10624)
("TINY" 10624 68352)
("MINY" 10624)
("MODULO" 10752)
("FINITE" 10752)
("CIRCULATION" 10752)
("QUATERNION" 10752)
("JOIN" 10752)
("VECTOR" 10752)
("SEMIDIRECT" 10752)
("SMASH" 10752)
("INTERIOR" 10752)
("RIGHTHAND" 10752)
("AMALGAMATION" 10752)
("CLOSED" 10752 10880)
("SLOPING" 10752)
("SIMILAR" 10752 10880)
("CONGRUENT" 10752)
("SLANTED" 10880)
("SMALLER" 10880)
("LARGER" 10880)
("TRANSVERSAL" 10880)
("FORKING" 10880)
("NONFORKING" 10880)
("GLAGOLITIC" 11264)
("TIFINAGH" 11520)
("RAISED" 11776)
("EDITORIAL" 11776)
("PARAGRAPHOS" 11776)
("FORKED" 11776)
("HYPODIASTOLE" 11776)
("PALM" 11776)
("CJK" 11904 12672)
("KANGXI" 12032 12160)
("IDEOGRAPHIC" 12160 12288 12672 12928 13056 13184)
("DITTO" 12288)
("POSTAL" 12288)
("GETA" 12288)
("HANGZHOU" 12288)
("MASU" 12288)
("PART" 12288)
("HIRAGANA" 12288 12416)
("KATAKANA" 12416 12672)
("BOPOMOFO" 12544 12672)
("PARTNERSHIP" 12800)
("KOREAN" 12800)
("LIMITED" 12928)
("HEXAGRAM" 19840)
("YI" 40960 41088 41216 41344 41472 41600 41728 41856 41984 42112)
("LISU" 42112)
("VAI" 42240 42368 42496)
("SLAVONIC" 42496)
("BAMUM" 42624)
("SYLOTI" 43008)
("PHAGS" 43008)
("SAURASHTRA" 43136)
("KAYAH" 43264)
("REJANG" 43264)
("JAVANESE" 43392)
("CHAM" 43520)
("MEETEI" 43904)
("ORNATE" 64768)
("RIAL" 64896)
("VARIATION" 65024 917760 917888)
("PRESENTATION" 65024)
("SESAME" 65024)
("CENTRELINE" 65024)
("FULLWIDTH" 65280 65408)
("HALFWIDTH" 65280 65408)
("INTERLINEAR" 65408)
("OBJECT" 65408)
("REPLACEMENT" 65408)
("LINEAR" 65536 65664)
("AEGEAN" 65792)
("PHAISTOS" 65920)
("LYCIAN" 66176)
("CARIAN" 66176)
("OLD" 66304 66432 68096 68608)
("GOTHIC" 66304)
("UGARITIC" 66432)
("DESERET" 66560)
("SHAVIAN" 66560)
("OSMANYA" 66688)
("CYPRIOT" 67584)
("IMPERIAL" 67584)
("PHOENICIAN" 67840)
("LYDIAN" 67840)
("KHAROSHTHI" 68096)
("AVESTAN" 68352)
("INSCRIPTIONAL" 68352)
("RUMI" 69120)
("KAITHI" 69760)
("CUNEIFORM" 73728 73856 73984 74112 74240 74368 74496 74752)
("EGYPTIAN" 77824 77952 78080 78208 78336 78464 78592 78720 78848)
("BYZANTINE" 118784 118912)
("MUSICAL" 119040 119168)
("TETRAGRAM" 119552)
("COUNTING" 119552)
("MAHJONG" 126976)
("DOMINO" 126976 127104)
("TORTOISE" 127232 127488)
("LANGUAGE" 917504)
("TAG" 917504)
("CANCEL" 917504)))
(defun ucs-name-expand-table (head)
(let ((slot (assoc head ucs-name-head-table))
names)
(when slot
(if (consp (cadr slot))
(cdr slot)
(dolist (elt (cdr slot))
(dotimes (i #x80)
(let* ((c (+ elt i))
(name (get-char-code-property c 'name)))
(if (and name (eq (string-match head name) 0))
(push (cons name c) names)))))
(setcdr slot names)))))
(defun ucs-name-filter (str names)
(let (l)
(dolist (elt names)
(if (eq (string-match (regexp-quote str) (car elt)) 0)
(push elt l)))
l))
(defun ucs-name-completion (str)
(or ucs-names
(if (string-match "\\*" str)
(setq ucs-names (ucs-names))
(if (string-match "^[A-Za-z]*" str)
(let ((head (match-string 0 str))
slot names)
(if (and (= (length head) (length str))
(not (assoc-string str ucs-name-head-table)))
(ucs-name-filter str ucs-name-head-table)
(ucs-name-filter str (ucs-name-expand-table head))))))))
(defun read-char-by-name (prompt)
(let* ((completion-ignore-case t)
(input (completing-read
prompt (completion-table-dynamic 'ucs-name-completion))))
(cond
((string-match-p "^[0-9a-fA-F]+$" input)
(string-to-number input 16))
((string-match-p "^#" input)
(read input))
(t
(or (and (string-match "^[A-Za-z]+" input)
(cdr (assoc input
(ucs-name-expand-table (match-string 0 input)))))
(error "Invalid character name: %s" input))))))
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: faster unicode character name completion
2009-12-07 2:00 ` Kenichi Handa
2009-12-07 8:13 ` Kenichi Handa
@ 2009-12-07 14:57 ` Stefan Monnier
2009-12-07 20:28 ` Juri Linkov
2009-12-08 1:45 ` Kenichi Handa
1 sibling, 2 replies; 24+ messages in thread
From: Stefan Monnier @ 2009-12-07 14:57 UTC (permalink / raw)
To: Kenichi Handa; +Cc: cyd, emacs-devel
>> > (defun ucs-name-completion (str)
>> > (when (string-match "^[A-Za-z]*" str)
>> > (let ((head (match-string 0 str))
>> > slot names)
>> > (if (and (= (length head) (length str))
>> > (not (assoc-string str ucs-name-head-table)))
>> > (ucs-name-filter str ucs-name-head-table)
>> > (ucs-name-filter str (ucs-name-expand-table head))))))
>> I don't understand what ucs-name-filter is trying to do.
> ?? It simply filters out elements that doesn't match with
> STR from NAMES (alist).
But then why is it needed?
Doesn't `completion-table-dynamic' take care of that already?
BTW, I tried the "precompute the table and autoload it" approach, and it
works fine in the sense that it's fast. The generated file is about
1.3MB. Basically, the main downside of this approach is that it doesn't
use the very strings we already have for the names, so after loading it
we have 2 copies of every char name in memory.
But I have a better idea: most of the time is not spent building the
completion table, but rather just weeding out all the "chars" that don't
have names, or should I say, looking for the few rare chars that do
have a name.
So the patch below seems to eb a good compromise: it uses up just about
1000K cons cells (i.e. 16KB on 64bit systems) to keep the precomputed
set of ~34K chars that do have a name, so that building the completion
table takes only a couple seconds.
Stefan
--- mule-cmds.el.~1.383.~ 2009-11-11 21:01:36.000000000 -0500
+++ mule-cmds.el 2009-12-07 09:55:16.000000000 -0500
@@ -2883,6 +2883,31 @@
(defvar nonascii-insert-offset 0 "This variable is obsolete.")
(defvar nonascii-translation-table nil "This variable is obsolete.")
+(defvar ucs-named-char-ranges
+ (purecopy
+ (eval-when-compile
+ (let ((ranges ())
+ (first 0)
+ (last 0))
+ (dotimes-with-progress-reporter (c #xEFFFF)
+ "Loading Unicode character names..."
+ (unless (or
+ (and (>= c #x3400 ) (<= c #x4dbf )) ; CJK Ideograph Extension Arch
+ (and (>= c #x4e00 ) (<= c #x9fff )) ; CJK Ideograph
+ (and (>= c #xd800 ) (<= c #xfaff )) ; Private/Surrogate
+ (and (>= c #x20000) (<= c #x2ffff)) ; CJK Ideograph Extensions B, C
+ (null (get-char-code-property c 'name)))
+ ;; This char has a name.
+ (if (<= c (1+ last))
+ ;; Extend the current range.
+ (setq last c)
+ ;; We have to split the range.
+ (push (cons first last) ranges)
+ (setq first (setq last c)))))
+ (cons (cons first last) ranges))))
+ "List of ranges of chars that have names.
+Every range is of the form (FIRST . LAST).")
+
(defvar ucs-names nil
"Alist of cached (CHAR-NAME . CHAR-CODE) pairs.")
@@ -2891,18 +2916,16 @@
(or ucs-names
(setq ucs-names
(let (name names)
- (dotimes-with-progress-reporter (c #xEFFFF)
- "Loading Unicode character names..."
- (unless (or
- (and (>= c #x3400 ) (<= c #x4dbf )) ; CJK Ideograph Extension A
- (and (>= c #x4e00 ) (<= c #x9fff )) ; CJK Ideograph
- (and (>= c #xd800 ) (<= c #xfaff )) ; Private/Surrogate
- (and (>= c #x20000) (<= c #x2ffff)) ; CJK Ideograph Extensions B, C
- )
+ (dolist (range ucs-named-char-ranges)
+ (let ((c (car range))
+ (end (cdr range)))
+ (while (<= c end)
(if (setq name (get-char-code-property c 'name))
- (setq names (cons (cons name c) names)))
+ (setq names (cons (cons name c) names))
+ (error "Wrong range"))
(if (setq name (get-char-code-property c 'old-name))
- (setq names (cons (cons name c) names)))))
+ (setq names (cons (cons name c) names)))
+ (setq c (1+ c)))))
names))))
(defvar ucs-completions (lazy-completion-table ucs-completions ucs-names)
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: faster unicode character name completion
2009-12-07 14:57 ` Stefan Monnier
@ 2009-12-07 20:28 ` Juri Linkov
2009-12-07 21:42 ` Stefan Monnier
2009-12-08 1:45 ` Kenichi Handa
1 sibling, 1 reply; 24+ messages in thread
From: Juri Linkov @ 2009-12-07 20:28 UTC (permalink / raw)
To: Stefan Monnier; +Cc: emacs-devel, Kenichi Handa
> But I have a better idea: most of the time is not spent building the
> completion table, but rather just weeding out all the "chars" that don't
> have names, or should I say, looking for the few rare chars that do
> have a name.
>
> So the patch below seems to eb a good compromise: it uses up just about
> 1000K cons cells (i.e. 16KB on 64bit systems) to keep the precomputed
> set of ~34K chars that do have a name, so that building the completion
> table takes only a couple seconds.
Before this change building the completion table took 10s, now only 2s.
The size was 88,203, now 91,595. I think it's a reasonable price for
such a speedup.
BTW, a related problem: it would be better to hide old obsolete Unicode
names to not advertise them, but still allow completions on them.
For instance, duplicate names such as
name: LATIN CAPITAL LETTER A WITH ACUTE
old-name: LATIN CAPITAL LETTER A ACUTE
add too much noise. Maybe to use the same approach as used for
`completion-ignored-extensions', i.e. to ignore old names, but don't
ignore if all possible completions end in one of them.
--
Juri Linkov
http://www.jurta.org/emacs/
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: faster unicode character name completion
2009-12-07 20:28 ` Juri Linkov
@ 2009-12-07 21:42 ` Stefan Monnier
2009-12-08 1:59 ` Miles Bader
0 siblings, 1 reply; 24+ messages in thread
From: Stefan Monnier @ 2009-12-07 21:42 UTC (permalink / raw)
To: Juri Linkov; +Cc: emacs-devel, Kenichi Handa
>> So the patch below seems to be a good compromise: it uses up just about
>> 1000K cons cells (i.e. 16KB on 64bit systems) to keep the precomputed
^^^^^
should be 1000 or 1K, of course.
> BTW, a related problem: it would be better to hide old obsolete Unicode
> names to not advertise them, but still allow completions on them.
> For instance, duplicate names such as
> name: LATIN CAPITAL LETTER A WITH ACUTE
> old-name: LATIN CAPITAL LETTER A ACUTE
> add too much noise.
Note that for the code 0-31, it seems that the oldname is more useful
than the new one (which ssems to just be "<control>" for all of them).
Not sure if there are others in the same situation.
> Maybe to use the same approach as used for
> `completion-ignored-extensions', i.e. to ignore old names, but don't
> ignore if all possible completions end in one of them.
It'd be easy to do, but I'm not sure it's worth the trouble.
Stefan
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: faster unicode character name completion
2009-12-07 14:57 ` Stefan Monnier
2009-12-07 20:28 ` Juri Linkov
@ 2009-12-08 1:45 ` Kenichi Handa
2009-12-08 2:29 ` Stefan Monnier
2009-12-09 0:12 ` Chong Yidong
1 sibling, 2 replies; 24+ messages in thread
From: Kenichi Handa @ 2009-12-08 1:45 UTC (permalink / raw)
To: Stefan Monnier; +Cc: cyd, emacs-devel
In article <jwvk4wyj22f.fsf-monnier+emacs@gnu.org>, Stefan Monnier <monnier@iro.umontreal.ca> writes:
>>> I don't understand what ucs-name-filter is trying to do.
> > ?? It simply filters out elements that doesn't match with
> > STR from NAMES (alist).
> But then why is it needed?
> Doesn't `completion-table-dynamic' take care of that already?
I don't know. The info says this:
-- Function: completion-table-dynamic function
This function is a convenient way to write a function that can act
as programmed completion function. The argument FUNCTION should be
a function that takes one argument, a string, and returns an alist
of possible completions of it. You can think of
`completion-table-dynamic' as a transducer between that interface
and the interface for programmed completion functions.
I thought that FUNCTION should return an alist that contains
ONLY valid completions.
> But I have a better idea: most of the time is not spent building the
> completion table, but rather just weeding out all the "chars" that don't
> have names, or should I say, looking for the few rare chars that do
> have a name.
> So the patch below seems to eb a good compromise: it uses up just about
> 1000K cons cells (i.e. 16KB on 64bit systems) to keep the precomputed
> set of ~34K chars that do have a name, so that building the completion
> table takes only a couple seconds.
Ah, interesting approach. But, I've just found that
dotimes-with-progress-reporter of the original code didn't
exclude the big unused range U+30000..U+DFFFF (about 75% of
the range currently checked). Just excluding that part in
the original code achieves almost the same performance as
your patch. Attached is that simpler version.
---
Kenichi Handa
handa@m17n.org
(defun ucs-names ()
"Return alist of (CHAR-NAME . CHAR-CODE) pairs cached in `ucs-names'."
(or ucs-names
(let ((ranges
'((#x00000 . #x033FF)
;; (#x03400 . #x04DBF) CJK Ideograph Extension A
(#x04DC0 . #x04DFF)
;; (#x04E00 . #x0x09FFF) CJK Ideograph
(#x0A000 . #x0D7FF)
;; (#x0D800 . #x0FAFF) Surrogate/Private
(#x0FB00 . #x1FFFF)
;; (#x20000 . #xDFFFF) CJK Ideograph Extension A, B, etc, unsed
(#xE0000 . #xE01EF)))
c end name names)
(dolist (range ranges)
(setq c (car range)
end (cdr range))
(while (<= c end)
(if (setq name (get-char-code-property c 'name))
(push (cons name c) names))
(if (setq name (get-char-code-property c 'old-name))
(push (cons name c) names))
(setq c (1+ c))))
(setq ucs-names names))))
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: faster unicode character name completion
2009-12-07 21:42 ` Stefan Monnier
@ 2009-12-08 1:59 ` Miles Bader
0 siblings, 0 replies; 24+ messages in thread
From: Miles Bader @ 2009-12-08 1:59 UTC (permalink / raw)
To: Stefan Monnier; +Cc: Juri Linkov, Kenichi Handa, emacs-devel
Stefan Monnier <monnier@IRO.UMontreal.CA> writes:
>> BTW, a related problem: it would be better to hide old obsolete Unicode
>> names to not advertise them, but still allow completions on them.
>
> Note that for the code 0-31, it seems that the oldname is more useful
> than the new one (which ssems to just be "<control>" for all of them).
> Not sure if there are others in the same situation.
I've noticed that some of the old names use more common terminology,
even if the new names are more consistent/logical/whatever.
So maybe it'd be better to hide only old names that are specifically
identified as being redundant.
-Miles
--
`To alcohol! The cause of, and solution to,
all of life's problems' --Homer J. Simpson
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: faster unicode character name completion
2009-12-08 1:45 ` Kenichi Handa
@ 2009-12-08 2:29 ` Stefan Monnier
2009-12-09 0:12 ` Chong Yidong
1 sibling, 0 replies; 24+ messages in thread
From: Stefan Monnier @ 2009-12-08 2:29 UTC (permalink / raw)
To: Kenichi Handa; +Cc: cyd, emacs-devel
> I don't know. The info says this:
> -- Function: completion-table-dynamic function
> This function is a convenient way to write a function that can act
> as programmed completion function. The argument FUNCTION should be
> a function that takes one argument, a string, and returns an alist
> of possible completions of it. You can think of
> `completion-table-dynamic' as a transducer between that interface
> and the interface for programmed completion functions.
> I thought that FUNCTION should return an alist that contains
> ONLY valid completions.
No, it can (and often is) a superset.
> Ah, interesting approach. But, I've just found that
> dotimes-with-progress-reporter of the original code didn't
> exclude the big unused range U+30000..U+DFFFF (about 75% of
> the range currently checked). Just excluding that part in
> the original code achieves almost the same performance as
> your patch. Attached is that simpler version.
Feel free to install that, then.
Stefan
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: faster unicode character name completion
2009-12-08 1:45 ` Kenichi Handa
2009-12-08 2:29 ` Stefan Monnier
@ 2009-12-09 0:12 ` Chong Yidong
2009-12-09 0:57 ` Kenichi Handa
1 sibling, 1 reply; 24+ messages in thread
From: Chong Yidong @ 2009-12-09 0:12 UTC (permalink / raw)
To: Kenichi Handa; +Cc: Stefan Monnier, emacs-devel
Kenichi Handa <handa@m17n.org> writes:
> Ah, interesting approach. But, I've just found that
> dotimes-with-progress-reporter of the original code didn't exclude the
> big unused range U+30000..U+DFFFF (about 75% of the range currently
> checked). Just excluding that part in the original code achieves
> almost the same performance as your patch. Attached is that simpler
> version.
Could you check this in ASAP? I'll hold the pretest for this. Thanks.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: faster unicode character name completion
2009-12-09 0:12 ` Chong Yidong
@ 2009-12-09 0:57 ` Kenichi Handa
2009-12-09 9:02 ` Deniz Dogan
0 siblings, 1 reply; 24+ messages in thread
From: Kenichi Handa @ 2009-12-09 0:57 UTC (permalink / raw)
To: Chong Yidong; +Cc: monnier, emacs-devel
In article <87k4wx57kh.fsf@stupidchicken.com>, Chong Yidong <cyd@stupidchicken.com> writes:
> Kenichi Handa <handa@m17n.org> writes:
> > Ah, interesting approach. But, I've just found that
> > dotimes-with-progress-reporter of the original code didn't exclude the
> > big unused range U+30000..U+DFFFF (about 75% of the range currently
> > checked). Just excluding that part in the original code achieves
> > almost the same performance as your patch. Attached is that simpler
> > version.
> Could you check this in ASAP? I'll hold the pretest for this. Thanks.
Ok, just done.
---
Kenichi Handa
handa@m17n.org
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: faster unicode character name completion
2009-12-09 0:57 ` Kenichi Handa
@ 2009-12-09 9:02 ` Deniz Dogan
0 siblings, 0 replies; 24+ messages in thread
From: Deniz Dogan @ 2009-12-09 9:02 UTC (permalink / raw)
To: Kenichi Handa; +Cc: Chong Yidong, monnier, emacs-devel
2009/12/9 Kenichi Handa <handa@m17n.org>:
> In article <87k4wx57kh.fsf@stupidchicken.com>, Chong Yidong <cyd@stupidchicken.com> writes:
>
>> Kenichi Handa <handa@m17n.org> writes:
>> > Ah, interesting approach. But, I've just found that
>> > dotimes-with-progress-reporter of the original code didn't exclude the
>> > big unused range U+30000..U+DFFFF (about 75% of the range currently
>> > checked). Just excluding that part in the original code achieves
>> > almost the same performance as your patch. Attached is that simpler
>> > version.
>
>> Could you check this in ASAP? I'll hold the pretest for this. Thanks.
>
> Ok, just done.
>
> ---
> Kenichi Handa
> handa@m17n.org
>
>
>
Not that I have any idea about low-level stuff at all, but wouldn't it
be possible to implement ucs-names in C?
--
Deniz Dogan
^ permalink raw reply [flat|nested] 24+ messages in thread
end of thread, other threads:[~2009-12-09 9:02 UTC | newest]
Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-11-30 19:55 Emacs 23.2 pretest freeze? Karl Fogel
2009-11-30 22:48 ` Chong Yidong
2009-11-30 23:05 ` Karl Fogel
2009-12-02 13:34 ` Alan Mackenzie
2009-12-03 21:35 ` Emacs 23.2 Pretest next week Chong Yidong
2009-12-04 11:23 ` faster unicode character name completion Kenichi Handa
2009-12-04 12:08 ` Deniz Dogan
2009-12-04 13:04 ` Juanma Barranquero
2009-12-04 13:26 ` Florian Beck
2009-12-04 15:07 ` Stefan Monnier
2009-12-04 22:38 ` Miles Bader
2009-12-07 2:00 ` Kenichi Handa
2009-12-07 8:13 ` Kenichi Handa
2009-12-07 14:57 ` Stefan Monnier
2009-12-07 20:28 ` Juri Linkov
2009-12-07 21:42 ` Stefan Monnier
2009-12-08 1:59 ` Miles Bader
2009-12-08 1:45 ` Kenichi Handa
2009-12-08 2:29 ` Stefan Monnier
2009-12-09 0:12 ` Chong Yidong
2009-12-09 0:57 ` Kenichi Handa
2009-12-09 9:02 ` Deniz Dogan
2009-12-04 19:04 ` Emacs 23.2 Pretest next week Dan Nicolaescu
2009-12-04 21:15 ` Chong Yidong
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).