Noam Postavsky writes: > severity 32372 wishlist > quit > > Raimon Grau writes: > >> Subject: [PATCH] Add uuid as allowed thingatpt symbol >> >> * lisp/thingatpt.el (thing-at-point-uuid-regexp): Add regexp for uuid. > > I guess you should mention something about the ops as well here. Though > it's not 100% clear what kind of format you should use for those. Maybe > just (top-level): Add 'bounds-of-thing-at-point' operation for 'uuid'. Aha. Added it. > >> +;; UUID >> + >> +(defvar thing-at-point-uuid-regexp >> + (rx (and bow > > Using rx is okay, I think. There was some discussion about it on > emacs-devel a little time ago, with most people saying the increased > verbosity made them not want to use it, but I kind of like it myself. > However, Stefan made the point that `and' is potentially a bit > confusing, because it could be misread as intersection. It's better to > use one of the synonyms `seq' or `:'. > >> + (or >> + "00000000-0000-0000-0000-000000000000" >> + (and >> + (repeat 8 hex-digit) "-" >> + (repeat 4 hex-digit) "-" >> + (or "1" "2" "3" "4" "5") >> + (repeat 3 hex-digit) "-" >> + (or "8" "9" "a" "b" "A" "B") >> + (repeat 3 hex-digit) "-" >> + (repeat 12 hex-digit))) >> + eow)) >> + "A regular expression matching a UUID from versions 1 to 5. >> + >> + More info on uuid's format in >> + https://tools.ietf.org/html/rfc4122." ) > > So, in that RFC I see this grammar > > UUID = time-low "-" time-mid "-" > time-high-and-version "-" > clock-seq-and-reserved > clock-seq-low "-" node > time-low = 4hexOctet > time-mid = 2hexOctet > time-high-and-version = 2hexOctet > clock-seq-and-reserved = hexOctet > clock-seq-low = hexOctet > node = 6hexOctet > hexOctet = hexDigit hexDigit > hexDigit = > "0" / "1" / "2" / "3" / "4" / "5" / "6" / "7" / "8" / "9" / > "a" / "b" / "c" / "d" / "e" / "f" / > "A" / "B" / "C" / "D" / "E" / "F" > > It looks like you crafted a regexp which is a tighter match for just the > UUID versions currently in use. I think we're better off with the > looser definition though, that way it will continue to be correct even > as new versions come out. > > Furthermore, I would guess a human user is going to be surprised if > (thing-at-point 'uuid) picks up this > > 12345678-1234-1234-8123-123456789012 > > but not this: > > 12345678-1234-1234-5123-123456789012 > Completely agree. Now using a simpler version that will be more predictable for users. > >> +(put 'uuid 'thing-at-point >> + (lambda () >> + (let ((boundary-pair (bounds-of-thing-at-point 'uuid))) >> + (if boundary-pair >> + (buffer-substring-no-properties >> + (car boundary-pair) (cdr boundary-pair)))))) > > I think this isn't needed, because the `thing-at-point' function already > does this for you: > > (let ((text > (if (get thing 'thing-at-point) > (funcall (get thing 'thing-at-point)) > (let ((bounds (bounds-of-thing-at-point thing))) > (when bounds > (buffer-substring (car bounds) (cdr bounds))))))) Right. I removed it. Thanks for the review! I fixed all the points raised. Cheers, Raimon Grau