Noam Postavsky <npostavs@gmail.com> writes:

> severity 32372 wishlist
> quit
>
> Raimon Grau <raimon@konghq.com> writes:
>
>> Subject: [PATCH] Add uuid as allowed thingatpt symbol
>>
>> * lisp/thingatpt.el (thing-at-point-uuid-regexp): Add regexp for uuid.
>
> I guess you should mention something about the ops as well here.  Though
> it's not 100% clear what kind of format you should use for those.  Maybe
> just (top-level): Add 'bounds-of-thing-at-point' operation for 'uuid'.

Aha. Added it.

>
>> +;; UUID
>> +
>> +(defvar thing-at-point-uuid-regexp
>> +  (rx (and bow
>
> Using rx is okay, I think.  There was some discussion about it on
> emacs-devel a little time ago, with most people saying the increased
> verbosity made them not want to use it, but I kind of like it myself.
> However, Stefan made the point that `and' is potentially a bit
> confusing, because it could be misread as intersection.  It's better to
> use one of the synonyms `seq' or `:'.
>
>> +           (or
>> +            "00000000-0000-0000-0000-000000000000"
>> +            (and
>> +             (repeat 8 hex-digit) "-"
>> +             (repeat 4 hex-digit) "-"
>> +             (or "1" "2" "3" "4" "5")
>> +             (repeat 3 hex-digit) "-"
>> +             (or "8" "9" "a" "b" "A" "B")
>> +             (repeat 3 hex-digit) "-"
>> +             (repeat 12 hex-digit)))
>> +           eow))
>> +  "A regular expression matching a UUID from versions 1 to 5.
>> +
>> +  More info on uuid's format in
>> +  https://tools.ietf.org/html/rfc4122." )
>
> So, in that RFC I see this grammar
>
>       UUID                   = time-low "-" time-mid "-"
>                                time-high-and-version "-"
>                                clock-seq-and-reserved
>                                clock-seq-low "-" node
>       time-low               = 4hexOctet
>       time-mid               = 2hexOctet
>       time-high-and-version  = 2hexOctet
>       clock-seq-and-reserved = hexOctet
>       clock-seq-low          = hexOctet
>       node                   = 6hexOctet
>       hexOctet               = hexDigit hexDigit
>       hexDigit =
>             "0" / "1" / "2" / "3" / "4" / "5" / "6" / "7" / "8" / "9" /
>             "a" / "b" / "c" / "d" / "e" / "f" /
>             "A" / "B" / "C" / "D" / "E" / "F"
>
> It looks like you crafted a regexp which is a tighter match for just the
> UUID versions currently in use.  I think we're better off with the
> looser definition though, that way it will continue to be correct even
> as new versions come out.
>
> Furthermore, I would guess a human user is going to be surprised if
> (thing-at-point 'uuid) picks up this
>
>     12345678-1234-1234-8123-123456789012
>
> but not this:
>
>     12345678-1234-1234-5123-123456789012
>

Completely agree.  Now using a simpler version that will be more
predictable for users.

>
>> +(put 'uuid 'thing-at-point
>> +     (lambda ()
>> +       (let ((boundary-pair (bounds-of-thing-at-point 'uuid)))
>> +         (if boundary-pair
>> +             (buffer-substring-no-properties
>> +              (car boundary-pair) (cdr boundary-pair))))))
>
> I think this isn't needed, because the `thing-at-point' function already
> does this for you:
>
>   (let ((text
>          (if (get thing 'thing-at-point)
>              (funcall (get thing 'thing-at-point))
>            (let ((bounds (bounds-of-thing-at-point thing)))
>              (when bounds
>                (buffer-substring (car bounds) (cdr bounds)))))))

Right. I removed it.

Thanks for the review! I fixed all the points raised.

Cheers,


Raimon Grau