unofficial mirror of help-gnu-emacs@gnu.org
 help / color / mirror / Atom feed
* Generate random char (and string) from unicode category (e.g: letter)
@ 2018-12-06 10:44 Alexandre Garreau
  2018-12-06 11:24 ` Eli Zaretskii
  0 siblings, 1 reply; 4+ messages in thread
From: Alexandre Garreau @ 2018-12-06 10:44 UTC (permalink / raw)
  To: help-gnu-emacs

Hi,

I recall clearly having wrote in elisp something to generate random and
more-or-less plausible input for gmail account creation form, including
ascii chars for login, statistical randomness for gender (like, iirc,
48% of “male”, 52% of “female”, minus 2% of “others”), and random
unicode for password, real name, etc. I recall in the end I ended with a
lot of ideograms in those.  So I know it’s doable in pure elisp (or
maybe was it guile? less likely…).

I really don’t recall how I did that, nor if I took care of using a
single script for each form input, but I’m sure I was using something
less ugly than currently, that is, (random (max-char)) until it matches
[[:alpha:]] (but I clearly recall using something that would work for
all unicode, including foreign scripts I wouldn’t even know about).

Do you have an idea of something cleaner? currently I have this:

#+BEGIN_SRC emacs-lisp
(defun random-letter (&rest osef)
  (let ((num (random (max-char))))
    (until (string-match "[[:alpha:]]" (string num))
      (setq num (random (max-char))))
    num))
#+END_SRC

and use it like this:

#+BEGIN_SRC emacs-lisp
(apply #'string (mapcar #'random-letter (make-list (1+ (random 190)) nil)))
#+END_SRC

Problems is I get stuff like this: "䯩繩ꏴ跾ಾ𢉰𐎕𘓅𪆶矏ᄬ𣒈⳰𨜄𛰸𧅌煂𢙴𧐁𡚯
𦅉ᤋ𡇼钇꿚㱓㗧𩅍姵爠𣑽𠌤ꇊ𡘄𑄇𫲘𪯋𣊚𦉂𠦵𘕋𠈾ლ𨟇𦷕𤃻𫿡𢿟巙𩿊𥖠𒒈ባ𗆘𤧟𗲀𔗸𖼂뺔
𧸋𡠜𬶟咨발𬞗쏊紋䲁坮𠢥旼𗴟𬓏𤁍គ𩍏Ɉ𪅊𤙬𫪃𫴛𤶋𫴃𧐨䞪𩇨𡤦馲𨂧𡮃𓂅𒇵𤉴𥙯藣ბ솇
𨆬𦄎靔𐤒ඐ𒐕襋𬵝𥤄𪃝𫈹𨣼𘋹돃𪣞筛𣯿휈𥽊Ꭾ𣐧𥺒𠊆ꮩ闭ઉ𦻸𨔆𤛢𢮁𤟩𪊕𥫰𪢟𡻋𘅇ᶗ펙𣄽
玽쭻𩿎𗔥𪟉䪁ᣃ쒝𩅑𡞬넒煮ڒꫥ𥾴𣁫𬑼깙𣫖筁ᣯ𣱮𡡯𨐒ﶒ𤑳𤯼昵䊘㝓𣑼𐦥𥆤갛𤡇𠜠𥉡䋯𥫻
𪲩兀𖠹瀖𣊫𨘥𢍪ᴢ", where the majority of characters are non-displayable
and are shown with a square with numbers in it to indicate there’s
nothing such that in installed fonts.  I clearly recall I what I did
there were no such characters, so something must be possible (maybe
using charsets?).

PS: is there a way to get something else than linear random distribution
with `random'? like normal law, or logarithmic distribution?

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Generate random char (and string) from unicode category (e.g: letter)
  2018-12-06 10:44 Generate random char (and string) from unicode category (e.g: letter) Alexandre Garreau
@ 2018-12-06 11:24 ` Eli Zaretskii
  0 siblings, 0 replies; 4+ messages in thread
From: Eli Zaretskii @ 2018-12-06 11:24 UTC (permalink / raw)
  To: help-gnu-emacs

> From: Alexandre Garreau <galex-713@galex-713.eu>
> Date: Thu, 06 Dec 2018 11:44:38 +0100
> 
> #+BEGIN_SRC emacs-lisp
> (defun random-letter (&rest osef)
>   (let ((num (random (max-char))))
>     (until (string-match "[[:alpha:]]" (string num))
>       (setq num (random (max-char))))
>     num))
> #+END_SRC
> 
> and use it like this:
> 
> #+BEGIN_SRC emacs-lisp
> (apply #'string (mapcar #'random-letter (make-list (1+ (random 190)) nil)))
> #+END_SRC
> 
> Problems is I get stuff like this: "䯩繩ꏴ跾ಾ𢉰𐎕𘓅𪆶矏ᄬ𣒈⳰𨜄𛰸𧅌煂𢙴𧐁𡚯
> 𦅉ᤋ𡇼钇꿚㱓㗧𩅍姵爠𣑽𠌤ꇊ𡘄𑄇𫲘𪯋𣊚𦉂𠦵𘕋𠈾ლ𨟇𦷕𤃻𫿡𢿟巙𩿊𥖠𒒈ባ𗆘𤧟𗲀𔗸𖼂뺔
> 𧸋𡠜𬶟咨발𬞗쏊紋䲁坮𠢥旼𗴟𬓏𤁍គ𩍏Ɉ𪅊𤙬𫪃𫴛𤶋𫴃𧐨䞪𩇨𡤦馲𨂧𡮃𓂅𒇵𤉴𥙯藣ბ솇
> 𨆬𦄎靔𐤒ඐ𒐕襋𬵝𥤄𪃝𫈹𨣼𘋹돃𪣞筛𣯿휈𥽊Ꭾ𣐧𥺒𠊆ꮩ闭ઉ𦻸𨔆𤛢𢮁𤟩𪊕𥫰𪢟𡻋𘅇ᶗ펙𣄽
> 玽쭻𩿎𗔥𪟉䪁ᣃ쒝𩅑𡞬넒煮ڒꫥ𥾴𣁫𬑼깙𣫖筁ᣯ𣱮𡡯𨐒ﶒ𤑳𤯼昵䊘㝓𣑼𐦥𥆤갛𤡇𠜠𥉡䋯𥫻
> 𪲩兀𖠹瀖𣊫𨘥𢍪ᴢ", where the majority of characters are non-displayable
> and are shown with a square with numbers in it to indicate there’s
> nothing such that in installed fonts.  I clearly recall I what I did
> there were no such characters, so something must be possible (maybe
> using charsets?).

If you don't want characters that need fancy fonts, why do you use
max-char?  Why not a smaller value, like 255?



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Generate random char (and string) from unicode category (e.g: letter)
       [not found] <mailman.5272.1544093503.1284.help-gnu-emacs@gnu.org>
@ 2018-12-06 13:46 ` Ben Bacarisse
  2018-12-06 14:29   ` Emanuel Berg
  0 siblings, 1 reply; 4+ messages in thread
From: Ben Bacarisse @ 2018-12-06 13:46 UTC (permalink / raw)
  To: help-gnu-emacs

Alexandre Garreau <galex-713@galex-713.eu> writes:

> I recall clearly having wrote in elisp something to generate random and
> more-or-less plausible input for gmail account creation form, including
> ascii chars for login, statistical randomness for gender (like, iirc,
> 48% of “male”, 52% of “female”, minus 2% of “others”), and random
> unicode for password, real name, etc. I recall in the end I ended with a
> lot of ideograms in those.  So I know it’s doable in pure elisp (or
> maybe was it guile? less likely…).
>
> I really don’t recall how I did that, nor if I took care of using a
> single script for each form input, but I’m sure I was using something
> less ugly than currently, that is, (random (max-char)) until it matches
> [[:alpha:]] (but I clearly recall using something that would work for
> all unicode, including foreign scripts I wouldn’t even know about).
>
> Do you have an idea of something cleaner? currently I have this:
>
> #+BEGIN_SRC emacs-lisp
> (defun random-letter (&rest osef)
>   (let ((num (random (max-char))))
>     (until (string-match "[[:alpha:]]" (string num))
>       (setq num (random (max-char))))
>     num))
> #+END_SRC
>
>
> and use it like this:
>
> #+BEGIN_SRC emacs-lisp
> (apply #'string (mapcar #'random-letter (make-list (1+ (random 190)) nil)))
> #+END_SRC
>
> Problems is I get stuff like this: "䯩繩ꏴ跾ಾ𢉰𐎕𘓅𪆶矏ᄬ𣒈⳰𨜄𛰸𧅌煂𢙴𧐁

You might find char-displayable-p useful.  It returns 'unicode for those
numbered character for which there is no configured font.  It returns t
for others but that includes control characters.

describe-char-display gives the font being used and will exclude control
characters.  It needs two args -- a position and a char or number -- but
the position is ignored when there is an actual character.

Unlike char-displayable-p is it not documented so it may change or
vanish over time.

char-syntax returns ?w for word-like characters.  This might do instead
on the [[:alpha:]] match.  Thus

(let ((ch (random (max-char))))
  (and (char-displayable-p ch) (eq (char-syntax ch) ?w)))

might be what you want though there will be a relatively low density of
matching characters.  max-char is very big.

Another strategy is to select characters randomly from a string of
acceptable options.

> PS: is there a way to get something else than linear random distribution
> with `random'? like normal law, or logarithmic distribution?

Yes, but I am running out of time!  A cheap way to get an almost normal
distribution with mean n is to sum k numbers between 0 and n/k.

If you use the "select from a string" method, you can simply duplicate
those characters you want more of.

-- 
Ben.


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Generate random char (and string) from unicode category (e.g: letter)
  2018-12-06 13:46 ` Ben Bacarisse
@ 2018-12-06 14:29   ` Emanuel Berg
  0 siblings, 0 replies; 4+ messages in thread
From: Emanuel Berg @ 2018-12-06 14:29 UTC (permalink / raw)
  To: help-gnu-emacs

Here [1] is a little something that can be
interesting, who knows?

(defun scramble (beg end)
  "Shuffle chars in region from BEG to END."
  (interactive "r")
  (when (use-region-p)
    (save-excursion
      (let*((str        (region-to-string))
            (chars      (delete "" (split-string str "")))
            (rand-chars (sort chars (lambda (a b) (zerop (random 2))))) )
        (delete-region beg end)
        (dolist (c rand-chars)
          (insert c) )))))

[1] http://user.it.uu.se/~embe8573/emacs-init/sort-my.el

-- 
underground experts united
http://user.it.uu.se/~embe8573


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2018-12-06 14:29 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-12-06 10:44 Generate random char (and string) from unicode category (e.g: letter) Alexandre Garreau
2018-12-06 11:24 ` Eli Zaretskii
     [not found] <mailman.5272.1544093503.1284.help-gnu-emacs@gnu.org>
2018-12-06 13:46 ` Ben Bacarisse
2018-12-06 14:29   ` Emanuel Berg

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).