Hi everybody,

some days ago i started thinking upon a
fast and simple random number generator of few
lines of codes, wondering about the use of
MUL and RDTSC. after some "deviations" from the theory  :D
i stumbled upon this following idea of mine:

.randomB:
  rdtsc
  mov rcx,866AA7'3F0B'7F'0B'03h
  movd mm0,eax
  movq mm1,rcx
  pshufw mm0,mm0,00'10'01'00b
  pmaddwd mm1,mm0
  pshufb mm1,mm0
  pmullw mm1,mm0
  movq rax,mm0
  movq rcx,mm1
  mul rdx
  sub rax,rcx
  ret


then using ENT to test some of its features
here what i have found:  :O

ent result.bin
Entropy = 7.841899 bits per byte.

Optimum compression would reduce the size
of this 130800 byte file by 1 percent.

Chi square distribution for 130800 samples is 29972.95, and randomly
would exceed this value less than 0.01 percent of the times.

Arithmetic mean value of data bytes is 129.7706 (127.5 = random).
Monte Carlo value for Pi is 3.139266055 (error 0.07 percent).
Serial correlation coefficient is -0.001012
(totally uncorrelated =0.0).

i dont know how to read it to evaluate my algo.
could you please help me ?

ent link is at
http://www.fourmilab.ch/random/

and the following 2 screenshots using gnuplot

total distribution of 65536 items with value ANDed to 65535
http://sites.google.com/site/x64lab/randomB_ganz.gif?attredirects=0

and a detail of it
http://sites.google.com/site/x64lab/randomB_detail.gif?attredirects=0

i have drawn some blu lines to denote visible
constant "holes" inside it.

Hints, general guidelines about the subject,
improvements, your opinion on the results well appreciated

Thanx in advance,
Cheers,
Posted on 2011-12-19 07:15:58 by hopcode
hopcode, heres a rand i coded a long time ago for testing sorting algos.


mov esi, offset buffer        ;buffer for rand output
mov edi, 500000d              ;size in bytes
mov ebx, 49282048h            ;seed
add edi, esi
xor ecx, ecx

@@:
rdtsc
rol eax, 5
lea eax,
shr eax, 3
ror edx, 3
lea edx,
rol edx, 7
or eax, edx
add eax, esi
add cl, al
ror cl, 3
rol eax, cl
xor cl, al
add cl, ah
rol eax, cl
lea eax,
xor eax, ebx
mov cl, bh
mov ch, bl
ror cx, cl
rol ax, cl
ror eax, 15
lea eax,
and ecx, ebx
ror eax, cl
mov , eax
rdtsc
mov ebx, eax
lea ebx,
lea ecx,
rol ebx, cl
lea ebx,
ror cx, cl
ror ebx, cl
add esi, 4
cmp edi, esi
jl @B


1gb of data generated by this rand, tested with ent:

Entropy = 8.000000 bits per byte.

Optimum compression would reduce the size
of this 1073741824 byte file by 0 percent.

Chi square distribution for 1073741824 samples is 268.19, and randomly
would exceed this value 50.00 percent of the times.

Arithmetic mean value of data bytes is 127.4989 (127.5 = random).
Monte Carlo value for Pi is 3.141590272 (error 0.00 percent).
Serial correlation coefficient is 0.000092 (totally uncorrelated = 0.0).

its fairly late but ill chime in on this some more tomorrow.


Posted on 2011-12-20 01:25:43 by paradox
Hi paradox,

1gb of data generated by this rand, tested with ent:
Entropy = 8.000000 bits per byte.
Optimum compression would reduce the size
of this 1073741824 byte file by 0 percent.
Chi square distribution for 1073741824 samples is 268.19, and randomly
would exceed this value 50.00 percent of the times.
Arithmetic mean value of data bytes is 127.4989 (127.5 = random).
Monte Carlo value for Pi is 3.141590272 (error 0.00 percent).
Serial correlation coefficient is 0.000092 (totally uncorrelated = 0.0).
its fairly late but ill chime in on this some more tomorrow.

that's very very nice. with those values it deserves more than
simple attention. i must say that i thought soemthing like a delta from 2 or more
RDTSC instructions, just because it is a non-serialized instruction...
i need some times for a diehard battery test on it, and porting it
to SSE 64bit eventually. stay tuned and
Thanks a lot,

Cheers
Posted on 2011-12-20 02:01:07 by hopcode
well, after diehard testing, i must say that the algo performs very well.
all values are like expected , or in the average.
i can share results if someone needs it. that is very simple and straightforward.
now, calling ECX the "state" and EBX the seed, it is very interesting the part 2
after the second RDTSC, where the seed will be elaborated.
i find the following line funny,

add eax, esi

because the fact that offset in memory is being added
to the final random number let me guess that, with some little modifications
this algo may be totally reversible i.e., given a state and a seed
one should be able to rebuild the offset in memory from the random number.

Bravo!, paradox.  :)

ok,now a little modification of my algo and some explanation

rdtsc
rol dx,4
rol ax,3
mov rcx,866AA7'3F0B'7F'0B'03h
movd mm0,eax
movq mm1,rcx
pshufw mm0,mm0,01'00'01'00b
pmaddwd mm1,mm0
pshufb mm1,mm0
movq rax,mm0
movq rcx,mm1
mul rdx
ret 0

with following unstable results yet on the x2.
they are yet around 30% and 70%
Entropy = 7.999309 bits per byte.
Optimum compression would reduce the size
of this 262144 byte file by 0 percent.

Chi square distribution for 262144 samples is 250.54, and randomly
would exceed this value 56.72 percent of the times.

Arithmetic mean value of data bytes is 127.7175 (127.5 = random).
Monte Carlo value for Pi is 3.140306706 (error 0.04 percent).
Serial correlation coefficient is -0.000857 (totally uncorrelated = 0.0).

cut away the pmullw, because too much aggressive
the pshufb is aggressive too, but i dont know how to resolve it at the moment.

i know it's quite mpossible to have a good rng without using a seed+state.
but that is the challenge, or ?

Cheers,
Posted on 2011-12-20 07:54:30 by hopcode
Don't forget to execute emms after using MMX, else the next FPU instruction will cause an exception because the FPU stack is in a bad state.
(Not required for SSE, one of the best reasons to always use SSE registers for MMX operations).
Posted on 2011-12-20 09:25:59 by Scali
sorry for the late reply work has been crazy lately.

anyways random is hard to define with statistical testing, what we can do is check for glaring errors and repetitive data.  like you have already found images are a great way to look at data.

example php rand under windows:

http://imageshack.us/photo/my-images/843/randombogorand.png

if i remember correctly diehard has been unsupported for quite some time, most people now a days are using NISTs random testing suite
http://csrc.nist.gov/groups/ST/toolkit/rng/documentation_software.html

if i have some time this weekend ill run my code though the new nist tests and see how it pans out.
Posted on 2011-12-22 01:20:57 by paradox

if i remember correctly diehard has been unsupported for quite some time, most people now a days are using NISTs random testing suite
http://csrc.nist.gov/groups/ST/toolkit/rng/documentation_software.html

yes, diehard is somewhat old,verbose but good. i have given a look to the NIST package and
the code inside is clean and effective. i think it deserves a porting to assembly,
because almost 100% optimizable. as i have some time i will start
to do it as coding exercise on the contained functions.
thanks for sharing.

i have managed to let my algo behave in a more stable way,
when outputting the chi-square, at the moment +20% the normal,
considering other values are ok;
the whole is acceptable, but in some way not fully satisfying.

ok, thank You again for the precious link,
All the best,

Cheers,








Posted on 2011-12-22 15:07:12 by hopcode
I once fused Park-Miller with Mersenne Twister (B), and it's never let me down.
See SuperRandom.
Posted on 2011-12-23 02:22:29 by Homer
Hi, sorry for late answering.
Merry Xmas and Happy New coming 2012  :)

I once fused Park-Miller with Mersenne Twister (B), and it's never let me down.
See SuperRandom.

nuclear cold-fusion eh ? :)
just kidding...
but i cannot find SuperRandom code. could You provide it (in assembly) ?

here a link to my Park-Miller-Carta implementation, getting rid of Whittle's
final if-correction.
It is branchless now and satisfies Park's minimum standard requirements.
http://sites.google.com/site/x64lab/home/reloaded-algorithms/park-miller-carta-random-lcg

btw: my curiosity, what's the goal in mixing Park-Miller with Mersenne ?
i.e. on what purpouse did You mix them ?

Cheers,

Posted on 2011-12-27 07:31:13 by hopcode
Yeah I can provide the code, but essentially it works as follows:

For each step (ie request for a random number),
1. Generate a Park Miller random, and set it as Seed for Mersenne.
2. Generate a mersenne random, and set it as Seed for Park Miller. This is your random number for your request.
3. At some count of iterations, periodically regenerate Mersenne's internal table of values using values generated from the above scheme.

If you still want the source, I'd be happy to dig it up.
Posted on 2012-01-16 11:36:06 by Homer
In order to properly evaluate the efficacy of a RNG, I analyze the numerical distribution of the output values in N dimensions, which basically is asking the question: for the numerical bounded domain of N bits, what is the distribution like? Are there dense clusters or is it well distributed?
Park Miller is not very well distributed, and MTB has ribbon-like distribution properties, mixing them created a better distribution than either could accomplish alone,  with very little execution overhead we get a better result in terms of numerical distribution across the numerical keyspace.
Posted on 2012-01-16 11:37:49 by Homer
Actually, I found it using the search on this site.
http://www.asmcommunity.net/board/index.php?topic=29222.msg206384#msg206384

Have fun :)
Posted on 2012-01-16 11:45:25 by Homer