Tests for Randomness (original) (raw)

ACiphers By Ritter Page

A discussion about distinguishing a random sequence from a non-random sequence.


Contents


Subject: Re: Tests for Randomness Date: Sat, 14 Nov 1998 17:41:05 GMT From: scott@helsbreth.org (Scott Nelson) Message-ID: 3652bce8.168156049@news.inreach.com References: 364C7F53.9F43FCA3@aspco.com Newsgroups: sci.crypt Lines: 30

On Fri, 13 Nov 1998 13:49:55 -0500, Andrew Andrew@aspco.com wrote:

Can anyone e-mail me some information on how one can best analyze a set of data to dtermine if it is a random sequence or non random one? Any help would be greatly appreciated.

Short answer: No.

Long answer: There is no known way to prove randomness. There's a pretty strong argument that randomness can't even be tested properly. And the (poor) tests we have are really only useful when applied to very large sets. As a rule of thumb, you need about n squared data points to achieve a confidence of 1/n in the results (which shouldn't be confused with confidence in the randomness of the generator) Some of the tests in Diehard use over 2 million 32 bit samples.

Still, applying the tests we have, as bad as they are, is better than nothing. Take a look at http://stat.fsu.edu/~geo/diehard.html (If you're a C programmer, you might prefer DiehardC - ftp://helsbreth.org/pub/helsbret/random )


Shameless plug for random web site: http://www.helsbreth.org/random Scott Nelson scott@helsbreth.org


Subject: Re: Tests for Randomness Date: Sun, 15 Nov 1998 07:26:57 GMT From: "Douglas A. Gwyn" DAGwyn@null.net Message-ID: 364E8229.5584A45@null.net References: 3652bce8.168156049@news.inreach.com Newsgroups: sci.crypt Lines: 13

Scott Nelson wrote:

There's a pretty strong argument that randomness can't even be tested properly.

What can be done (in many cases) is to measure the information in favor of some hypothesis (about the source) over some other hypothesis; for example, that the observed sequence was generated by a uniform-random source rather than a source with certain other specified characteristics.

That changes an ill-posed question "Is it random?" into an answerable question where the answer has useful mathematical properties.


Subject: Re: Tests for Randomness Date: Mon, 16 Nov 1998 19:33:47 +0100 From: Mok-Kong Shen mok-kong.shen@stud.uni-muenchen.de Message-ID: 3650700B.A1122213@stud.uni-muenchen.de References: 364C7F53.9F43FCA3@aspco.com Newsgroups: sci.crypt Lines: 11

Andrew wrote:

Can anyone e-mail me some information on how one can best analyze a set of data to dtermine if it is a random sequence or non random one? Any help would be greatly appreciated.

For random bit sequences I recommend Maurer's test which is described in Menezes et al., Handbook of Applied Cryptography. It is very simple to implement. I have a Fortran code for that.

M. K. Shen


Subject: Re: Tests for Randomness Date: Mon, 16 Nov 1998 14:17:33 -0500 From: Andrew Andrew@aspco.com Message-ID: 36507A4D.503FCABC@aspco.com References: 364C7F53.9F43FCA3@aspco.com Newsgroups: sci.crypt Lines: 54

I appreciate all those who have responded to my original post. While I have not had the opportunity to examine in detail all of the information and related web sites, I have noticed one thing. I will attempt to explain, to the best of my abilities, a question. Please forgive me for any incorrect use of terminology.

A majority of the tests for randomness which were recommended to me, dealt primarily with sequencing or one dimensional space. The tests I was most interested in would also include an amplitude (distance) or a two dimensional space. For example:

Suppose you have the following sequence: 5 -1 4 -2 6 -1 5 -2 4 0 5 -1

because it generally follows: positive, negative, positive, negative .... it might appear that this is a randomly distributed set of values. The graph would look like this:

Looking more closely, if you looked at the graph of the result of the data it would look like this:

With this graph you can clearly see a "trend" in the data that is present. Using the average "step size" and number of iterations one could calculate the randomness of where you ended up from where you started.

I was hoping that people on this list might know of additional tests which would take that set of data and indicate if it was random or not.

Again, Any advice or assistance is greatly appreciated. I look forward to hearing from you all. Thanks for your time.

Andrew

Andrew wrote:

Can anyone e-mail me some information on how one can best analyze a set of data to dtermine if it is a random sequence or non random one? Any

help would be greatly appreciated.

Thanks for your time.

Regards,

Andrew andrew@aspco.com


Subject: Re: Tests for Randomness Date: Mon, 16 Nov 1998 21:45:28 GMT From: scott@helsbreth.org (Scott Nelson) Message-ID: 36508f0f.11893416@news.inreach.com References: 36507A4D.503FCABC@aspco.com Newsgroups: sci.crypt Lines: 39

On Mon, 16 Nov 1998 14:17:33 -0500, Andrew Andrew@aspco.com wrote:

I appreciate all those who have responded to my original post. While I have not had the opportunity to examine in detail all of the information and related web sites, I have noticed one thing. I will attempt to explain, to the best of my abilities, a question. Please forgive me for any incorrect use of terminology.

A majority of the tests for randomness which were recommended to me, dealt primarily with sequencing or one dimensional space.

[example snipped]

I was hoping that people on this list might know of additional tests which would take that set of data and indicate if it was random or not.

Again, Any advice or assistance is greatly appreciated. I look forward to hearing from you all. Thanks for your time.

I think you need to learn a bit more of the nomenclature before you ask again. If I roll a die 10 times and get the sequence 1,2,3,4,5,6,6,6,6,6 that sequence is just as random as the sequence 5,2,3,5,1,2,4,6,2,2 or any other, but I suspect you would reject the first as "not random enough"

Some of the tests in the Diehard battery test in 2, 3, 4 and 5 dimensions. Any sequence generated by a pseudo random generator will fail to be random when viewed in enough dimensions, though "enough" might be larger than is practical. For instance, a cryptographically secure pseudo random number generator shouldn't fail until the number of dimensions is greater than the key space.

Not sure if that answers your question or not, but I hope it helps.


DiehardC 1.03 now available via ftp from ftp://helsbreth.org/pub/helsbret/random Scott Nelson scott@helsbreth.org


Subject: Re: Tests for Randomness Date: Tue, 17 Nov 1998 10:02:50 GMT From: "Douglas A. Gwyn" DAGwyn@null.net Message-ID: 365149A3.E3F4031F@null.net References: 36507A4D.503FCABC@aspco.com Newsgroups: sci.crypt Lines: 8

Andrew wrote:

Suppose you have the following sequence: 5 -1 4 -2 6 -1 5 -2 4 0 5 -1 because it generally follows: positive, negative, positive, negative .... it might appear that this is a randomly distributed set of values.

You need to distinguish between the set and the sequence. If that pattern continues as the sequence is extended, then the sequence is certainly not random.


Subject: Re: Tests for Randomness Date: Thu, 10 Dec 1998 23:39:02 GMT From: bob_jenkins@my-dejanews.com Message-ID: 74pm2m$9nn$1@nnrp1.dejanews.com References: 36507A4D.503FCABC@aspco.com Newsgroups: sci.crypt Lines: 17

In article 36507A4D.503FCABC@aspco.com, Andrew Andrew@aspco.com wrote:

Suppose you have the following sequence: 5 -1 4 -2 6 -1 5 -2 4 0 5 -1

because it generally follows: positive, negative, positive, negative .... it might appear that this is a randomly distributed set of values.

The test for that is called the "run test". It measures how long a run of strictly increasing values you have. It's one of the standard tests, not a particularly powerful one though.

-----------== Posted via Deja News, The Discussion Network ==---------- http://www.dejanews.com/ Search, Read, Discuss, or Start Your Own


Subject: Re: Tests for Randomness Date: Fri, 11 Dec 1998 05:15:11 GMT From: ritter@io.com (Terry Ritter) Message-ID: 3670a993.3798156@news.io.com References: 74pm2m$9nn$1@nnrp1.dejanews.com Newsgroups: sci.crypt Lines: 37

On Thu, 10 Dec 1998 23:39:02 GMT, in 74pm2m$9nn$1@nnrp1.dejanews.com, in sci.crypt bob_jenkins@my-dejanews.com wrote:

In article 36507A4D.503FCABC@aspco.com, Andrew Andrew@aspco.com wrote:

Suppose you have the following sequence: 5 -1 4 -2 6 -1 5 -2 4 0 5 -1

because it generally follows: positive, negative, positive, negative .... it might appear that this is a randomly distributed set of values.

The test for that is called the "run test". It measures how long a run of strictly increasing values you have.

That sounds like "Runs-Up."

It's one of the standard tests, not a particularly powerful one though.

In my experience, Runs-Up and Runs-Down are far more powerful than they may appear. In contrast to many other randomness tests, Runs-Up and Runs-Down can expose correlations (errors of conditional probability), to some extent.

For small value ranges it is necessary to use the exact formulation for the expectations, which is not in Knuth II. See:

http://www.io.com/~ritter/ARTS/RUNSUP.HTM


Terry Ritter ritter@io.com http://www.io.com/~ritter/ Crypto Glossary http://www.io.com/~ritter/GLOSSARY.HTM


Subject: Re: Tests for Randomness Date: Fri, 11 Dec 1998 08:12:06 -0700 From: "Tony T. Warnock" u091889@cic-mail.lanl.gov Message-ID: 36713646.6494E470@cic-mail.lanl.gov References: 3670a993.3798156@news.io.com Newsgroups: sci.crypt Lines: 48

Terry Ritter wrote:.

In my experience, Runs-Up and Runs-Down are far more powerful than they may appear. In contrast to many other randomness tests, Runs-Up and Runs-Down can expose correlations (errors of conditional probability), to some extent.

For small value ranges it is necessary to use the exact formulation for the expectations, which is not in Knuth II. See:

http://www.io.com/~ritter/ARTS/RUNSUP.HTM

I've found the same thing. These are also fairly easy to implement.

Tony


Terry Ritter, hiscurrent address, and histop page.

Last updated: 1999-01-19