bpo-36018: Add another example for NormalDist() (#18191) · python/cpython@10355ed (original) (raw)

`` @@ -772,6 +772,42 @@ Carlo simulation https://en.wikipedia.org/wiki/Monte_Carlo_method`_:

``

772

772

` >>> quantiles(map(model, X, Y, Z)) # doctest: +SKIP

`

773

773

`[1.4591308524824727, 1.8035946855390597, 2.175091447274739]

`

774

774

``

``

775

`` +

Normal distributions can be used to approximate `Binomial

``

``

776

`` +

distributions http://mathworld.wolfram.com/BinomialDistribution.html`_

``

``

777

`+

when the sample size is large and when the probability of a successful

`

``

778

`+

trial is near 50%.

`

``

779

+

``

780

`+

For example, an open source conference has 750 attendees and two rooms with a

`

``

781

`+

500 person capacity. There is a talk about Python and another about Ruby.

`

``

782

`+

In previous conferences, 65% of the attendees preferred to listen to Python

`

``

783

`+

talks. Assuming the population preferences haven't changed, what is the

`

``

784

`+

probability that the rooms will stay within their capacity limits?

`

``

785

+

``

786

`+

.. doctest::

`

``

787

+

``

788

`+

n = 750 # Sample size

`

``

789

`+

p = 0.65 # Preference for Python

`

``

790

`+

q = 1.0 - p # Preference for Ruby

`

``

791

`+

k = 500 # Room capacity

`

``

792

+

``

793

`+

Approximation using the cumulative normal distribution

`

``

794

`+

from math import sqrt

`

``

795

`+

round(NormalDist(mu=np, sigma=sqrt(np*q)).cdf(k + 0.5), 4)

`

``

796

`+

0.8402

`

``

797

+

``

798

`+

Solution using the cumulative binomial distribution

`

``

799

`+

from math import comb, fsum

`

``

800

`+

round(fsum(comb(n, r) * pr * q(n-r) for r in range(k+1)), 4)

`

``

801

`+

0.8402

`

``

802

+

``

803

`+

Approximation using a simulation

`

``

804

`+

from random import seed, choices

`

``

805

`+

seed(8675309)

`

``

806

`+

def trial():

`

``

807

`+

... return choices(('Python', 'Ruby'), (p, q), k=n).count('Python')

`

``

808

`+

mean(trial() <= k for i in range(10_000))

`

``

809

`+

0.8398

`

``

810

+

775

811

`Normal distributions commonly arise in machine learning problems.

`

776

812

``

777

813

`` Wikipedia has a `nice example of a Naive Bayesian Classifier

``