## lundi 20 août 2018

### Probabilité

As you probably know, I try to maintain some brain activity (between to pinups) by studying maths and programmation.

I have stayed more than one year doing pratically nothing but, today I'll show you how to compute what amounts and their probability I am suppose to get or loose after playing  several times a money game I know the odds.

I am no expert so the calculus might be wrong : you will have to have it verified by somebody very accurate.

A simple problem with dice:
-- I win 200€ if I get the 6 (odds 1/6)
-- I  win 10€ if I get the 5 or the 4 (odds 2/6)
-- I win 0€ if I get 1 or 2 or 3  (odds 3/6).

The outcomes of the game could be memorized in a Python list:
[ [ 1.0/6 ,  200 ] , [ 2.0/6, 10 ], [ 3.0/6 , 0 ] ]

Now If I play 100 times I would get  some money around this amount:

100 time the mean win = 100 * E(X) =  100  * µ = 100 (200*1//6+ 10*2/6+ 0 * 3/6) = 100 *36.6 =  aproximatively 3660€

But what are the probailities to get only 500€ or between only 400 and 500€  after  100  throws.:

This is the purpose of the progame I publish at this adresse

https://pastebin.com/6S6Yh36j

The source is a little sloopy but ,I hope still readable engough.

In the "main" part we try (not hard enough) to prepare a list of possible wins from the game exceptectation (the mean win) . We just want amount-classes  to build the  winning distribution.

The gen () and rec_gen() function are used tu build a list representing what append during the N experiments (throwing N times the dice).
Each position in the list give the number of occurences for each possible outcomes for exemple:

[50,20,35] says in correspodnance with the previous table  [[ 1.0/6 ,  200 ] , [ 2.0/6, 10 ], [ 3.0/6 , 0 ] ]
tha we won 50 times 200€ 20 times 10€ and 35 times 0€

The callback function gives control to the trial() function that will compute the amount won and its probability :

The amount won is easily calculated : gain = 50* 200 + 20 * 10 + 35 * 0

The probability of a trial corresponding to such a configuration [50,20,35] is a succession of independents experiments having this probability:
P(50,20,35) = 1/6 * 1/6 *..... *1/6 (50 times) * 2/6*2/6* ...2/6 (20 times ) * 3/6 *3/6 * ...*3/6 (35 times).

But they are many trials that correspond to [50,20,35].
I have used  by the basic law of counting :
number of possible arrangements of n elements   with k1, k2,.... elements being indistinguishabe.
in this case we compute  a= 100! / (50! * 20! * 35!)

And the final probability for all case corresponding to [50,20,35] is  P(50,20,35) * a

The last function storegain() just add the probability to the distribution-classe we have chosen using the python library bisect.

I hope it gives you a fast way to understand the pgm if you have a basic knowledge of python and probability.

As you see this methode (generating a list such as [50,20,35]) allow to evaluate very long trials without having to generate a lot of datas.
It  perhaps alllready exists in some library  but I didn't take the time to verify because what interests me is to invent and create.

It must be also possible to modify easily the function rec_gen() for compatibility with threads.

I just found  CLT (Central Limit Theorem) in a proposed list for searching my email.
It is probably a way to warn me that the program I just proposed is not necessary.
I have even been warned in the street about this: "he made a fool of him-self, it can be computed very easily with the help of the CLT".

The distribution I try to compute is effectively approximately Normal when throwing n times the dice (n sufficiently big).
So  you can find the parameters of the  ( Normal Distribution)  as an approximation with the help of  Var(Xn)  =  n * Var(X) and
E(Xn) = n * E(X) and
Xn= sum of the Xi

But depending on the precision you need you might prefer the program I propose. For illustration I will soon show you a graph of the distribution using moderately narrowed  gain-amount classes for N=100 throws.
With  the basic dice distribution  I have  used in the program, the result  is alike a positive sinusoidal signal modulated by a standard normal signal.
To demonstrate this I have a new version of the program  that depend on the matplotlib.pyplot python module (to be installed).

https://pastebin.com/qxdBTjf8

And a new graphic using "raw" datas (aproximately 5000 points)with https://pastebin.com/sAFJR65x and giving:
I need to study this one because it is kind of mysterious. Why so much perfect curves?
This is all the gain (x) and  probablity  (y) generated by the program and not grouped in class (amount range).
At least not very feminin!