The life of an arbitrator is full of A/B tests. If you need to split the creatives and understand who has a better CTR, then you want to put a request collection form on a proclamation and see if it will improve the conversion, then new lends were brought into PP, and they are purely visual, like, normal, but will the leads?

And what happens next? The arbitrator takes a couple or three creatives, shoves them into FB and, let's say, gets the following results after spinning them off:

Creo1: 2800 impressions - 100 clicks = 3.6 CTR

Creo2: 3000 impressions - 100 clicks = 3.3 CTR

Creo3: 3700 impressions - 100 clicks = 2.7 CTR

"It's as clear as day!" cries our arbitrator excitedly, "Creo1 has the biggest CTR, fuck all the others!"

Or, let's say he has a couple of punches. And he's like, he's pouring 100 clicks on each one and watching the breakthrough:

The first puncture has a 25% and the second a 37%.

"Ahaaaah," yells the arbitrator, "the first proclamation is shit!"

And everything would be fine, but for some reason, after all the elements tested in this way are assembled, it doesn't convert ♂️

Or it converts, but the final values do not correspond to those tested. "Arbitration is one big random. decides the arbitrator and goes to the factory.

And while he's going, we'll see where he was wrong, and for that we would do well to dive into probability theory and statistics, but we won't do that, because it's boring, tedious and abstruse, and we need to pour ?

Interested send to Wikipedia and then follow the links, For the time being, let us understand one simple thing: the data we get after conducting the test may not be enough to draw an unequivocal conclusion as to whether value A is better than value B.

So how do we know if we've dumped enough traffic or not? For such cases, clever people have long invented online calculators of statistical significance and you and I Let's deal with one of them.

Take, such, go to the tab "Test Results", enter there the data from the first example with the creatives and see the result:

If you look closely, there is a slider at the bottom that defaults to 95%, which means that there is only 5% probability that the CTRs of the creos will be different when they are further drained!

The second example will be the same, you can check it yourself.

So what would the sample size have to be for the difference to be significant?

Let's take our second example with the procs. Recall, one has a breakdown of 25% and the other has 37% for 100 clicks. The difference between the punches is 12%. Let's go to the first tab "Sample Size" and put 25% - and 12% there.

We see that the sample size should be 2 times larger! Check, go back to the "Test results" and stupidly multiply our numbers by 2 (ie, imagine that when merging another 100 clicks breakthrough remains the same, which, of course, just a guess, which is to check the tests).

Only now with a 95% probability we can be sure that we have tested the procles.

This is the end of this brief introduction to statistics, count everything and pour in the plus, gentlemen!