One of the key calculations underpinning our model is that we need to be able to add together probabilities of winning a game, so that we combine results from different sets of champions to make a team . But you can’t simply add probabilities because they are bounded by 0 and 1, meaning that you cannot have a probability greater than 1 or less than 0.
Let’s say the probability of winning in one game with a specific set of champions (set A) was 0.7, and the probability with another set of champions (set B) was 0.6. If we add 0.6 + 0.7 we get 1.3…which is not a valid probability!
|Champion 1 (lane)||Champion 2 (lane)||Champion 3 (lane)||p Win|
|Garen (TOP)||Kalista (BOTTOM)||Annie (BOTTOM)||0.7|
|Champion 1 (lane)||Champion 2 (lane)||p Win|
|Sejuani (JUNGLE)||Katarina (MIDDLE)||0.6|
|Champion 1 (lane)||Champion 2 (lane)||Champion 3 (lane)||Champion 4 (lane)||Champion 5 (lane)||p Win|
|Garen (TOP)||Kalista (BOTTOM)||Annie (BOTTOM)||Sejuani (JUNGLE)||Katarina (MIDDLE)||1.3|
The problem is caused by the non-linearity of the probability scale. The magnitude of the difference between 0.5 and 0.6 is smaller than between 0.8 and 0.9. On the graph shown below, a greater range of X values (2 to 4) occur between 0.8 and 0.9, than between 0.5 and 0.6 (0 to 0.25) on the X axis.
Odds not probabilities
An improvement on this is by considering a win/loss ratio. This is a form of odds – the probability of you winning, versus the probability of you not winning. Winning half of your games gives you a win/loss ratio of 1. If you lose more you go below 1 but greater than 0. If you win more you go above 1 and the maximum is infinite. So on this scale the boundaries are 0 and infinity, which means that we could continually add odds together without ever reaching a limit. But the problem is that half the distribution is squeezed between 0 and 1 and the other half is stretched out between 1 and infinity – things here are also non-linear so they cannot just be added together.
So how do we fix this? Well, if we take the natural logarithm of the win/loss ratio (the odds), then the scale becomes linear! The natural logarithm is the logarithm of a number to the base (also written as ). This might seem a little confusing, and it isn’t particularly relevant exactly what it means here, but if you’re interested then have a Google around for it. In this instance and in most statistics, statisticians use the term “log” to define the natural logarithm (unlike in conventional mathematics).
This handy little fix can be shown by considering the distribution across our new scale. The log of 1 is 0, the log of 0 is negative , and the log of is . So now half of our distribution is between negative and 0, and the other half is between 0 and , meaning that things are evenly spread.
This change from probability to log-odds then changes the relationship with X values from a non-linear to a linear one. Because the the distribution is now linear, the gradient is constant, so an increase in the log-odds of 1 is the same at any point along the graph, meaning that we can add and subtract different log-odds and still get valid and meaningful results.
Hugo – Chinchillarama – Pedder
Natural logarithms – http://www.purplemath.com/modules/logs3.htm