What we’ve wanted to do for a long time (as maybe you all have) is to be able to quantify and assess how different members of a team interact, and to use these interactions to predict how well a team would perform on average, as well as to suggest possible new combinations of champions which might play well together. What we wanted to really pin down on was synergy.
Synergy: The interaction of two or more agents or forces so that their combined effect is greater than the sum of their individual effects.
We’ve spent the past six months developing and implementing a model to do this. Countless nights poring over messy diagrams, confusing algorithms and (as a contrast) reasonably elegant code. Sometimes we loved it, sometimes it made our heads hurt, but we finally have a product that we believe achieves our objectives.
To outline some of the benefits of our model over other LoL team statistical analyses – we use indirect evidence on synergistic relationships:
- to help give more information to champion relationships where data are available
- to give estimates of champion relationships where there are as yet no data available
- to account for relationships within team compositions which might hide specific effects that we want to know about
All this means that we can predict the success of specific team comps which have not been played before within the dataset.
We’ve used one thousand games from solo queue (mixed tiers) available from Riot’s API as our input for the model. We’d like a larger set of data ideally, but we’re happy to make do with what we’ve got for now! What we do is we look at the synergy between every available champion in every lane on every available team, and we inform these relationships from our thousand games of data.
We then pool these synergistic effects to identify the team with the most synergy, based on the data. To do this also means accounting for various uncertainties and dependencies. For instance, if there was less data to inform a particular team composition then we had to give our estimate less “weight” in the analysis. This is best achieved by calculating a confidence interval, which gives an indication of how confident we are that our estimate is the “true” value.
The tricky part of assessing this confidence is that to calculate it accurately we need to consider which of our inputs can be thought of as independent and which cannot. An easy way to illustrate this is that multiple measurements of anything (e.g. blood pressure taken at several time points) are likely to be more similar than multiple measurements from different people. These dependencies therefore need to be properly quantified so that they can be accounted for when calculating the confidence of the estimates.
Once we’d managed this, all we had to do was to rank the teams in order of which we predicted would be the best, and to look at how much confidence we had in these predictions. These two things were the results of our model.