Predicting football outcomes requires more than knowing which team has won the most games. You need to understand how confident you can be in each team's measured strength. A newly promoted side that has played only a handful of matches carries far more uncertainty than a consistent top-four contender with dozens of results on record. This is where our Glicko-2 rating system provides a decisive advantage.
Developed by Professor Mark Glickman at Boston University, Glicko-2 is a natural evolution of both the original ELO system and the first-generation Glicko algorithm. While ELO assigns a single number to represent strength, Glicko-2 adds two critical dimensions: Rating Deviation (how certain the rating is) and Volatility (how consistently the team performs). Together, these three parameters give our prediction models a far richer understanding of each team's true quality.
How Glicko-2 Differs from ELO
The standard ELO system, originally created by Arpad Elo for chess, produces a single rating value. When two teams meet, the winner gains points and the loser drops points, with the magnitude determined by the expected outcome. This approach has served sports analytics well for decades, but it has a fundamental limitation: it treats every rating as equally trustworthy.
Consider two teams, both rated at 1600 ELO. One has played 30 matches this season with steady results; the other has played only 5 matches with wildly varying performances. ELO treats these ratings as identical. Glicko-2 recognizes that the first rating is far more reliable and encodes this distinction directly into the system through Rating Deviation and Volatility.
Understanding Rating Deviation (RD)
Rating Deviation Explained
Rating Deviation measures the uncertainty surrounding a team's rating. Think of it as a confidence interval. A team with a rating of 1700 and an RD of 30 is almost certainly between 1640 and 1760 in true strength. A team rated 1700 with an RD of 90 could realistically be anywhere from 1520 to 1880. The lower the RD, the more you can trust the rating.
RD naturally increases over time when a team is inactive. If a club has not played for several weeks, the system acknowledges that its true strength may have shifted and widens the uncertainty accordingly. As new match results come in, the RD decreases again, reflecting renewed confidence in the updated rating. This self-correcting mechanism makes Glicko-2 particularly well suited to football, where international breaks and cup schedules create irregular gaps between league fixtures.
Understanding Volatility
Volatility Explained
Volatility measures the consistency of a team's performance relative to expectations. A team with low volatility performs predictably, whether they win or lose. A team with high volatility produces frequent surprises, such as beating top sides one week and losing to relegation candidates the next. High volatility signals that the team's true strength is harder to pin down.
Volatility is especially valuable for identifying teams in transition. A squad that has recently changed managers, made significant transfer window signings, or suffered a string of injuries will typically show elevated volatility. Our prediction models use this signal to appropriately widen the range of likely outcomes when such teams are involved.
How Our Football Glicko-2 System Works
We do not compute a single Glicko-2 rating per team. Instead, our system tracks multiple rating dimensions, each capturing a different aspect of team performance. Every dimension carries its own Rating Deviation and Volatility values. These dimensional ratings feed our prediction models as features; the public Power Rankings table surfaces the headline overall rating and RD alongside the ELO comparison.
Overall Rating
The composite measure of a team's general strength, updated after every match based on the result and the quality of the opponent. This is the primary Glicko-2 rating and serves as the anchor for all other dimensions.
Attack Rating
Measures a team's offensive capability based on goals scored relative to expected output against each opponent's defensive quality. A high attack rating with low RD indicates a reliably potent attacking side.
Defense Rating
Captures defensive solidity based on goals conceded relative to the attacking strength of opponents faced. Teams that consistently keep clean sheets against strong attacks will see their defense rating climb with decreasing RD.
Home Rating
Tracks performance specifically in home fixtures. Some teams gain a substantial advantage from their home ground and supporters, while others show little difference. This dimension quantifies that effect.
Away Rating
Measures how a team performs on the road. Certain squads travel well and maintain their quality away from home, whereas others see significant drops. The away rating captures this pattern independently.
Uncertainty-Adjusted Match Predictions
One of the most powerful aspects of Glicko-2 is how Rating Deviation and Volatility directly influence prediction confidence. When two teams with low RD and low volatility face each other, our models can produce tighter, more confident probability estimates. When one or both teams carry high RD or elevated volatility, the system appropriately widens the predicted outcome distribution.
This means our predictions are not just about who is more likely to win, but about how sure we can be about that assessment. A 65% win probability backed by low-RD ratings is fundamentally different from a 65% win probability where both teams have high uncertainty, and our models treat these situations accordingly.
Glicko-2 vs ELO: The Comparison
What Each System Captures
Traditional ELO
- Single rating number per team
- No measure of rating reliability
- No tracking of performance consistency
- Fixed K-factor adjustments
- All ratings treated as equally certain
- No distinction between active and inactive teams
Our Glicko-2 System
- Rating plus Rating Deviation plus Volatility
- Built-in confidence measure for every rating
- Tracks consistency of team performance
- Adaptive rating changes based on uncertainty
- Uncertain ratings change more after each result
- Inactivity increases RD automatically
- Multi-dimensional: attack, defense, home, away
How to Read the Power Rankings Table
Each league's Power Rankings page presents a unified table showing both ELO and Glicko-2 ratings side by side. From Glicko-2's perspective, the most important columns are those that expose uncertainty — information a plain ELO number cannot convey. Here is what each column means:
| Column | What It Represents |
|---|---|
| ELO | The team's overall ELO rating, displayed alongside G2 so you can compare the two systems' strength estimates at a glance. The table is sorted by ELO rank by default. |
| G2 | The team's overall Glicko-2 strength score. Higher is stronger; the starting baseline is 1500. When G2 and ELO imply different ranks, the Rank Δ column quantifies the disagreement. |
| RD | Rating Deviation — Glicko-2's built-in confidence measure. Below 80 means the rating is reliable. Above 120 signals significant uncertainty, typically caused by inconsistent recent form or a limited run of recent matches. |
| ELO± | Change in ELO from the most recent match. Positive = rating gained; negative = rating lost. |
| G2± | Change in Glicko-2 from the most recent match. Because Glicko-2 factors in RD when computing updates — uncertain ratings shift more — the magnitude and occasionally the sign of G2± can differ from ELO± for the same result. This is expected behaviour, not an inconsistency. |
| Rank Δ | Glicko-2 rank minus ELO rank. Positive means Glicko-2 is more skeptical (ranks the team lower). Negative means Glicko-2 is more bullish. This is the fastest way to identify where the two models fundamentally disagree on a team's true strength. |
Note on Volatility and dimensional ratings: Glicko-2 also computes Volatility and separate attack, defense, home, and away dimensions for every team. These parameters are used internally as features in our neural network prediction models. The public Power Rankings table surfaces the headline metrics — overall rating, RD, and the ELO comparison — to keep the view actionable without overwhelming detail. While Glicko-2 provides richer information than ELO or league standings, it remains one input among many in our prediction pipeline. Football outcomes are inherently uncertain, and these metrics enhance analytical understanding rather than guarantee results.
Explore Glicko-2 Ratings Across Europe's Top Leagues
View the latest Glicko-2 standings for all five major European football leagues. Each table shows overall ratings, Rating Deviation, Volatility, and dimensional breakdowns for every team.
Conclusion
Glicko-2 represents a significant advancement over traditional ELO for football team rating. By encoding how certain we are about each rating through Rating Deviation, and how consistently a team performs through Volatility, the system provides our prediction models with information that a single rating number simply cannot convey.
When you see a Glicko-2 rating on our platform, you are seeing not just an estimate of team strength, but a complete statistical profile: how strong the team is, how confident that assessment is, and how predictably the team has been performing. Combined with attack, defense, home, and away dimensions, these ratings give our neural network models a comprehensive and nuanced view of every team in Europe's top five leagues.
This depth of information is what allows our system to generate predictions that go beyond surface-level analysis, capturing the subtleties of form, reliability, and situational context that determine real match outcomes.