Remember the final table of the 2009 World Series of Poker's Main Event? Phil Ivey, the consensus best player in poker, put his tournament on the line with Ace-King. His opponent—Darvin Moon, the consensus best logger in the Western Maryland panhandle—held a dominated Ace-Queen. Ivey was the overwhelming favorite. And then ... luck happened.
The poker gods granted Moon his Queen (a 3:1 long shot), and the most dangerous player at the table was out in seventh place.
With five players remaining, Joe Cada risked his tournament with pocket Threes against Jeff Shulman's pocket Jacks, and managed to find the third Three (a 4:1 long shot). With three players remaining, Cada risked his tournament again—this time with pocket Twos against Antoine Saout's pocket Queens—and again found a way to win (another 4:1 long shot).
The last two players standing? Moon and Cada.
Advanced metrics in baseball, basketball, and football are now an expected part of ESPN's broadcast. But for more than 10 years, poker has lacked the measurements to answer viewers' most basic questions: Who's playing the best? Who's gotten lucky?
To understand luck and skill, the yin and yang forces of poker, you need to understand their foundational metric: the Situation Score. A Situation Score captures the amount that a player in a given situation usually ends up winning (or losing) in the hand.
To generate a Situation Score, you take the situation you're interested in, find a gaggle of similarly situated players in a relevant historical database, and calculate their average end-of-hand outcome. Here are some soft intuition-builders:
- If a player is dealt pocket Aces at the same time an opponent is dealt pocket Kings, his Situation Score will be large and positive. In other words, he is really lucky. If his opponent is dealt pocket Queens instead, his Situation Score will still be positive, but not quite as large.
- If a player flops middle set at the same time an opponent flops top set, his Situation Score will be hugely negative, i.e., he is really unlucky to find his good hand edged by a marginally better one. If instead he flops middle pair, his Situation Score will be slightly negative.
Think of the Situation Score as a baseline, "replacement level" in the parlance of sports analytics.
As far as using Situation Scores, the most natural application is to measure luck. The basic idea is to see how much the dealing of the cards changes a player's Situation Score. Slightly more formally, a Luck Score is the difference between a player's Situation Score immediately before some cards are dealt, and immediately after. Since the only event separating those two calculations was the dealing of some cards, the difference can only be attributable to luck.
Take those hands from that 2009 final table. We applied a database of over a billion hands of online poker played mostly in early 2011—and collected for a project I work on called One Billion Hands—to generate Situation Scores and measure luck. (The hands were thoroughly anonymized before they ever reached us, and the suits, and flop- and hole-card order have been randomized.)
Let's return to Darvin Moon busting Phil Ivey in 2009. Ivey was the short stack at the table with about 6.5 million in chips remaining—and about to be hit with a round of blinds that would take nearly a tenth of his remaining stack. First to act, he went all in with his Ace-King. The table folded to Moon, and he called Ivey.
Because there's a new luck score every time cards come out, we know that here at the start Darvin Moon and his A-Q had a luck score of -3.17. This represents the bad luck of being dealt a dominated A-Q when an A-K is in the hand. However, when the flop came out Q-6-6, this represented a massive stroke of luck: +13.45 for the flop. The turn and river returned luck scores of +2.23 and +2.02, respectively, for Moon, which represents the luck of Ivey not spiking his King on either street.
You'll see bigger spikes for unlikelier hands, like Joe Cada squeezing out flops worth +53.55 and +27.33 against Antoine Saout and Jeff Shulman. Or you'll see them swing back and forth on absolute cooler hands, like Jennifer Harman losing a hand in 2005 in which her opponent flopped a straight, she turned a full house, and he rivered a straight flush.
Hand by hand, this might not seem so different from the percentages you'll see on televised poker. You would have seen, for example, that Ivey was a 75 percent favorite to win the hand pre-flop, but fell to just a 14 percent chance to win after the flop, then 7 percent after the turn. But luck scores show a much fuller picture than that.
There are really three types of bad luck in poker. There is getting your money in the the best hand, only to see the cards screw you (a bad beat); getting your money in with the worst hand because your hand was so strong you couldn't possibly fold (a cooler); and when the texture of a board changes in a way that you can no longer extract value (An action-freezer. Example: you have the K-high flush, and I have the A-high flush, and I'm one betting round away from stacking you, then the river pairs the board. I might slow down because I'm newly scared of a full house, and lose the opportunity to take the rest of your stack.)
The ESPN percentages only give you a picture of the first type of luck, the backwards-looking kind. But luck scores can account for the future outcomes of the bad luck you just encountered—the money you're now forced to dump into a pot, or the action you can't draw out of your opponent because of a board thrown into chaos. You can be 100 percent to win or lose a hand, and still hit an unlucky card that will cost you money. That's what luck score gets across.
Another, broader, use of a the scores is the accumulation of good or bad luck—the ability to tell the story of how lucky a player got over the course of entire levels, sessions, tournaments, or final tables. It's one thing to know that Phil Ivey got unlucky on the hand that knocked him out; it's another to know if a player rode an overwhelming and unrelenting wave of luck to a tournament win (Cada), or if a player ground out good results with bad cards.
Those stories have always been told through hand-by-hand anecdotes or through observers' intuition. But with luck scores, we can say, for example, that 2013 Main Event winner Ryan Riess had a positive luck score on every single level of the final table, and was the luckiest on all but one. That's the kind of story you can tell when you have a firmer grasp on what luck looks like over the long view.
Although luck played a larger-than-usual role in the 2009 final table, it's not uncommon for it to be the dominant factor. The team at 1BH dug into the numbers, and found that it commonly takes over 1,000 hands before player performance (as opposed to luck) has the dominant effect on outcomes.
At this year's final table, new skill and luck measurements found their way into poker broadcasts (~2:43). These measurements are taken straight from the advanced metrics playbook: start with some fresh perspective, apply cutting-edge algorithms to a boatload of data, and voila—new numbers that capture the essence of the game.
From the Moneymaker boom through the Black Friday bust, we've been treated to a decade of semi-ubiquitous televised poker. But now, with metrics exposing poker's foundational yin (skill) and yang (luck), we have the opportunity to see that decade for what it was: prelude.
The team behind OneBillionHands.com is using its billion-hand database to bring Moneyball to poker.