A Simple Form of Poker

“Essentially” Solved

Posted on January 9, 2015 by Sean Carroll
You know it’s a good day when there are refereed articles in Science about poker. (Enthusiasm slightly dampened by the article being behind a paywall, but some detailshere.)
Poker, of course, is a game of incomplete information. You don’t know your opponent’s cards, they don’t know yours. Part of your goal should be to keep it that way: you don’t want to give away information that would let your opponent figure out what you have.
As a result, the best way to play poker (against a competent opponent) is to use a mixed strategy: in any given situation, you want to have different probabilities for taking various actions, rather than a deterministic assignment of the best thing to do. If, for example, you always raise with certain starting hands, and always call with others, an attentive player will figure that out, and thereby gain a great deal of information about your hand. It’s much better to sometimes play weak hands as if they are strong (bluffing) and strong hands as if they are weak (slow-playing). The question is: how often should you be doing that?
Now researchers at a University of Alberta group that studies computerized poker has offered and “essentially” perfect strategy for a very simple form of poker: Heads-Up Limit Hold’em. In Hold’em, each player has two “hole” cards face down, and there are five “board” cards face-up in the middle of the table; your hand is the best five-card combination you can form from your hole cards and the board. “Heads-up” means that only two players are playing (much simpler than a multi-player game), and “limit” means that there is any bet comes in a single pre-specified amount (much simpler than “no-limit,” where you can bet anything from a fixed minimum up to the size of your stack or your opponent’s, whichever is smaller).
A simple game, but not very simple. Bets occur after each player gets their hole cards, and again after three cards (the “flop”) are put on the board, again after a fourth card (the turn), and finally after the last board card (the river) is revealed. If one player bets, the other can raise, and then the initial better can re-raise, up to a number of bets (typically four) that “caps” the betting.
So a finite number of things can possibly happen, which makes the game amenable to computer analysis. But it’s still a large number. There are about 3×1017 “states” that one can reach in the game, where a “state” is defined by a certain number of bets having been made as well as the configuration of cards that have already been dealt. Not easy to analyze! Fortunately (or not), as a player with incomplete information you won’t be able to distinguish between all of those states — i.e. you don’t know your opponent’s hole cards. So it turns out that there are about 3×1014 distinct “decision points” from which a player might end up having to act.
So all you need to do is: for each of those 300 trillion possibilities, assign the best possible mixed strategy — your probability to bet/check if there hasn’t already been a bet, fold/call/raise if there has — and act accordingly. Hey, nobody ever said being a professional poker player would be easy. (As you might know, human beings are very bad at randomness, so many professionals use the second hand on a wristwatch to generate pseudo-random numbers and guide their actions.)
Nobody is going to do that, of course. So there might be a worry that advances like this will render human beings superfluous at playing poker. It’s not a realistic worry quite yet; despite it’s apparent complexity, Heads-up Limit is a sufficiently simple game that it’s almost never played by real people. Heads-up No-Limit, and multiplayer versions of both limit and no-limit, are played quite frequently; but they’re also much harder games to solve.
Which shouldn’t be taken as disparagement of the amazing things the Alberta group has achieved. To put it in context, they provide a plot of the sizes of imperfect-information games that have previously been essentially solved. Before this work, the most complicated solved game had less than 1011 decision points, whereas Heads-Up Limit Hold’em has over 1013 (after accounting for some symmetries in game configurations, which I presume means cards of equal value but different suits, etc.).
An impressive leap forward, relying on a technique known as “counterfactual regret minimization.” Essentially, the computer acts randomly at first, but keeps track of how much it regrets making a decision when things go badly, and adjusts its mixed strategy appropriately. Eventually an equilibrium is reached in which regrets are minimized. (If only we were allowed to live our lives this way…)
And, despite the simplicity of the game, you can certainly learn something from the strategy that the computer has worked out. Here is a plot with suggested actions for a couple of early decisions: on the left, what the dealer (who acts first) should do on their opening move, depending on what their hole cards are; on the right, what the other player should do if the dealer chose to raise. Red is fold, blue is call, and green is raise. (The dealer can fold as a first move, since the other player is forced to make a “blind” bet to start the action.)
Note that there is no blue in the left diagram. That means that the dealer should never simply call the blind bet without raising — a move known in the trade as “limping.” That fits well with received poker wisdom, which generally looks down on limping as an overly tentative and ultimately counter-productive strategy. Of course, no-limit is a different game, and one might still imagine a place for limping with a strong hand, since you can hope to get raised and then come over the top with a huge bet. But it’s food for thought for poker experts. (Also, note the lack of red in the right diagram — folding to a single bet is a bad idea with all but the worst hole cards. Hmm…)
As I said, this doesn’t affect real players in casinos very much, or at least not yet. Online poker is a different table of fish, of course. In the US, online poker is essentially dead, having been squelched by the federal government on what’s known in the community asBlack Friday. But before that happened, there was already serious worry about online “bots” that could play many hands and beat all but the best human players. And the curve showing the growth in complexity of solved games is pretty impressive; it’s not impossible to imagine that No-Limit will be “essentially solved” in the foreseeable future.
Fortunately, like chess, knowing that there are computers better than you at poker doesn’t take away the pleasure of playing with ordinary human opponents. And if they tend to limp a lot, your pleasure might be even higher.

No hay comentarios:

Publicar un comentario