Maven (Scrabble)

src: images.sftcdn.net

Maven is an artificial intelligence Scrabble player, created by Brian Sheppard. It has been used in officially licensed Hasbro Scrabble games.

Video Maven (Scrabble)

Algoritma

Fase permainan

Maven game play is divided into three phases: "mid-game" phase, "pre-endgame" phase and "endgame" phase.

The "mid-game" phase lasts from the beginning of the game until there are nine or fewer tiles left in the bag. The program uses a fast algorithm to find all possible games from a given shelf, and then part of a program called "kibitzer" uses a simple heuristic to sort it into a rough quality sequence. The most promising steps are then evaluated by "simming", in which the program simulates a random image of a tile, spins forward a set number of plays, and compares the spread of points from moving results. By simulating thousands of random images, the program can provide highly accurate quantitative evaluation of different dramas. (While searching for Monte Carlo, Maven does not use the Monte Carlo tree search because it evaluates game trees only 2-layered, rather than playing out to the end of the game, and not reallocating rollouts to branches that are more promising for deeper exploration; in terms of learning reinforcement, the Maven quest can be considered "Monte Carlo simulation truncated." The correct MCT strategy is not necessary because the endgame can be solved.The superficial search is because the author of Maven argues that, due to the fast rotation of letters in a person's bag, it is usually useless to see more than 2 layers, because if one sees, for example 4-ply, the reward variance will be larger and the simulation will take several times longer, while it only helps in some exotic situations: "We maintain that if it requires extreme situations like CACIQUE to see the value of a four-ply simulation then they are not worth doing. "Seb if the board value can be evaluated with very high accuracy in Scr abble, unlike games like Go, a deeper simulation is not possible to change the initial evaluation.)

The "pre-endgame" phase works in much the same way as the "mid-game" phase, except that it's designed to try to produce a good game ending situation.

The "endgame" phase takes over as soon as no tiles are left in the bag. In a two-player game, this means that the players can now deduce from the initial tile initial letter distribution on their respective shelves. Maven uses the B-star search algorithm to analyze game trees during the endgame phase.

Moving generation

Maven has used several algorithms to move the generation, but that has stuck is the DAWG algorithm. The GADDAG algorithm is faster, but DAWG for North American English is only 0.5 MB, compared to about 2.5 MB for GADDAG. It makes a significant difference to download the game, while the speed advantage is not important. (Note that it's not important does not mean that the difference is small, just that the user can not tell the difference.GADDAG may be twice as fast, but both algorithms are pretty fast.)

Shelf evaluation

The first version (1986) of Maven uses a set of about 100 patterns to judge the rack. Each tile has a value (27 patterns). Each duplicate has a value (22 patterns). There is a pattern for triplicate and quads for letters that have enough representation in the bag. Finally, the combination of QU is a pattern.

Immediately after the first version, Maven derives the term evaluation of the shelf for vocal/consonant and Q/U distribution. The vowel/consonant balance is a table search based on the count of vowels and the remaining consonants on the shelf. The Q/U distribution varies the Q and U values â€‹â€‹using indexed indexed searches by how many of each remain in the bag.

Shortly after, Maven acquired tile duplicator evaluators. The idea is to vary the shelf depending on the possibility of drawing duplicates. For example, A is generally better than mine as a tile, but if there are 7 A and only 2 I remaining in the bag, then maybe we should prefer to keep I.

The parameter installation is done by setting the value to predict the total score in the future. Now this will be called the Learning of Temporal Differences.

This rack evaluation design is native to Maven. It was very successful in competing with today's human champions.

The design was later extended by other researchers. Mark Watkins championed what he called "the pattern of tiled synergies." This is a combination like ADES that forms the basis of many high-valued words. This is a natural extension of the design, which significantly improves the game. The Maven pattern is set gradually expanded from the base set 100 to over 400.

Maven has since turned to a completely different architecture, proposed by John O'Laughlin and implemented in Quackle. This is a "complete" architecture, where the program has different rack evaluation parameters for each of the 3 million possible combinations of 0 to 7 tiles. With advances in computer power over the last decade, it became possible to adjust the set of such large parameters.

The disadvantage of using a complete approach is that Maven loses the ability to vary the evaluation as a function of the remaining tiles in the bag. The bottom line is that the complete rack evaluator does not have the corresponding term shelf value to pull perhaps from the bag.

The full evaluation version of Maven's shelf has added that capability. In Maven, each rack has its own liner evaluator, where the rack value varies as a function of the opportunity to draw duplicates, the opportunity to draw vocals, and the opportunity to draw Q and U. The system has 5 parameters per rack, for about 15 million parameters.

Simulation

The great human champion Ron Tiekert has learned Scrabble by playing the individual positions dozens of times, and tabulating the results. He suggested that with Maven's speed, it should be possible to automate the process overnight. Brian Sheppard named this process "simulation", though it goes by the name "rollout" in backgammon, and "playout" on Go.

The process is to choose N candidates who move using heuristic rack scores. Then play those who move hundreds or thousands of times to see which candidates are performing best. You can vary the depth of playout according to your purpose; play two or four moves ahead to get a good idea of â€‹â€‹the differential point, or play to the end of the game to gauge the odds of winning.

By the mid-1990s, computers had become fast enough that Maven used simulations to choose to move in competitive games under control of the tournament time. Improved algorithms are important to scale simulations for this purpose. The most important innovation is to vary the number of trials given to the candidate so that more successful candidates receive more effort. It also helps to control the rack so that all moving candidates are sampled against the same and unbiased distribution.

The game analysis played by the Maven simulation engine shows that Maven has surpassed the skill level of the human champion.

Endgame

Playing strong in the Scrabble endbook is much harder than it looks. In theory, endgames are perfect information games, so the Alfa-beta pruning algorithm should work. But in practice Alpha Beta works poorly on Scrabble.

The problem with Alpha Beta is that some Scrabble suffixes require 14 moves to play out, and it is impossible to search it in depth. This is not just a theoretical possibility. When a player is "stuck" with a tile, it is impossible for him to play all his tiles. In that situation the optimal strategy for both sides usually plays one plot at each turn.

Maven uses a different approach. The B * search algorithm is a selective, progressive-widening algorithm that ensures to find the optimal solution for a two-player game when one can calculate the upper and lower limits on the value of each position.

It turns out that it is possible to estimate the upper and lower limits on the endgame position. These limits are true (that is, the true value lies within the interval) for the majority of positions. Since B * is strong enough with a small percentage of errors in the boundary, Maven can solve endgames that other approaches can not do.

Further enhancements make the Asymptotic Maven EndGame solution optimal even in the presence of errors. When search B * ends with proof that one step is best, and there is still time left, then Maven extends its forecast by 1 point and searches again. This retrieval is usually very fast, because the tree from the previous search is mostly still valid. Repeated use of this policy will increasingly identify errors, starting with the smallest (and most likely) error.

Exhausting Pre-endgame

When only 1 or 2 tiles remain in the bag ("PEG-1" or "PEG-2"), it is possible to perform a complete search of the state space.

The PEG-1 case is important, as nearly half of all games pass through it. Maven can play such countries in depth in almost all cases. That is, for all legal actions, Maven can play the resulting endgames (up to 8 for each legal step), and calculate which side will win the game in each case. Because there are some situations (eg, two blanks, jammed with Q) that require extra effort, the calculations are done progressively. That is, Maven expands his analysis first where the decision is near and relevant.

In PEG-2 it is usually not possible to thoroughly examine all moving sequences, so Maven goes as far as it can be done at the available time.

One feature of this low-tile situation is that it is very difficult to safely trim the list of legal actions. For example, the optimal game is ranked behind more than 50 other movements with a heuristic rack score of more than 1% of the time.

This policy does not produce a game that is theoretically perfect, because it is impossible to know the true initial distribution of the unseen tiles should be. Assuming a uniform distribution goes well, and it is possible to compute the conclusions about the invisible tiles which slightly increase on the assumption.

Another limitation is that Maven does not address the "hidden information" aspect of such situations. That is, in theory there are situations in which the player maximizes expectations by choosing a random move according to the probability distribution. Maven chooses a pure strategy in every node.

Maps Maven (Scrabble)

References

(PDF) . Artificial Intelligence . 134 (1-2): 241-275. doi: 10.1016/S0004-3702 (01) 00166-7.

Amazon.com: Scrabble Champion Edition JC: Software

src: images-na.ssl-images-amazon.com

External links

Hasbro
Funkitron

Source of the article : Wikipedia

Minggu, 15 Juli 2018