CMPSCI 383: Artificial Intelligence

Fall 2014 (archived)

Assignment 03

This assignment is due at 1700 on Friday, 26 September Wednesday, 01 October.

The goal of this problem is to write a program that given a tic-tac-toe board, determines which player will win, or if the game will be a draw.

I will be updating the assignment with questions (and their answers) as they are asked.

See also:

Tic-tac-toe

Tic-tac-toe is a simple, deterministic, adversarial two-player game with no hidden knowledge.

Traditionally, it is played on a 3 x 3 board, where the X player (who goes first) and the O player alternate turns, claiming spaces on the board. The first player to claim three spaces in a row wins the game. The game concludes in a draw if neither player is able to do so.

To quote Wikipedia:

The game can be generalized to an m,n,k-game in which two players alternate placing stones of their own color on an m×n board, with the goal of getting k of their own color in a row. Tic-tac-toe is the (3,3,3)-game.

In this assignment, we will be examining boards where m = n = k, and k >= 3. Specifically, you’ll implement a program to compute the Minimax value of the game board provided as input. We will define the X player as the MAX player, and an X winning board as having value 1. O is the MIN player, with a winning board value of -1. A draw is represented with a value of 0.

Input data format

We will test your programs using data in the following text-based format.

Boards will be represented as a string of characters. Permissible characters are uppercase X, uppercase O, and . (a period), representing tiles claimed by the first player, the second player, and unclaimed tiles, respectively. All other characters are ignored (including spaces, newlines, dashes, and bars).

Any of the following are valid inputs, and each represents the same board:

1
2
3
O X X
. X O
. O .
1
2
3
4
5
O | X | X
- + - + -
. | X | O
- + - + -  
. | O | .
1
OXX.XO.O.

Inputs will always contain exactly n2 permissible characters, such that n is the width and height of the board.

Inputs will always contain either an equal number of Xs and Os, or one more X than the number of Os.

Inputs will always represent game states reachable through legal play. Boards such as:

1
2
3
4
XOXO
XOXO
XOXO
XOXO

are not reachable, and will never be provided to your program.

Output data format

Your program should output a single line. The line should be in one of the following two forms.

  • If your program computed the exact minimax value for the board, the line should consist of only that value as an integer, for example:
1
-1

or

1
0
  • If your program computed an estimated minimax value, the line should consist of only that value, expressed as a floating-point number, for example:
1
1.0

or

1
-0.3

Normalize your output of estimates to be within the interval [-1.0, 1.0].

Even if the estimated minimax value is an integer, format it with an explicit decimal point to disambiguate it from an exact value.

Finally, if you are unable to compute the exact value, and unable to develop heuristics, then output a heuristic value of exactly 0.0:

1
0.0

which we will interpret as your program not being able to decide upon a minimax value (see Grading below).

Limitations

You will find that a naive depth-first search of the game tree will grow quite quickly with n. To improve things, you might consider each of the following:

  • Alpha-Beta pruning can reduce the effective size of the search tree, though node expansion order matters. Consider using a heuristic to decide which node to expand first as your DFS search proceeds.
  • Transposition tables can cut down on node expansions as well. You needn’t cache all transpositions, and if memory becomes an issue, consider evicting items from the cache.
  • Related to transposition tables: You can dramatically shrink the state space if you consider that many boards are actually equivalent. For example:
1
2
3
XO.
...
...

is the same as

1
2
3
.OX
...
...

when reflected, as is:

1
2
3
...
...
XO.  

If you can come up with a canonical way to represent boards that collapses rotations and reflections to a single state, and use it in conjunction with transposition tables, then you’ll cut way down on the amount of the game tree your program will have to explore.

  • As we discussed in class, you can apply heuristics to evaluate states when they’re not end-game states. You may want to do some benchmarking of your program, and decide that for a board with u empty (unplaced) spaces, you’ll only expand to a depth of depthLimit(u). The branching factor at the first level is u, and at the next is u-1, etc., so your time bound will still be roughly consistent for a given value of u across different values of n. depthLimit is a function whose details you’ll have to determine for yourself based upon your benchmarking.

What to submit

You should submit two things: a tic-tac-toe game state analyzer and a readme.txt.

  • Your analyzer should use its first command as the path to an input file. If, for example, your solver’s main method is in a Java class named TicTacToe, we should be able to use java TicTacToe /Users/liberato/testcase to direct your program to read the input in the file located at /Users/liberato/testcase.
  • Your analyzer should print the minimax value it computes to standard output, as described above.

Submit the source code of your analyzer, written in the language of your choice. Name the file containing the main() method TicTacToe.java or your language’s equivalent. If the file you submit depends upon other files, be sure to submit these other files as well.

As in the previous assignment, while you may use library calls for data structures and the like, you must implement the search method you use yourself. Do not use a library for search or optimization. We will consider it plagiarism if you do. Check with us if you think there’s any ambiguity.

Your readme.txt should contain the following items:

  • your name
  • if the language of your choice is not Java, Python, Ruby, node.js-compatible JavaScript, ANSI C or C++ (or if you’re concerned it’s not completely obvious to us how to compile and execute it), a description of how to compile and execute the submitted files
  • a description of which optimizations, if any, you implemented; in particular, if you use a heuristic, describe the intuition(s) behind it
  • a description of what you got working, what is partially working and what is completely broken

If you’re using language features that require a specific version of your language or runtime, check for that version at program start and fail if it’s not present, emitting an understandable error message indicating this fact. Your program must compile and execute on the Edlab Linux machines.

If your program does not compile or execute, you will receive no credit. Check with us in advance if you’re concerned.

Grading

We will run your program on a variety of test cases. The test cases will not be available to you before grading. You are welcome to write and distribute your own test cases (or test case generators).

If your program is able to correctly compute the exact minimax value for all 3x3 boards, then you will receive 75% of the possible points. If it cannot do this much, you will receive no points, and the grader will stop.

The remaining 25% of the points will be awarded or penalized on the basis of your program’s handling of larger boards (test cases), both empty and partially played, as follows.

  • For each board where it returns the correct exact minimax value, it will be awarded a fraction of the remaining points. An incorrect exact minimax value will be penalized the same fraction of points.
  • For each board where it returns a estimated minimax value of the correct sign, it will be awarded a slightly smaller fraction of the points. If it returns a estimated value of the incorrect sign, it will penalized a fraction of the remaining points. Explicitly, this penalty is to dissuade you from choosing heuristic values at random to get half of the available points.
  • For each board where it returns an estimated minimax value of 0.0, it will neither gain nor lose points.
  • For each board where it does not produce output and exit in under twenty seconds, it will be penalized a fraction of the remaining points.

If your readme.txt is missing or judged insufficient, your overall score may be penalized by up to ten percent.

We’re not going to feed your program incorrectly formatted input, so you need only concern yourself with handling input in the format described in the assignment. We expect valid output. Generating output that is not in the format described in the assignment will result in a failed test case.

Some of the test cases may push at the boundaries of what your program will be capable of, depending upon your choice of search strategy. If your program exceeds available heap memory (which we’ll set to 1 GB in Java, using the -Xmx1024M argument if necessary), or if it does not terminate in twenty seconds, we will consider the test case failed.

Advice

Negamax is a formulation of minimax for two-player zero-sum games that applies to tic-tac-toe. While it doesn’t provide any advantage in time/space complexity, some people find it simpler to implement.

You get 75% of the points if your program can correctly handle 3x3 boards, and no points if it cannot. First things first: Make sure your program can do this much.

Use the grading criteria and your own estimate of the difficulty of implementing each possible optimization to guide your work.

Questions and answers

Is something such as 6.4e-05 considered valid output?

Yes. The autograders are generally written in Python3, so your floating-point output should be a string parseable by its built-in float() function. Relevant links: https://docs.python.org/3/library/functions.html#float and https://docs.python.org/3/reference/lexical_analysis.html#floating

I think I’ve finished my implementation: minmax w/ alpha beta pruning and transposition table.

It solves most everything I throw at it (including a blank 4x4 board) nearly instantaneously and with minimal amount of memory.

My question: for boards where m/n/k > 4 how much is going to be filled in? Memory usage explodes very quickly.

Your program will see both empty and partially filled boards of where n>4, but not many of that size (<=10% points worth of the assignment’s test cases). So if you are correct about the above functionality of your program, you will score at least a 90% on this assignment. But if you want to polish further, read on.

I assume your program is using a depth-first search, and that the memory usage you’re seeing is due to transposition tables. If memory usage is a problem, you might consider limiting the size of your program’s transposition tables. One approach is to treat them as a cache rather than a table, and have an eviction policy. You may use a third party library to make this easier, such as Google’s Guava library (see also https://code.google.com/p/guava-libraries/wiki/CachesExplained). If you do this be sure to note it in your readme.txt so that we can configure the autograder to run your submission with minimal trouble.

Another approach is to limit the depth of your program’s search so that the tables don’t expand past the memory limit, and to heuristically evaluate the nodes at the chosen maxdepth if they are not leaves.

I’m considering using a Java Map object for my transposition table. The directions seem to discourage such library use. What are your thoughts on this?

It depends.

The problem with the general contract of Maps in this context is that they never forget. So as you add more and more objects to them, they will continue to consume more and more heap space. Depending upon the size of the object you’re storing in them, and the amount of state space that you explore (which will vary based upon whether you canonicalize symmetric boards), your program may exceed the 1 GB heap limit I’m placing on it.

On one hand, Maps are a fine fit for this problem, and will speed up the execution of your solver (until they don’t, when your program crashes). On the other hand, you can use a related data structure, a Cache, which is like a Map, but has a policy by which it evicts old entries. This will still save your program time (on average, assuming it contains enough entries that the hit rate justifies its overhead), and won’t crash your program, since you can cap its size. The drawback is that you must learn about and either implement a Cache, or utilize an existing implementation such as from Java’s Guava library, described above.

In HW3, it said that if the output is “0.0” then there will be no point deducted or awarded. What if it is the case where the optimal solution is a draw and the estimated minimax value is actually meant to be 0.0?

I’m wondering because in bigger test cases, it is quite often that the game resolves in a draw. Please let me know what you think. Thank you!

Yes, that’s correct: a 0.0 will award you no points. Tic-tac-toe is a simple enough game that I expect your heuristics to be able to decide one way or the other (+ vs -). If your heuristics can’t, then I expect you to either: convince yourself that the game is a tie, and output 0; or improve your heuristics to a non-zero signed number.

10% or fewer of weight of the test cases will involve boards bigger than 4x4; I expect you won’t have this issue except possibly in those cases. In other words, it’s quite possible to do A-level work by just outputting 0.0 in those (rare) cases, assuming you handle 4x4 boards without using heuristics.