Assignment #1: Dynamic Connect-4

Due: Qualifying submission by September 26; final submission by September 28

This assignment is to be done individually.

Academic Integrity

The following is offered with apologies to the vast majority of students who do their work honestly and take their university learning seriously:

Your instructor takes academic integrity seriously and has no tolerance for plagiarism or any other form of academic misconduct. Failure to respect these guidelines will result in you receiving a grade of zero on this assignment.

Acceptable collaboration between students, provided it is acknowledged explicitly in your report and code, might include:

  1. discussing some aspect of the assignment specifications in attempting to understand a particular point
  2. discussing high-level features of your heuristic
  3. discussing a problem you encountered while implementing the alpha-beta pruning algorithm as described in the course text
Sharing of any computer code between students, or re-using any code from a third party (e.g., open source) is acceptable, provided that you indicate this explicitly at the start of your report and (as comments) in the source code. In this case, only the portion of work that you did individually will be considered toward your grade.

Unacceptable collaboration and violations of academic integrity include, but are not limited to:

  1. including any code that was not your own and failing to indicate so
  2. copying part of another student's report
If you are uncertain about any of these guidelines, please discuss with your instructor as soon as possible.

Introduction

The game of "dynamic connect-4" is played on a 7x7 grid as follows: Starting from the initial position illustrated below, players take turns moving one piece of their colour by one square, either horizontally or vertically. White plays first. Pieces can only move to unoccupied squares. The winner is the first player to line up four of its pieces either horizontally, vertically, or diagonally.

Game Agent

Implement a game-playing agent to play the game of dynamic connect-4. Your agent must be capable of playing either white or black. The time limit for your agent to output its move is 20 seconds. It is strongly suggested that you display the game state using the text representation of a matrix of comma-separated characters, with O denoting a white piece, X denoting a black piece, and suitably formatted whitespace denoting an empty square. For example, the starting board configuration above would be represented as follows:

    1 2 3 4 5 6 7
  1  , , , , , ,X
  2 X, , , , , ,O
  3 O, , , , , ,X
  4 X, , , , , ,O
  5 O, , , , , ,X
  6 X, , , , , ,O
  7 O, , , , , ,

Each square in the grid can be labeled by its <x,y> position, with <1,1> corresponding to the top-left square. Moves are entered by specifying the position of the piece to move in the form <x,y> followed by one of the cardinal directions N, S, E, W.

For example, to move the black piece at the top right of the board one square to the left, the command would be 71W. These commands will be exchanged, either with a human opponent or an AI competitor, through the game server for which a sample client is provided in C. Note that each string sent to the game server must terminate with a carriage return ("\n"); this is done to facilitate testing by hand, e..g., using telnet <server host address> 12345. For the purposes of human-readability, your agent should echo to screen each of the moves being played.

Testing

Your program should have an option for specifying the initial game state (either through an input file or manually, as you wish). This will be used to test your program with specific game scenarios.

Suggested Development Strategy

Start with minimax, and verify your implementation is working correctly. Initially, you should consider only the simple evaluation function of +1 for a win, -1 for a loss at terminal nodes. Next, incorporate alpha-beta pruning to reduce the search space.

Obviously, the simple evaluation function can only make intelligent moves when its look-ahead horizon detects a sequence leading to a guaranteed win. However, since your agent typically won't be able to search that deep in the tree, it is necessary to cut off the search at non-terminal nodes, and use a heuristic evaluation function to provide a meaningful value representing the "goodness" of the corresponding state. The quality of your heuristic function will determine how well your agent plays the game.

Additional notes

You may program your assignment in any computer language of your preference, provided it can be compiled and run on the Trottier Engineering Linux machines (156TRLinux.ece.mcgill.ca). Note that to ensure a "level playing field", we will use these for the class tournament, so you should tune your program's search depth to respect the time limit per move (see above) on these machines. To login, use your McGill UEA (first.last@mail.mcgill.ca).

Report

Your report should address the following questions:

  1. Consider the three game states shown below:
     , , , , , , 
     , , , , , , 
    O, ,X, , , , 
     , , ,O, , ,X
     , , , ,O,X,X
     , ,O, , ,O,X
     , , ,X,O, , 
    
         (a)
    
     
    O,O, , , , , 
    X, , , , , , 
     , ,X, , ,O, 
     , ,X,O, , ,X
     , , , ,O,X,
     ,X, , , , , 
    O, , , , , , 
    
         (b)
    
     
     , ,O, ,X, , 
     , ,O, ,X, , 
     , ,X, ,O, , 
     , ,X, ,O, , 
     , , , , , , 
     , ,O, ,X, , 
     , ,X, ,O, , 
    
         (c)
    
    For each of these configurations, graph the total number of states explored by your program when using depth cutoffs of 3, 4, 5 and 6, both with minimax and alpha-beta. Assume it is white's turn to play.
  2. Estimate a formula that relates the depth cutoff to the number of states visited for each of minimax and alpha-beta algorithms.
  3. Explain whether the number of states explored depends on the order in which you generate new states during the search. Justify your response using results from your program.
  4. Explain the heuristic evaluation function you used and provide a clear rationale for the factors you included.
  5. A more complex evaluation function will increase computation time. Since each move is time constrained, explain whether this favours the use of simpler evaluation functions so that your agent can evaluate nodes deeper in the game tree.

Submitting your assignment

Your assignment must be submitted through moodle to allow for peer- and self-assessment. The submission must contain:

Marking scheme

Question/CriterionUnsatisfactoryBare minimumSatisfactoryAbove and beyond
1. For each of the configurations given in the assignment specifications, graph the total number of states explored by your program when using depth cutoffs of 3, 4, 5 and 6, both with minimax and alpha-beta. Assume it is white's turn to play. 0: No graph provided 1: Graph shows results for one or two depth cutoffs. 3: Graph shows results for at least three depth cutoffs and is clearly labelled, indicating number of nodes expanded for each cutoff. 5: Graph shows results for all requested depth cutoffs and is clearly labelled, indicating number of nodes expanded for each cutoff. The report explains how the values were computed (averaged) from multiple moves.
2. Estimate a formula that relates the depth cutoff to the number of states visited for each of minimax and alpha-beta algorithms. 0: No formula given 1: Report includes formula with minimal explanation and no comparison between minimax and alpha-beta algorithms. 2: Report includes formula and comparison allowing for an obvious appreciation of the performance benefits of alpha-beta pruning.
3. Explain whether the number of states explored depends on the order in which you generate new states during the search. Justify your response using results from your program. 0: No No discussion. 1: Report includes a minimal description of how generating states can affect the number of states explored without graphs or tables. 3: Report includes a meaningful graph comparing two different methods in which new states can be generated, and analyzes their performance. 5: The report additionally describes how, in theory, the states could be ordered to achieve optimal savings from alpha-beta pruning.
4. Explain the heuristic evaluation function you used and provide a clear rationale for the factors you included. 0: No discussion. 1: An explanation of the heuristic evaluation function used without providing a clear rationale. 3: An explanation of the heuristic evaluation function used with a reasonable rationale for the factors included. 5: An explanation of the heuristic evaluation function used with a thorough explanation of how each of the factors were developed.
5. Discuss the computational trade-offs with respect to the use of a more complex evaluation function and the depth of the game tree that can be evaluated. 0: No discussion provided. 1: Trivial explanation provided, for example, noting only that computing the more complex evaluation function (may) decrease the maximum depth of the game tree that can be evaluated within a given time limit. 2: Answer includes explanation of memory/time tradeoffs -- for example, that the deeper search from simpler (but faster) evaluation function requires more memory. 4: Answer includes previous elements and provides test results showing the tree depth and/or number of nodes explored in a given time limit for different choices of evaluation function, as well as an analysis of the improved capability of play resulting from a "shallower" exploration of the game tree but using a more competent heuristic.
6. Code and documentation. 0: Invoking the main program without correct arguments does not provide meaningful feedback. 1: Agent program provides meaningful feedback when invoked. Program is organized, making it easy to find alpha-beta pruning code and heuristic evaluation function. Comments delineate major blocks of code. 2: In addition to meaningful feedback and well-organized and commented code, the documentation provided allows the assessor to easily launch the game agent from a manually specified board state.
7. Tournament performance. 0: Agent did not compile or run correctly. 1: Agent played the game of Dyanmic Connect-4 correctly, but did not win any games through its choice of heuristics. 2: Top-third of agents in class competition. 3: Top-performing agent in class competition.

Original specifications by Stéphane Pelletier and Jeremy Cooperstock
Last updated on 19 September 2017