Problem Statement

ESPN Winning Formula Challenge :: Powered by TopCoder - Problem Statement

Problem Statement

    

Introduction

Your task in this problem is to predict the outcome of college football games. You must design an algorithm which predicts the final scores of each week's games based on historical data from games already played. For each game since 2004, you will be given data about every play that occured during the game.

Implementation Details

Your algorithm will be given historical data one week at a time, in chronological order. After each week of historical data, you may be asked to predict the outcome of the next week's games. During example testing, you will always be asked to predict the outcomes of all of the next week's games, to provide the maximum sample size. Furthermore, during example testing you will be asked to provide predictions for all the upcoming games in the 2008 season, not just the next week's games. During the actual testing, however, you will only be asked to provide predictions for one week's games -- the ones which have not occurred yet and are coming up next.



To perform these predictions, you must implement two methods. The first, data, will take a String[], plays each element of which represents one play in one game in one week. A second input String[], players, will give summary statistics for all the players involved in the week's games. Finally, summary will give the results of each game. The second method, predict, will take the names of two teams, as well as the date of the game, and the location of the stadium they will play in. It should then return a int[] with two elements, the first of which is the home team score and the second of which is the away team score.

Scoring

This prediction competition will be scored in four parts as shown in the competition schedule. Submissions received prior to each mini-season deadline will be evaluated on all the games in that mini-season. Your solution will be run multiple times in each mini-season -- once per week -- using the most up to date statistics. For each game, you can receive a maxium of 100 points for predicting the exact score of the game. These 100 points are broken down as follows:
  • Picking the correct winner: 60 points
  • Picking the correct winner and being off by D on the spread:
    • D = 0, 20 points
    • D = 1, 18 points
    • D = 2, 16 points
    • D = 3, 13 points
    • D = 4, 10 points
    • D = 5, 7 points
    • D = 6, 4 points
    • D = 7, 1 points
    • D > 7, 0 points
  • For each team, if you are off by X points in your prediction of that team's score, you will receive 10-X points with a minimum of 0
For example, lets say that you predict team A will beat team B 10 to 9, but in reality the score is 15 to 10, with team A winning. You will get 60 points for predicting the correct winner. Your prediction had a spread of 1, while the true spread was 5, so you get 10 points for D=4. You were off by 5 points in your prediction of on team's score, and off by 1 in your prediction of the other team's score, so you get 5+9=14 points for that. Thus your overall score will be 60+10+14=84. If, on the other hand, team B had beat team A by a score of 10 to 9, you would get 0 points for predicting the winner and the spread (you only get spread points if you predict the right winner). Your score predictions would be close, however, and you'd get a total of 18 points. Your overall score will simply be the sum of your individual game scores.

Data

The data files for all previous games are available for download in CSV format. Descriptions of the fields are available for the play-by-play data, the player data, and the summary data. Feel free to ask further questions related the data in the forums.
 

Definition

    
Class:WinningFormula
Method:data
Parameters:String[], String[], String[]
Returns:int
Method signature:int data(String[] plays, String[] players, String[] summary)
 
Method:predict
Parameters:String, String, String, String, int, int, int
Returns:int[]
Method signature:int[] predict(String home, String away, String stadium, String date, int week, int homeid, int awayid)
(be sure your methods are public)
    
 

Notes

-The time limit for processing all the weekly data and making all the predictions is 9 minutes.
-The memory limit is 1024M.
-An invalid prediction for a game (invalid return, etc.) will result in 0 points for that game.
 

Constraints

-The time limit is 9 minutes. This includes making receiving data and making all the predictions.
-The memory limit is 1024M.
 

Examples

0)
    
0
When testing the example, your code will be asked to make predictions for every game since 2004, including those in the 2008 season which have not been played. So, you will be given the statistics for one week's worth of data, and then asked to predict all the games in the next week. This will start with your code being asked to predict each game in week one (before which you have no data), then you will be given the statistics for the week one games and asked to predict each week two game, and so on. Finally, after you have received all the statistical data, you will be asked to predict each upcoming game in the 2008 season.



It is important to note that this is different from the final testing, where your code will only predict one week of games. In the final testing, you will still be given all the statistical data, one week at a time, up to the current time. You will then be asked to predict the outcomes of the upcoming week's games, for just one week. If you want to simulate the actual testing, simply return an empty array for all calls to predict except for one week (currently 68). Be sure to remove this code when you make your final submission though!

This problem statement is the exclusive and proprietary property of TopCoder, Inc. Any unauthorized use or reproduction of this information without the prior written consent of TopCoder, Inc. is strictly prohibited. (c)2006, TopCoder, Inc. All rights reserved.