Problem Domains Print E-mail

This years competition will feature six problem domains: Mountain Car, Tetris, Helicopter Hovering, Real-Time Strategy, Keepaway Soccer, and Polyathlon. We have attempted to provide a mix of well-known hard problems, challenging new domains and classic problems. Below we describe each domain and why it was selected for this year's competition. More information about agents will be evaluated can be found on the Rules Page.

 

Mountain Car

Creator: Adam White, University of Alberta

In the Mountain Car problem, an agent must drive an underpowered car up a steep mountain road. Since gravity is stronger than the car’s engine, even at full throttle the car cannot simply accelerate up the steep slope. The car’s movement is described by two continuous output variables, position and velocity, and one discrete input representing the acceleration of the car.

 

Tetris

Creator: Brian Tanner, University of Alberta

Tetris is a falling-blocks puzzle video game originally designed and programmed by Alexey Pajitnov in 1985. In Tetris, a pseudorandom sequence of tetrominoes (sometimes called "tetrads" in older versions) - shapes composed of four square blocks each - fall down the playing field. The object of the game is to manipulate these tetrominoes by moving each one sideways and rotating it by 90 degree units, with the aim of creating a horizontal line of blocks without gaps. When such a line is created, it disappears, and the blocks above (if any) fall. The game ends when the player "tops out", that is, when the stack of tetrominoes reaches the top of the playing field and no new tetrominoes are able to enter. Despite its simple rules, playing tetris well requires a complex strategy and lots of experience.

 

Real-Time Strategy

Creator: Marc Lanctot, University of Alberta

The RTS problem is a simplified real-time strategy game domain with two types of units: workers and marines. Workers help gather minerals by finding mineral patches, and minerals are used to train marines or more workers. Marines are used for combat. There is a single base controlled by each player. The goal is to destroy the opponent's base. The opponent will play a strategy chosen from a set of fixed strategies. The domain is continuous. At each step, the action taken by the learning agent is the composition of all actions taken by the workers and marines, so the dimensionality of the action space can grow or shrink over time. Reward is based purely on the outcome of the game.  In the case of a tie, scores are assigned based on the relative accomplishments of both players.

 

Helicopter Hovering

Creators: Pieter Abbeel, Adam Coates, Andrew Y. Ng, Stanford University.

Autonomous helicopter flight represents a challenging control problem with high dimensional, asymmetric, noisy, nonlinear, non-minimum phase dynamics. Though helicopters are significantly harder to control than fixed-wing aircraft, they are uniquely suited to many applications requiring either low-speed flight or stable hovering. The control of autonomous helicopters thus provides an important and challenging testbed for learning and control algorithms.

 

Keepaway Soccer 

Creators: Peter Stone, University of Texas at Austin and Rich Sutton, University of Alberta. (Adapted to RL-Glue by Matthew Taylor, University of Texas at Austin)

RoboCup simulated soccer is a well-understood domain, as it has been the basis of multiple international competitions and research challenges. The multiagent domain incorporates noisy sensors and actuators, as well as enforcing a hidden state so that agents only have a partial world view at any given time. Since late 2002, the Keepaway task has been part of the official release of the open source RoboCup Soccer Server used at RoboCup.  In Keepaway one team - the keepers - attempts to maintain possession of the ball within a limited region while another team - the takers - attempts to steal the ball or force it out of bounds, ending an episode.

 
Polyathlon

Creators: Adam White and Brian Tanner, University of Alberta; Shimon Whiteson, Universiteit van Amsterdam

In the future robots will be used in many homes, offices and construction sites. It would be useful if these robots could learn to perform new tasks on-the-job with little dependance on human guidance and training. The polyathlon is meant to simulate this senerio: the agent faces a series of unknown tasks. The agent must learn, online, how to solve each task without any prior task knowledge or pretraining. The polyathlon raises a number of interesting algorithmic challenges, such as transfer learning, feature construction, adaptive representations and parameter-free learning 

[read more ... ]

 

 

 

Polls

My team is most likely to compete in...
 

Login to Message Boards

Separate username & password from team login.





Lost Password?
NOTE: Registration for message boards has been DISABLED because of SPAM. Please e-mail brian@rl-competition.org for an account.