Skip to content

benlagreca02/Metroid-II-RL

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

37 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Metroid-II-RL

Reinforcement learning plays Metroid II. 'Nuff said.

Current state

PyBoy provides "game wrappers" for various games to make AI work easier. I am working on implementing one for Metroid II and will eventually get my code pulled into the project.

Currently, a pixel-based observation approach is going to be used. A tile-based approach would be much more effecinet, and train much faster, however currently

Some incredibly simple training has been done, but its terrible unsurprisingly. This was mostly done to prove that the code environment worked, and some learning could be done.

main.py is going to mostly be used for testing purposes.

train.py is the script used to start a training run.

view.py is the script used to view a training result.

Tangent about timing info

Through testing and some math, roughly 216,000 iterations correspond to an hour of "real life" game time. This was calculated by measuring 1000 iteration's time. The output of the script was as follows. This assumes 1000 iterations

Human Time: 16.650566339492798
Machine time: 0.2655789852142334
Machine is 62.69534589139785 times faster
time per step HUMAN: 0.016650566339492797
time per step FAST: 0.0002655789852142334
One hour of human gameplay = 216208.8620049702

the math for this was simply time / (measured_time / iterations) --> 3600 / (16.65/1000)

TODO

At this point, a pull request has been made for PyBoy, and many changes need to be made to that repository, but before I do that, I want to focus more on the training and AI portion of things. I'll be modifying this code to use Pufferlib and CleanRL soon. I'll also be heavily modifying the environment to give better observations like missiles, health, etc.

Agent Milestones

  • explore starting area
  • Stop shooting randomly and "spazzing out"
  • Get out of starting area relatively quickly
  • Avoid/kill enemies in starting area
  • Drop down through first major shaft (requires downward jump shooting)
  • Find first Metroid
  • Kill first Metroid

Model

  • Add frame stacking to observation space (Added LSTM instead)
  • improve observations of environments to be tiles, rather than pixels
  • Change action space so agent can hold buttons (SML code has simmilar)
  • Change to Pufferlib native environment, rather than gym wrapper (?)
  • Define a baseline test reward function
  • make observations of environments pixels of screen
  • Write simple exploration function using game coordinate hashing
  • Do some kind of extremely bare-bones training to just explore

Custom Environment

Environment is pretty much set. Its plenty good enough to start training

  • Verify ROM integrity (the rom works)
  • Take random actions in the environment (veryify pyboy interface working)
  • Implement a game_wrapper for Metroid II (makes AI stuff easier)
    • Potentially use RAM mappings to calculate more useful info (particularly health)
    • Improve game_over to check health, may be slightly faster than using "GAME OVER" screen
    • Make a pull request for PyBoy to merge my code in
      • Fix code to comply with PyBoy coding standards
    • Implement the start_game function to skip though the menu
    • Implement game_over function to check if agent is dead
    • Integrate and verify the RAM mappings
  • Determine and implement all possible button combos for "actions" (may need minor improvements)

Misc

  • Containerize the program to make running on other machines easy (either with conda, or docker)

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages