-
Notifications
You must be signed in to change notification settings - Fork 39
(v0.2) Using the Sweeper
This page covers the hyperparameter sweeper interface (which is a wrapper around normal doodad interface). At the moment this only supports grid searches.
The first step is to set up a Sweeper object for your code repository. This mainly requires pointing doodad to all code dependencies, and needs to be reimplemented for each project.
This is an example for a project that uses rllab and its mujoco environments. Your configuration will vary - generally if there are no dependencies you will only need to include the project folder as a dependency.
from doodad.easy_sweep import DoodadSweeper
import doodad.mount as mount
MOUNTS = [
mount.MountLocal(local_dir='~/path/to/project/repo', pythonpath=True), # Code project folder
mount.MountLocal(local_dir='~/path/to/rllab', pythonpath=True), # RLLAB
mount.MountLocal(local_dir='~/path/to/.mujoco', pythonpath=True), # Mujoco
]
SWEEPER = DoodadSweeper(mounts=MOUNTS,
docker_img='dementrock/rllab3_shared',
docker_output_dir='/data',
local_output_dir='data/docker_test_run',
)
For a more detailed explanation on "mount" objects, see the tutorial page.
Note! If running sweeps inside docker, persistent output files must be written to the docker_output_dir
folder (in this case, '/data'). Otherwise, they will not by synced and you will not be able to access them once the job has finished.
That's it! Now we can start running jobs.
The DoodadSweeper object contains several functions for running functions with different parameter settings. As a simple example, let's run an "experiment" locally.
from <my_sweeper_file> import SWEEPER
def example_function(param1=0, param2='c'):
print(param1, param2)
sweep_params = {
'param1': [0,1,2],
'param2': ['a', 'b'],
}
SWEEPER.run_sweep_serial(example_function, sweep_params, repeat=1)
We can also run this experiment in parallel locally using python's multiprocessing
module. (Be careful for things that cannot be pickled).
SWEEPER.run_sweep_parallel(example_function, sweep_params, repeat=1)
Your output should look similar to:
0 a
1 a
2 a
0 b
1 b
2 b
If you want to use EC2, make sure you run the setup script first (scripts/setup_ec2.py
).
Before trying to launch an ec2 job, it is helpful to test your configuration locally through docker. Make sure that if you have output files (such as logs), they appear in the local_output_dir
folder.
SWEEPER.run_test_docker(example_function, sweep_params, repeat=1)
Finally, we can run our experiments on ec2!
SWEEPER.run_sweep_ec2(example_function, sweep_params,
bucket_name='my.bucket.name.here',
s3_log_name='my_example_run',
region='us-east-2',
instance_type='c4.large',
repeat=1)
To pull your logs from S3, you can use the pull_s3_logs.py
script located in the scripts folder, or the aws s3 sync
command (part of aws-cli).
WARNING: Not tested yet.
By default, the sweeper object performs a grid search over parameters given. Depending on your use case, you may want to implement a custom sweep scheme or simply run an experiment once.
The sweeper object has methods which call a function once (by providing the function and arguments).
# EC2 docker
SWEEPER.run_single_ec2(run_method, kwargs, bucket_name)
# Local docker
SWEEPER.run_single_docker(run_method, kwargs)