allennlp-optuna
is AllenNLP plugin for
hyperparameter optimization using Optuna.
Machine \ Device | Single GPU | Multi GPUs |
---|---|---|
Single Node | ✅ | Partial |
Multi Nodes | ✅ | Partial |
AllenNLP provides a way of distributed training (https://medium.com/ai2-blog/c4d7c17eb6d6).
Unfortunately, allennlp-optuna
doesn't fully support this feature.
With multiple GPUs, you can run hyperparameter optimization.
But you cannot enable a pruning feature.
(For more detail, please see himkt/allennlp-optuna#20
and optuna/optuna#1990)
Alternatively, allennlp-optuna
supports distributed optimization with multiple machines.
Please read the tutorial about
distributed optimization in allennlp-optuna
.
You can also learn about a mechanism of Optuna in the paper
or documentation.
You can read the documentation on readthedocs.
pip install allennlp_optuna
# Create .allennlp_plugins at the top of your repository or $HOME/.allennlp/plugins
# For more information, please see /~https://github.com/allenai/allennlp#plugins
echo 'allennlp_optuna' >> .allennlp_plugins
Model configuration written in Jsonnet.
You have to replace values of hyperparameters with jsonnet function std.extVar
.
Remember casting external variables to desired types by std.parseInt
, std.parseJson
.
local lr = 0.1; // before
↓↓↓
local lr = std.parseJson(std.extVar('lr')); // after
For more information, please refer to AllenNLP Guide.
You can define search space in Json.
Each hyperparameter config must have type
and keyword
.
You can see what parameters are available for each hyperparameter in
Optuna API reference.
[
{
"type": "int",
"attributes": {
"name": "embedding_dim",
"low": 64,
"high": 128
}
},
{
"type": "int",
"attributes": {
"name": "max_filter_size",
"low": 2,
"high": 5
}
},
{
"type": "int",
"attributes": {
"name": "num_filters",
"low": 64,
"high": 256
}
},
{
"type": "int",
"attributes": {
"name": "output_dim",
"low": 64,
"high": 256
}
},
{
"type": "float",
"attributes": {
"name": "dropout",
"low": 0.0,
"high": 0.5
}
},
{
"type": "float",
"attributes": {
"name": "lr",
"low": 5e-3,
"high": 5e-1,
"log": true
}
}
]
Parameters for suggest_#{type}
are available for config of type=#{type}
. (e.g. when type=float
,
you can see the available parameters in suggest_float
Please see the example in detail.
allennlp tune \
config/imdb_optuna.jsonnet \
config/hparams.json \
--serialization-dir result/hpo \
--study-name test
Optionally, you can specify the metrics and direction you are optimizing for:
allennlp tune \
config/imdb_optuna.jsonnet \
config/hparams.json \
--serialization-dir result/hpo \
--study-name test \
--metrics best_validation_accuracy \
--direction maximize
You can choose a pruner/sample implemented in Optuna. To specify a pruner/sampler, create a JSON config file
The example of optuna.json looks like:
{
"pruner": {
"type": "HyperbandPruner",
"attributes": {
"min_resource": 1,
"reduction_factor": 5
}
},
"sampler": {
"type": "TPESampler",
"attributes": {
"n_startup_trials": 5
}
}
}
And add a epoch callback to your configuration. (https://guide.allennlp.org/hyperparameter-optimization#6)
callbacks: [
{
type: 'optuna_pruner',
}
],
config/imdb_optuna.jsonnet
is a simple configuration for allennlp-optunaconfig/imdb_optuna_with_pruning.jsonnet
is a configuration using Optuna pruner (and TPEsampler)
$ diff config/imdb_optuna.jsonnet config/imdb_optuna_with_pruning.jsonnet
32d31
< datasets_for_vocab_creation: ['train'],
58a58,62
> callbacks: [
> {
> type: 'optuna_pruner',
> }
> ],
Then, you can use a pruning callback by running following:
allennlp tune \
config/imdb_optuna_with_pruning.jsonnet \
config/hparams.json \
--optuna-param-path config/optuna.json \
--serialization-dir result/hpo_with_optuna_config \
--study-name test_with_pruning
allennlp best-params \
--study-name test
allennlp retrain \
config/imdb_optuna.jsonnet \
--serialization-dir retrain_result \
--study-name test
you can run optimizations in parallel.
You can easily run distributed optimization by adding an option
--skip-if-exists
to allennlp tune
command.
allennlp tune \
config/imdb_optuna.jsonnet \
config/hparams.json \
--optuna-param-path config/optuna.json \
--serialization-dir result \
--study-name test \
--skip-if-exists
allennlp-optuna uses SQLite as a default storage for storing results. You can easily run distributed optimization over machines by using MySQL or PostgreSQL as a storage.
For example, if you want to use MySQL as a storage, the command should be like following:
allennlp tune \
config/imdb_optuna.jsonnet \
config/hparams.json \
--optuna-param-path config/optuna.json \
--serialization-dir result \
--study-name test \
--storage mysql://<user_name>:<passwd>@<db_host>/<db_name> \
--skip-if-exists
You can run the above command on each machine to run multi-node distributed optimization.
If you want to know about a mechanism of Optuna distributed optimization, please see the official documentation: https://optuna.readthedocs.io/en/latest/tutorial/10_key_features/004_distributed.html
- Cookpad Techlife (in Japanese): https://techlife.cookpad.com/entry/2020/11/06/110000
allennlp-optuna
is used for optimizing hyperparameter of NER model