forked from microsoft/qlib
-
Notifications
You must be signed in to change notification settings - Fork 3
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
support optimization based strategy (microsoft#754)
* support optimization based strategy * fix riskdata not found & update doc * refactor signal_strategy * add portfolio example * Update examples/portfolio/prepare_riskdata.py Co-authored-by: you-n-g <you-n-g@users.noreply.github.com> * fix typo Co-authored-by: you-n-g <you-n-g@users.noreply.github.com> * fix typo Co-authored-by: you-n-g <you-n-g@users.noreply.github.com> * update doc * fix riskmodel doc Co-authored-by: you-n-g <you-n-g@users.noreply.github.com> Co-authored-by: you-n-g <you-n-g@users.noreply.github.com>
- Loading branch information
Showing
14 changed files
with
667 additions
and
261 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,46 @@ | ||
# Portfolio Optimization Strategy | ||
|
||
## Introduction | ||
|
||
In `qlib/examples/benchmarks` we have various **alpha** models that predict | ||
the stock returns. We also use a simple rule based `TopkDropoutStrategy` to | ||
evaluate the investing performance of these models. However, such a strategy | ||
is too simple to control the portfolio risk like correlation and volatility. | ||
|
||
To this end, an optimization based strategy should be used to for the | ||
trade-off between return and risk. In this doc, we will show how to use | ||
`EnhancedIndexingStrategy` to maximize portfolio return while minimizing | ||
tracking error relative to a benchmark. | ||
|
||
|
||
## Preparation | ||
|
||
We use China stock market data for our example. | ||
|
||
1. Prepare CSI300 weight: | ||
|
||
```bash | ||
wget http://fintech.msra.cn/stock_data/downloads/csi300_weight.zip | ||
unzip -d ~/.qlib/qlib_data/cn_data csi300_weight.zip | ||
rm -f csi300_weight.zip | ||
``` | ||
|
||
2. Prepare risk model data: | ||
|
||
```bash | ||
python prepare_riskdata.py | ||
``` | ||
|
||
Here we use a **Statistical Risk Model** implemented in `qlib.model.riskmodel`. | ||
However users are strongly recommended to use other risk models for better quality: | ||
* **Fundamental Risk Model** like MSCI BARRA | ||
* [Deep Risk Model](https://arxiv.org/abs/2107.05201) | ||
|
||
|
||
## End-to-End Workflow | ||
|
||
You can finish workflow with `EnhancedIndexingStrategy` by running | ||
`qrun config_enhanced_indexing.yaml`. | ||
|
||
In this config, we mainly changed the strategy section compared to | ||
`qlib/examples/benchmarks/workflow_config_lightgbm_Alpha158.yaml`. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,71 @@ | ||
qlib_init: | ||
provider_uri: "~/.qlib/qlib_data/cn_data" | ||
region: cn | ||
market: &market csi300 | ||
benchmark: &benchmark SH000300 | ||
data_handler_config: &data_handler_config | ||
start_time: 2008-01-01 | ||
end_time: 2020-08-01 | ||
fit_start_time: 2008-01-01 | ||
fit_end_time: 2014-12-31 | ||
instruments: *market | ||
port_analysis_config: &port_analysis_config | ||
strategy: | ||
class: EnhancedIndexingStrategy | ||
module_path: qlib.contrib.strategy | ||
kwargs: | ||
model: <MODEL> | ||
dataset: <DATASET> | ||
riskmodel_root: ./riskdata | ||
backtest: | ||
start_time: 2017-01-01 | ||
end_time: 2020-08-01 | ||
account: 100000000 | ||
benchmark: *benchmark | ||
exchange_kwargs: | ||
limit_threshold: 0.095 | ||
deal_price: close | ||
open_cost: 0.0005 | ||
close_cost: 0.0015 | ||
min_cost: 5 | ||
task: | ||
model: | ||
class: LGBModel | ||
module_path: qlib.contrib.model.gbdt | ||
kwargs: | ||
loss: mse | ||
colsample_bytree: 0.8879 | ||
learning_rate: 0.2 | ||
subsample: 0.8789 | ||
lambda_l1: 205.6999 | ||
lambda_l2: 580.9768 | ||
max_depth: 8 | ||
num_leaves: 210 | ||
num_threads: 20 | ||
dataset: | ||
class: DatasetH | ||
module_path: qlib.data.dataset | ||
kwargs: | ||
handler: | ||
class: Alpha158 | ||
module_path: qlib.contrib.data.handler | ||
kwargs: *data_handler_config | ||
segments: | ||
train: [2008-01-01, 2014-12-31] | ||
valid: [2015-01-01, 2016-12-31] | ||
test: [2017-01-01, 2020-08-01] | ||
record: | ||
- class: SignalRecord | ||
module_path: qlib.workflow.record_temp | ||
kwargs: | ||
model: <MODEL> | ||
dataset: <DATASET> | ||
- class: SigAnaRecord | ||
module_path: qlib.workflow.record_temp | ||
kwargs: | ||
ana_long_short: False | ||
ann_scaler: 252 | ||
- class: PortAnaRecord | ||
module_path: qlib.workflow.record_temp | ||
kwargs: | ||
config: *port_analysis_config |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,55 @@ | ||
# Copyright (c) Microsoft Corporation. | ||
# Licensed under the MIT License. | ||
import os | ||
import numpy as np | ||
import pandas as pd | ||
|
||
from qlib.data import D | ||
from qlib.model.riskmodel import StructuredCovEstimator | ||
|
||
|
||
def prepare_data(riskdata_root="./riskdata", T=240, start_time="2016-01-01"): | ||
|
||
universe = D.features(D.instruments("csi300"), ["$close"], start_time=start_time).swaplevel().sort_index() | ||
|
||
price_all = ( | ||
D.features(D.instruments("all"), ["$close"], start_time=start_time).squeeze().unstack(level="instrument") | ||
) | ||
|
||
# StructuredCovEstimator is a statistical risk model | ||
riskmodel = StructuredCovEstimator() | ||
|
||
for i in range(T - 1, len(price_all)): | ||
|
||
date = price_all.index[i] | ||
ref_date = price_all.index[i - T + 1] | ||
|
||
print(date) | ||
|
||
codes = universe.loc[date].index | ||
price = price_all.loc[ref_date:date, codes] | ||
|
||
# calculate return and remove extreme return | ||
ret = price.pct_change() | ||
ret.clip(ret.quantile(0.025), ret.quantile(0.975), axis=1, inplace=True) | ||
|
||
# run risk model | ||
F, cov_b, var_u = riskmodel.predict(ret, is_price=False, return_decomposed_components=True) | ||
|
||
# save risk data | ||
root = riskdata_root + "/" + date.strftime("%Y%m%d") | ||
os.makedirs(root, exist_ok=True) | ||
|
||
pd.DataFrame(F, index=codes).to_pickle(root + "/factor_exp.pkl") | ||
pd.DataFrame(cov_b).to_pickle(root + "/factor_cov.pkl") | ||
# for specific_risk we follow the convention to save volatility | ||
pd.Series(np.sqrt(var_u), index=codes).to_pickle(root + "/specific_risk.pkl") | ||
|
||
|
||
if __name__ == "__main__": | ||
|
||
import qlib | ||
|
||
qlib.init(provider_uri="~/.qlib/qlib_data/cn_data") | ||
|
||
prepare_data() |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
File renamed without changes.
File renamed without changes.
Oops, something went wrong.