Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

detailed_workflow.ipyb collector.py not working #1875

Open
vishalkhialani opened this issue Dec 26, 2024 · 3 comments
Open

detailed_workflow.ipyb collector.py not working #1875

vishalkhialani opened this issue Dec 26, 2024 · 3 comments
Labels
question Further information is requested

Comments

@vishalkhialani
Copy link

vishalkhialani commented Dec 26, 2024

❓ Questions and Help

Thank you for making this tool.
I am learning the ropes and was going through [\examples\tutorial\detailed_workflow.ipynb](/~https://github.com/microsoft/qlib/blob/main/examples/tutorial/detailed_workflow.ipynb)

I am stuck at at cell where it is suppose to download extra data

if not p.exists():
    !cd ../../scripts/data_collector/pit/ && pip install -r requirements.txt
    !cd ../../scripts/data_collector/pit/ && python collector.py download_data --source_dir ~/.qlib/stock_data/source/pit --start 2000-01-01 --end 2020-01-01 --interval quarterly --symbol_regex "^(600519|000725).*"
    !cd ../../scripts/data_collector/pit/ && python collector.py normalize_data --interval quarterly --source_dir ~/.qlib/stock_data/source/pit --normalize_dir ~/.qlib/stock_data/source/pit_normalized
    !cd ../../scripts/ && python dump_pit.py dump --csv_path ~/.qlib/stock_data/source/pit_normalized --qlib_dir ~/.qlib/qlib_data/cn_data --interval quarterly

I just stays at stalls for a couple hrs and does not download the data. You can see the output here -
https://gist.github.com/vishalkhialani/1b15eb9a50511f67e05cf2d3d7835f75

I am on win11 and tried it on linux too but it did not help.
I tried to debug it but its taking very long as I am new to the tool. Please can someone guide me on this.

@vishalkhialani vishalkhialani added the question Further information is requested label Dec 26, 2024
@vishalkhialani
Copy link
Author

I spent some time and I have narrowed down the issue to be with

\scripts\data_collector\utils.py

resp = requests.get(HS_SYMBOLS_URL.format(s_type=_k), timeout=None)

As http://app.finance.ifeng.com/hq/list.php?type=stock_a&class=ha is giving a 404.

The API has been either deprecated or moved. I can't fix this as this will need the authors to find an alternative endpoint or the updated one.

@SunsetWolf
Copy link
Collaborator

Hi, @vishalkhialani , This issue has been resolved in PR 1758, please pull the latest code and retry.

@vishalkhialani
Copy link
Author

thanks @SunsetWolf I thought I had the latest pull. I will check it later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants