In this short tutorial, we show how to run our model on arbitrary videos and visualize the predictions. Note that this feature is only provided for experimentation/research purposes and presents some limitations, as this repository is meant to provide a reference implementation of the approach described in the paper (not production-ready code for inference in the wild).
-
Our script have achieved single and two-person 3D pose estimation on custom videos. This repository provides an API to generate 3D joint coordinates and rendering animation. You can add other detector and 2D pose estimation methods based on our scripts. You can also apply it to other high-level tasks, such as skeleton-based action recognition.
-
Pipeline: We adopt YOLOv3 and SORT for human detection and tracking, HRNet for 2D pose estimation, and GAST-Net for 2D-to-3D pose reconstruction.
- Prepare YOLOv3 pretrained model:
cd checkpoint
mkdir yolov3
wget https://pjreddie.com/media/files/yolov3.weights
- Prepare HRNet pretrained model:
mkdir hrnet
cd hrnet
mkdir pose_coco
Download HRNet pretrained model[pose_hrnet_w48_384x288.pth], put it to pose_coco dir.
- Prepare GAST-Net pretrained model:
mkdir gastnet
Download GAST-Net pretrained model[27_frame_model.bin], put it to gastnet dir
${root_path}
-- checkpoint
|-- yolov3
|-- yolov3.weights
|-- hrnet
|-- pose_coco
|-- pose_hrnet_w48_384x288.pth
|-- gastnet
|-- 27_frame_model.bin
- Single-person 3D pose estimation:
python gen_skes.py -v baseball.mp4 -np 1 --animation
- Two-person 3D pose estimation:
python gen_skes.py -v apart.avi -np 2 --animation
-- The result can be found in the output dir.
- Single-person 3D pose estimation:
python gen_skes.py -v baseball.mp4 -np 1