This repository contains the code for our paper, Few-shot Structure-Informed Machinery Part Segmentation with Foundation Models and Graph Neural Networks. Our approach leverages foundation models and graph neural networks to perform few-shot segmentation of machinery parts, even in complex and low-data scenarios.
To help you get started, we provide a small set of synthetic truck images in data/test_data, which you can use for quick testing. For a detailed explanation of the methodology, implementation, and results, please refer to our paper.
For a streamlined and reproducible setup, we provide a Docker Devcontainer. This ensures a properly configured environment with all necessary dependencies. Prerequisites:
- Docker (including NVIDIA Docker for GPU acceleration)
- VS Code with the Dev Containers extension
To get started, simply open the repository in VS Code and launch the Devcontainer. The setup process will handle all required installations automatically, including downloading the necessary weights.
If you prefer running the code directly on your machine, follow these steps:
-
Install the required dependencies:
pip install -r requirements.txt
-
Install PyTorch with CUDA 12.1 support (recommended for GPU acceleration):
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
SuperPoint Weights (Local Installation Only)
For local setups, you need to manually download the three required SuperPoint weight files from the SuperGlue repository and place them in:
foundation_graph_segmentation/interest_point_detectors/superpoint/weights/
For further details, please refer to our paper.
The repository includes three trained checkpoints stored in the checkpoints
folder. Each checkpoint corresponds to a different segmentation granularity:
- TRUCK
- TRUCK CRANE
- LOW
These checkpoints were trained on the synthetic truck dataset.
Each granularity has a corresponding configuration file in the config
folder. The configuration files define the parameters for training and testing. Feel free to experiment with them!
To run inference on the test images, use the following command, replacing the config file with the one matching the granularity you want to test:
python3.10 test.py --config_file config/parameters_test_TRUCK.yaml
python3.10 test.py --config_file config/parameters_test_TRUCK_CRANE.yaml
python3.10 test.py --config_file config/parameters_test_LOW.yaml
The results will be saved in the results
folder.
To train a model from scratch, use the following command with the desired configuration file:
python3.10 train.py --config_file config/parameters_train_TRUCK.yaml
python3.10 train.py --config_file config/parameters_train_TRUCK_CRANE.yaml
python3.10 train.py --config_file config/parameters_train_LOW.yaml
The trained model will be saved in the checkpoints
folder.
Combination of SuperPoint, CLIPSeg, Segment Anything and Graph Neural Networks.
Using blender to create synthetic images by randomizing environment, perspective anc crane articulation.
Rendered video of the synthetic truck with changing perspective, background, lighting and articulation. Right side shows rendering, left side shows segmentation overlay.
Different granularity and sample sizes. Qualitative results on synthetic truck dataset.
Training on 10 synthetic images. The synthetic truck-mounted loading crane differs from the real one. The model is able to transfer the knowledge to the real world.
Using Davis2017 Dataset. Trained on First, Middle and Last Frame.
Segmentation Classes | Image |
---|---|
One Class | ![]() |
Two Classes | ![]() |
Multi Classes | ![]() |
This work was conducted at the AIT Austrian Institute of Technology 🇦🇹 in the Center for Vision, Automation & Control 🏗️.
Name & Email | AIT Research Profile | Google Scholar |
---|---|---|
👨🔬 Michael Schwingshackl 📧Michael.Schwingshackl@ait.ac.at |
🔗 Profile | 🔗 Scholar |
👨🔬 Fabio Francisco Oberweger 📧Fabio.Oberweger@ait.ac.at |
🔗 Profile | 🔗 Scholar |
👨🔬 Markus Murschitz 📧Markus.Murschitz@ait.ac.at |
🔗 Profile | 🔗 Scholar |
The provided test images allow you to evaluate the approach on a small sample of our synthetic truck dataset. If you require access to the full dataset for research or further experiments, please reach out to us.
For inquiries, feel free to contact us.
If you use Hopomop in your research, please use the following BibTeX entry.
@InProceedings{Schwingshackl_2025_WACV,
author = {Schwingshackl, Michael and Oberweger, Fabio F. and Murschitz, Markus},
title = {Few-Shot Structure-Informed Machinery Part Segmentation with Foundation Models and Graph Neural Networks},
booktitle = {Proceedings of the Winter Conference on Applications of Computer Vision (WACV)},
month = {February},
year = {2025},
pages = {1989-1998}
}