TabPFN

TabPFN is a foundation model for tabular data that outperforms traditional methods while being dramatically faster. This repository contains the core PyTorch implementation with CUDA optimization.

⚠️ Major Update: Version 2.0: Complete codebase overhaul with new architecture and features. Previous version available at v1.0.0 and pip install tabpfn<2.

📚 For detailed usage examples and best practices, check out Interactive Colab Tutorial

🌐 TabPFN Ecosystem

Choose the right TabPFN implementation for your needs:

TabPFN Client: Easy-to-use API client for cloud-based inference
TabPFN Extensions: Community extensions and integrations
TabPFN (this repo): Core implementation for local deployment and research
TabPFN UX: No-code TabPFN usage

Try our Interactive Colab Tutorial to get started quickly.

🏁 Quick Start

Installation

# Simple installation
pip install tabpfn

# Local development installation
git clone /~https://github.com/PriorLabs/TabPFN.git
pip install -e "TabPFN[dev]"

Offline Usage

TabPFN automatically downloads model weights when first used. For offline usage:

Manual Download

Download the model files manually from HuggingFace:
- Classifier: tabpfn-v2-classifier.ckpt
- Regressor: tabpfn-v2-regressor.ckpt
Place the file in one of these locations:
- Specify directly: TabPFNClassifier(model_path="/path/to/model.ckpt")
- Set environment variable: os.environ["TABPFN_MODEL_CACHE_DIR"] = "/path/to/dir"
- Default OS cache directory:
  - Windows: %APPDATA%\tabpfn\
  - macOS: ~/Library/Caches/tabpfn/
  - Linux: ~/.cache/tabpfn/

Quick Download Script

import requests
from tabpfn.utils import _user_cache_dir
import sys

# Get default cache directory using TabPFN's internal function
cache_dir = _user_cache_dir(platform=sys.platform)
cache_dir.mkdir(parents=True, exist_ok=True)

# Define models to download
models = {
    "tabpfn-v2-classifier.ckpt": "https://huggingface.co/Prior-Labs/TabPFN-v2-clf/resolve/main/tabpfn-v2-classifier.ckpt",
    "tabpfn-v2-regressor.ckpt": "https://huggingface.co/Prior-Labs/TabPFN-v2-reg/resolve/main/tabpfn-v2-regressor.ckpt",
}

# Download each model
for name, url in models.items():
    path = cache_dir / name
    print(f"Downloading {name} to {path}")
    with open(path, "wb") as f:
        f.write(requests.get(url).content)

print(f"Models downloaded to {cache_dir}")

Basic Usage

Classification

from sklearn.datasets import load_breast_cancer
from sklearn.metrics import accuracy_score, roc_auc_score
from sklearn.model_selection import train_test_split

from tabpfn import TabPFNClassifier

# Load data
X, y = load_breast_cancer(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.5, random_state=42)

# Initialize a classifier
clf = TabPFNClassifier()
clf.fit(X_train, y_train)

# Predict probabilities
prediction_probabilities = clf.predict_proba(X_test)
print("ROC AUC:", roc_auc_score(y_test, prediction_probabilities[:, 1]))

# Predict labels
predictions = clf.predict(X_test)
print("Accuracy", accuracy_score(y_test, predictions))

Regression

from sklearn.datasets import fetch_openml
from sklearn.metrics import mean_squared_error, r2_score
from sklearn.model_selection import train_test_split

# Assuming there is a TabPFNRegressor (if not, a different regressor should be used)
from tabpfn import TabPFNRegressor  

# Load Boston Housing data
df = fetch_openml(data_id=531, as_frame=True)  # Boston Housing dataset
X = df.data
y = df.target.astype(float)  # Ensure target is float for regression

# Train-test split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.5, random_state=42)

# Initialize the regressor
regressor = TabPFNRegressor()  
regressor.fit(X_train, y_train)

# Predict on the test set
predictions = regressor.predict(X_test)

# Evaluate the model
mse = mean_squared_error(y_test, predictions)
r2 = r2_score(y_test, predictions)

print("Mean Squared Error (MSE):", mse)
print("R² Score:", r2)

Best Results

For optimal performance, use the AutoTabPFNClassifier or AutoTabPFNRegressor for post-hoc ensembling. These can be found in the TabPFN Extensions repository. Post-hoc ensembling combines multiple TabPFN models into an ensemble.

Steps for Best Results:

Install the extensions:

git clone /~https://github.com/priorlabs/tabpfn-extensions.git
pip install -e tabpfn-extensions

from tabpfn_extensions.post_hoc_ensembles.sklearn_interface import AutoTabPFNClassifier

clf = AutoTabPFNClassifier(max_time=120, device="cuda") # 120 seconds tuning time
clf.fit(X_train, y_train)
predictions = clf.predict(X_test)

See our Colab

🤝 Join Our Community

We're building the future of tabular machine learning and would love your involvement:

Connect & Learn:
- Join our Discord Community
- Read our Documentation
- Check out GitHub Issues
Contribute:
- Report bugs or request features
- Submit pull requests
- Share your research and use cases
Stay Updated: Star the repo and join Discord for the latest updates

📜 License

Prior Labs License (Apache 2.0 with additional attribution requirement): here

📚 Citation

You can read our paper explaining TabPFN here.

@article{hollmann2025tabpfn,
 title={Accurate predictions on small data with a tabular foundation model},
 author={Hollmann, Noah and M{\"u}ller, Samuel and Purucker, Lennart and
         Krishnakumar, Arjun and K{\"o}rfer, Max and Hoo, Shi Bin and
         Schirrmeister, Robin Tibor and Hutter, Frank},
 journal={Nature},
 year={2025},
 month={01},
 day={09},
 doi={10.1038/s41586-024-08328-6},
 publisher={Springer Nature},
 url={https://www.nature.com/articles/s41586-024-08328-6},
}

@inproceedings{hollmann2023tabpfn,
  title={TabPFN: A transformer that solves small tabular classification problems in a second},
  author={Hollmann, Noah and M{\"u}ller, Samuel and Eggensperger, Katharina and Hutter, Frank},
  booktitle={International Conference on Learning Representations 2023},
  year={2023}
}

❓ FAQ

Python Version Compatibility

Q: Why can't I use TabPFN with Python 3.8?
A: TabPFN v2 requires Python 3.9 or newer as specified in our pyproject.toml. This is due to our use of newer Python features and type annotations. We recommend updating to Python 3.9+ to use TabPFN v2.

Q: I'm getting pickle errors when loading the model. What could be wrong?
A: First check that you're using Python 3.9+ and PyTorch 2.1+. If you've manually downloaded the model files, ensure they weren't corrupted during download. Try using the download script in the Offline Usage section above.

🛠️ Development

Setup environment:

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
git clone /~https://github.com/PriorLabs/TabPFN.git
cd tabpfn
pip install -e ".[dev]"
pre-commit install

Before committing:

pre-commit run --all-files

Run tests:

pytest tests/

Name		Name	Last commit message	Last commit date
Latest commit History 335 Commits
.github		.github
examples		examples
scripts		scripts
src/tabpfn		src/tabpfn
tests		tests
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TabPFN

🌐 TabPFN Ecosystem

🏁 Quick Start

Installation

Offline Usage

Manual Download

Quick Download Script

Basic Usage

Classification

Regression

Best Results

🤝 Join Our Community

📜 License

📚 Citation

❓ FAQ

Python Version Compatibility

🛠️ Development

About

Releases

Packages

Used by 134

Contributors 18

Languages

License

PriorLabs/TabPFN

Folders and files

Latest commit

History

Repository files navigation

TabPFN

🌐 TabPFN Ecosystem

🏁 Quick Start

Installation

Offline Usage

Manual Download

Quick Download Script

Basic Usage

Classification

Regression

Best Results

🤝 Join Our Community

📜 License

📚 Citation

❓ FAQ

Python Version Compatibility

🛠️ Development

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Used by 134

Contributors 18

Languages

Packages