Arrakis is a library to conduct, track and visualize mechanistic interpretability experiments.
-
Updated
Feb 11, 2025 - Jupyter Notebook
Arrakis is a library to conduct, track and visualize mechanistic interpretability experiments.
Training and exploration of linear probes into Othello-GPT by Li et al. (2022)
A Flax-based library for examining transformers, based on TransformerLens.
Implementation and analysis of Sparse Autoencoders for neural network interpretability research. Features interactive visualization dashboard and W&B integration.
Add a description, image, and links to the transformerlens topic page so that developers can more easily learn about it.
To associate your repository with the transformerlens topic, visit your repo's landing page and select "manage topics."