Mechanistic-Interpretability

Base76 Research Lab — Mechanistic Interpretability

Research-first repository for mechanistic interpretability, residual-state analysis, sparse autoencoders, runtime observability, and intervention-aware analysis.

Start Here

Repository README
Current Status
Findings Surface
Research Index
Reproducibility Guide
Latest Release

Claim Boundary

Current claims are scoped to the active GPT-2 Small setup. Cross-model generalization is not yet established. Read-only observer traces and write-back interventions are treated as distinct evidence classes.

Lab

Base76 Research Lab (Sweden)

Web: https://base76.se/en/
ORCID: https://orcid.org/0009-0000-4015-2357
GitHub: https://github.com/base76-research-lab

This site is open source. Improve this page.