Skip to main content

Software

Open-source tools for AI evaluation.

We build open-source PyTorch tools for psychometric modeling, adaptive testing, and rigorous evaluation of AI systems.

01

Composable by default

Metrics, datasets, and uncertainty estimates work together without requiring custom pipelines.

02

Measurement-aware outputs

Every output makes its assumptions visible. Results always include uncertainty and comparability information.

03

Built for community use

Researchers across institutions can adopt, extend, and contribute back to shared infrastructure.

Libraries

Library

torch_measure

A PyTorch library for measurement science. Includes IRT models (Rasch, 2PL, 3PL), computerized adaptive testing, psychometric metrics, and GPU-accelerated estimation.