Differentially private synthetic-data / DP-pretraining task on private tabular dataset family: we
Subject outcomes
- surrogate:gen-claude-MIX-MAXscore -0.084
- surrogate:gen-claude-MIX-UNIFscore -0.087
- surrogate:baseline_domainscore -0.282
ML Engineering & Research
Surrogate Public Data for DP on tabular data: LLM-synthesized tabular datasets from schema-level specs (zero privacy budget) used as drop-in public data for differential privacy mechanisms. Benchmarks surrogate generation methods (direct CSV vs SCM agents, across LLMs) on DP pretraining utility (AUC Advantage) across privacy budgets.
Response matrix
Each row is an AI model and each column an item, ordered so the strongest models and easiest items gather toward one corner. 21 subjects × 3 items, 86% of cells evaluated.
Fit to width. Hover for subject & item; click a cell for details.

Scale: AUC Advantage in [−1, 1] (per privacy budget ε): how distinguishable a differentially-private model trained on LLM-surrogate data is from one trained on real data — 0 = indistinguishable. Shown per privacy budget ε.
Sample items
A spread of items across the difficulty range. This benchmark does not publish per-answer traces, so each item shows which subjects succeeded.
Differentially private synthetic-data / DP-pretraining task on private tabular dataset family: we
Subject outcomes
Differentially private synthetic-data / DP-pretraining task on private tabular dataset family: edad
Subject outcomes
Differentially private synthetic-data / DP-pretraining task on private tabular dataset family: acs
Subject outcomes
Subjects
21 subjects, ranked by mean response across this benchmark's items.