Understanding Ability-Risk Tuning2 days ago
Why this vignette exists | Key Intuition | The three response matrices | 1. Observed human responses: $O$ | 2. Paired LLM-predicted human responses: $P$ | 3. Additional LLM-generated responses: $G$ | The mixed-subjects IRT objective | What lambda is learning | $$L_O^ | Ability-risk tuning | The approximate target is$$\widehat R(\lambda) | Why row alignment matters | Case A: perfect paired prediction | $$\lambda_ | \frac | Case B: row-shuffled perfect predictions | Case C: same DGP, fresh Bernoulli draw | $$\operatorname{Cov}(O_{ij},P_{ij}) | $$\operatorname{Var}(P_{ij}) | What kind of LLM data produces higher lambda? | One approach to row alignment: leave-one-item-out prediction | Another approach: covariate-based prediction | Something that probably won't work: item-text-only generation | How to generate $G$ | Summary | Technical Explanation | Overview: four objects, one objective | 1. The estimator and its estimating equation | 2. The sandwich covariance of $\hat\gamma$ | 3. Ability scoring and the implicit gradient | 4. Delta-method propagation and the risk | 5. Why this differs from the PPI++ trace objective
mixedsubjectsirt 1.0.0Klint Kanopka understanding-ability-risk.Rmd