R-universe - klintkanopka (Klint Kanopka)

Calibrating with a Weakly-Informative, Biased LLM2 days ago

mixedsubjectsirt 1.0.0Klint Kanopka weakly-informative-llm.Rmd

Choosing Lambda in Mixed-Subjects IRT2 days ago

Two objectives, two estimators | Example data | Ability-risk tuning: Minimizing $\mathbb{E}[g'\Sigma_\gamma g]$ | Cross-fit $\lambda$ tuning (recommended workflow) | Frozen expected-count estimator (fast approximation) | Minimizing $\text{Tr}\big[\Sigma_\gamma\big]$ (diagnostic only) | Choosing a procedure

mixedsubjectsirt 1.0.0Klint Kanopka lambda-tuning.Rmd

IRT Linking and Gradient Asymmetry: Diagnostic Guide2 days ago

mixedsubjectsirt 1.0.0Klint Kanopka linking-comparison.Rmd

Mixed-Subjects IRT Calibration2 days ago

mixedsubjectsirt 1.0.0Klint Kanopka mixed-subjects-workflow.Rmd

Per-Item Lambda (Experimental)2 days ago

mixedsubjectsirt 1.0.0Klint Kanopka lambda-tuning-item.Rmd

Simulation Validation of the Mixed-Subjects MML Estimator2 days ago

mixedsubjectsirt 1.0.0Klint Kanopka simulation-validation.Rmd

Understanding Ability-Risk Tuning2 days ago

Why this vignette exists | Key Intuition | The three response matrices | 1. Observed human responses: $O$ | 2. Paired LLM-predicted human responses: $P$ | 3. Additional LLM-generated responses: $G$ | The mixed-subjects IRT objective | What lambda is learning | $$L_O^ | Ability-risk tuning | The approximate target is$$\widehat R(\lambda) | Why row alignment matters | Case A: perfect paired prediction | $$\lambda_ | \frac | Case B: row-shuffled perfect predictions | Case C: same DGP, fresh Bernoulli draw | $$\operatorname{Cov}(O_{ij},P_{ij}) | $$\operatorname{Var}(P_{ij}) | What kind of LLM data produces higher lambda? | One approach to row alignment: leave-one-item-out prediction | Another approach: covariate-based prediction | Something that probably won't work: item-text-only generation | How to generate $G$ | Summary | Technical Explanation | Overview: four objects, one objective | 1. The estimator and its estimating equation | 2. The sandwich covariance of $\hat\gamma$ | 3. Ability scoring and the implicit gradient | 4. Delta-method propagation and the risk | 5. Why this differs from the PPI++ trace objective

mixedsubjectsirt 1.0.0Klint Kanopka understanding-ability-risk.Rmd

Calibrating with a Weakly-Informative, Biased LLM2 days ago

mixedsubjectsirt 1.0.0Klint Kanopka weakly-informative-llm.Rmd

Choosing Lambda in Mixed-Subjects IRT2 days ago

Two objectives, two estimators | Example data | Ability-risk tuning: Minimizing $\mathbb{E}[g'\Sigma_\gamma g]$ | Cross-fit $\lambda$ tuning (recommended workflow) | Frozen expected-count estimator (fast approximation) | Minimizing $\text{Tr}\big[\Sigma_\gamma\big]$ (diagnostic only) | Choosing a procedure

mixedsubjectsirt 1.0.0Klint Kanopka lambda-tuning.Rmd

IRT Linking and Gradient Asymmetry: Diagnostic Guide2 days ago

mixedsubjectsirt 1.0.0Klint Kanopka linking-comparison.Rmd

Mixed-Subjects 1PL Calibration2 days ago

mixedsubjectsirt 1.0.0Klint Kanopka mixed-subjects-1pl.Rmd

Mixed-Subjects IRT Calibration2 days ago

mixedsubjectsirt 1.0.0Klint Kanopka mixed-subjects-workflow.Rmd

Per-Item Lambda (Experimental)2 days ago

mixedsubjectsirt 1.0.0Klint Kanopka lambda-tuning-item.Rmd

Simulation Validation of the Mixed-Subjects MML Estimator2 days ago

mixedsubjectsirt 1.0.0Klint Kanopka simulation-validation.Rmd

Understanding Ability-Risk Tuning2 days ago

Why this vignette exists | Key Intuition | The three response matrices | 1. Observed human responses: $O$ | 2. Paired LLM-predicted human responses: $P$ | 3. Additional LLM-generated responses: $G$ | The mixed-subjects IRT objective | What lambda is learning | $$L_O^ | Ability-risk tuning | The approximate target is$$\widehat R(\lambda) | Why row alignment matters | Case A: perfect paired prediction | $$\lambda_ | \frac | Case B: row-shuffled perfect predictions | Case C: same DGP, fresh Bernoulli draw | $$\operatorname{Cov}(O_{ij},P_{ij}) | $$\operatorname{Var}(P_{ij}) | What kind of LLM data produces higher lambda? | One approach to row alignment: leave-one-item-out prediction | Another approach: covariate-based prediction | Something that probably won't work: item-text-only generation | How to generate $G$ | Summary | Technical Explanation | Overview: four objects, one objective | 1. The estimator and its estimating equation | 2. The sandwich covariance of $\hat\gamma$ | 3. Ability scoring and the implicit gradient | 4. Delta-method propagation and the risk | 5. Why this differs from the PPI++ trace objective

mixedsubjectsirt 1.0.0Klint Kanopka understanding-ability-risk.Rmd

Mixed-Subjects 1PL Calibration17 days ago

mixedsubjectsirt 1.0.0Klint Kanopka mixed-subjects-1pl.Rmd