olmo_tap.experiments.robustnessΒΆ

Modules

amplegcg

AmpleGCG wrapper class.

build_attack_bank

Build a portable attack bank of transferable GCG suffixes on MedMCQA.

data

Data loading for robustness head supervised finetuning on MedMCQA.

engine

Robustness finetuning protocol.

eval

Evaluate robustness: replay the attack bank against a model and compare to the security baseline recorded at bank-construction time.

precompute_gcg

Precompute GCG adversarial suffixes for MedMCQA shards.

training

HydraTransformer Robustness Finetuning Pipeline