olmo_tap.experiments.robustness.eval¶
Evaluate robustness: replay the attack bank against a model and compare to the security baseline recorded at bank-construction time.
Usage:
# raw OLMo-7B, no LoRA (sanity only -- base is an always-A classifier)
pixi run -e cuda python -m olmo_tap.experiments.robustness.eval --base
# Prod security LoRA only -- with --shard-id 0 this round-trips the bank's
# stored security_flip_rate; with --shard-id N != 0 it probes cross-shard
# transfer.
pixi run -e cuda python -m olmo_tap.experiments.robustness.eval \
--security --shard-id 1
# Full stack: prod security + robustness checkpoint
pixi run -e cuda python -m olmo_tap.experiments.robustness.eval \
--checkpoint path/to/checkpoint_final.pt --shard-id 0
Functions