olmo_tap.experiments.robustness.build_attack_bank¶
Build a portable attack bank of transferable GCG suffixes on MedMCQA.
- Three resumable phases:
Seed selection – pick –num-seeds validation examples by seed.
Suffix gen – run AmpleGCG on each seed, –num-return-seq candidates each.
- Transfer score – test every candidate against all seeds (own + others)
on OLMo-7B + prod security LoRA; tier-filter survivors.
Each phase persists incrementally. On re-run, phases resume from their last cached progress. Intended usage:
# smoke test (minutes)
pixi run -e cuda python -m olmo_tap.experiments.robustness.build_attack_bank \
--num-seeds 3 --num-return-seq 2
# real run (hours)
pixi run -e cuda python -m olmo_tap.experiments.robustness.build_attack_bank
Functions
|
|
|
|
|
|
|
|
|
Score every candidate against all seeds on the target; resumable. |
- olmo_tap.experiments.robustness.build_attack_bank.filter_and_save_bank(out_dir: Path, scored: list[dict], args: Namespace) None[source]¶
- olmo_tap.experiments.robustness.build_attack_bank.phase_1_select_seeds(out_dir: Path, seed: int, num_seeds: int) list[int][source]¶