olmo_tap.experiments.robustness.engine

Robustness finetuning protocol. See https://www.overleaf.com/read/kpnzybhdvwnh#a3aa13 for theory details.

Functions

train(model, exp_config, optimizer, scheduler)

Performs supervised robustness finetuning on a HydraTransformer model.

olmo_tap.experiments.robustness.engine.train(model: HydraTransformer, exp_config: ExperimentConfig, optimizer: Optimizer, scheduler: LRScheduler, stagnant_thresh: int = 100)[source]

Performs supervised robustness finetuning on a HydraTransformer model. Assumed that only 1 head (at 0th index by convention) is loaded and being trained.

Parameters:
  • model – HydraTransformer LLM model being finetuend.

  • optimizer – Any torch optim object.

  • scheduler – Any torch scheduler object.

  • stagnant_thresh – If after this many steps no successful adversarial attacks were made, training comes to early stop.

Parap exp_config:

Global config object storing experiment details.