olmo_tap.experiments.security.engine

Security Finetuning protocol. See https://www.overleaf.com/read/kpnzybhdvwnh#a3aa13 for theory details.

Functions

train(model, exp_config, optimizer, scheduler)

Performs supervised security finetuning on a HydraTransformer model.

olmo_tap.experiments.security.engine.train(model: HydraTransformer, exp_config: ExperimentConfig, optimizer: Optimizer, scheduler: LRScheduler)[source]

Performs supervised security finetuning on a HydraTransformer model. Assumed that only 1 head (at 0th index by convention) is loaded and being trained.

Parameters:
  • model – HydraTransformer LLM model being finetuend.

  • optimizer – Any torch optim object.

  • scheduler – Any torch scheduler object.

Parap exp_config:

Global config object storing experiment details.