olmo_tap.experiments.security.data¶
Data loading for security head supervised finetuning on MedMCQA.
Functions
|
Wrap a raw MedMCQA question with preamble. |
|
Load a MedMCQA shard, tokenize prompts, return train_dl. |
|
Tokenize the question prompt and store the ground-truth answer token ID. |
- olmo_tap.experiments.security.data.format_question(question: str, mcq_options: list[str]) str[source]¶
Wrap a raw MedMCQA question with preamble.
- olmo_tap.experiments.security.data.load_shard(config: TrainingConfig) tuple[DataLoader, int, int, int, int][source]¶
Load a MedMCQA shard, tokenize prompts, return train_dl.