olmo_tap.final_evals.elo.prompts.build_prompt_set¶
Materialise the Tournament 1 prompt bank.
Source A — 80 MedMCQA validation items recast as open-ended clinical
questions (stratified-random by subject_name).
Source B — 70 MedQA-USMLE vignettes from the test split, used in their
native open-ended form.
Source C — Hand-curated prompts, appended later by separate tooling.
This script is deterministic given --seed; it must be re-runnable to
regenerate bank.jsonl byte-for-byte.
The output is a JSON-Lines file at --output (default
olmo_tap/final_evals/elo/prompts/bank.jsonl):
{
"prompt_id": "srcA_00001",
"source": "medmcqa_open" | "medqa" | "curated",
"subject": "endocrinology" | ...,
"text": "...",
"gold_answer": "...", # Sources A and B
"expected_behavior": "...", # Source C only
"generation_method": "...", # Source C only
"tags": ["..."]
}
This script produces only Sources A and B (150 prompts). The curated
Source C is appended later by separate tooling; the loader downstream is
expected to json.loads each line of bank.jsonl and merge any
additional source: curated entries seamlessly.
Usage:
python -m olmo_tap.final_evals.elo.prompts.build_prompt_set \
--output olmo_tap/final_evals/elo/prompts/bank.jsonl \
--seed 20260425
Functions
|
Build the full prompt bank from Sources A and B. |
|
Read a previously-written bank.jsonl back into a list of records. |
|
|
|
Write |
- olmo_tap.final_evals.elo.prompts.build_prompt_set.build_bank(seed: int = 20260425) list[dict[str, Any]][source]¶
Build the full prompt bank from Sources A and B.
Source C (curated) is appended later by separate tooling; the loader downstream is expected to
json.loadseach line ofbank.jsonland merge any additionalsource: curatedentries seamlessly.