olmo_tap.final_evals.elo.prompts.build_prompt_set

Materialise the Tournament 1 prompt bank.

Source A — 80 MedMCQA validation items recast as open-ended clinical questions (stratified-random by subject_name). Source B — 70 MedQA-USMLE vignettes from the test split, used in their native open-ended form. Source C — Hand-curated prompts, appended later by separate tooling. This script is deterministic given --seed; it must be re-runnable to regenerate bank.jsonl byte-for-byte.

The output is a JSON-Lines file at --output (default olmo_tap/final_evals/elo/prompts/bank.jsonl):

{
  "prompt_id":         "srcA_00001",
  "source":            "medmcqa_open" | "medqa" | "curated",
  "subject":           "endocrinology" | ...,
  "text":              "...",
  "gold_answer":       "...",         # Sources A and B
  "expected_behavior": "...",         # Source C only
  "generation_method": "...",         # Source C only
  "tags":              ["..."]
}

This script produces only Sources A and B (150 prompts). The curated Source C is appended later by separate tooling; the loader downstream is expected to json.loads each line of bank.jsonl and merge any additional source: curated entries seamlessly.

Usage:

python -m olmo_tap.final_evals.elo.prompts.build_prompt_set \
    --output olmo_tap/final_evals/elo/prompts/bank.jsonl \
    --seed 20260425

Functions

build_bank([seed])

Build the full prompt bank from Sources A and B.

load_bank(path)

Read a previously-written bank.jsonl back into a list of records.

main()

write_bank(bank, path)

Write bank to path as JSON-Lines (one record per line).

olmo_tap.final_evals.elo.prompts.build_prompt_set.build_bank(seed: int = 20260425) list[dict[str, Any]][source]

Build the full prompt bank from Sources A and B.

Source C (curated) is appended later by separate tooling; the loader downstream is expected to json.loads each line of bank.jsonl and merge any additional source: curated entries seamlessly.

olmo_tap.final_evals.elo.prompts.build_prompt_set.load_bank(path: Path) list[dict[str, Any]][source]

Read a previously-written bank.jsonl back into a list of records.

olmo_tap.final_evals.elo.prompts.build_prompt_set.main() None[source]
olmo_tap.final_evals.elo.prompts.build_prompt_set.write_bank(bank: Iterable[dict[str, Any]], path: Path) None[source]

Write bank to path as JSON-Lines (one record per line).