olmo_tap.benchmarksΒΆ
Modules
Inference latency / throughput benchmark for vanilla OLMo, naive-averaging Hydra, and Hydra + PoE (Product of Experts speculative decode). |
|
Plot the three-config benchmark output produced by |
Modules
Inference latency / throughput benchmark for vanilla OLMo, naive-averaging Hydra, and Hydra + PoE (Product of Experts speculative decode). |
|
Plot the three-config benchmark output produced by |