olmo_tap.experiments.utils.config¶
Config classes to support training and inference.
Classes
|
Master config to store the HydraLoraConfig and TrainingConfig in training. |
|
Supports loading Hydra model for inference or training. |
|
Config to store training specific parameters. |
- class olmo_tap.experiments.utils.config.ExperimentConfig(seed: int, model: HydraLoRAConfig = <factory>, train: TrainingConfig = <factory>, wandb_project: str = 'hydra', wandb_run_name: str | None = None, device: str = 'cuda')[source]¶
Bases:
objectMaster config to store the HydraLoraConfig and TrainingConfig in training.
- model: HydraLoRAConfig¶
- train: TrainingConfig¶
- class olmo_tap.experiments.utils.config.HydraLoRAConfig(weights_dir: str = '/vol/bitbucket/tjt25/olmo2-7b-instruct-weights', model_size: str = '7b', n_heads_final: int = 5, n_heads_training: int = 1, heads_depth: int = 3, vocab_size: int = 100352, lora_r: int = 16, lora_alpha: int = 32, target_modules: list[str] = <factory>, device: str = 'cuda')[source]¶
Bases:
objectSupports loading Hydra model for inference or training. NOTE: n_heads_final is for book-keeping the number of heads the final Hydra model is intended to have; n_heads_training is the actual number loaded at training time.
- class olmo_tap.experiments.utils.config.TrainingConfig(learning_rate: float = 0.0001, batch_size: int = 16, num_epochs: int = 1, max_seq_len: int = 256, num_workers: int = 4, shard_id: int = 0, weights_dir: str = '/vol/bitbucket/tjt25/olmo2-7b-instruct-weights', warmup_steps: int = 100, lr_schedule: str = 'cosine', output_dir: str = 'experiments/uncertainty/outputs', checkpoint_every_n_steps: int = 250)[source]¶
Bases:
objectConfig to store training specific parameters.