towhee.trainer.training_config.TrainingConfig

class towhee.trainer.training_config.TrainingConfig(output_dir: str = './output_dir', overwrite_output_dir: bool = True, eval_strategy: str = 'epoch', eval_steps: ~typing.Optional[int] = None, batch_size: ~typing.Optional[int] = 8, val_batch_size: ~typing.Optional[int] = -1, seed: int = 42, epoch_num: int = 2, dataloader_pin_memory: bool = True, dataloader_drop_last: bool = True, dataloader_num_workers: int = 0, lr: float = 5e-05, metric: ~typing.Optional[str] = 'Accuracy', print_steps: ~typing.Optional[int] = None, load_best_model_at_end: ~typing.Optional[bool] = False, early_stopping: ~typing.Union[dict, str] = <factory>, model_checkpoint: ~typing.Union[dict, str] = <factory>, tensorboard: ~typing.Optional[~typing.Union[dict, str]] = <factory>, loss: ~typing.Union[str, ~typing.Dict[str, ~typing.Any]] = 'CrossEntropyLoss', optimizer: ~typing.Union[str, ~typing.Dict[str, ~typing.Any]] = 'Adam', lr_scheduler_type: str = 'linear', warmup_ratio: float = 0.0, warmup_steps: int = 0, device_str: ~typing.Optional[str] = None, sync_bn: bool = False, freeze_bn: bool = False)[source]

Bases: object

The training config, it can be defined in a yaml file

Parameters:
  • output_dir (str) – The output directory where the model predictions and checkpoints will be written.

  • overwrite_output_dir (bool) – Overwrite the content of the output directory.

  • eval_strategy (str) – The evaluation strategy.

  • eval_steps (int) – Run an evaluation every X steps.

  • batch_size (int) – Batch size for training.

  • val_batch_size (int) – Batch size for evaluation.

  • seed (int) – Random seed that will be set at the beginning of training.

  • epoch_num (int) – Total number of training epochs to perform.

  • dataloader_pin_memory (bool) – Drop the last incomplete batch if it is not divisible by the batch size.

  • dataloader_drop_last (bool) – Drop the last incomplete batch if it is not divisible by the batch size.

  • dataloader_num_workers (int) – Number of subprocesses to use for data loading.

  • lr (float) – The initial learning rate for AdamW.

  • metric (str) – The metric to use to compare two different models.

  • print_steps (int) – If None, use the tqdm progress bar, otherwise it will print the logs on the screen every print_steps.

  • load_best_model_at_end (bool) – Whether or not to load the best model found during training at the end of training.

  • early_stopping (Union[dict, str]) – Early stopping.

  • model_checkpoint (Union[dict, str]) – Model checkpoint.

  • tensorboard (Union[dict, str]) – Tensorboard.

  • loss (Union[str, Dict[str, Any]]) – Pytorch loss in torch.nn package.

  • optimizer (Union[str, Dict[str, Any]]) – Pytorch optimizer Class name in torch.optim package.

  • lr_scheduler_type (str) – The scheduler type to use.

  • warmup_ratio (float) – Linear warmup over warmup_ratio fraction of total steps.

  • device_str (str) – Device string.

  • sync_bn (bool) – It will be work if device_str is cuda, the True sync_bn would make training slower but acc better.

  • freeze_bn (bool) – It will completely freeze all BatchNorm layers during training.

Methods

load_from_yaml

Load training configuration from yaml.

save_to_yaml

Save training configuration to yaml.

to_dict

to_json_string

Attributes

batch_size

dataloader_drop_last

dataloader_num_workers

dataloader_pin_memory

device

device_str

epoch_num

eval_batch_size

eval_steps

eval_strategy

freeze_bn

load_best_model_at_end

loss

lr

lr_scheduler_type

metric

optimizer

output_dir

overwrite_output_dir

print_steps

seed

sync_bn

train_batch_size

val_batch_size

warmup_ratio

warmup_steps

early_stopping

model_checkpoint

tensorboard

__init__(output_dir: str = './output_dir', overwrite_output_dir: bool = True, eval_strategy: str = 'epoch', eval_steps: ~typing.Optional[int] = None, batch_size: ~typing.Optional[int] = 8, val_batch_size: ~typing.Optional[int] = -1, seed: int = 42, epoch_num: int = 2, dataloader_pin_memory: bool = True, dataloader_drop_last: bool = True, dataloader_num_workers: int = 0, lr: float = 5e-05, metric: ~typing.Optional[str] = 'Accuracy', print_steps: ~typing.Optional[int] = None, load_best_model_at_end: ~typing.Optional[bool] = False, early_stopping: ~typing.Union[dict, str] = <factory>, model_checkpoint: ~typing.Union[dict, str] = <factory>, tensorboard: ~typing.Optional[~typing.Union[dict, str]] = <factory>, loss: ~typing.Union[str, ~typing.Dict[str, ~typing.Any]] = 'CrossEntropyLoss', optimizer: ~typing.Union[str, ~typing.Dict[str, ~typing.Any]] = 'Adam', lr_scheduler_type: str = 'linear', warmup_ratio: float = 0.0, warmup_steps: int = 0, device_str: ~typing.Optional[str] = None, sync_bn: bool = False, freeze_bn: bool = False) None
__repr__()

Return repr(self).

load_from_yaml(path2yaml: Optional[str] = None) TrainingConfig[source]

Load training configuration from yaml.

Parameters:

path2yaml (str) – The path to yaml.

Returns:

(TrainingConfig).

TrainingConfig instance self.

Example

>>> from towhee.trainer.training_config import TrainingConfig
>>> from pathlib import Path
>>> conf = Path(__file__).parent / "config.yaml"
>>> ta = TrainingConfig()
>>> ta.save_to_yaml(conf)
>>> ta.load_from_yaml(conf)
>>> ta.epoch_num
2
save_to_yaml(path2yaml: Optional[str] = None)[source]

Save training configuration to yaml.

Parameters:

path2yaml (str) – The path to yaml.

Example

>>> from towhee.trainer.training_config import TrainingConfig
>>> from pathlib import Path
>>> conf = Path(__file__).parent / 'config.yaml'
>>> ta = TrainingConfig()
>>> ta.save_to_yaml(conf)
>>> ta.load_from_yaml(conf)
>>> ta.epoch_num
2