routines module

Full Documentation for hippynn.experiment.routines module. Click here for a summary page.

Routines for setting up and performing training

Bases: object

Parameters:

stopping_key – name of validation metric for stopping
controller – Optional – Controller object for LR scheduling and ending experiment. If not provided, will be constructed from parameters below.
Device – Where to train the model. Falls back to CUDA if available. Specify a tuple of device numbers to use DataParallel.
max_epochs – Optional – maximum number of epochs to train. Mandatory if controller not provided.
batch_size – Only used if the controller itself is not specified. Mandatory if controller not provided.
eval_batch_size – Only used if the controller itself is not specified.
scheduler – scheduler passed to the controller
optimizer – Pytorch optimizer or optimizer class. Defaults to Adam.
learning_rate – If an optimizer class is provided, the learning rate is used to construct the optimizer.
fraction_train_eval – What fraction of the training dataset to evaluate in the evaluation phase

All params after stopping_key, controller, and device are optional and can be built into a controller.

Note

Multiple GPUs is an experimental feature that is currently under debugging.

optimizer: alias of Adam

batch_size: int | None = None

controller: Controller | Callable | None = None

device: device | str | None = None

eval_batch_size: int | None = None

fraction_train_eval: float | None = 0.1

learning_rate: float | None = None

max_epochs: int | None = None

scheduler: _LRScheduler | Callable = None

stopping_key: str | None = None

setup_and_train(training_modules: TrainingModules, database: Database, setup_params: SetupParams, store_all_better=False, store_best=True, store_every=0)[source]

Param:: training_modules: see setup_training()
Param:: database: see train_model()
Param:: setup_params: see setup_training()
Param:: store_all_better: Save the state dict for each model doing better than a previous one
Param:: store_best: Save a checkpoint for the best model
Param:: store_every: Save a checkpoint for every certain epochs
Returns:: See train_model()

Shortcut for setup_training followed by train_model.

Note

The training loop will capture KeyboardInterrupt exceptions to abort the experiment early. If you would like to gracefully kill training programmatically, see train_model() with callbacks argument.

Note

Saves files in the current running directory; recommend you switch to a fresh directory with a descriptive name for your experiment.

setup_training(training_modules: TrainingModules, setup_params: SetupParams)[source]

Prepares training_modules for training with experiment_params.

Param:: training_modules: Tuple of model, training loss, and evaluation losses (Can be built from graph using graphs.assemble_training_modules)
Param:: setup_params: parameters controlling how training is performed (See SetupParams)

Roughly:

sets devices for training modules
if no controller given:
- instantiates and links optimizer to the learnable params on the model
- instantiates and links scheduler to optimizer
- builds a default controller with setup params
creates a MetricTracker for storing the training metrics

Returns:: (optimizer,evaluator,controller,metrics,callbacks)

test_model(database, evaluator, batch_size, when, metric_tracker=None)[source]

Tests the model on the database according to the model_evaluator metrics. If a plot_maker is attached to the model evaluator, it will make plots. The plots will go in a sub-folder specified by when the testing is taking place. The results are then printed.

Parameters:

database – The database test the model on.
evaluator – The evaluator containing model and evaluation losses to measure.
when – A string to specify what plots are currently to be used.
metric_tracker – (Optional) metric tracker to save metrics on. If not provided, a blank one will be constructed.

Returns:

metric tracker

train_model(training_modules, database, controller, metric_tracker, callbacks, batch_callbacks, store_all_better=False, store_best=True, store_every=0, store_structure_file=True, store_metrics=True, quiet=False)[source]

Performs training loop, allows keyboard interrupt. When done, reinstate the best model, make plots and metrics over time, and test the model.

Parameters:

training_modules – tuple-like of model, loss, and evaluator
database – Database
controller – Controller
metric_tracker – MetricTracker for storing model performance
callbacks – callbacks to perform after every epoch.
batch_callbacks – callbacks to perform after every batch
store_best – Save a checkpoint for the best model
store_all_better – Save the state dict for each model doing better than a previous one
store_every – Save a checkpoint for every certain epochs
store_structure_file – Save the structure file for this experiment
store_metrics – Save the metric tracker for this experiment.
quiet – If True, disable printing during training (still prints testing results).

Returns:

metric_tracker

Note

callbacks take the form of an iterable of callables and will be called with cb(epoch,new_best)

epoch indicates the epoch number
new_best indicates if the model is a new best model

Note

batch_callbacks take the form of an iterable of callables and will each be called with cb(batch_inputs, batch_model_outputs, batch_targets)

Note

You may want to make your callbacks store other state, if so, an easy way is to make them a callable object.

Note

callback state is not managed by hippynn. If your wish to save or load callback state, you will have to manage that manually (possibly with a callback itself).

training_loop(training_modules: TrainingModules, database, controller: Controller, metric_tracker: MetricTracker, callbacks, batch_callbacks, store_all_better, store_best, store_every, quiet)[source]

Performs a high-level training loop.

Parameters:

training_modules – training modules from assemble_modules
database – database to train to
controller – controller for early stopping and/or learning rate decay
metric_tracker – the metric tracker
callbacks – list of callbacks for each epoch
batch_callbacks – list of callbacks for each batch
store_best – Save a checkpoint for the best model
store_all_better – Save the state dict for each model doing better than a previous one
store_every – Save a checkpoint for every certain epochs
quiet – whether to print information. Setting quiet to true won’t prevent progress bars.

Returns:

metrics – the state of the experiment after training

Note

Saves files in the current running directory; recommend switching to a fresh directory with a descriptive name for each experiment.

Rough structure.

Loop over Epochs, performing:

Loop over batches
- Make predictions
- Calculate Loss, perform backwards
- Optimizer Step
- Batch Callbacks
Perform validation and print results
Controller/Scheduler Step
Epoch callbacks
If new best, save the model state_dict and a checkpoint