assembly module

Full Documentation for hippynn.experiment.assembly module. Click here for a summary page.

Assembling graphs for a training run

class TrainingModules(model, loss, evaluator)

Bases: tuple

Parameters:

model – assembled torch.nn.Module of the model
loss – assembled torch.nn.Module for the loss
evaluator – assembled evaluator for validation losses

evaluator: Alias for field number 2

loss: Alias for field number 1

model: Alias for field number 0

assemble_for_training(train_loss, validation_losses, validation_names=None, plot_maker=None)[source]

Parameters:

train_loss – LossNode
validation_losses – dict of (name:loss_node) or list of LossNodes. -if a list of loss nodes, the name of the node will be used for printing the loss, -this can be overwritten with a list of validation_names
(optional) (validation_names) – list of names for loss nodes, only if validation_losses is a list.
plot_maker – optional PlotMaker for model evaluation

Returns:

training_modules, db_info -db_info: dict of inputs (input to model) and targets (input to loss) in terms of the db_name.

assemble_for_training computes:

what inputs are needed to the model
what outputs of the model are needed for the loss
what targets are needed from the database for the loss

It then uses this info to create GraphModule s for the model and loss, and an Evaluator based on validation loss (& names), early stopping, plot maker.

Note

Model and training loss are always evaluated on the active device. But the validation losses reside by default on the CPU. This helps compute statistics over large datasets. To accomplish this, the modules associated with the loss are copied in the validation loss. Thus, after assembling the modules for training, changes to the loss nodes will not affect the model evaluator. In all likelihood you aren’t planning to do something too fancy like change the loss nodes during training. But if you do plan to do something like that with callbacks, know that you would probably need to construct a new evaluator.

build_loss_modules(training_loss, validation_losses, network_outputs, database_inputs)[source]

determine_out_in_targ(*nodes_required_for_loss)[source]

Parameters:: nodes_required_for_loss – train nodes and validation nodes
Returns:: lists of needed network inputs, network outputs, and database targets

generate_database_info(inputs, targets, allow_unfound=False)[source]

Construct db info from input nodes and target nodes. :param inputs: list of input nodes :param targets: list of target nodes :param allow_unfound: don’t check if names are valid

Builds a list of the db names for the nodes. If

Returns:

precompute_pairs(model, database, batch_size=10, device=None, make_dense=False, n_images=1)[source]

Parameters:

model – Assembled GraphModule involving a PairIndexer
database – database that precomputation should be supplied with
batch_size – batch size to do pre-computation with
device – where to do the precomputation.
make_dense – return a dense array of pairs. Warning, this can be memory-expensive. However, it is necessary if you are going to use num_workers>0 in your dataloaders. If False, the cache is stored as a sparse array.
n_images – number of images for cache storage; increase this if it fails. However, large values can incur a very large memory cost if make_dense is True.

Returns:

None– changes the model graph.

Note

After running pre-compute pairs, your model will expect to load pairs directly from the database, and your database will contain cached pair entries.

Note that the returned model needs to be re-assembled with the new graph for the cache to take effect. Example usage: >>> precompute_pairs(training_modules.model,database,device=’cuda’) >>> training_modules, db_info = assemble_for_training(train_loss, validation_losses) >>> database.inputs = db_info[‘inputs’]