pretraining module
Full Documentation for hippynn.pretraining
module.
Click here for a summary page.
Things to do before training, i.e. initialization of network and diagnostics.
- calculate_max_system_force(array_dict: dict, species_name: str, force_name: str, device: device = None, batch_size: int = 50)[source]
Calculates the maximum force magnitude in each system in
array_dict
.Example usage for unsplit data:
>>> db = Database(...) >>> max_force = calculate_max_system_force(db.arr_dict,"Z","F")
If the database has been split:
>>> max_force_train = calculate_max_system_force(db.splits['train'],"Z","F")
Example usage to prune out high-force data:
>>> db = Database(...) >>> force_threshold = ... >>> max_force = calculate_max_system_force(db.arr_dict,"Z","F") >>> high_force_system = max_force > force_threshold >>> db.arr_dict = {k:v[~high_force_system] for k,v in db.arr_dict.items()}
- Parameters:
array_dict – dictionary mapping strings to tensors/numpy arrays
species_name – dictionary key for species
force-name – dictionary key for positions
device – Where to perform the computation.
batch_size – batch size to perform evaluation over.
- Returns:
- calculate_min_dists(array_dict: dict, species_name: str, positions_name: str, dist_hard_max: float, cell_name: str = None, device: device = None, pair_finder_class: _BaseNode = 'auto', batch_size: int = 50)[source]
Calculates the minimum distance found in each system in
array_dict
.Example usage for unsplit data:
>>> db = Database(...) >>> min_dists = calculate_min_dists(db.arr_dict,"Z","R",5.0)
If the database has been split:
>>> min_dists_train = calculate_min_dists(db.splits['train'],"Z","R",5.0)
Example usage to prune out low-distance data:
>>> db = Database(...) >>> dist_threshold = ... >>> min_dist = calculate_min_dists(db.arr_dict,"Z","R",5.0) >>> low_distance_system = min_dist < dist_threshold >>> db.arr_dict = {k:v[~low_distance_system] for k,v in db.arr_dict.items()}
Note
The cutoff radius
dist_hard_max
should be set large enough such that each atom is expected to have at least one neighbor. If an atom has no neighbors, its min_dist will be set to the largest distance found in the current batch. If an entire system has no neighbors, the minimum distance will be set to zero.- Parameters:
array_dict – dictionary mapping strings to tensors/numpy arrays
species_name – dictionary key for species
positions_name – dictionary key for positions
dist_hard_max – maximum distance to search
cell_name – dictionary key for cell (periodic boundary conditions. if the cell is not specified, open boundaries are used.
pair_finder_class – if ‘auto’, choose automatically. elsewise build this kind of pair finder.
device – Where to perform the computation.
batch_size – batch size to perform evaluation over.
- Returns:
- hierarchical_energy_initialization(energy_module, database=None, trainable_after=False, decay_factor=0.01, encoder=None, energy_name=None, species_name=None, peratom=False)[source]
Computes values for the non-interacting energy using the training data.
- Parameters:
energy_module – HEnergyNode or torch module for energy prediction
database – InterfaceDB object to get training data, required if model contains E0 term
trainable_after – Determines if it should change .requires_grad attribute for the E0 parameters
decay_factor – change initialized weights of further energy layers by
df**N
for layer Nencoder – species encoder, can be auto-identified from energy node
energy_name – name for the energy variable, can be auto-identified from energy node
species_name – name for the species variable, can be auto-identified from energy node
peratom
- Returns:
None