custom_kernels
This is summary documentation for hippynn.custom_kernels module. Click here for full documentation.
Links to Full Documentation:
Custom Kernels for hip-nn interaction sum.
This module provides implementations in pytorch, numba, cupy, and triton.
Pytorch implementations take extra memory, but launch faster than numba kernels. Numba kernels use far less memory, but do come with some launching overhead on GPUs. Cupy kernels only work on the GPU, but are faster than numba. Cupy kernels require numba for CPU operations. Triton custom kernels only work on the GPU, and are generaly faster than CUPY. Triton kernels revert to numba or pytorch as available on module import.
On import, this module attempts to set the custom kernels as specified by the user in hippynn.settings.
See the Custom Kernels section of the documentation for more information.
Depending on your available packages, you may have the following options:
“pytorch”: dense pytorch operations.
“sparse”: sparse pytorch operations. Can be faster than pure pytorch for large enough systems, and will not require as much memory. May require latest pytorch version. Cannot cover all circumstances; should error if encountering a result that this implementation cannot cover..
“numba”: numba implementation of custom kernels, beats pytorch-based kernels.
“cupy”: cupy implementation of custom kernels, better than numba
“triton”: triton-based custom kernels, uses auto-tuning and the triton compiler. This is usually the best option.
The available items are stored in the variable hippynn.custom_kernels.CUSTOM_KERNELS_AVAILABLE
.
The active implementation is stored in hippynn.custom_kernels.CUSTOM_KERNELS_ACTIVE
.
For more information, see Custom Kernels
Module Attributes
List of available kernel implementations based on currently installed packages.. |
|
Which custom kernel implementation is currently active. |
|
|
|
|
|
|
Functions
Check available imports and populate the list of available custom kernels. |
|
|
Activate or deactivate custom kernels for interaction. |
Exceptions
Modules
Wraps non-pytorch implementations for use with pytorch autograd. |
|
Pure pytorch implementation of envsum operations. |
|
Pure pytorch implementation of envsum operations |
|
This module implements a version of converting from pytorch tensors to numba DeviceNDArrays that skips much of the indirection that takes place in the numba implementation. |
|
Tools for converting between torch and numba-compatible arrays |
|
Utilities for the custom kernels, including pre-sorting the indices. |