dimension_wise_rescaling
Dimension-wise Rescaling is one of the enhancement modules to cope with vast gradient ranges as described in the second paragraph of section 4.3 in [M-OFDFT].
Rescaling takes place after the application of local frames and natural reparametrization. To handle the still remaining scale trade off between density coefficients and gradients, dimension-wise rescaling is introduced. As dimensions vary for different molecules, density coefficients are centered and rescaled by atomic number specific biases \(\bar{\mathbf{p}}_{Z, \tau}\) and scales \(\lambda_{Z, \tau}\), corresponding to atomic number \(Z\) and dimension \(\tau\). The scaling factor is found by simultaneously (up)scaling the coefficients and downscaling the gradient labels, until the gradient arrives at an appropriate scale or the coefficient exceeds its chosen scale. The target gradient scale \(s_{\text{grad}}\) is set to \(0.05\) and for coefficients, the maximal scale is set to \(s_{\text{coeff}} = 50\), as larger coefficient scales can later be compressed by the Shrink Gate module. As the maximum gradient scale of a model can indicate the capability of its fitting the gradient label, the maximum gradient is used as a proxy for the label scale.
- class DimensionWiseRescaling(coeff_biases: Tensor | ndarray, coeff_stds: Tensor | ndarray, max_grads: Tensor | ndarray, s_coeff: float = 50, s_grad: float = 0.05, epsilon: float = 1e-08)[source]
Center and rescale density coefficients and gradients dimension-wise.
- __init__(coeff_biases: Tensor | ndarray, coeff_stds: Tensor | ndarray, max_grads: Tensor | ndarray, s_coeff: float = 50, s_grad: float = 0.05, epsilon: float = 1e-08) None[source]
Initialize the DimensionWiseRescaling module.
- Parameters:
coeff_biases (
Tensor | np.ndarray) – Concatenated tensor of dimensionwise coeff average per atomic number. Can be split using basis information.coeff_stds (
Tensor | np.ndarray) – Concatenated tensor of dimensionwise coeff standard deviation per atomic number. Can be split using basis information.max_grads (
Tensor | np.ndarray) – Concatenated tensor of dimensionwise max gradient per atomic number. Can be split using basis information.s_coeff (
float) – Maximal coefficient scale. Defaults to 50 (compare [M-OFDFT]).s_grad (
float) – Target gradient scale. Defaults to 0.05 (compare [M-OFDFT]).epsilon (
float) – Small number to avoid division by zero. Defaults to 1e-8.
- classmethod apply_to_dataset_statistics(dataset_statistics: DatasetStatistics, output_path: str, weigher_key: str, **kwargs) DatasetStatistics[source]
Apply the dimension wise rescaling to a DatasetStatistics object, i.e. compute what the statistics would be after applying dimension wise rescaling to the dataset.
- Parameters:
dataset_statistics (
DatasetStatistics) – DatasetStatistics object to be transformed, and which the dimension wise rescaling’s parameters are based on.output_path (
str) – Path where the transformed statistics will be saved.weigher_key (
str) – Selects the sample weigher for the dataset statistics.**kwargs – Additional arguments to be passed to the constructor, see
from_dataset_statistics().
- Returns:
The transformed DatasetStatistics object.
- Return type:
- forward(data_sample: OFData) Tensor[source]
Compute shifted and rescaled coeffs.
- Parameters:
data_sample (
OFData) – A sample from the OFData dataset. From this we extract density coefficients which are then rescaled by lambda_z.- Returns:
The rescaled density coefficients.
- Return type:
Tensor
- classmethod from_dataset_statistics(dataset_statistics: DatasetStatistics, basis_info: BasisInfo = None, weigher_key: str = 'has_energy_label', equivariant: bool = False, s_coeff: int = 50, s_grad: float = 0.05, epsilon: float = 1e-08) DimensionWiseRescaling[source]
Instantiate class from a DatasetStatistics object holding the required maximum gradients, coefficient biases and coefficient standard deviations.
- Parameters:
dataset_statistics (
DatasetStatistics) – DatasetStatistics object containing the needed coefficient mean (bias), their standard deviations and the maximum gradient.basis_info – Basis information object. Required if equivariant is True.
equivariant – Whether to use the same scale factors for all components (different m’s) of non-scalar fields.
weigher_key (
str) – Selects the sample weigher for the dataset statistics.s_coeff (
float) – Maximal coefficient scale. Defaults to 50 (compare [M-OFDFT]).s_grad (
float) – Target gradient scale. Defaults to 0.05 (compare [M-OFDFT]).epsilon (
float) – Small number to avoid division by zero. Defaults to 1e-8.
- Returns:
A new DimensionWiseRescaling object with the loaded data.
- Return type:
- init_lambda_z() None[source]
Initialize the coefficient and gradient scaling \(\lambda_Z\).
This is done according to eq (4) in [M-OFDFT]:
\[\begin{split}\lambda_{Z, \tau} = \begin{cases} \text{min} \left\{ \frac{\mathrm{max\_grad}_{Z, \tau}}{s_{\text{grad}}} , \frac{s_{\text{coeff}}}{\mathrm{std\_coeff}_{Z,\tau}},\right\} &\text{if } \mathrm{max\_grad}_{Z, \tau} > s_{\text{grad}}, \\ 1, &\text{otherwise} \end{cases}\end{split}\]where \(\tau\) refers to dimension in the given basis and \(Z\) is the atomic number.
- rescale_coeffs_and_gradient(data_sample: OFData) tuple[Tensor, Tensor][source]
Compute shifted and rescaled coeffs and inversely scaled gradients.
- Parameters:
data_sample (
OFData) – A sample from the OFData dataset. From this we extract density coefficients and gradient labels which are then rescaled by lambda_z.- Returns:
- A tuple containing the rescaled density coefficients and the
rescaled gradients.
- Return type:
tuple[Tensor, Tensor]