run_density_optimization

class SampleGenerator(model_config: DictConfig, model: MLDFTLitModule, negative_integrated_density_penalty_weight: float = 0.0, transform_device: str | device = 'cpu')[source]

Class to generate samples from the model configuration.

model_config: The model configuration.

model: The model.

transforms: The transforms.

basis_info: The basis information.

negative_integrated_density_penalty_weight: The weight for the negative integrated density penalty.

__init__(model_config: DictConfig, model: MLDFTLitModule, negative_integrated_density_penalty_weight: float = 0.0, transform_device: str | device = 'cpu') → None[source]

Initialize the SampleGenerator.

Parameters:

model_config – The model configuration.
model – The model.
negative_integrated_density_penalty_weight – The weight for the negative integrated density penalty.

classmethod from_run_path(run_path: str | Path, device: str | device = 'cuda', transform_device: str | device = 'cpu', negative_integrated_density_penalty_weight: float = 0.0, use_last_ckpt: bool = True) → SampleGenerator[source]

Create a SampleGenerator from a run path.

Parameters:

run_path – The run path.
device – The device to load the model on.
transform_device – The device to apply the transforms on.
negative_integrated_density_penalty_weight – The weight for the negative integrated density penalty.
ckpt_choice

Returns:

The instantiated SampleGenerator

get_functional_factory(xc_functional: str | None = None) → FunctionalFactory[source]

Get a functional factory for the model and its config.

Parameters:: xc_functional – The XC functional to use.
Returns:: The functional factory.

get_sample_from_mol(mol: Mole) → OFData[source]

Get a sample from a molecule with the appropriate transforms applied.

Parameters:: mol – The molecule.
Returns:: The OFData sample.

add_density_optimization_trajectories_to_sample(sample: OFData, callback: ConvergenceCallback, energies_label: Energies, basis_info: BasisInfo, save_coeff_interval: int = 100)[source]: Add the density optimization trajectories of energies and coefficients to the sample.

calculate_basis_size(mol: Mole, basis_info: BasisInfo) → int[source]

Calculate the size of the basis set for a given molecule.

Parameters:

mol – The molecule.
basis_info – The basis information object.

Returns:

The number of basis functions.

configure_dataset_indices(dataset_size: int, n_molecules: int, molecule_choice: str | list, seed: int | None, start_idx: int = 0) → ndarray[source]: Configure the indices of the dataset to optimize.

main(cfg: DictConfig)[source]

Main function to use hydra main.

Enables to also run the meth:run_ofdft in code.

parse_run_path(run_path: Path | str) → Path[source]: Parse the run path, making it absolute if necessary.

plotting_worker(plot_queue: Queue, save_dir: Path, basis_info: BasisInfo, enable_grid_plots: bool, save_individual_plots: bool = False, num_threads: int = 8, fail_fast: bool = False)[source]: Worker process for handling plotting.

run_ofdft(run_path: Path | str, optimizer: Optimizer, guess_path: Path | str | None = None, use_last_ckpt: bool = True, device: device | str = 'cpu', transform_device: device | str = 'cpu', num_processes_per_device: int = 1, num_devices: int = 1, num_workers: int = 1, num_threads_per_process: int = 8, model_dtype: dtype = torch.float64, xc_functional: str = 'PBE', initialization: str = 'minao', n_molecules: int = 1, molecule_choice: str | list[int] = 'first_n', seed: int = 0, log_file: str | None = None, save_individual_plots: bool = False, save_denop_samples: bool = False, plot_every_n: int = 10, swarm_plot_subsample: float = 1.0, ofdft_kwargs: dict = None, split: str = 'val', split_file_path: str | None = None, plot_l1_norm: bool = True, enable_grid_operations: bool = True, l1_grid_level: int = 3, l1_grid_prune: str = 'nwchem_prune', negative_integrated_density_penalty_weight: float = 0.0, convergence_criterion: str = 'last_iter', fail_fast: bool = False, save_coeff_interval: int = 100)[source]

Script to run ofdft using a model checkpoint on multiple molecules.

Note: Right now this only supports density optimizations using a checkpoint.

Parameters:

run_path (Path | str) – The path to the run directory.
guess_path (Path | str, optional) – The path to the guess directory. Defaults to None, then the same model is used for the proj_minao guess.
use_last_ckpt (bool, optional) – Whether to use the last checkpoint. Defaults to True.
device (torch.device | str, optional) – The device to run on. Defaults to “cpu”.
model_dtype (torch.dtype, optional) – The dtype of the model. Defaults to torch.float64.
xc_functional (str, optional) – The XC functional to use. Defaults to “PBE”. Irrelevant if the xc functional is part of the model prediction.
initialization (str, optional) – The initialization to use. Defaults to “minao”. Other possible values include “hueckel”, “proj_minao”, “label”.
optimizer (str, optional) – The optimizer to use, e.g. “gradient_descent” or “slsqp”. Defaults to “gradient_descent”.
n_molecules (int, optional) – The number of molecules to optimize. Defaults to 1.
molecule_choice (str | list[int], optional) – The choice of molecules to optimize. Options are “first_n”, “random”, “seeded_random”, or a list of indices. Defaults to “first_n”.
log_file (str | None, optional) – The path to the log file. Defaults to None, then no log file is created.
save_individual_plots (bool, optional) – Whether to keep individual plots for each molecule. Defaults to False.
save_denop_samples (bool, optional) – Whether to save the density optimization trajectories of the samples. Defaults to False.
swarm_plot_subsample (float, optional) – The subsample factor for the swarm plots. Defaults to 1.0.
ofdft_kwargs (dict, optional) – Additional keyword arguments for the OFDFT class. Defaults to None.
split (str, optional) – The split to use, i.e. “val” or “train”. Defaults to “val”.
plot_l1_norm (bool, optional) – Whether to plot the L1 norm of the density error, for which the integration grid is required. Defaults to True.
l1_grid_level (int, optional) – The grid level of the integration grid for the L1 norm. Defaults to 0.
l1_grid_prune (str, optional) – The pruning method for the integration grid for the L1 norm. Defaults to “nwchem_prune”.
negative_integrated_density_penalty_weight (float, optional) – The weight of the negative integrated density penalty. Defaults to 0.0.
convergence_criterion (str, optional) – The convergence criterion for the density optimization.
fail_fast (bool, optional) – Whether to raise an error immediately if a molecule fails. Defaults to False, such that errors are logged, but the script continues. Useful to set to True for debugging.

run_singlepoint_ofdft(mol: ~pyscf.gto.mole.Mole, sample_generator: ~mldft.ofdft.run_density_optimization.SampleGenerator, func_factory: ~mldft.ofdft.functional_factory.FunctionalFactory, optimizer: ~mldft.ofdft.optimizer.Optimizer = <mldft.ofdft.optimizer.TorchOptimizer object>, initial_guess_str: str = 'minao', callback: ~mldft.ofdft.callbacks.ConvergenceCallback | None = None, ofdft_kwargs=None, return_sample: bool = False) → tuple[Energies, Tensor, bool] | tuple[Energies, Tensor, bool, OFData][source]

Run a single-point OFDFT calculation for the given molecule.

Parameters:

mol – The molecule.
sample_generator – The sample generator.
func_factory – The functional factory.
optimizer – The optimizer.
initial_guess_str – The initial guess.
callback – The callback.
ofdft_kwargs – Additional keyword arguments for density optimization.
return_sample – Whether to return the sample as well.

Returns:

The final energies, coefficients and whether the calculation converged. If return_sample is True, also returns the OFData sample.

run_to_checkpoint_path(run_path: Path | str, use_last_ckpt: bool = True) → Path[source]: Get the path to the checkpoint of a run.

set_torch_defaults_worker(id: int, num_threads: int)[source]: Set the torch defaults for a dataloader worker.

worker(process_idx: int, dataset: OFDataset, basis_info: BasisInfo, checkpoint_path: Path, guess_path: Path | None, optimizer: Optimizer, device: str | device, num_workers: int, num_threads: int, model_dtype: str | dtype, xc_functional: str, negative_integrated_density_penalty_weight: float, use_last_ckpt: bool, initialization: str, dataset_statistics_path: Path | str, convergence_criterion: str, plot_queue, plot_every_n: int, save_dir: Path, save_denop_samples: bool, fail_fast: bool, save_coeff_interval: int)[source]: Worker process for density optimization.