run_density_optimization
- class SampleGenerator(model_config: DictConfig, model: MLDFTLitModule, negative_integrated_density_penalty_weight: float = 0.0, transform_device: str | device = 'cpu')[source]
Class to generate samples from the model configuration.
- model_config
The model configuration.
- model
The model.
- transforms
The transforms.
- basis_info
The basis information.
- negative_integrated_density_penalty_weight
The weight for the negative integrated density penalty.
- __init__(model_config: DictConfig, model: MLDFTLitModule, negative_integrated_density_penalty_weight: float = 0.0, transform_device: str | device = 'cpu') None[source]
Initialize the SampleGenerator.
- Parameters:
model_config – The model configuration.
model – The model.
negative_integrated_density_penalty_weight – The weight for the negative integrated density penalty.
- classmethod from_run_path(run_path: str | Path, device: str | device = 'cuda', transform_device: str | device = 'cpu', negative_integrated_density_penalty_weight: float = 0.0, use_last_ckpt: bool = True) SampleGenerator[source]
Create a SampleGenerator from a run path.
- Parameters:
run_path – The run path.
device – The device to load the model on.
transform_device – The device to apply the transforms on.
negative_integrated_density_penalty_weight – The weight for the negative integrated density penalty.
ckpt_choice
- Returns:
The instantiated SampleGenerator
- get_functional_factory(xc_functional: str | None = None) FunctionalFactory[source]
Get a functional factory for the model and its config.
- Parameters:
xc_functional – The XC functional to use.
- Returns:
The functional factory.
- add_density_optimization_trajectories_to_sample(sample: OFData, callback: ConvergenceCallback, energies_label: Energies, basis_info: BasisInfo, save_coeff_interval: int = 100)[source]
Add the density optimization trajectories of energies and coefficients to the sample.
- calculate_basis_size(mol: Mole, basis_info: BasisInfo) int[source]
Calculate the size of the basis set for a given molecule.
- Parameters:
mol – The molecule.
basis_info – The basis information object.
- Returns:
The number of basis functions.
- configure_dataset_indices(dataset_size: int, n_molecules: int, molecule_choice: str | list, seed: int | None, start_idx: int = 0) ndarray[source]
Configure the indices of the dataset to optimize.
- main(cfg: DictConfig)[source]
Main function to use hydra main.
Enables to also run the meth:run_ofdft in code.
- parse_run_path(run_path: Path | str) Path[source]
Parse the run path, making it absolute if necessary.
- plotting_worker(plot_queue: Queue, save_dir: Path, basis_info: BasisInfo, enable_grid_plots: bool, save_individual_plots: bool = False, num_threads: int = 8, fail_fast: bool = False)[source]
Worker process for handling plotting.
- run_ofdft(run_path: Path | str, optimizer: Optimizer, guess_path: Path | str | None = None, use_last_ckpt: bool = True, device: device | str = 'cpu', transform_device: device | str = 'cpu', num_processes_per_device: int = 1, num_devices: int = 1, num_workers: int = 1, num_threads_per_process: int = 8, model_dtype: dtype = torch.float64, xc_functional: str = 'PBE', initialization: str = 'minao', n_molecules: int = 1, molecule_choice: str | list[int] = 'first_n', seed: int = 0, log_file: str | None = None, save_individual_plots: bool = False, save_denop_samples: bool = False, plot_every_n: int = 10, swarm_plot_subsample: float = 1.0, ofdft_kwargs: dict = None, split: str = 'val', split_file_path: str | None = None, plot_l1_norm: bool = True, enable_grid_operations: bool = True, l1_grid_level: int = 3, l1_grid_prune: str = 'nwchem_prune', negative_integrated_density_penalty_weight: float = 0.0, convergence_criterion: str = 'last_iter', fail_fast: bool = False, save_coeff_interval: int = 100)[source]
Script to run ofdft using a model checkpoint on multiple molecules.
Note: Right now this only supports density optimizations using a checkpoint.
- Parameters:
run_path (
Path | str) – The path to the run directory.guess_path (
Path | str, optional) – The path to the guess directory. Defaults to None, then the same model is used for the proj_minao guess.use_last_ckpt (
bool, optional) – Whether to use the last checkpoint. Defaults to True.device (
torch.device | str, optional) – The device to run on. Defaults to “cpu”.model_dtype (
torch.dtype, optional) – The dtype of the model. Defaults to torch.float64.xc_functional (
str, optional) – The XC functional to use. Defaults to “PBE”. Irrelevant if the xc functional is part of the model prediction.initialization (
str, optional) – The initialization to use. Defaults to “minao”. Other possible values include “hueckel”, “proj_minao”, “label”.optimizer (
str, optional) – The optimizer to use, e.g. “gradient_descent” or “slsqp”. Defaults to “gradient_descent”.n_molecules (
int, optional) – The number of molecules to optimize. Defaults to 1.molecule_choice (
str | list[int], optional) – The choice of molecules to optimize. Options are “first_n”, “random”, “seeded_random”, or a list of indices. Defaults to “first_n”.log_file (
str | None, optional) – The path to the log file. Defaults to None, then no log file is created.save_individual_plots (
bool, optional) – Whether to keep individual plots for each molecule. Defaults to False.save_denop_samples (
bool, optional) – Whether to save the density optimization trajectories of the samples. Defaults to False.swarm_plot_subsample (
float, optional) – The subsample factor for the swarm plots. Defaults to 1.0.ofdft_kwargs (
dict, optional) – Additional keyword arguments for the OFDFT class. Defaults to None.split (
str, optional) – The split to use, i.e. “val” or “train”. Defaults to “val”.plot_l1_norm (
bool, optional) – Whether to plot the L1 norm of the density error, for which the integration grid is required. Defaults to True.l1_grid_level (
int, optional) – The grid level of the integration grid for the L1 norm. Defaults to 0.l1_grid_prune (
str, optional) – The pruning method for the integration grid for the L1 norm. Defaults to “nwchem_prune”.negative_integrated_density_penalty_weight (
float, optional) – The weight of the negative integrated density penalty. Defaults to 0.0.convergence_criterion (
str, optional) – The convergence criterion for the density optimization.fail_fast (
bool, optional) – Whether to raise an error immediately if a molecule fails. Defaults to False, such that errors are logged, but the script continues. Useful to set to True for debugging.
- run_singlepoint_ofdft(mol: ~pyscf.gto.mole.Mole, sample_generator: ~mldft.ofdft.run_density_optimization.SampleGenerator, func_factory: ~mldft.ofdft.functional_factory.FunctionalFactory, optimizer: ~mldft.ofdft.optimizer.Optimizer = <mldft.ofdft.optimizer.TorchOptimizer object>, initial_guess_str: str = 'minao', callback: ~mldft.ofdft.callbacks.ConvergenceCallback | None = None, ofdft_kwargs=None, return_sample: bool = False) tuple[Energies, Tensor, bool] | tuple[Energies, Tensor, bool, OFData][source]
Run a single-point OFDFT calculation for the given molecule.
- Parameters:
mol – The molecule.
sample_generator – The sample generator.
func_factory – The functional factory.
optimizer – The optimizer.
initial_guess_str – The initial guess.
callback – The callback.
ofdft_kwargs – Additional keyword arguments for density optimization.
return_sample – Whether to return the sample as well.
- Returns:
The final energies, coefficients and whether the calculation converged. If return_sample is True, also returns the OFData sample.
- run_to_checkpoint_path(run_path: Path | str, use_last_ckpt: bool = True) Path[source]
Get the path to the checkpoint of a run.
- set_torch_defaults_worker(id: int, num_threads: int, device: device | str)[source]
Set the torch defaults for a dataloader worker.
- worker(process_idx: int, dataset: OFDataset, basis_info: BasisInfo, checkpoint_path: Path, guess_path: Path | None, optimizer: Optimizer, device: str | device, transform_device: str | device, num_workers: int, num_threads: int, model_dtype: str | dtype, xc_functional: str, negative_integrated_density_penalty_weight: float, use_last_ckpt: bool, initialization: str, dataset_statistics_path: Path | str, convergence_criterion: str, plot_queue, plot_every_n: int, save_dir: Path, save_denop_samples: bool, fail_fast: bool, save_coeff_interval: int)[source]
Worker process for density optimization.