summary_density_optimization

_add_energy_boxplot(ax: Axes, energy_errors_dict: dict[str, Tensor], energy_names: tuple[str] = None, title='Energy Contribution Errors', plot_electronic: bool = False)[source]

Add a boxplot of the energy errors of different contributions for a set of molecules.

_add_energy_error_over_n_atoms(ax: Axes, energy_errors_dict: dict, num_atoms: Tensor, energy_name='total', title='Absolute Stopping Energy Error vs Number of Atoms')[source]

Add a scatter plot of the total energy error vs the number of atoms in the molecule.

_add_energy_histogram(ax: Axes, energy_errors_dict: dict, energy_name='total', title='Stopping Energy Error Distribution', cut_off=1000)[source]

Add a histogram of the total energy error for a set of molecules.

_add_energy_mae(ax: Axes, energy_maes: Tensor, energy_stopping_maes: Tensor, energy_name: str = 'total', color: str = 'blue')[source]

Add a plot of the mean energy absolute errors.

_add_mean_density_differences_l2(ax: Axes, density_differences_l2: Tensor, stopping_density_differences_l2: Tensor, color: str = 'blue')[source]

Add a plot of the mean density differences.

_add_mean_gradient_norm(ax: Axes, gradient_norms: Tensor, stopping_indices: Tensor, color='blue')[source]

Add a plot of the mean gradient norms.

_add_mean_relative_density_differences_l1(ax: Axes, num_electrons: Tensor, density_differences_l1: Tensor, stopping_density_differences_l1: Tensor, color: str = 'blue')[source]

Add a plot of the mean density difference l1 norm divided by the number of electrons.

_add_stopping_index_histogram(ax, stopping_indices)[source]

Plot a histogram of the stopping indices.

_compare_initial_to_stopping_gradient_norms(ax: Axes, gradient_norms: tensor, stopping_indices: tensor, ground_state_energy_dict: dict[str, Tensor | float], energy_name: str = 'total')[source]

Scatter plot the initial gradient norm against the stopping gradient norm.

_compare_initial_to_stopping_l2_norms(ax: Axes, initial_l2_norms: Tensor | list[Tensor], stopping_l2_norms: Tensor | list[Tensor], ground_state_energy_dict: dict[str, Tensor | float], energy_name: str = 'total')[source]

Plot the initial density difference against the stopping density difference measured by the L2 norm.

_multi_shape_coeff_mae_scatter(ax: Axes, cumulative_coeff_errors: Tensor, cumulative_counts: Tensor, basis_function_indices: Tensor | ndarray, basis_info: BasisInfo)[source]

Plots the mean absolute coefficient error over basis dimensions for possibly varying number of basis dimensions.

Parameters:
  • cumulative_coeff_errors (torch.Tensor) – Cumulative sum of coefficient absolute errors retrieved by cumulate_coeff_error().

  • cumulative_counts (torch.Tensor) – Counts of basis dimension appearances retrieved by cumulate_coeff_error().

  • basis_function_indices (torch.Tensor | np.ndarray) – A tensor of all basis dimensions present in the basis_info. Differs from a sample.basis_function_ind.

  • basis_info (BasisInfo) – The dataset’s basis_info.

_single_shape_coeff_mae_scatter(ax: Axes, coeff_errors: Tensor | list[Tensor], basis_info: BasisInfo, basis_function_ind: Tensor)[source]

Plots the mean absolute density coefficient error over basis dimensions.

This simple version can be called, if all molecules display the same coefficient shape.

cumulate_coeff_error(sample: OFData, pred_ground_state_coeffs: Tensor, cumulative_error: Tensor, cumulative_counts: Tensor) tuple[Tensor, Tensor][source]

Cumulates the absolute error of coefficients by adding the (stopping) basis_function wise error onto the cumulative error.

This is necessary if density coefficient dimensions vary across molecules. The cumulative counts are returned as well in order to build the average afterwards.

custom_round(value: float)[source]

Determine value’s order of magnitude, find the next highest integer in that magnitude and add one integer.

density_differences_swarm_line_plot(ax: Axes, stopping_indices: Tensor, density_differences_l2: Tensor, n_molecules: int = None, subsample: float = 1.0)[source]

Plot the energy error for a set of molecules as a swarm plot with a line for each molecule.

density_optimization_summary_pdf_plot(out_pdf_path: Path | str, run_data_dict: dict, matplotlib_backend: str = 'pdf', subsample: float = 1.0)[source]

Create a summary page for the density optimization for a set of molecules.

Parameters:
  • out_pdf_path – Path to the output PDF file.

  • run_data_dict – Dictionary containing the n_molecules, basis_info, basis_function_indices, stopping indices, energy trajectories, gradient norms, coefficient trajectories and density differences for the set of molecules.

  • matplotlib_backend – Matplotlib backend to use for plotting. By default the pdf backend is used to create vectorized PDF plots, while for instance ‘agg’ could be used to create rasterized PNG plots.

  • subsample – Fraction of molecules to plot in the swarm plots (individual molecule trajectories). By default (1.0), all molecules are plotted. Only takes effect, if n_molecules is larger than the number of molecules to plot and n_molecules > 2.

density_optimization_swarm_plot(energy_trajectories_dict: dict[str, Tensor], energy_ground_state_dict: dict[str, Tensor | float], gradient_norms: Tensor | list[Tensor], density_differences_l2: Tensor | list[Tensor], stopping_indices: Tensor, n_molecules: int = None, energy_names: tuple[str] | str = None, subsample: float = 1.0, **_)[source]

Summarize the density optimization process for a set of molecules, by plotting a line for every molecule, showing the energy error, gradient norm and L2 norm of the density error.

energy_error_swarm_line_plot(ax: Axes, stopping_indices: Tensor, energy_name: str, energy_errors_dict: dict[str, Tensor] = None, energy_errors: Tensor = None, n_molecules: int = None, subsample: float = 1.0)[source]

Plot the energy error for a set of molecules as a swarm plot with a line for each molecule.

get_density_difference_l1_norm(sample: OFData, basis_info: BasisInfo, l1_grid_level: int = 3, l1_grid_prune: str = 'nwchem_prune') Tensor[source]

Get the L1 norm of the difference between the predicted and target densities for a single sample.

get_density_difference_l2_norm(sample: OFData)[source]

Get the L2 norm of the difference between the predicted and target densities for a single sample.

get_energy_error(sample: OFData, energy_name: str)[source]

Get the (signed) energy error of ‘energy_name’ for a single sample in mHa.

get_energy_errors_dict(energy_trajectories_dict: dict[str, list[Tensor]], energy_ground_state_dict: dict[str, Tensor], energy_names: tuple[str] = None) tuple[dict[str, list], dict[str, list]][source]

Calculate the (signed) energy errors for a set of molecules from the energy trajectories and their ground state values.

get_overlapping_mean(data: list[Tensor | ndarray])[source]

Compute the mean of a list of 1D tensors/arrays with different length for overlapping regions.

get_overlapping_quantiles(data: list[Tensor | ndarray], quantiles: list[float])[source]

Compute the quantiles of a list of 1D tensors/arrays with different length for overlapping regions.

get_runwise_density_optimization_data(sample_dir: Path | str, n_molecules: int = None, energy_names: str = None, plot_l1_norm: bool = True, l1_grid_level: int = 3, l1_grid_prune: str = 'nwchem_prune') dict[source]

Get the data for a set of molecules for the density optimization process from a directory of saved samples.

Parameters:
  • sample_dir – Path to the directory containing the samples. These samples are expected to result from the mldft.ofdft.run_ofdft.py script.

  • n_molecules – Number of molecules to plot. If None, all molecules in the directory are plotted.

  • energy_names – Names of the energies to be plotted. If None, all available energies are plotted.

Returns:

Dictionary containing the stopping indices, energy trajectories, gradient

norms, coefficient trajectories and density differences for the set of molecules.

Return type:

run_data_dict

gradient_norm_swarm_line_plot(ax: Axes, stopping_indices: Tensor, gradient_norms: Tensor, n_molecules: int = None, subsample: float = 1.0)[source]

Plot the gradient norms for a set of molecules as a swarm plot with a line for each molecule.

initialize_energy_dicts(sample_dir: Path | str, n_molecules: int = None, energy_names: tuple[str] = None)[source]

Initialize the energy trajectories dictionary for a set of molecules.

We load the first sample in the given directory. As of yet, this seems necessary in order to check for available energy names (ground state and trajectories) and not do it within the loop over the sample directory itself. If the energy names are not provided, all available energies are extracted from the first sample.

plot_density_optimization_trajectory_means(energy_trajectories_dict: dict[str, Tensor], energy_ground_state_dict: dict[str, Tensor | float], coeffs_trajectories: Tensor | list[Tensor], coeffs_ground_state: Tensor | list[Tensor], gradient_norms: Tensor | list[Tensor], density_differences_l2: Tensor | list[Tensor], stopping_indices: Tensor, num_electrons: Tensor, n_molecules: int, energy_names: tuple[str] | str = None, **kwargs)[source]

Summarize the density optimization process for a set of molecules, by plotting the mean energy difference, gradient norm and density difference.

plot_energy_summary_scatter(energy_trajectories_dict: dict[str, Tensor], energy_ground_state_dict: dict[str, Tensor | float], stopping_indices: Tensor, energy_names: tuple[str] | str = None, n_molecules: int = None, **_)[source]

Scatter plots of the stopping over initial energy errors for a set of molecules.

Parameters:
  • energy_trajectories_dict – Dictionary containing the energy trajectories for each molecule. Keys are energy names and values are tensors of shape (n_molecules, n_cycles).

  • energy_ground_state_dict – Dictionary containing the ground state energies for each molecule. Keys are energy names and values are tensors of shape (n_molecules,).

  • stopping_indices – Tensor containing the stopping indices for each molecule determined by the convergence criteria of the density optimization run.

  • energy_names – Tuple of energy names to be plotted.

  • n_molecules – Number of molecules to be plotted.

plot_mean_and_fill_between(data: Tensor, ax: Axes, color: str = 'blue')[source]

Plot the mean of the data and fill the area between mean - std and mean + std.

plot_ofdft_energy_distribution(energy_trajectories_dict: dict[str, Tensor], energy_ground_state_dict: dict[str, Tensor | float], stopping_indices: Tensor, energy_names: tuple[str] | str = None, num_atoms: Tensor = None, **_)[source]

Create a energy distribution summary page for the density optimization process for a set of molecules.

plot_ofdft_run_summary(energy_trajectories_dict: dict[str, list[Tensor]], energy_ground_state_dict: dict[str, Tensor | float], coeffs_trajectories: Tensor | list[Tensor], coeffs_ground_state: Tensor | list[Tensor], coeffs_pred_ground_state: Tensor | list[Tensor], gradient_norms: Tensor | list[Tensor], density_differences_l2: Tensor | list[Tensor], stopping_indices: Tensor, n_molecules: int, energy_names: tuple[str] | str = None, basis_info: BasisInfo = None, basis_function_indices: Tensor = None, sample_basis_function_ind: Tensor = None, cumulative_coeff_error: Tensor = None, cumulative_counts: Tensor = None, **_)[source]

Create a summary page for the density optimization process for a set of molecules.

plot_quantiles_data(data: Tensor | ndarray, ax: Axes, color: str = 'blue', quantiles=[0.0, 0.1, 0.9, 1.0], quantile_colors=['lightgreen', 'blue', 'lightcoral'])[source]

Plot ensemble data on a given Axes object.

Parameters: - data: 2D array of shape (num_datasets, num_points) - ax: matplotlib Axes object to plot on - color: color for the mean line (default: ‘blue’) - quantiles: list of quantiles to plot (default: [0.,0.1,0.9,1.]) - quantile_colors: list of colors for quantile areas (default: [“lightgreen”,”blue”, “lightcoral”])

save_density_optimization_metrics(output_path: Path, run_data_dict: dict)[source]

Save the density optimization metrics for a set of molecules to a file.

Parameters:
  • output_path – Path where to save the metrics yaml file.

  • run_data_dict – Dictionary containing the n_molecules, basis_info, basis_function_indices, stopping indices, energy trajectories, gradient norms, coefficient trajectories and density differences for the set of molecules.

stopping_index_line_plot(ax: Axes, data_array: Tensor | ndarray, stopping_index: int, color='blue', **kwargs)[source]

Plot the data array as solid line up to the stopping index and dashed line afterwards.

subsample_swarm(ax: Axes, n_molecules: int, subsample: float, data_array: Tensor, stopping_indices: Tensor, label: str = 'error') tuple[Tensor, Tensor, int][source]

Subsample the data array and stopping indices and plot the min and max on given axis.