generate_labels_dataset

Compute labels for molecules in the dataset and save them as zarr.zip files.

Can be run with the mldft_labelgen command.

get_id_and_sample_id_from_chk_file(file: Path) tuple[int, int | None][source]

Get the molecule id and sample id from the .chk file.

Parameters:

file – The .chk file.

Returns:

Tuple of molecule id and sample id.

main(cfg: DictConfig)[source]

Hydra entry point for label generation.

run_label_generation(cfg: DictConfig)[source]

Run the label generation for the dataset.

run_labelgen_task(dataset: DataGenDataset, chk_file: Path, calculation_fct: callable, orbital_basis: str, kohn_sham_basis: str, of_basis_set: str)[source]

Run the label generation task for a single molecule.

save_dataset_info(cfg: DictConfig, label_dir: Path)[source]

Save relevant config information for the dataset in a yaml file.