transform_dataset

Applying a basis transformation to the dataset.

To convert a dataset to a new basis, this script can be run with:

python mldft/datagen/transform_dataset.py data=qm9 data/transforms=local_frames

You can adapt data and data/transforms to your needs. The use_cached_data flag has to be set to `false`.

convert_zarr_file(args)[source]

Convert one label file with the new transforms.

is_valid_label(path: Path, spatial_keys_to_check: list[str]) → bool[source]

Check if a label file is valid. Does not guarantee that the file is not broken, but checks that most of the necessary data is present.

main(cfg: DictConfig) → float | None[source]: Hydra entry point for the script.

remove_broken_files(path_list: list[Path], old_label_dir: Path, new_label_dir: Path)[source]

Scan the list of zarr files and remove those that are broken, eg. can not be opened or have no of_labels.

transform_dataset(cfg: DictConfig) → None[source]

Script to basis transform the dataset.

Applies the given transform to the dataset and saves the new labels in a new folder.