transform_dataset

Applying a basis transformation to the dataset.

To convert a dataset to a new basis, this script can be run with:

python mldft/datagen/transform_dataset.py data=qm9 data/transforms=local_frames

You can adapt data and data/transforms to your needs. The use_cached_data flag has to be set to `false`.

convert_zarr_file(args)[source]

Convert one label file with the new transforms.

Parameters:

args – Tuple of arguments for the convert function.

is_valid_label(path: Path, spatial_keys_to_check: list[str]) bool[source]

Check if a label file is valid. Does not guarantee that the file is not broken, but checks that most of the necessary data is present.

Parameters:

path (Path) – Path to the label file.

Returns:

True if the label file is valid, False otherwise.

Return type:

bool

main(cfg: DictConfig) float | None[source]

Hydra entry point for the script.

remove_broken_files(path_list: list[Path], old_label_dir: Path, new_label_dir: Path)[source]

Scan the list of zarr files and remove those that are broken, eg. can not be opened or have no of_labels.

Parameters:

path_list (list) – List of paths to zarr files.

transform_dataset(cfg: DictConfig) None[source]

Script to basis transform the dataset.

Applies the given transform to the dataset and saves the new labels in a new folder.

Parameters:

cfg (DictConfig) – Config, see configs/ml/train.yaml.