DatasetGeneration/

Generate a protein pool to create IP and FI datasets.

Protein:

Generate either a convex or concave hull based on specified alpha and num_points. Hulls are created by radially distributing random points, then optimizing a perimeter based on the specified alpha. Shapes can be convex by setting relatively lower alpha and higher num_points. Hull coordinates are then converted to grid based shapes, where pixels within the hull perimeter are assigned a value of 1, and 0 otherwise.

ProteinPool:

The protein pool is generated by creating protein shapes using input parameter distributions for alpha (the concavity of the shapes) and num_points (the number of points used in shape hull generation). This file is saved as .pkl and loaded in DatasetGenerator to compute interactions.

DatasetGenerator:

Load/create shapes from the protein pool and compute interactions for IP (docking pose prediction) and FI (fact-of-interaction) datasets. Interactions that score below specified decision thresholds are added to their respective dataset. Dataset generation figures and statistics can be generated and saved to file.