check_shape_distributions.py
- class Dock2D.Tests.check_shape_distributions.ShapeDistributions(protein_pool, dataset_name, show=False)
- __init__(protein_pool, dataset_name, show=False)
Initialize checks for generated protein pool.
- Parameters
protein_pool – protein pool filename.pkl
dataset_name – data set name
show – show plots (does not affect saving)
- check_missing_examples(combination_list, found_list, protein_shapes, params_list)
Check for unrepresented combinations of parameters and regenerate them purely for plotting purposes. For example, given a small protein pool and a broad distribtution of parameters, the tail combination of parameters might never be encountered and saved to the protein pool.
Note
This is done just to get an idea of what the desired shape distribution might look like, but the user must either increase the protein pool size or increase relative probabilities of each parameter missing. Increasing protein pool size is the simplist solution.
- Parameters
combination_list – parameter combinations expected to encounter
found_list – parameter combinations encountered
protein_shapes – protein shapes extracted from protein pool file
params_list – list of parameters extracted from protein pool file
- Returns
indices
- get_counts(counts)
Used to get unique, sorted parameters.
- Parameters
counts – unique counts of current parameter
- Returns
unique, counts
- get_dict_counts(shape_params)
Initial counts of shape generating parameters used from current protein pool
- Parameters
shape_params – list of shape parameters in order
- Returns
alpha_counts, numpoints_counts, params_list
- get_shape_distributions(data, debug=False)
Parse protein pool file and get shape distributions based on parameter combinations.
- Parameters
data – loaded protein pool .pkl
- Returns
shapes_plot, alphas_packed, numpoints_packed
- get_unique_fracs(counts, dataname)
Fraction of total counts each unique parameter represents in current protein pool
- Parameters
counts – counts of each current parameter
dataname – name of protein pool
- Returns
unique, fracs, barwidth
- plot_shapes_and_params(plot_pub=False, debug=False)
Plot a 2D array of example shapes generated using all desired alpha and num_points parameter combinations and plot a two histograms opposite the axes corresponding to each parameter.