dynasigml.dynasig_df
- class dynasigml.dynasig_df.DynaSigDF(files_list, exp_measures, exp_labels, output_name, id_func=None, beta_values=None, models=None, models_labels=None, added_atypes_list=None, added_massdef_list=None, use_svib=False, verbose=False, from_other=None, use_localsig=False)
Class representing a set of Dynamical Signatures (DynaSigs), which can be outputted as a data frame.
The DynasigDF class allows for the parallel computation of DynaSigs, stores them with minimal disk space using NumPy .npy binary files, and implements methods to easily combine the separate DynaSigDFs into one big dataframe for further analysis.
- files_list
The list of sequence variants PDB files for which to compute the Dynamical Signatures.
- Type:
list
- exp_measures
A list of experimental measures, in the same order as files_list. There can be more than one measure per variant (in which case exp_measures is a list of lists).
- Type:
list
- exp_labels
Labels for the experimental measures, in the order they appear in each sub_list. If there is only one measure per variant, type can be str.
- Type:
list
- n_exp
The number of experimental measures per variant.
- Type:
int
- outname
The output name where the DynaSigDF object will be saved (with the .pickle extension).
- Type:
str
- id_func
The function that gives the variant identifier from the PDB filename. By default, it gives the name of the file without the path to the directory and without the extension.
- Type:
function
- beta_values
The list of beta values for the Entropic Signatures computed. By default, beta=1 is the only value tested.
- Type:
list
- models
The list of ENM models to run. By default, only ENCoM is run.
- Type:
list
- models_labels
The labels for the ENM models.
- Type:
list
- added_atypes_list
Can be used to add custom atom types definitions. Needs to be of the same length as ‘files_list’ (some elements can be None).
- Type:
list
- added_massdef_list
Same as ‘added_atypes_list’, for mass definition files.
- Type:
list
- use_svib
If True, the vibrational entropy will be computed and added as a potential predictor variable. Defaults to False because usually the Entropic Signature is enough for the model to capture the vibrational entropy.
- Type:
bool
- index_dict
Dictionary of parameter combinations for every row index in the data frame.
- Type:
dict
- params_dict
Dictionary of data frame row indices for every combination of parameters.
- Type:
dict
- dynasigs_masslabels
List of labels for the masses in the DynaSig. Every PDB file must generate the same labels.
- Type:
list
- data_array
Storage of all the data as a data frame (each observation on a row), using a NumPy 2D array.
- Type:
ndarray
- __init__(files_list, exp_measures, exp_labels, output_name, id_func=None, beta_values=None, models=None, models_labels=None, added_atypes_list=None, added_massdef_list=None, use_svib=False, verbose=False, from_other=None, use_localsig=False)
Constructor for the DynaSigDF class.
- Parameters:
files_list (list) – The list of PDB files on which to compute DynaSigs.
exp_measures (list) – List of experimental measures, matching the files_list. If many experimental measures are used, has to be a list of lists: [[measure1_file1, measure2_file1], [measure1_file2, measure2_file2], …]
exp_labels (list) – The labels for each of the experimental measures. If only one measure is supplied, can be a string instead of a list.
output_name (str) – the output name to use for the saved DynaSigDF object.
id_func (function, optional) – function to be ran on the input files to generate the filename IDs (which are used to identify each DynaSig in the final dataframe). If not supplied, defaults to just the filename at the end of the path, minus the extension.
beta_values (list, optional) – If supplied, the list of beta values for the computation of DynaSigs. Positive values will use the Entropic Signature, any negative value will compute the MSF (predicted B-factors).
models (list, optional) – If supplied, list of ENM models (need to inherit from the ENM metaclass from the NRGTEN package. Defaults to ENCoM.
models_labels (list, optional) – List of model labels (to identify the model used in the dataframe).
added_atypes_list (list, optional) – Can be used to add custom atom types definitions. Needs to be of the same length as ‘files_list’ (some elements can be None).
added_massdef_list (list, optional) – Same as ‘added_atypes_list’, for mass definition files.
use_svib (bool, optional) – If True, the vibrational entropy will be computed and added as a potential predictor variable. Defaults to False because usually the Entropic Signature is enough for the model to capture the vibrational entropy.
verbose (bool, optional) – If True, progress will be printed. False by default.