dynasigml.dynasig_df

class dynasigml.dynasig_df.DynaSigDF(files_list, exp_measures, exp_labels, output_name, id_func=None, beta_values=None, models=None, models_labels=None, added_atypes_list=None, added_massdef_list=None, use_svib=False, verbose=False, from_other=None, use_localsig=False)

Class representing a set of Dynamical Signatures (DynaSigs), which can be outputted as a data frame.

The DynasigDF class allows for the parallel computation of DynaSigs, stores them with minimal disk space using NumPy .npy binary files, and implements methods to easily combine the separate DynaSigDFs into one big dataframe for further analysis.

files_list

The list of sequence variants PDB files for which to compute the Dynamical Signatures.

Type:

list

exp_measures

A list of experimental measures, in the same order as files_list. There can be more than one measure per variant (in which case exp_measures is a list of lists).

Type:

list

exp_labels

Labels for the experimental measures, in the order they appear in each sub_list. If there is only one measure per variant, type can be str.

Type:

list

n_exp

The number of experimental measures per variant.

Type:

int

outname

The output name where the DynaSigDF object will be saved (with the .pickle extension).

Type:

str

id_func

The function that gives the variant identifier from the PDB filename. By default, it gives the name of the file without the path to the directory and without the extension.

Type:

function

beta_values

The list of beta values for the Entropic Signatures computed. By default, beta=1 is the only value tested.

Type:

list

models

The list of ENM models to run. By default, only ENCoM is run.

Type:

list

models_labels

The labels for the ENM models.

Type:

list

added_atypes_list

Can be used to add custom atom types definitions. Needs to be of the same length as ‘files_list’ (some elements can be None).

Type:

list

added_massdef_list

Same as ‘added_atypes_list’, for mass definition files.

Type:

list

use_svib

If True, the vibrational entropy will be computed and added as a potential predictor variable. Defaults to False because usually the Entropic Signature is enough for the model to capture the vibrational entropy.

Type:

bool

index_dict

Dictionary of parameter combinations for every row index in the data frame.

Type:

dict

params_dict

Dictionary of data frame row indices for every combination of parameters.

Type:

dict

dynasigs_masslabels

List of labels for the masses in the DynaSig. Every PDB file must generate the same labels.

Type:

list

data_array

Storage of all the data as a data frame (each observation on a row), using a NumPy 2D array.

Type:

ndarray

__init__(files_list, exp_measures, exp_labels, output_name, id_func=None, beta_values=None, models=None, models_labels=None, added_atypes_list=None, added_massdef_list=None, use_svib=False, verbose=False, from_other=None, use_localsig=False)

Constructor for the DynaSigDF class.

Parameters:
  • files_list (list) – The list of PDB files on which to compute DynaSigs.

  • exp_measures (list) – List of experimental measures, matching the files_list. If many experimental measures are used, has to be a list of lists: [[measure1_file1, measure2_file1], [measure1_file2, measure2_file2], …]

  • exp_labels (list) – The labels for each of the experimental measures. If only one measure is supplied, can be a string instead of a list.

  • output_name (str) – the output name to use for the saved DynaSigDF object.

  • id_func (function, optional) – function to be ran on the input files to generate the filename IDs (which are used to identify each DynaSig in the final dataframe). If not supplied, defaults to just the filename at the end of the path, minus the extension.

  • beta_values (list, optional) – If supplied, the list of beta values for the computation of DynaSigs. Positive values will use the Entropic Signature, any negative value will compute the MSF (predicted B-factors).

  • models (list, optional) – If supplied, list of ENM models (need to inherit from the ENM metaclass from the NRGTEN package. Defaults to ENCoM.

  • models_labels (list, optional) – List of model labels (to identify the model used in the dataframe).

  • added_atypes_list (list, optional) – Can be used to add custom atom types definitions. Needs to be of the same length as ‘files_list’ (some elements can be None).

  • added_massdef_list (list, optional) – Same as ‘added_atypes_list’, for mass definition files.

  • use_svib (bool, optional) – If True, the vibrational entropy will be computed and added as a potential predictor variable. Defaults to False because usually the Entropic Signature is enough for the model to capture the vibrational entropy.

  • verbose (bool, optional) – If True, progress will be printed. False by default.