# gaudi.objectives¶

These are the built-in objectives in GaudiMM. You can also build your own, but these are ready to use.

## Angle objective¶

This objective calculates the angle formed by three given atoms (or the dihedral, if four atoms are given) and returns the absolute difference of that angle and the target value.

class gaudi.objectives.angle.Angle(threshold=None, probes=None, *args, **kwargs)

Angle class

Parameters: threshold (float) – Optimum angle probes (list of str) – Atoms that make the angle, expressed as a series of / strings Deviation from threshold angle, in degrees float
evaluate(ind)

Return the score of the individual under the current conditions.

probes(ind)
gaudi.objectives.angle.enable(**kwargs)

## Contacts objective¶

This objective provides a wrapper around Chimera’s DetectClash that detects clashes and contacts. Clashes are understood as steric conflicts that increases the energy of the system. They are evaluated as the sum of volumetric overlapping of the Van der Waals’ spheres of the implied atoms. Contacts are considered as stabilizing, and they are evaluated with a Lennard-Jones 12-6 like function.

class gaudi.objectives.contacts.Contacts(probes=None, radius=5.0, which='hydrophobic', clash_threshold=0.6, hydrophobic_threshold=-0.4, cutoff=0.0, hydrophobic_elements=('C', 'S'), bond_separation=4, same_residue=True, only_internal=False, *args, **kwargs)

Contacts class :param probes: Name of molecule gene that is object of contacts analysis :type probes: str :param radius: Maximum distance from any point of probes that is searched

for possible interactions
Parameters: which ({'hydrophobic', 'clashes'}) – Type of interactions to measure clash_threshold (float, optional) – Maximum overlap of van-der-Waals spheres that is considered as a contact (attractive). If the overlap is greater, it’s considered a clash (repulsive) hydrophobic_threshold (float, optional) – Maximum overlap for hydrophobic patches. hydrophobic_elements (list of str, optional, defaults to [C, S]) – Which elements are allowed to interact in hydrophobic patches cutoff (float, optional) – If the overlap volume is greater than this, a penalty is applied. Useful to filter bad solutions. bond_separation (int, optional) – Ignore clashes or contacts between atoms within n bonds. only_internal (bool, optional) – If set to True, take into account only intramolecular interactions, defaults to False Lennard-Jones-like energy when which=hydrophobic, and volumetric overlap of VdW spheres in A³ if which=clashes. float
evaluate_clashes(ind)
evaluate_hydrophobic(ind)
find_interactions(ind)
molecules(ind)
probes(ind)
gaudi.objectives.contacts.enable(**kwargs)

## Coordination objective¶

This objective performs rough estimations of good orientations of ligating residues in a protein to coordinate a given metal or small molecule. The geometry is approximated by computing average distances from ligating atoms the metal centre (self.probe) as well as the angles formed by the probe, the ligating atom and its immediate neighbor. Good planarity is assured by a dihedral check.

class gaudi.objectives.coordination.Coordination(probe=None, radius=3.0, atom_types=(), atom_elements=(), atom_names=(), residues=(), geometry='tetrahedral', distance=0, min_atoms=1, prevent_intruders=True, enforce_all_residues=False, only_one_ligand_per_residue=False, center_of_mass_correction=False, distance_correction=False, *args, **kwargs)

Coordination class

Parameters: probe (tuple) – The atom that acts as the metal center, expressed as /. This will be parsed later on. residues (list of str) – Residues that must coordinate to probe, expressed as /. Position can be *. radius (float, optional, default=3.0) – Distance from probe where ligating atoms must be found atom_types (list of str, optional) – Types of atoms that are considered ligands to probe atom_names (list of str, optional) – Names of atoms that are considered ligands to probe atom_elements (list of str, optional) – Elements of atoms that are considered ligands to probe distance (float, optional) – Perfect distance a ligand atom should be from target. geometry (str or list of 3-tuple floats, optional) – Which geometry should be fitted. Choose from GEOMETRIES dict or specify a set of vectors. enforce_all_residues (bool, optional) – Whether to force or not if all specified residues should coordinate. only_one_ligand_per_residue (bool, optional) – Enforce that only one ligand for each residue should coordinate. prevent_intruders (bool, optional) – Don’t let non-ligand atoms to be closer to the target than the selected ligand atoms. center_of_mass_correction (bool, optional) – If True, calculate the distance between the metal center and the center of mass of the ligand atoms, and sum that to the final score. distance_correction (bool, optional) – If True, report the deviation of the experimental coordination bond length and the ideal one, as tabulated by chimera.Element, and sum that to the final score. Sum of RMSD of vertices from ideal RMSD and average cosine of angle deviation from ideal orientation of ligand neighbors. A perfect match should report 0.0. float
coordination_sphere(ind)

1. Get atoms and residues found within self.radius angstroms from self.probe. Found residues MUST include self.residues. Otherwise, apply penalty.

2. Sort atoms by absolute difference of self.distance and distance to self.probe. That way, nearest atoms are computed first. If found atoms do not include some of the requested types, apply penalty.

evaluate(ind)
1. Get requested atoms sorted by distance
2. If they meet the minimum quantity, return the rmsd for that geometry
3. If that’s not possible of they are not enough, return penalty
molecules(ind)
probe(ind)
residues(ind)
gaudi.objectives.coordination.enable(**kwargs)
gaudi.objectives.coordination.ideal_bond_deviation(metal, ligand, other_ligands=())

Assess if the current bond vector is well oriented with respect to the ideal bond vector.

Parameters: metal (chimera.Atom) – The ion ligands are coordinating to ligand (chimera.Atom) – Potential ligand atoms to metal Absolute sine of the angle between the ideal vector and the ligand-metal one. float
gaudi.objectives.coordination.ideal_bonded_positions(atom, element, geometry=None)

## Distance objective¶

This objective calculates the distance between two given atoms. It returns the absolute difference between the calculated distance and the target value.

class gaudi.objectives.distance.Distance(threshold=None, tolerance=None, target=None, probes=None, center_of_mass=False, *args, **kwargs)

Distance class

Parameters: threshold (float) – Optimum distance to meet tolerance (float) – Maximum deviation from threshold that is not penalized target (str) – The atom to measure the distance to, expressed as / probes (list of str) – The atoms whose distance to target is being measured, expressed as /. If more than one is provided, the average of all of them is returned center_of_mass (bool) – (Mean of) absolute deviation from threshold distance, in A. float
atoms(ind, *targets)
evaluate_center_of_mass(ind)
evaluate_distances(ind)

Measure the distance

gaudi.objectives.distance.enable(**kwargs)

## DrugScoreX objective¶

This objective is a wrapper around the binaries provided by Neudert and Klebe and calculates the score of the current pose.

The lower, the better, so usually you will use a -1.0 weight.

class gaudi.objectives.dsx.DSX(binary=None, potentials=None, proteins=('Protein', ), ligands=('Ligand', ), terms=None, sorting=1, cofactor_mode=0, with_covalent=False, with_metals=True, *args, **kwargs)

DSX class

Parameters: protein (str) – The molecule name that is acting as a protein ligand (str) – The molecule name that is acting as a ligand binary (str, optional) – Path to the DSX binary. Only needed if drugscorex is not in PATH. potentials (str, optional) – Path to DSX potentials. Only needed if DSX_POTENTIALS env var has not been set by the installation process (conda install -c insilichem drugscorex normally takes care of that). terms (list of bool, optional) – Enable (True) or disable (False) certain terms in the score function in this order: distance-dependent pair potentials, torsion potentials, intramolecular clashes, sas potentials, hbond potentials sorting (int, defaults to 1) – Sorting mode. An int between 0-6, read binary help for -S: -S int : Here you can specify the mode that affects how the results will be sorted. The default mode is '-S 1', which sorts the ligands in the same order as they are found in the lig_file. The following modes are possible:: 0: Same order as in the ligand file 1: Ordered by increasing total score 2: Ordered by increasing per-atom-score 3: Ordered by increasing per-contact-score 4: Ordered by increasing rmsd 5: Ordered by increasing torsion score 6: Ordered by increasing per-torsion-score  cofactor_mode (int, defaults to 0) – Cofactor handling mode. An int between 0-7, read binary help for -I: -I int : Here you can specify the mode that affects how cofactors, waters and metals will be handeled. The default mode is '-I 1', which means, that all molecules are treated as part of the protein. If a structure should not be treated as part of the protein you have supply a seperate file with seperate MOLECULE entries corresponding to each MOLECULE entry in the ligand_file (It is assumed that the structure, e.g. a cofactor, was kept flexible in docking, so that there should be a different geometry corresponding to each solution. Otherwise it won't make sense not to treat it as part of the protein.). The following modes are possible: 0: cofactors, waters and metals interact with protein, ligand and each other 1: cofactors, waters and metals are treated as part of the protein 2: cofactors and metals are treated as part of the protein (waters as in mode 0) 3: cofactors and waters are treated as part of the protein 4: cofactors are treated as part of the protein 5: metals and waters are treated as part of the protein 6: metals are treated as part of the protein 7: waters are treated as part of the protein Please note: Only those structures can be treated individually, which are supplied in seperate files.  with_covalent (bool, defaults to False) – Whether to deal with covalently bonded atoms as normal atoms (False) or not (True) with_metals (bool, defaults to True) – Whether to deal with metal atoms as normal atoms (False) or not (True) Interaction energy as reported by DSX output logs. float
clean()
evaluate(ind)

Run a subprocess calling DSX binary with provided options, and parse the results. Clean tmp files at exit.

get_molecule_by_name(ind, *names)

Get a molecule gene instance of individual by its name

parse_output(stream)
prepare_command()
prepare_ligands(ligands)
prepare_proteins(proteins)
gaudi.objectives.dsx.enable(**kwargs)

## Energy (OpenMM) objective¶

This objective is a wrapper around OpenMM, providing a GPU-accelerated energy calculation of the system with a simple forcefield evaluation.

class gaudi.objectives.energy.Energy(targets=None, forcefields=('amber99sbildn.xml', ), auto_parametrize=None, parameters=None, platform=None, *args, **kwargs)

Calculate the energy of a system

Parameters: targets (list of str, default=None) – If set, which molecules should be evaluated. Else, all will be evaluated. forcefields (list of str, default=('amber99sbildn.xml',)) – Which forcefields to use auto_parametrize (list of str, default=None) – List of Molecule instances GAUDI should try to auto parametrize with antechamber. parameters (list of 2-item list of str) – List of (gaff.mol2, .frcmod) files to use as parametrization source. platform (str) – Which platform to use for calculations. Choose between CPU, CUDA, OpenCL. The estimated potential energy, in kJ/mol float
calculate_energy(coordinates)

Set up an OpenMM simulation with default parameters and return the potential energy of the initial state

Parameters: coordinates (simtk.unit.Quantity) – Positions of the atoms in the system potential_energy – Potential energy of the system, in kJ/mol float
static chimera_molecule_to_openmm_positions(*molecules)
static chimera_molecule_to_openmm_topology(*molecules)

Convert a Chimera Molecule object to OpenMM structure, providing topology and coordinates.

Parameters: molecule (chimera.Molecule) – topology (simtk.openmm.app.topology.Topology) coordinates (simtk.unit.Quantity)
evaluate(individual)

Calculates the energy of current individual

Notes

For static calculations, where molecules are essentially always the same, but with different coordinates, we only need to generate topologies once. However, for dynamic jobs, with potentially different molecules involved each time, we cannot guarantee having the same topology. As a result, we generate it again for each evaluation.

molecules(individual)
simulation

Build a new OpenMM simulation if not yet defined and return it

Notes

self.topology must be defined previously! Use self.chimera_molecule_to_openmm_topology to set it.

gaudi.objectives.energy.calculate_energy(filename, forcefields=None)

Calculate energy from PDB file with desired forcefields. If not specified, amber99sbildn will be used. Returns potential energy in kJ/mol.

gaudi.objectives.energy.enable(**kwargs)

## GOLD objective¶

This objective is a wrapper around the scoring functions provided by CCDC’s GOLD.

It will use the rescoring abilities in GOLD to extract the fitness corresponding to any of the available scoring functions:

• GoldScore (goldscore)
• ChemScore (chemscore)
• Astex Statistical Potential (asp)
• CHEMPLP (chemplp)

Since GOLD is commercial software, you will need to install it separately and provide a valid license! This is just a wrapper. Make sure to set all the needed environment variables, such as CCDC_LICENSE_FILE, and that ‘gold_auto’ is in PATH. Check tests/test_objectives_gold.py for an example; make sure to have GOLDXX/bin before GOLDXX/GOLD/bin! class gaudi.objectives.gold.Gold(protein='Protein', ligand='Ligand', scoring='chemscore', score_component='Score', radius=10, *args, **kwargs) Gold class Parameters: protein (str) – The name of molecule acting as protein ligand (str) – The name of molecule acting as ligand scoring (str, optional, defaults to chemscore) – Fitness function to use. Choose between chemscore, chemplp, goldscore and asp. score_component (str, optional, defaults to 'Score') – Scoring fields to parse out of the rescore.log file, such as Score, DG, S(metal), etc. radius (float, optional, defaults to 10.0) – Radius (in A) of binding site sphere, the origin of which is automatically centered at the ligand’s center of mass. Interaction energy as reported by GOLD’s chosen scoring function float clean() evaluate(ind) Run a subprocess calling LigScore binary with provided options, and parse the results. Clean tmp files at exit. get_molecule_by_name(ind, *names) Get a molecule gene instance of individual by its name origin(molecule) parse_output(filename) Get last word of first line (and unique) and parse it into float prepare_command(protein_path, ligand_path, origin) prepare_ligands(ligands) prepare_proteins(proteins) gaudi.objectives.gold.enable(**kwargs) ## Hydrogen bonds objective¶ This objective is a wrapper around Chimera’s FindHBond. It returns the number of hydrogen bonds that can be formed between the target molecule and its environment. .. todo: Evaluate the possible HBonds with some kind of function that gives a rough idea of the strength (energy) of each of them.  class gaudi.objectives.hbonds.Hbonds(probes=None, radius=5.0, distance_tolerance=0.4, angle_tolerance=20.0, only_intermolecular=True, only_probes=False, *args, **kwargs) Hbonds class :param probes: Names of molecules being object of analysis :type probes: list of str :param radius: Maximum distance from any point of probe that is searched for a possible interaction Parameters: distance_tolerance (float, optional) – Allowed deviation from ideal distance to consider a valid H bond. angle_tolerance (float, optional) – Allowed deviation from ideal angle to consider a valid H bond. only_intermolecular (boolean, optional) – Only intermolecular interactions are considered (defaults to True) only_probes (boolean, optional) – Only interactions between probe molecules are considered, excluding other molecule genes. (defaults to False) Number of detected Hydrogen bonds. int display(bonds) Mock method to show a graphical depiction of the found H Bonds. evaluate(ind) Find H bonds within self.radius angstroms from self.probes, and return only those that interact with probe. Ie, discard those hbonds in that search space whose none of their atoms involved are not part of self.probe. molecules(ind) probes(ind) gaudi.objectives.hbonds.enable(**kwargs) ## Inertia objective¶ This objective calculates the alignment between the axes of inertia of the given molecules. class gaudi.objectives.inertia.AxesOfInertia(reference=None, targets=None, only_primaries=False, threshold=0.84, *args, **kwargs) Calculates the axes of inertia of given molecules and returns their alignment deviation. Parameters: reference (str) – Molecule name targets should align to. targets (list of str) – Names of molecules to be aligned to reference threshold (float) – Target average of cosine of angle of alignment between targets and reference. only_primaries (bool) – Consider only the largest inertia vectors. Mean absolute difference of threshold alignment and mean of all the cosines involved for each axis. float evaluate(individual) Return the score of the individual under the current conditions. reference(individual) The reference molecule. Usually, the biggest in size targets(individual) gaudi.objectives.inertia.calculate_alignment(reference_axis, *probes_axes) gaudi.objectives.inertia.calculate_axes_of_inertia(molecule) gaudi.objectives.inertia.calculate_inertial_matrix(coordinates, masses) gaudi.objectives.inertia.centroid(coordinates, masses) gaudi.objectives.inertia.enable(**kwargs) ## LigScore objective¶ This objective is a wrapper around the scoring fuction provided by IMP’s ligand_score. The lower, the better, so usually you will use a -1.0 weight. class gaudi.objectives.ligscore.LigScore(proteins=('Protein', ), ligands=('Ligand', ), method='pose', binary=None, library=None, *args, **kwargs) LigScore class Parameters: proteins (list of str) – The name of molecules that are acting as proteins ligands (list of str) – The name of molecules that are acting as ligands binary (str, optional) – Path to ligand_score executable library (str, optional) – Path to LigScore lib file Interaction energy as reported by IMP’s ligand_score. float clean() evaluate(ind) Run a subprocess calling LigScore binary with provided options, and parse the results. Clean tmp files at exit. get_molecule_by_name(ind, *names) Get a molecule gene instance of individual by its name parse_output(stream) Get last word of first line (and unique) and parse it into float prepare_command(protein_path, ligand_path) prepare_ligands(ligands) prepare_proteins(proteins) gaudi.objectives.ligscore.enable(**kwargs) ## NWChem objective¶ This objective is a wrapper around NWChem. It expects an additional input template with the keywordMOLECULE, which will be replaced by the currently expressed molecule(s). See TEMPLATE for an example, which works as a default template if none is provided.

A ~/.nwchemrc file should be present. If you installed NWChem with our conda recipe, you will find the file in $CONDA_PREFIX/etc/default.nwchemrc. Copy it to your$HOME.

class gaudi.objectives.nwchem.NWChem(template=None, targets=('Ligand', ), parser=None, title=None, executable=None, basis_library=None, processors=None, *args, **kwargs)

NWChem class

Parameters: targets (list of str) – Molecule name(s) to be processed with NWChem. Small ones! template (str, optional) – NWChem input template (or path to a file with such contents) containing a $MOLECULE placeholder to be replaced by the currently expressed molecule(s) requested in targets, and optionally, a$TITLE placeholder to be replaced by the job name. If not provided, it will default to the TEMPLATE example (single-point dft energy). parser (str, optional) – Path to a Python script containing a top-level function called parse_output which will parse the NWChem output and return a float. This replaces the default parser, which looks for the last ‘Total energy’ value. processors (int, optional=None) – Number of physical processors to use with openmpi Any numeric value as reported by the parser routines. By default, last ‘Total energy’ value. float
clean()
evaluate(ind)

Run a subprocess calling DSX binary with provided options, and parse the results. Clean tmp files at exit.

get_molecule_by_name(ind, *names)

Get a molecule gene instance of individual by its name

get_xyz(*molecules)
parse_output(stream)
prepare_nwfile(*molecules)
gaudi.objectives.nwchem.enable(**kwargs)

## Solvation objective¶

This objective calculates SASA for the given system (or region).

class gaudi.objectives.solvation.Solvation(targets=None, threshold=0.0, radius=5.0, method='area', *args, **kwargs)

Solvation class

Parameters: targets ([str]) – Names of the molecule genes being analyzed threshold (float, optional, default=0) – Optimize the difference to this value radius (float, optional, default=5.0) – Max distance to search for neighbor atoms from targets. method (str, optional, default=area) – Which method should be used. Both methods compute the surface of the solvated molecule. area returns the surface area of such surface, while volume returns the volume occuppied by the model. Surface area of solvated shell, in A² (if method=area), or volume of solvated shell, in A³ (if method=volume). float
evaluate_area(ind)
evaluate_volume(ind)
molecules(ind)
surface(ind)
targets(ind)
zone_atoms(probes, molecules)
gaudi.objectives.solvation.enable(**kwargs)
gaudi.objectives.solvation.grid_sas_surface(atoms, probe_radius=1.4, grid_spacing=0.5)

Stripped from Chimera’s Surface.gridsurf

## Vina objective¶

This objective is a wrapper around the scoring functions provided by AutoDock Vina.

class gaudi.objectives.vina.Vina(receptor='Protein', ligand='Ligand', prepare_each=False, *args, **kwargs)

Vina class

Parameters: receptor (str) – Key of the gene containing the molecule acting as receptor (protein) ligand (str) – Key of the gene containing the molecule acting as ligand prepare_each (bool) – Whether to prepare receptors and ligands in every evaluation or try to cache the results for faster performance. Interaction energy in kcal/mol, as reported by AutoDock Vina –score-only. float

Notes

• AutoDock scripts prepare_ligand4.py and prepare_receptor4.py are

used to prepare the corresponding .pdqt files that will be used as input for AutoDock Vina scorer. - No repairs nor cleanups will be performed on ligand/receptor molecules, so the user has to take into account that provided .mol2 or .pdb files have correct atom types and correct structure (including Hydrogen atoms that will be taken into account in the docking evaluation). Otherwise, AutoDock errors/warnings could appear (e.g. ValueError: Could not find atomic number for Lp Lp) - Gasteiger charges will be added during the preparation of the .pdbqt files. - All torsions of the ligand will be marked as inactive for AutoDock, because torsion changes are part of GaudiMM genes.

clean()
evaluate(ind)

Run a subprocess calling Vina binary with provided options, and parse the results. Clean tmp files at exit.

parse_output(stream)
prepare_ligand(molecule)
prepare_receptor(molecule)
tmpfile
gaudi.objectives.vina.enable(**kwargs)

## Volume objective¶

This objective calculates the volume occupied by the requested Molecule gene instance.

Note

Volume is calculated from the surfacePiece created by a new experimental method found in Chimera’s Surface.gridsurf. This could be used as an objective for SES, instead of Solvation.

class gaudi.objectives.volume.Volume(threshold=0.0, target=None, cavities=False, *args, **kwargs)

Volume class

Parameters: threshold (float or 'auto') – Final volume to target. If ‘auto’, it will calculate the sum of VdW volumes of all requested atoms in probes. (Unimplemented!) target (list of str) – Molecule gene name to calculate volume over cavities (boolean, optional, default=False) – If True, evaluate cavities volume creating a convex hull and calculating the difference between convex hull volume and molecule volume Calculated volume in A³ float
evaluate_convexhull(ind)
evaluate_volume(ind)
target(ind)
gaudi.objectives.volume.convexhull_volume(surface)

This function gets a surface, creates the convex hull and calculates its volume

Parameters: surface (Surface.gridsurf.ses_surface(molecule.atoms)) – volume – Convex hull volume float

Notes

Some systems may produce small volume blobs, resulting in a number of different surface pieces. This should be discussed in the future. # points = surface.surfacePieces[0].geometry[0] # convexhull = scipy.spatial.ConvexHull(points)

gaudi.objectives.volume.enable(**kwargs)

## Base class for all objectives¶

Objectives are modules that reside in the gaudi.objectives package, and have a certain class structure

class gaudi.objectives.ObjectiveProvider(environment=None, name=None, weight=None, zone=None, precision=3, **kwargs)

Bases: object

Base class that every objectives plugin MUST inherit.

Mount point for plugins implementing new objectives to be evaluated by DEAP. The objective resides within the Fitness attribute of the individual. Do whatever you want, but use an evaluate() function to return the results. Apart from that, there’s no requirements.

The base class includes some useful attributes, so don’t forget to call ObjectiveProvider.__init__ in your overriden __init__. For example, self.zone is a Chimera.selection.ItemizedSelection object which is shared among all objectives. Use that to get atoms in the surrounding of the target gene, and remember to self.zone.clear() it before use.

— From (M.A. itself)[http://martyalchin.com/2008/jan/10/simple-plugin-framework/]: Now that we have a mount point, we can start stacking plugins onto it. As mentioned above, individual plugins will subclass the mount point. Because that also means inheriting the metaclass, the act of subclassing alone will suffice as plugin registration. Of course, the goal is to have plugins actually do something, so there would be more to it than just defining a base class, but the point is that the entire contents of the class declaration can be specific to the plugin being written. The plugin framework itself has absolutely no expectation for how you build the class, allowing maximum flexibility. Duck typing at its finest.

classmethod clear_cache()
evaluate(individual)

Return the score of the individual under the current conditions.

plugins = [<class 'gaudi.objectives.angle.Angle'>, <class 'gaudi.objectives.contacts.Contacts'>, <class 'gaudi.objectives.coordination.Coordination'>, <class 'gaudi.objectives.distance.Distance'>, <class 'gaudi.objectives.dsx.DSX'>, <class 'gaudi.objectives.energy.Energy'>, <class 'gaudi.objectives.gold.Gold'>, <class 'gaudi.objectives.hbonds.Hbonds'>, <class 'gaudi.objectives.inertia.AxesOfInertia'>, <class 'gaudi.objectives.ligscore.LigScore'>, <class 'gaudi.objectives.nwchem.NWChem'>, <class 'gaudi.objectives.solvation.Solvation'>, <class 'gaudi.objectives.vina.Vina'>, <class 'gaudi.objectives.volume.Volume'>]
classmethod validate(data, schema=None)
classmethod with_validation(**kwargs)