Introduction How To Use


Protein complexes are involved in many important processes in a living cell. In order to understand the mechanisms of these processes, it is necessary to solve the 3D structure of the protein complexes. Experimental techniques such as X-ray crystallography, nuclear magnetic resonance (NMR), and cryo-electron microscopy have been used to solve the 3D structure of protein complexes, as shown in the large number of entries of complex structures in the Protein Data Bank (PDB). When protein complex structures have not been solved by experiment, it is possible to use computational tools to construct models of these complexes. A protein docking program takes two or more component protein structures as input and assembles them into 3D structure models of a protein complex. Input proteins can be either experimentally solved or computationally modeled structures using protein structure prediction programs.

This server provides access to LZerD for pairwise protein docking and Multi-LZerD for docking 3 or more proteins simultaneously. As input, LZerD takes two protein structures while MultiLZerD takes 3 to 6 protein structures. Both methods output docked models of the input proteins. By combining a soft protein surface representation using 3D Zernike descriptors (which are based on a mathematical moment expansion of the shape function) with geometric hashing, LZerD and Multi-LZerD can quickly search the space of binding poses while tolerating some subunit flexibility, including side-chain flexibility.

Algorithm

LZerD pairwise docking

Pairwise docking by LZerD (Local 3D Zernike descriptor-based protein Docking) is computed by the following three steps:
  1. LZerD takes two structures provided by the user (called a receptor and a ligand) as input and makes tens of thousands of docking conformations, sampling all possible interaction interface regions and interaction angles. If a docking conformation has too many atom clashes, too small interaction area, or low shape complementarity at the interface region, that conformation is rejected. In LZerD, a protein structure is represented by molecular surface, which is segmented into overlapping local surface regions. And each local surface region is represented by a mathematical moment-based shape descriptor called 3D Zernike descriptor (3DZD). 3DZD is rotation-invariant, which makes computation of shape complimentarily fast, and also allows a “soft” representation of surface and thus is robust to induced conformational change of proteins that occurs upon docking at a certain degree. The conformational exploration is performed by the geometric hashing algorithm. If the user provided constraints of residue-residue distances or interface residues, models that do not agree with the constraints are rejected.
  2. Generated docking models are clustered with a user-defined cluster cutoff (the default is a root-mean square deviation, RMSD, of 4 Angstroms). Typically, this step reduces the docking models to up to a few thousand to a few tens of thousands, depending on the proteins and the cutoff.
  3. The remaining models are ranked by the sum or score ranks (ranksum) from 3 scoring functions, DFIRE, GOAP, and ITScore. These 3 scoring functions essentially check if atom interactions in a model have similar distance and angle features to those observed in experimentally determined protein structures. If a model is consistently ranked as the top among all the models, then the ranksum will be 1+1+1 = 3. Ranksum was shown to perform very well in docking model ranking in CAPRI protein docking assessments. In the docking results page, models are initially ranked by ranksum. Refinement is not currently applied to the models. Thus, the structure of individual receptor and ligand are the same as what the user has input.

Multi-LZerD multiple-chain docking

Multi-LZerD takes 3 or more protein structures as input and assemble all of them into complex structures.
  1. First, LZerD is used to generate pairwise docking models for every pair of structure combinations. For example, if 3 chains are input, A, B, C, then pairwise models are generated for A-B, A-C, and B-C. They are then clustered with a user-configuable RMSD cutoff (default 10 Å).
  2. Next, Multi-LZerD uses a genetic algorithm to combine pairwise models to generate full-chain models. In the genetic algorithm, different combinations of pairwise models are iteratively generated and selected. For selecting models in the process, a molecular mechanics force field is used, which is specially trained for docking model selection. Finally, models are generated according to the user-configurable population size (default 200) and clustered with a the same user-configurable cutoff as before
  3. The resulting models are ranked by ranksum and presented in the result page. Refinement is not currently applied to the models.
For more details, see the original papers listed in References.

Which docking method should you use?

LZerD Multi-LZerD IDP-LZerD
Available through webserver? Yes Yes No
Available for download? Yes, here Yes, here Yes, here
Can dock 2 subunits? Yes Yes Yes
Can dock 3+ subunits? No Yes No
Can dock a disordered subunit? No No Yes