Server Benchmark Results (2015)

EVfold.org: Evolutionary Couplings and Protein 3D Structure Prediction
Robert Sheridan, Robert J. Fieldhouse, Sikander Hayat, Yichao Sun, Yevgeniy Antipin, Li Yang, Thomas Hopf, Debora S. Marks, Chris Sander
bioRxiv doi: http://dx.doi.org/10.1101/021022

Links:
View preprint at biorxiv.org
Download pdf

 

Transmembrane Protein 3D Structures - Predicted by Evolutionary Couplings

New EVfold Predications for α-Helical Transmembrane Proteins

This page links you to our most recent EVfold runs for α-helical membrane proteins (EVfold - Membrane 2.0). This includes updated results from all the proteins that were presented in our 2012 paper ( Hopf et al 2012 ). We recommend you use these rather than the older results as we can see that where a comparison can be made to a known crystal structure (our benchmark set), all evolutionary couplings (ECs) are more accurate with respect to real contacts and all predicted 3D structures are more accurate compared to the crystals. This is due to the use of the improved algorithm and the increase in the number of sequences available.


Come back to this site often.
This site will be updated regularly to include all α-helical transmembrane protein families that have sufficient diverse sequences.

Comparison to known structures:

Evolutionary couplings and predicted 3D folds for membrane proteins compared to known 3D structures.
 

Predictions for unknown structures:

Evolutionary couplings and predicted 3D folds for proteins that have unknown structures, either for the specific protein or any other family member. The high ranked EC residue pairs may also be informative of functional sites.

 

Sequence co-evolution gives 3D contacts and structures of protein complexes (2014)

You can download the 2014 publication, and supplementary materials.

Sequence co-evolution gives 3D contacts and structures of protein complexes
Thomas A Hopf, Charlotta P I Schärfe, João P G L M Rodrigues, Anna G Green, Oliver Kohlbacher, Chris Sander, Alexandre M J J Bonvin, Debora S Marks
eLife 2014;3:e03430 September 25, 2014 http://dx.doi.org/10.7554/eLife.03430

Links:
View publication on journal web site.
Download PDF


Supplementary Files for EVcomplex

  1. File_1 : Benchmark data set and results
  2. File_2 : De novo prediction data set and results
  3. File_3 : Docking results
  4. File_4 : Predicted inter-ECs for complexes in de novo prediction data set with EVcomplex score >= 0.8
  5. File_5 : ATP synthase interaction predictions
  6. File_6 : Comparison of ATP synthase EVcomplex predictions of a and b subunit with cross-linking studies
  7. File_7 : PDB identifiers used for comparison of predicted evolutionary couplings to known 3D structures

Table 1. EVcomplex predictions and docking results for 15 protein complexes

  EVcomplex
contacts
Docking quality
(RMSD)
Complex name Subunits Seqs ECs TP
rate
Top
ranked
model
Best
model
Carbamoyl-phosphate synthase CarB:CarA 2.3 17 0.88 1.9 1.9
Aminomethyltransferase/Glycine cleavage system H protein GcsH:GcsT 2.9 5 0.2 5.4 5.4
Histidine kinase/response regulator KdpD:CheY
(T. maritima)
95.4 78 0.72 2.1 2.0
Ubiquinol oxidase CyoB:CyoA 1.0 11 0.55 1.8 1.2
Outer membrane usher protein/Chaperone protein FimD:FimC 3.6 6 0.83 3.2 3.0
Molybdopterin synthase MoaD:MoaE 3.6 8 1.0 4.4 4.1
Methionine transporter complex MetN:MetI 1.9 14 0.86 1.5 1.2
Dihydroxyacetone kinase DhaL:DhaK 1.4 12 0.42 6.7 2.4
Vitamin B12 uptake system BtuC:BtuF 3.2 5 0.6 2.8 2.8
Vitamin B12 uptake system BtuC:BtuD 9.8 21 0.88 1.1 0.9
ATP synthase γ and ε subunits AtpE:AtpG 2.9 15 0.53 1.4 1.4
IIA-IIB complex of the N,N'-diacetylchitobiose (Chb) transporter PtqA:PtqB 3.1 5 0.2 7.2 5.5
30 S Ribosomal proteins RS3:RS14 1.4 11 0.91 1.1 1.1
Succinatequinone oxido-reductase flavoprotein/iron-sulfur subunits SdhB:SdhA 3.0 8 0.62 1.4 1.4
30 S Ribosomal proteins RS10:RS14 1.2 6 1.0 5.3 2.5
  • Column "Seqs" : Number of non-redundant sequences in concatenated alignment normalized by alignment length.
  • Column "ECs" : Inter-ECs with EVcomplex score >= 0.8.
  • Column "TP rate" : True Positive rate for inter-ECs above score threshold.
  • Column "Top ranked model" : iRMSD positional deviation of model from known structure, for docked model with best HADDOCK score.
  • Column "Best model" : Lowest iRMSD observed across all models.

 

3D structures of helical transmembrane proteins (2012) - EVfold_membrane

You can download the 2012 publication, supplementary materials and data files.

Data files include, for known and unknown structures:

  • multiple sequence alignments for protein domain family
  • tables of derived evolutionary constraints
  • predicted contact maps
  • predicted all-atom coordinates

For benchmark set of known structures only:

  • evaluation of prediction accuracy

Three-Dimensional Structures of Membrane Proteins from Genomic Sequencing
Thomas A. Hopf, Lucy J. Colwell, Robert Sheridan, Burkhard Rost, Chris Sander, Debora S. Marks
10.1016/j.cell.2012.04.012

Links:
View publication on journal web site.
Download PDF

2011 Publication: EVfold benchmark for 15 protein domains


Data Downloads for Membrane Proteins

Assembled top ranked structure models with related files across all reported protein domains
evfold_membrane_top_models_with_related_files.tar.gz

All structure models across all reported protein domains
evfold_membrane_all_models.tar.gz


Individual protein domains blindly predicted with experimentally KNOWN structures

protein_namehhblits_parametersdownload link
ADIC_SALTYe20_n2_m30_f0download_results
ADRB2_HUMANe20_n2_m30_f0download_results
ADT1_BOVINe40_n2_m30_f0download_results
AMTB_ECOLIe5_n2_m30_f0download_results
AQP4_HUMANe10_n2_m30_f0download_results
BTUC_ECOLIe10_n2_m30_f0download_results
C3NQD8_VIBCJe20_n2_m30_f0download_results
C6E9S6_ECOBDe10_n2_m80_f0download_results
COX1_BOVINe40_n2_m80_f0download_results
COX3_BOVINe3_n2_m30_f0download_results
CYB_BOVINe3_n2_m80_f0download_results
FIEF_ECOLIe5_n2_m30_f0download_results
GLPG_ECOLIe5_n2_m30_f0download_results
GLPT_ECOLIe30_n2_m30_f0download_results
METI_ECOLIe15_n2_m30_f0download_results
MIP_BOVINe10_n2_m30_f0download_results
MSBA_SALTYe3_n2_m30_f0download_results
O67854_AQUAEe3_n2_m30_f0download_results
OPSD_BOVINe20_n2_m30_f0download_results
Q87TN7_VIBPAe10_n2_m30_f0download_results
Q8EKT7_SHEONe10_n2_m30_f0download_results
Q9K0A9_NEIMBe10_n2_m30_f0download_results
SGLT_VIBPAe5_n2_m30_f0download_results
TEHA_HAEINe3_n2_m30_f0download_results
URAA_ECOLIe3_n2_m30_f0download_results

Individual protein domains of UNKNOWN structure

protein_namehhblits_parametersdownload link
ABCG2_HUMANe10_n2_m30_f0download_results
ADR1_HUMANe5_n2_m30_f0download_results
B1B3L4_STAAUe10_n2_m50_f0download_results
CCG1_HUMANe3_n2_m30_f0download_results
CTNS_HUMANe3_n2_m50_f0download_results
EAMA_ECOLIe5_n2_m30_f0download_results
ELOV4_HUMANe3_n2_m30_f0download_results
GABR1_HUMANe5_n2_m30_f0download_results
LIVH_ECOLIe3_n2_m30_f0download_results
MSMO1_HUMANe20_n2_m30_f0download_results
NAC1_HUMANe15_n2_m40_f0download_results
NU1M_HUMANe10_n2_m40_f0download_results
S13A1_HUMANe20_n2_m30_f0download_results
S22A4_HUMANe30_n2_m30_f0download_results
S5A1_HUMANe5_n2_m70_f0download_results
SL9A1_HUMANe10_n2_m30_f0download_results
TSPOA_HUMANe5_n2_m30_f0download_results
VIAAT_HUMANe5_n2_m50_f0download_results

You may download the code for the calculation of evolutionary constraints from multiple sequence alignments (and please request to be notified of code updates as well).

 

EVfold benchmark for 15 protein domains (2011)

You can download the 2011 publication, supplementary materials and data files.

Data files include:

  • multiple sequence alignments for protein domain family
  • tables of derived evolutionary constraints
  • predicted contact maps
  • predicted all-atom coordinates
  • evaluation of prediction accuracy

Protein 3D structure computed from evolutionary sequence variation.
Debora S. Marks*, Lucy J. Colwell*, Robert Sheridan, Thomas A. Hopf, Andrea Pagnani, Riccardo Zecchina, Chris Sander.
PLoS One. 2011;6(12):e28766. Epub 2011 Dec 7.
*Joint first

Links:
View Publication on Journal Website
Download Publication PDF
Download Supplementary Text and Figures
News reports: Sciencedaily   HMS_News   Eurekalert   Blog/Becky.Ward   Blog/Bosco.K.Ho   CurrentBiology

2012 Publication on Predicting Unknown 3D Structures of Transmembrane Proteins:


Supplemental Data Downloads

Appendix NameDescriptionAppendix NameDescription
Appendix_A11. All DI scores 2. Top 500 filtered and ranked DIScores1Appendix_A6TM and GDT_TS scores
Appendix_A2Distance geometry and simulated annealing protocolAppendix_A7Pymol rmsd scores
Appendix_A3All predicted structure coordinates for all proteinsAppendix_A8DI MI, BNM and SCA top 500 scores plus (i) cys, conservation and secondary structure clashes flagged, (ii) mapping to a crystal structure (iii) distance between predicted residues in a reference crystal structure
Appendix_A4Pymol sessions for a. All EIC top ranked predicted structures, the best TM score structure and a reference crystal structures. b. Best TM score BNM method predicted structures.Appendix_A9Sequence, secondary structure prediction and residue mapping to crystal structure
Appendix_A5Discrimination scoresAppendix_A10Atom types used in applying residue-residue distance constraints (in addition to C-alpha and C-beta)

1Selection of those rows with no annotation in columns 7, 8 or 9 gives EICs

Note: The Trypsin content in the appendicies above reflects a minor method adjustment. See below for an explanation and additional resources.

You may download the PFAM sequence alignments we used for calculating constraints, and download the code for the calculation of evolutionary constraints from multiple sequence alignments (and please request to be notified of code updates as well).

All input files for CNS_solve available on request. These include formatted distance and dihedral constraints and the extended polypeptide coordinates.

Higher resolution versions of the main text figures can be downloaded here: main text figures


Useful ID Mappings

Pfam Family IdPfam Family NameUniprot IdUniprot NamePdb Id
PF000017tm_1P02699OPSD_BOVIN1hzx
PF00089TrypsinP00763TRY2_RAT3tgi
PF00071RasP01112RASH_HUMAN5p21
PF00075RNase_HP0A7Y4RNH_ECOLI1f21
PF00072Response_regP0AE67CHEY_ECOLI1e6k
PF00307CHQ01082SPTB2_HUMAN1bkr
PF00059Lectin_CQ9NNX6A8MVQ9_HUMAN2it6
PF00085ThioredoxinP80579THIO_ALIAC1rqm
PF00028CadherinP12830CADH1_HUMAN2o72
PF00254FKBP_CO45418O45418_CAEEL1r9h
PF00486Trans_reg_CP0AA16OMPR_ECOLI1odd
PF00076RRM_1P26378ELAV4_HUMAN1g2e
PF00013KH_1Q15365PCBP1_HUMAN1wvn
PF00014Kunitz_BPTIP00974BPT1_BOVIN5pti
PF00018SH3_1P07947YES_HUMAN2hda
 

Appendix A3

Appendix SubnameDescription
Appendix_A3_PF00001_P02699coordinates of predicted structures for OPSD_BOVIN
Appendix_A3_PF00089_P00763coordinates of predicted structures for TRY2_RAT
Appendix_A3_PF00071_P01112coordinates of predicted structures for RASH_HUMAN
Appendix_A3_PF00075_P0A7Y4coordinates of predicted structures for RNH_ECOLI
Appendix_A3_PF00072_P0AE67coordinates of predicted structures for CHEY_ECOLI
Appendix_A3_PF00307_Q01082coordinates of predicted structures for SPTB2_HUMAN
Appendix_A3_PF00059_Q9NNX6coordinates of predicted structures for A8MVQ9_HUMAN
Appendix_A3_PF00085_P80579coordinates of predicted structures for THIO_ALIAC
Appendix_A3_PF00028_P12830coordinates of predicted structures for CADH1_HUMAN
Appendix_A3_PF00254_O45418coordinates of predicted structures for O45418_CAEEL
Appendix_A3_PF00486_P0AA16coordinates of predicted structures for OMPR_ECOLI
Appendix_A3_PF00076_P26378coordinates of predicted structures for ELAV4_HUMAN
Appendix_A3_PF00013_Q15365coordinates of predicted structures for PCBP1_HUMAN
Appendix_A3_PF00014_P00974coordinates of predicted structures for BPT1_BOVIN
Appendix_A3_PF00018_P07947coordinates of predicted structures for YES_HUMAN
 

Appendix A4

Appendix SubnameDescription
Appendix_A4_EIC_DIpymol sessions for main results ( DI/EIC calculations )
Appendix_A4_BNMpymol session files for BNM score predictions
 

Appendix A8

Appendix SubnameDescription
Appendix_A8_DI.tar.gztop ranked DI predicitons with distance in crystal structure between pr edicted pairs
Appendix_A8_MI.tar.gztop ranked MI predicitons with distance in crystal structure between pr edicted pairs
Appendix_A8_SCA.tar.gztop ranked SCA predicitons with distance in crystal structure between predicted pairs
Appendix_A8_BNM.tar.gztop ranked BNM predicitons with distance in crystal structure between predicted pairs
 

Appendices: Trypsin Notes

Structure computations for Trypsin included in the Appendices were done using a slight adjustment to the method. After the DI scores were computed for all pairs of amino acids in the domain with sequence position difference greater than five, all pairings of Cysteine to Cysteine which ranked in the top 500 pairs were examined.

first possecond posDI scoreranksubsequent pairing flag
48640.56591
1962200.357312
1711850.352373
1392060.30675
1851960.1056114paired again
1852200.04708944paired again
641850.04567746paired again
1712200.03179793paired again
641960.027111118paired again
641710.027015120paired again
301600.022444173
1711960.021539218paired again
481710.018447238paired again
1322330.017758255
642060.012812483paired again

Those Cysteine to Cysteine pairings which involve an amino acid which was paired in a higher ranking pair were flagged and filtered from the list. The remaining 6 Cysteine to Cysteine pairs were promoted to the top of the overall DI score rankings by replacing the DI score with values greater than any other DI score in the table, preserving the relative order of these six pairings:

first possecond posreplacement scorenew rank
48640.999991
1962200.999982
1711850.999973
1392060.999964
301600.999955
1322330.999946

By doing this we were able to use the same processing pipeline while prioritizing the presumed disulfide bonding.

For the appendices affected by this change in method, below are links to the appendix content obtained when the standard method is applied without the adjustment.

Appendix NameDescription
Appendix_A1_ST1. All DI scores 2. Top 500 filtered and ranked DIScores1 for Trypsin
Appendix_A3_STAll predicted structure coordinates for Trypsin
Appendix_A7_STPymol rmsd scores for Trypsin
Appendix_A8_DI_STDI top 500 scores plus (i) cys, conservation and secondary structure clashes flagged, (ii) mapping to a crystal structure (iii) distance between predicted residues in a reference crystal structure
Appendix_A9_STSequence, secondary structure prediction and residue mapping to crystal structure

1Selection of those rows with no annotation in columns 7, 8 or 9 gives EICs