Transcription factor
AF-G5EGB2-F1-v4
Share your feedback on structure with Google DeepMind Looks great Could be improved
Information
- 0
- 5
- 10
- 15
- 20
- 25
- 30
Predicted aligned error (PAE)
Click and drag a box on the PAE viewer to select regions of the structure and highlight them on the 3D viewer.
PAE data is useful for assessing inter-domain accuracy – go to Help section below for more information.
PDB ID and chain | Description | Species | Residue range | E-value Likelihood of a match between the query and target sequence in a structural alignment. The lower the E-value, the more significant the alignment. | Seq. identity Percentage of identical residues between aligned sequences over the aligned length. | Res. (Å) Resolution. Indicates the level of detail present in the 3D structure. Smaller value means finer details of the structure and higher quality. | Align in 3D | |
---|---|---|---|---|---|---|---|---|
Align in 3D | PDB ID and chain2i13_A_3 | Description Zinc finger and SCAN domain-containing protein 2 Zinc finger and SCAN domain... Zinc finger and SCAN domain-containing protein 2 | Species Mus musculus Mus musculus | Residue range 218 - 388Q07230 | E-value Likelihood of a match between the query and target sequence in a structural alignment. The lower the E-value, the more significant the alignment. | Seq. identity 39.2% Percentage of identical residues between aligned sequences over the aligned length. | Resolution (Å) 1.96 Resolution. Indicates the level of detail present in the 3D structure. Smaller value means finer details of the structure and higher quality. | |
Align in 3D | PDB ID and chain5v3g_A_3 | Description C2H2-type domain-containing protein C2H2-type domain-containing... C2H2-type domain-containing protein | Species Homo sapiens Homo sapiens | Residue range 301 - 470D9IWL3 | E-value Likelihood of a match between the query and target sequence in a structural alignment. The lower the E-value, the more significant the alignment. | Seq. identity 37.1% Percentage of identical residues between aligned sequences over the aligned length. | Resolution (Å) 2.42 Resolution. Indicates the level of detail present in the 3D structure. Smaller value means finer details of the structure and higher quality. | |
Align in 3D | PDB ID and chain8sss_D_1 | Description Transcriptional repressor CTCF Transcriptional repressor C... Transcriptional repressor CTCF | Species Homo sapiens Homo sapiens | Residue range 263 - 465P49711 | E-value Likelihood of a match between the query and target sequence in a structural alignment. The lower the E-value, the more significant the alignment. | Seq. identity 27.2% Percentage of identical residues between aligned sequences over the aligned length. | Resolution (Å) 2.30 Resolution. Indicates the level of detail present in the 3D structure. Smaller value means finer details of the structure and higher quality. | |
Align in 3D | PDB ID and chain5v3g_D_3 | Description C2H2-type domain-containing protein C2H2-type domain-containing... C2H2-type domain-containing protein | Species Homo sapiens Homo sapiens | Residue range 301 - 470D9IWL3 | E-value Likelihood of a match between the query and target sequence in a structural alignment. The lower the E-value, the more significant the alignment. | Seq. identity 36.4% Percentage of identical residues between aligned sequences over the aligned length. | Resolution (Å) 2.42 Resolution. Indicates the level of detail present in the 3D structure. Smaller value means finer details of the structure and higher quality. | |
Align in 3D | PDB ID and chain8sst_A_1 | Description Transcriptional repressor CTCF Transcriptional repressor C... Transcriptional repressor CTCF | Species Homo sapiens Homo sapiens | Residue range 263 - 465P49711 | E-value Likelihood of a match between the query and target sequence in a structural alignment. The lower the E-value, the more significant the alignment. | Seq. identity 28.6% Percentage of identical residues between aligned sequences over the aligned length. | Resolution (Å) 2.19 Resolution. Indicates the level of detail present in the 3D structure. Smaller value means finer details of the structure and higher quality. | |
Align in 3D | PDB ID and chain2i13_B_3 | Description Zinc finger and SCAN domain-containing protein 2 Zinc finger and SCAN domain... Zinc finger and SCAN domain-containing protein 2 | Species Mus musculus Mus musculus | Residue range 218 - 388Q07230 | E-value Likelihood of a match between the query and target sequence in a structural alignment. The lower the E-value, the more significant the alignment. | Seq. identity 43% Percentage of identical residues between aligned sequences over the aligned length. | Resolution (Å) 1.96 Resolution. Indicates the level of detail present in the 3D structure. Smaller value means finer details of the structure and higher quality. | |
Align in 3D | PDB ID and chain6ml6_A_1 | Description Zinc finger and BTB domain-containing protein 24 Zinc finger and BTB domain-... Zinc finger and BTB domain-containing protein 24 | Species Mus musculus Mus musculus | Residue range 375 - 519Q80X44 | E-value Likelihood of a match between the query and target sequence in a structural alignment. The lower the E-value, the more significant the alignment. | Seq. identity 36.7% Percentage of identical residues between aligned sequences over the aligned length. | Resolution (Å) 1.54 Resolution. Indicates the level of detail present in the 3D structure. Smaller value means finer details of the structure and higher quality. | |
Align in 3D | PDB ID and chain6ml4_A_1 | Description Zinc finger and BTB domain-containing protein 24 Zinc finger and BTB domain-... Zinc finger and BTB domain-containing protein 24 | Species Mus musculus Mus musculus | Residue range 375 - 519Q80X44 | E-value Likelihood of a match between the query and target sequence in a structural alignment. The lower the E-value, the more significant the alignment. | Seq. identity 36.7% Percentage of identical residues between aligned sequences over the aligned length. | Resolution (Å) 1.48 Resolution. Indicates the level of detail present in the 3D structure. Smaller value means finer details of the structure and higher quality. | |
Align in 3D | PDB ID and chain8sss_A_1 | Description Transcriptional repressor CTCF Transcriptional repressor C... Transcriptional repressor CTCF | Species Homo sapiens Homo sapiens | Residue range 263 - 465P49711 | E-value Likelihood of a match between the query and target sequence in a structural alignment. The lower the E-value, the more significant the alignment. | Seq. identity 31.6% Percentage of identical residues between aligned sequences over the aligned length. | Resolution (Å) 2.30 Resolution. Indicates the level of detail present in the 3D structure. Smaller value means finer details of the structure and higher quality. | |
Align in 3D | PDB ID and chain5t0u_D_1 | Description Transcriptional repressor CTCF Transcriptional repressor C... Transcriptional repressor CTCF | Species Homo sapiens Homo sapiens | Residue range 294 - 465P49711 | E-value Likelihood of a match between the query and target sequence in a structural alignment. The lower the E-value, the more significant the alignment. | Seq. identity 29.5% Percentage of identical residues between aligned sequences over the aligned length. | Resolution (Å) 3.20 Resolution. Indicates the level of detail present in the 3D structure. Smaller value means finer details of the structure and higher quality. | |
viewer. Structures are aligned using alpha
carbon atoms as reference points. Once
aligned, RMSD values will appear in the
list below; lower values indicate greater
similarity between the two structures.
AlphaFold database protein sequences clustered by the MMseqs2 algorithm (Steinegger M. and Soeding J., Nat. Commun. 9, 2018). Each cluster is comprised of sequences that fulfil two criteria: maintaining a maximum sequence identity of 50% and achieving a 90% bi-directional sequence overlap with the longest sequence of the cluster representative.
AFDB accession | Description | Species | Sequence length | Average pLDDT |
---|---|---|---|---|
AFDB accessionAF-A0A5N6KJN1-F1 | Description C2H2-type domain-containing protein C2H2-type domain-containing protein | SpeciesMonilinia laxa Monilinia laxa | Sequence length 427 | Average pLDDT 56.47 |
AFDB accessionAF-A0A6P4KS37-F1 | Description zinc finger protein 320 zinc finger protein 320 | SpeciesDrosophila bipectinata Drosophila bipectinata | Sequence length 431 | Average pLDDT 55.69 |
AFDB accessionAF-A0A6P5UKK7-F1 | Description zinc finger protein 84 zinc finger protein 84 | SpeciesDrosophila obscura Drosophila obscura | Sequence length 442 | Average pLDDT 55.69 |
AFDB accessionAF-A0A6I8UMJ6-F1 | Description zinc finger protein 28 zinc finger protein 28 | SpeciesDrosophila pseudoobscura pseudoobscura Drosophila pseudoobscura pseudoobscura... Drosophila pseudoobscura pseudoobscura | Sequence length 442 | Average pLDDT 55.41 |
AFDB accessionAF-A0A6P7T363-F1 | Description zinc finger protein 184-like zinc finger protein 184-like | SpeciesOctopus vulgaris Octopus vulgaris | Sequence length 435 | Average pLDDT 54.72 |
AFDB accessionAF-A0A6J2XX53-F1 | Description zinc finger protein 484-like isoform X1 zinc finger protein 484-like isoform X1 ... zinc finger protein 484-like isoform X1 | SpeciesSitophilus oryzae Sitophilus oryzae | Sequence length 477 | Average pLDDT 53.94 |
AFDB accessionAF-A0A0M4F656-F1 | Description CG42726 CG42726 | SpeciesDrosophila busckii Drosophila busckii | Sequence length 419 | Average pLDDT 52.72 |
AFDB accessionAF-A0A199VD53-F1 | Description Zinc finger protein ZAT4 Zinc finger protein ZAT4 | SpeciesAnanas comosus Ananas comosus | Sequence length 442 | Average pLDDT 52.62 |
AFDB accessionAF-A0A6V7QX22-F1 | Description Uncharacterized protein Uncharacterized protein | SpeciesAnanas comosus var. bracteatus Ananas comosus var. bracteatus | Sequence length 522 | Average pLDDT 51.75 |
AFDB accessionAF-A0A6J2XVW3-F1 | Description putative zinc finger protein 840 isoform X3 putative zinc finger protein 840 isoform X3 ... putative zinc finger protein 840 isoform X3 | SpeciesSitophilus oryzae Sitophilus oryzae | Sequence length 410 | Average pLDDT 50.12 |
Visit our online training course
How to interpret the Predicted Aligned Error
The Predicted Aligned Error (PAE) measures the confidence in the relative position of two residues within the predicted structure, providing insight into the reliability of relative position and orientations of different domains. Consider the human protein encoded by the gene GNE (Q9Y223). GNE has two distinct domains according to experimentally determined structures in the Protein Data Bank (PDBe-KB). Does AlphaFold confidently predict their relative positions? We can use the interactive Predicted Aligned Error (PAE) plot to answer this question. The PAE plot is not an inter-residue distance map or a contact map. Instead, the shade of green indicates the expected distance error in Ångströms (Å), ranging from 0 Å to an arbitrary cut-off of 31 Å. The colour at (x, y) corresponds to the expected distance error in the residue x’s position when the predicted and the true structures are aligned on residue y. The two low-error, dark green squares correspond to the two domains. By clicking and dragging, you can highlight these squares on the structure. If you want to remove the highlighting, click the cross icon. When selecting an off-diagonal region, the plot visually represents the relationship between the selected ranges on the sequence and structure. The x range corresponds to the selection for scored residues, highlighted in orange, while the y range of aligned residues is highlighted in emerald green. Let’s consider another inter-domain example, the human protein encoded by DIP2B (Q9P265). In this case, we have confidence in the relative position of scored residues around 1450 when aligned with residues around 850, suggesting a packing between the small central domains. Note that the PAE scores are asymmetrical, meaning there might be variations in PAE values between (x,y) and (y,x) positions. This is particularly relevant for loop regions with highly uncertain orientations, as seen on the DNA topoisomerase 3 (Q8T2T7).
A dark green tile corresponds to a good prediction (low error), whereas a light green tile indicates poor prediction (high error). For example, when aligning on residue 300:
The high PAE values across the whole inter-domain region indicate that for this particular protein, AlphaFold does not reliably predict the relative position of the domains.
Last updated
Last updated in AlphaFold DB version 2022-11-01, created with the AlphaFold Monomer v2.0 pipeline.
Licence and attribution
Data is available for academic and commercial use, under a CC-BY-4.0 licence.
EMBL-EBI expects attribution (e.g. in publications, services or products) for any of its online services, databases or software in accordance with good scientific practice.
If you make use of an AlphaFold prediction, please cite the following papers: Jumper, J et al. Highly accurate protein structure prediction with AlphaFold. Nature (2021).
Varadi, M et al. AlphaFold Protein Structure Database in 2024: providing structure coverage for over 214 million protein sequences. Nucleic Acids Research (2024).
If you use data from AlphaMissense in your work, please cite the following paper: Cheng, J et al. Accurate proteome-wide missense variant effect prediction with AlphaMissense. Science (2023).
AlphaFold Data Copyright (2022) DeepMind Technologies Limited.
AlphaMissense Copyright (2023) DeepMind Technologies Limited.
Feedback and questions
If you want to share feedback on an AlphaFold structure prediction, consider using the feedback buttons at the top of each structure page. If you have any questions that are not covered in the FAQs, please contact alphafold@deepmind.com. If you have feedback on the website or experience any bugs please contact afdbhelp@ebi.ac.uk.
Let us know how the AlphaFold Protein Structure Database has been useful in your research at alphafold@deepmind.com.
Disclaimer
The AlphaFold and AlphaMissense Data and other information provided on this site contain predictions with varying levels of confidence, is for theoretical modelling only and caution should be exercised in its use. It is provided 'as-is' without any warranty of any kind, whether expressed or implied. For clarity, no warranty is given that use of the information shall not infringe the rights of any third party. The information is not intended to be a substitute for professional medical advice, diagnosis, or treatment, and does not constitute medical or other professional advice. The AlphaFold and AlphaMissense Data have not been validated for, and are not approved for, any clinical use.
Use of the AlphaFold Protein Structure Database is subject to EMBL-EBI Terms of Use.