1. Introduction¶
Primary Usage: Identification of cis-regulatory elements initially identified by matrix scoring and then additionally scored on 7 other relevant contextual datapoints. Based on analysis of protein-coding transcripts in the Ensembl database.
- This work is a derivative of “Transcription factors” by kelvin13, used under CC BY 3.0.
Note
TFBS_footprinting is now available in a Docker image.
Predict TFBSs in the promoters any of 1-80,000 human protein coding transcripts in the Ensembl database. TFBS predictions can also be made for 87 unique non-human species (including model organisms such as mouse and zebrafish), present in the following groups:
- 70 Eutherian mammals
- 24 Primates
- 11 Fish
- 7 Sauropsids
The TFBS footprinting method computationally predicts transcription factor binding sites (TFBSs) in a target species (e.g. homo sapiens) using 575 position weight matrices (PWMs) based on binding data from the JASPAR database. Additional experimental data from a variety of sources is used to support or detract from these predictions:
- DNA sequence conservation in homologous mammal species sequences
- proximity to CAGE-supported transcription start sites (TSSs)
- correlation of expression between target gene and predicted transcription factor (TF) across 1800+ samples
- proximity to ChIP-Seq determined TFBSs (GTRD project)
- proximity to qualitative trait loci (eQTLs) affecting expression of the target gene (GTEX project)
- proximity to CpGs
- proximity to ATAC-Seq peaks (ENCODE project)