To investigate the co-localization of transcription factors, modENCODE ChIP-chip data sets for 25 factors were integrated with ChIP-chip data sets for an 16 additional factors produced by the Berkeley Drosophila Transcription Network Project (BDTNP, FBrf0192397
). Data sets generated for the same factor were merged and the union was used for further analysis. Highly occupied target (HOT) regions were identified using a Gaussian kernel density estimation across the genome with a bandwidth of 300 bp, using the centers of each of the TF binding peaks as points. The density was then scanned for peaks, and each peak was denoted a HOT region. To determine the complexity of the HOT region, the sum of the Gaussian kernalized distance from the peak to each transcription factor that contributed at least 0.1 to peak's strength was calculated. The reported window around each HOT peak was derived by finding the maximum distance (in bp) from the HOT peak to a contributing TF, and then adding 150 bp (one half of the bandwidth). Each window is centered on the HOT peak. A TF complexity score was assigned to each of 38,562 distinct TF binding sites, corresponding to the number of distinct TFs bound (from 1 to ~21). Of these distinct TF binding sites, a subset of 1,962 hot regions (hotspots) had a TF complexity of eight or greater.