Genome-wide histone modification distributions in S2-DRSC and ML-DmBG3-c2 cells, as determined by ChIP-chip, were used in this analysis. Only ChIP data using antibodies that showed less than 50% of total signal associated with non-histone proteins, and more than fivefold higher affinity for the corresponding histone peptide, were considered. To derive the chromatin state model, the genome was divided into 200 bp bins, and average enrichment was calculated per bin based on unsmoothed log2 intensity ratio values. For H1, H4 and H3K23ac, regions of significant depletion rather than enrichment were called. Polycomb (Pc) distribution was used to discount the genome-wide difference in S2 H3K27me3 profiles. Bin-average values of each modification were shifted by the genome-wide mean, scaled by the genome-wide variance, and quantile-normalized between the two cells. The hidden Markov model (HMM) with multivariate normal emission distributions was then determined from the Baum–Welch algorithm using data from both cell types, and 30 seeding configurations determined with K-means clustering. States with minor intensity variations (Euclidian distance of mean emission values ,0.15) were merged.
State c1: Active promoter/transcription start site region.
State c2: Actively transcribed exon.
State c3: Actively transcribed intron (enhancer).
State c4: Other open chromatin.
State c5: Actively transcribed exon on the male X chromosome (dosage compensation).
State c6: Region of Polycomb-mediated repression.
State c7: Heterochromatin.
State c8: Heterochromatin-like region embedded in euchromatin.
State c9: Transcriptionally silent intergenic euchromatin.