For peak calling, PCR duplicates, defined as reads sharing the same alignment, were removed from individual data sets. Data for the 36 different stages were combined for peak calling. To account for non-specific signal from abundant transcripts, 5' end frequency was normalized to transcript abundance as measured by the 3' portion of the paired end reads. Peaks were called by a sliding window algorithm that assesses the significance of local signal enrichment given a null distribution. Signal-enriched windows in close proximity to each other are merged into peaks, and those were subsequently trimmed at the edges down to the first base with signal.
Sequences corresponding to library identification barcodes and primers were trimmed. Reads were mapped with STAR. All uniquely mapping reads were kept. For multiply mapping reads, if all alignments started within an annotated transposon and overlapped the same gene annotation, the alignment starting in the closest transposon was selected. Transcript annotations were obtained from FlyBase (release 5.32).