All 23-mers having a 3-prime proto-spacer adjacent motif (PAM) sequence (NGG) and a 15 base sequence unique to the genome (including the PAM) were identified in the D. melanogaster Release6 genome assembly (forward and reverse strands); these constituted all possible 23-mer designs that could possibly be suitable as sgRNA. These designs were further assessed for predicted efficiency and off-target effects. Because base pair mismatches can be tolerated outside the seed region, predicted sgRNAs were evaluated for potential off-target sites allowing for 3, 4 or 5 mismatches, corresponding to low, medium and high stringencies, respectively.
The "efficiency" scores range from 1.47-12.32 (higher is better, > 5 recommended, see FBrf0229582 for details).
The "seed" score measures the number of bases in the 15 bases next to the PAM that are unique in the genome; these scores range from 12 to 15.
The "frameshift" score represents the percent of frameshift changes predicted by micro-homology around the target site; these scores range from 0-100 (higher is better, >50 recommended, see PMID:24972169 for details).
The "machine learning" score was developed by training and testing against a fly cell-based screen dataset (scores not yet validated in cells or in vivo); scores range from 0-1 (higher is better, see FBrf0239612 for details).
The "OT" (off-target) score represents a weighted sum of the number of off-target sites at various mismatch stringencies (low_stringency_mismatches + 0.1*med_stringency_mismatches + 0.01*high_stringency_mismatches); scores range from 0-5441.73 (lower is better, less than 1 recommended).
These designs were assessed for overlapping genomic variants in lines commonly used for CRISPR work (S2R+, TRiP Cas9 lines); these variants could prevent sgRNA targeting.
First data version provided (2018-09-24) has been deprecated.
This data update incorporates newer FlyBase gene model anntotations, information on genomic variants overlapping sgRNA designs, and a new machine learning prediction score.
At FlyBase, these sgRNA designs are viewable in JBrowse, sorted into five different JBrowse tracks based on their specificity (predicted off-target matches): 1. no off-target sites (high stringency), 2. no off-target sites (medium stringency), 3. no off-target sites (low stringency), 4. 1-3 off-target sites (low stringency), 5. more than 3 off-target sites and/or at least 1 off-target site in a CDS (low stringency).
For additional information on overlappng genomic variants and off-target sites, use the sgRNA design's sequence to search the DRSC CRISPR tool (http://www.flyrnai.org/crispr/).