FlyBase Reference Report: Guruharsha et al., 2013.7.9, Refined DPiM dataset (DPiM-2) supersedes original unfiltered protein identification dataset (DPiM-1).

Reference

Citation

Guruharsha, K.G, Mintseris, J., Obar, R.A, Gygi, S.P, Artavanis-Tsakonas, S. (2013.7.9). Refined DPiM dataset (DPiM-2) supersedes original unfiltered protein identification dataset (DPiM-1).

FlyBase ID

FBrf0222073

Publication Type

Personal communication to FlyBase

Abstract

PubMed ID

PubMed Central ID

Text of Personal Communication

As part of the Drosophila Protein Interaction Mapping (DPiM) project, more than 5,000 Drosophila proteins were subjected to co-affinity purification following tandem mass spectrometry (coAP-MS). The raw unfiltered protein identification data from these individual pull-downs was made available to the fly community regularly, prior to publication, through FlyBase and DPiM website at Harvard. This large dataset was referred to as a personal communication to FlyBase (Mintseris et al., 2009.11.23; FBrf0209452), which formed the basis of all the results and analyses encompassed in a later publication – Guruharsha et al., 2011, Cell (FBrf0216491 and FBrf0218395). The published dataset will supersede the previous personal communication to FlyBase. It is well known that proteins identified by coAP-MS represent a mixture of genuine interactors and many nonspecific interactors. The nonspecific interactors are present in a large number of data sets independent of the bait (tagged protein) used, whereas genuine interactors should co-occur only across relevant experiments. Most algorithms for identification of specific interactions from affinity purification data, including the one we used (Guruharsha et al., 2011), rely on distinguishing those proteins that occur rarely with respect to some model or background frequencies. Therefore any potential bait-prey interactions inferred directly from the raw dataset (Mintseris et al., 2009.11.23) without any filtering/scoring will be misleading. While a portion of those interactions are likely to be true, others will be considered false by algorithms used to generate the protein interaction network. Different types of potential false-positives were filtered-out in the meta-analyses of DPiM dataset (detailed in supplementary methods, Guruharsha et al., 2011). After all the filtering, the combined data set of 3,488 high quality coAP-MS contained 2,770,552 total peptides at 0.007% FDR identified from 4,927 proteins at 0.8% FDR. Our statistical analysis of the filtered dataset led to the prediction of 10,969 high confidence co-complex membership interactions (0.05% FDR) between 2,297 proteins, including thousands of prey-prey interactions. The remaining 2,630 proteins have interactions in this dataset that fall below the statistical cut-off (Supplemental Table S3, Guruharsha et al., 2011; FBrf0218395). Users are advised to refer to the 2011 publication for the DPiM dataset.

DOI

Related Publication(s)

Research paper

A Protein Complex Network of Drosophila melanogaster.

Guruharsha et al., 2011, Cell 147(3): 690--703 [FBrf0216491]

FlyBase analysis

Rollback of original unfiltered protein interaction dataset (DPiM-1).

FlyBase curators, 2014, Rollback of original unfiltered protein interaction dataset (DPiM-1). [FBrf0226221]

Associated Information

Comments

Associated Files

Other Information

Secondary IDs

Language of Publication

English

Additional Languages of Abstract

Parent Publication

Publication Type

Abbreviation

Title

ISBN/ISSN

Data From Reference

Datasets (0)

Report Sections

Open Close