1932

Abstract

Computational proteomics is the data science concerned with the identification and quantification of proteins from high-throughput data and the biological interpretation of their concentration changes, posttranslational modifications, interactions, and subcellular localizations. Today, these data most often originate from mass spectrometry–based shotgun proteomics experiments. In this review, we survey computational methods for the analysis of such proteomics data, focusing on the explanation of the key concepts. Starting with mass spectrometric feature detection, we then cover methods for the identification of peptides. Subsequently, protein inference and the control of false discovery rates are highly important topics covered. We then discuss methods for the quantification of peptides and proteins. A section on downstream data analysis covers exploratory statistics, network analysis, machine learning, and multiomics data integration. Finally, we discuss current developments and provide an outlook on what the near future of computational proteomics might bear.

Loading

Article metrics loading...

/content/journals/10.1146/annurev-biodatasci-080917-013516
2018-07-20
2024-04-19
Loading full text...

Full text loading...

/deliver/fulltext/biodatasci/1/1/annurev-biodatasci-080917-013516.html?itemId=/content/journals/10.1146/annurev-biodatasci-080917-013516&mimeType=html&fmt=ahah

Literature Cited

  1. 1.  James P 1997. Protein identification in the post-genome era: the rapid rise of proteomics. Q. Rev. Biophys. 30:4279–331
    [Google Scholar]
  2. 2.  Cox J, Mann M 2011. Quantitative, high-resolution proteomics for data-driven systems biology. Annu. Rev. Biochem. 80:273–99
    [Google Scholar]
  3. 3.  Altelaar AF, Munoz J, Heck AJ 2013. Next-generation proteomics: towards an integrative view of proteome dynamics. Nat. Rev. Genet. 14:135–48
    [Google Scholar]
  4. 4.  Aebersold R, Mann M 2016. Mass-spectrometric exploration of proteome structure and function. Nature 537:7620347–55
    [Google Scholar]
  5. 5.  Bassani-Sternberg M, Bräunlein E, Klar R, Engleitner T, Sinitcyn P et al. 2016. Direct identification of clinically relevant neoepitopes presented on native human melanoma tissue by mass spectrometry. Nat. Commun. 7:13404
    [Google Scholar]
  6. 6.  Welker F, Collins MJ, Thomas JA, Wadsley M, Brace S et al. 2015. Ancient proteins resolve the evolutionary history of Darwin's South American ungulates. Nature 522:755481–84
    [Google Scholar]
  7. 7.  Wolters DA, Washburn MP, Yates JR 2001. An automated multidimensional protein identification technology for shotgun proteomics. Anal. Chem. 73:235683–90
    [Google Scholar]
  8. 8.  Fornelli L, Durbin KR, Fellers RT, Early BP, Greer JB et al. 2017. Advancing top-down analysis of the human proteome using a benchtop quadrupole-orbitrap mass spectrometer. J. Proteome Res. 16:2609–18
    [Google Scholar]
  9. 9.  Toby TK, Fornelli L, Kelleher NL 2016. Progress in top-down proteomics and the analysis of proteoforms. Annu. Rev. Anal. Chem. 9:499–519
    [Google Scholar]
  10. 10.  Chait BT 2006. Mass spectrometry: bottom-up or top-down?. Science 314:519665–66
    [Google Scholar]
  11. 11.  Zamdborg L, LeDuc RD, Glowacz KJ, Kim YB, Viswanathan V et al. 2007. ProSight PTM 2.0: improved protein identification and characterization for top down mass spectrometry. Nucleic Acids Res 35:W701–6
    [Google Scholar]
  12. 12.  Kou Q, Xun L, Liu X 2016. TopPIC: a software tool for top-down mass spectrometry-based proteoform identification and characterization. Bioinformatics 32:223495–97
    [Google Scholar]
  13. 13.  Park J, Piehowski PD, Wilkins C, Zhou M, Mendoza J et al. 2017. Informed-Proteomics: open-source software package for top-down proteomics. Nat. Methods 14:9909–14
    [Google Scholar]
  14. 14.  Gillette MA, Carr SA 2013. Quantitative analysis of peptides and proteins in biomedicine by targeted mass spectrometry. Nat. Methods 10:128–34
    [Google Scholar]
  15. 15.  Liebler DC, Zimmerman LJ 2013. Targeted quantitation of proteins by mass spectrometry. Biochemistry 52:223797–3806
    [Google Scholar]
  16. 16.  Picotti P, Aebersold R 2012. Selected reaction monitoring-based proteomics: workflows, potential, pitfalls and future directions. Nat. Methods 9:6555–66
    [Google Scholar]
  17. 17.  Ebhardt HA, Root A, Sander C, Aebersold R 2015. Applications of targeted proteomics in systems biology and translational medicine. Proteomics 15:189193–208
    [Google Scholar]
  18. 18.  MacLean B, Tomazela DM, Shulman N, Chambers M, Finney GL et al. 2010. Skyline: an open source document editor for creating and analyzing targeted proteomics experiments. Bioinformatics 26:7966–68
    [Google Scholar]
  19. 19.  Doerr A 2014. DIA mass spectrometry. Nat. Methods 12:135–35
    [Google Scholar]
  20. 20.  Rosenberger G, Bludau I, Schmitt U, Heusel M, Hunter CL et al. 2017. Statistical control of peptide and protein error rates in large-scale targeted data-independent acquisition analyses. Nat. Methods 14:9921–27
    [Google Scholar]
  21. 21.  Bruderer R, Bernhardt OM, Gandhi T, Xuan Y, Sondermann J et al. 2017. Optimization of experimental parameters in data-independent mass spectrometry significantly increases depth and reproducibility of results. Mol. Cell. Proteom. 16:122296–309
    [Google Scholar]
  22. 22.  Bilbao A, Varesio E, Luban J, Strambio-De-Castillia C, Hopfgartner G et al. 2015. Processing strategies and software solutions for data-independent acquisition in mass spectrometry. Proteomics 15:5–6964–80
    [Google Scholar]
  23. 23.  Tsou C-C, Avtonomov D, Larsen B, Tucholska M, Choi H et al. 2015. DIA-Umpire: comprehensive computational framework for data-independent acquisition proteomics. Nat. Methods 12:3258–64
    [Google Scholar]
  24. 24.  McDonnell LA, Heeren RMA 2007. Imaging mass spectrometry. Mass Spectrom. Rev. 262007:606–43
    [Google Scholar]
  25. 25.  Cox J, Mann M 2008. MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification. Nat. Biotechnol. 26:121367–72
    [Google Scholar]
  26. 26.  Tyanova S, Temu T, Cox J 2016. The MaxQuant computational platform for mass spectrometry-based shotgun proteomics. Nat. Protoc. 11:122301–19
    [Google Scholar]
  27. 27.  Tyanova S, Temu T, Carlson A, Sinitcyn P, Mann M, Cox J 2015. Visualization of LC-MS/MS proteomics data in MaxQuant. Proteomics 15:81453–56
    [Google Scholar]
  28. 28.  Tyanova S, Temu T, Sinitcyn P, Carlson A, Hein MY et al. 2016. The Perseus computational platform for comprehensive analysis of (prote)omics data. Nat. Methods 13:9731–40
    [Google Scholar]
  29. 29.  Rost HL, Sachsenberg T, Aiche S, Bielow C, Weisser H et al. 2016. OpenMS: a flexible open-source software platform for mass spectrometry data analysis. Nat Meth 13:9741–48
    [Google Scholar]
  30. 30.  Deutsch EW, Mendoza L, Shteynberg D, Farrah T, Lam H et al. 2010. A guided tour of the Trans-Proteomic Pipeline. Proteomics 10:61150–59
    [Google Scholar]
  31. 31.  McIlwain S, Tamura K, Kertesz-Farkas A, Grant CE, Diament B et al. 2014. Crux: rapid open source protein tandem mass spectrometry analysis. J. Proteome Res. 13:104488–91
    [Google Scholar]
  32. 32.  Perez-Riverol Y, Alpi E, Wang R, Hermjakob H, Vizcaíno JA 2015. Making proteomics data accessible and reusable: current state of proteomics databases and repositories. Proteomics 15:5–6930–50
    [Google Scholar]
  33. 33.  Vizcaíno JA, Csordas A, Del-Toro N, Dianes JA, Griss J et al. 2016. 2016 update of the PRIDE database and its related tools. Nucleic Acids Res 44:D1D447–56
    [Google Scholar]
  34. 34.  Vizcaino JA, Deutsch EW, Wang R, Csordas A, Reisinger F et al. 2014. ProteomeXchange provides globally coordinated proteomics data submission and dissemination. Nat. Biotechnol. 32:3223–26
    [Google Scholar]
  35. 35.  Griss J, Jones AR, Sachsenberg T, Walzer M, Gatto L et al. 2014. The mzTab data exchange format: communicating mass-spectrometry-based proteomics and metabolomics experimental results to a wider audience. Mol. Cell. Proteom. 13:102765–75
    [Google Scholar]
  36. 36.  Wilhelm M, Schlegl J, Hahne H, Moghaddas Gholami A, Lieberenz M et al. 2014. Mass-spectrometry-based draft of the human proteome. Nature 509:7502582–87
    [Google Scholar]
  37. 37.  Kim MS, Pinto SM, Getnet D, Nirujogi RS, Manda SS et al. 2014. A draft map of the human proteome. Nature 509:7502575–81
    [Google Scholar]
  38. 38.  Schaab C, Geiger T, Stoehr G, Cox J, Mann M 2012. Analysis of high accuracy, quantitative proteomics data in the MaxQB database. Mol. Cell. Proteom. 11:3M111.014068
    [Google Scholar]
  39. 39.  Desiere F 2006. The PeptideAtlas project. Nucleic Acids Res 34:90001D655–58
    [Google Scholar]
  40. 40. UniProt Consort. 2017. UniProt: the universal protein knowledgebase. Nucleic Acids Res 45:D1D158–69
    [Google Scholar]
  41. 41.  Mohimani H, Yang YL, Liu WT, Hsieh PW, Dorrestein PC, Pevzner PA 2011. Sequencing cyclic peptides by multistage mass spectrometry. Proteomics 11:183642–50
    [Google Scholar]
  42. 42.  Yates A, Akanni W, Amode MR, Barrell D, Billis K et al. 2016. Ensembl 2016. Nucleic Acids Res 44:D1D710–16
    [Google Scholar]
  43. 43.  Vaudel M, Barsnes H, Berven FS, Sickmann A, Martens L 2011. SearchGUI: an open-source graphical user interface for simultaneous OMSSA and X!Tandem searches. Proteomics 11:5996–99
    [Google Scholar]
  44. 44.  Vaudel M, Burkhart JM, Zahedi RP, Oveland E, Berven FS et al. 2015. PeptideShaker enables reanalysis of MS-derived proteomics data sets. Nat. Biotechnol. 33:122–24
    [Google Scholar]
  45. 45.  Zhang J, Gonzalez E, Hestilow T, Haskins W, Huang Y 2009. Review of peak detection algorithms in liquid-chromatography-mass spectrometry. Curr. Genom. 10:6388–401
    [Google Scholar]
  46. 46.  Miladinović SM, Kozhinov AN, Gorshkov MV, Tsybin YO 2012. On the utility of isotopic fine structure mass spectrometry in protein identification. Anal. Chem. 84:94042–51
    [Google Scholar]
  47. 47.  Snyder LR, Kirkland JJ, Dolan JW 2010. Introduction to Modern Liquid Chromatography Hoboken, NJ: Wiley
  48. 48.  Kanu AB, Dwivedi P, Tam M, Matz L, Hill HH 2008. Ion mobility–mass spectrometry. J. Mass Spectrom. 43:11–22
    [Google Scholar]
  49. 49.  Heller R, Stanley D, Yekutieli D, Rubin N, Benjamini Y 2006. Cluster-based analysis of FMRI data. Neuroimage 33:2599–608
    [Google Scholar]
  50. 50.  Senko MW, Beu SC, McLafferty FW 1995. Determination of monoisotopic masses and ion populations for large biomolecules from resolved isotopic distributions. J. Am. Soc. Mass Spectrom. 6:4229–33
    [Google Scholar]
  51. 51.  Rockwood AL, Van Orden SL, Smith RD 1996. Ultrahigh resolution isotope distribution calculations. Rapid Commun. Mass Spectrom. 10:154–59
    [Google Scholar]
  52. 52.  Horn DM, Zubarev RA, McLafferty FW 2000. Automated reduction and interpretation of high resolution electrospray mass spectra of large molecules. J. Am. Soc. Mass Spectrom. 11:4320–32
    [Google Scholar]
  53. 53.  Oda Y, Huang K, Cross FR, Cowburn D, Chait BT 1999. Accurate quantitation of protein expression and site-specific phosphorylation. PNAS 96:126591–96
    [Google Scholar]
  54. 54.  Bantscheff M, Schirle M, Sweetman G, Rick J, Kuster B 2007. Quantitative mass spectrometry in proteomics: a critical review. Anal. Bioanal. Chem. 389:41017–31
    [Google Scholar]
  55. 55.  Ong SE, Mann M 2007. Stable isotope labeling by amino acids in cell culture for quantitative proteomics. Methods Mol. Biol. 359:37–52
    [Google Scholar]
  56. 56.  Hsu JL, Huang SY, Chow NH, Chen SH 2003. Stable-isotope dimethyl labeling for quantitative proteomics. Anal. Chem. 75:246843–52
    [Google Scholar]
  57. 57.  Boersema PJ, Aye TT, van Veen TA, Heck AJ, Mohammed S 2008. Triplex protein quantification based on stable isotope labeling by peptide dimethylation applied to cell and tissue lysates. Proteomics 8:224624–32
    [Google Scholar]
  58. 58.  Engelsberger WR, Erban A, Kopka J, Schulze WX 2006. Metabolic labeling of plant cell cultures with K15NO3 as a tool for quantitative analysis of proteins and metabolites. Plant Methods 2:314
    [Google Scholar]
  59. 59.  Ippel JH, Pouvreau L, Kroef T, Gruppen H, Versteeg G et al. 2004. In vivo uniform 15N-isotope labelling of plants: using the greenhouse for structural proteomics. Proteomics 4:1226–34
    [Google Scholar]
  60. 60.  Cox J, Michalski A, Mann M 2011. Software lock mass by two-dimensional minimization of peptide mass errors. J. Am. Soc. Mass Spectrom. 22:81373–80
    [Google Scholar]
  61. 61.  Cox J, Mann M 2009. Computational principles of determining and improving mass precision and accuracy for proteome measurements in an Orbitrap. J. Am. Soc. Mass Spectrom. 20:81477–85
    [Google Scholar]
  62. 62.  Podwojski K, Fritsch A, Chamrad DC, Paul W, Sitek B et al. 2009. Retention time alignment algorithms for LC/MS data must consider non-linear shifts. Bioinformatics 25:6758–64
    [Google Scholar]
  63. 63.  Mueller LN, Rinner O, Schmidt A, Letarte S, Bodenmiller B et al. 2007. SuperHirn—a novel tool for high resolution LC-MS-based peptide/protein profiling. Proteomics 7:193470–80
    [Google Scholar]
  64. 64.  Pasa-Tolic L, Masselon C, Barry RC, Shen Y, Smith RD 2004. Proteomic analyses using an accurate mass and time tag strategy. Biotechniques 37:4621–36
    [Google Scholar]
  65. 65.  Eng JK, McCormack AL, Yates JR 1994. An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database. J. Am. Soc. Mass Spectrom. 5:11976–89
    [Google Scholar]
  66. 66.  Perkins DN, Pappin DJ, Creasy DM, Cottrell JS 1999. Probability-based protein identification by searching sequence databases using mass spectrometry data. Electrophoresis 20:183551–67
    [Google Scholar]
  67. 67.  Geer LY, Markey SP, Kowalak JA, Wagner L, Xu M et al. 2004. Open mass spectrometry search algorithm. J. Proteome Res. 3:5958–64
    [Google Scholar]
  68. 68.  Craig R, Beavis RC 2004. TANDEM: matching proteins with tandem mass spectra. Bioinformatics 20:91466–67
    [Google Scholar]
  69. 69.  Cox J, Neuhauser N, Michalski A, Scheltema RA, Olsen JV, Mann M 2011. Andromeda: a peptide search engine integrated into the MaxQuant environment. J. Proteome Res. 10:41794–1805
    [Google Scholar]
  70. 70.  Elias JE, Gygi SP 2007. Target-decoy search strategy for increased confidence in large-scale protein identifications by mass spectrometry. Nat. Methods 4:3207–14
    [Google Scholar]
  71. 71.  Keller A, Nesvizhskii AI, Kolker E, Aebersold R 2002. Empirical statistical model to estimate the accuracy of peptide identifications made by MS/MS and database search. Anal. Chem. 74:205383–92
    [Google Scholar]
  72. 72.  Choi H, Nesvizhskii AI 2008. Semisupervised model-based validation of peptide identifications in mass spectrometry-based proteomics. J. Proteome Res. 7:1254–65
    [Google Scholar]
  73. 73.  Käll L, Canterbury JD, Weston J, Noble WS, MacCoss MJ 2007. Semi-supervised learning for peptide identification from shotgun proteomics datasets. Nat. Methods 4:11923–25
    [Google Scholar]
  74. 74.  Degroeve S, Martens L, Jurisica I 2013. MS2PIP: a tool for MS/MS peak intensity prediction. Bioinformatics 29:243199–203
    [Google Scholar]
  75. 75.  Tran NH, Zhang X, Xin L, Shan B, Li M 2017. De novo peptide sequencing by deep learning. PNAS 114:318247–52
    [Google Scholar]
  76. 76.  Taylor JA, Johnson RS 1997. Sequence database searches via de novo peptide sequencing by tandem mass spectrometry. Rapid Commun. Mass Spectrom. 11:91067–75
    [Google Scholar]
  77. 77.  Ma B, Zhang K, Hendrie C, Liang C, Li M et al. 2003. PEAKS: powerful software for peptide de novo sequencing by tandem mass spectrometry. Rapid Commun. Mass Spectrom. 17:202337–42
    [Google Scholar]
  78. 78.  Ma B, Johnson R 2012. De novo sequencing and homology searching. Mol. Cell. Proteom. 11:2O111.014902
    [Google Scholar]
  79. 79.  Han Y, Ma B, Zhang K 2004. SPIDER: software for protein identification from sequence tags with de novo sequencing error. Proc. Comput. Syst. Bioinform. Conf., Stanford, Calif., 16–19 Aug.206–15 New York: IEEE
    [Google Scholar]
  80. 80.  Bern M, Cai Y, Goldberg D 2007. Lookup peaks: a hybrid of de novo sequencing and database search for protein identification by tandem mass spectrometry. Anal. Chem. 79:41393–1400
    [Google Scholar]
  81. 81.  Beausoleil SA, Villen J, Gerber SA, Rush J, Gygi SP 2006. A probability-based approach for high-throughput protein phosphorylation analysis and site localization. Nat. Biotechnol. 24:101285–92
    [Google Scholar]
  82. 82.  Bailey CM, Sweet SMM, Cunningham DL, Zeller M, Heath JK, Cooper HJ 2009. SLoMo: automated site localization of modifications from ETD/ECD mass spectra. J. Proteome Res. 8:41965–71
    [Google Scholar]
  83. 83.  Lemeer S, Kunold E, Klaeger S, Raabe M, Towers MW et al. 2012. Phosphorylation site localization in peptides by MALDI MS/MS and the Mascot Delta Score. Anal. Bioanal. Chem. 402:1249–60
    [Google Scholar]
  84. 84.  Savitski MM, Lemeer S, Boesche M, Lang M, Mathieson T et al. 2011. Confident phosphorylation site localization using the Mascot Delta Score. Mol. Cell. Proteom. 10:2M110.003830
    [Google Scholar]
  85. 85.  Taus T, Köcher T, Pichler P, Paschke C, Schmidt A et al. 2011. Universal and confident phosphorylation site localization using phosphoRS. J. Proteome Res. 10:125354–62
    [Google Scholar]
  86. 86.  Sharma K, D'Souza RC, Tyanova S, Schaab C, Wisniewski JR et al. 2014. Ultradeep human phosphoproteome reveals a distinct regulatory nature of Tyr and Ser/Thr-based signaling. Cell Rep 8:51583–94
    [Google Scholar]
  87. 87.  Chick JM, Kolippakkam D, Nusinow DP, Zhai B, Rad R et al. 2015. A mass-tolerant database search—supplementary. Nat. Biotechnol. 33:7743–49
    [Google Scholar]
  88. 88.  Savitski MM 2006. ModifiComb, a new proteomic tool for mapping substoichiometric post-translational modifications, finding novel types of modifications, and fingerprinting complex protein mixtures. Mol. Cell. Proteom. 5:5935–48
    [Google Scholar]
  89. 89.  Kong AT, Leprevost FV, Avtonomov DM, Mellacheruvu D, Nesvizhskii AI 2017. MSFragger: ultrafast and comprehensive peptide identification in mass spectrometry-based proteomics. Nat. Methods 14:5513–20
    [Google Scholar]
  90. 90.  Sinz A 2006. Chemical cross-linking and mass spectrometry to map three-dimensional protein structures and protein–protein interactions. Mass Spectrom Rev 25:4663–82
    [Google Scholar]
  91. 91.  Singh P, Panchaud A, Goodlett DR 2010. Chemical cross-linking and mass spectrometry as a low-resolution protein structure determination technique. Anal. Chem. 82:72636–42
    [Google Scholar]
  92. 92.  Hoopmann MR, Zelter A, Johnson RS, Riffle M, MacCoss MJ et al. 2015. Kojak: efficient analysis of chemically cross-linked protein complexes. J. Proteome Res. 14:52190–98
    [Google Scholar]
  93. 93.  Götze M, Pettelkau J, Schaks S, Bosse K, Ihling CH et al. 2012. StavroX—a software for analyzing crosslinked products in protein interaction studies. J. Am. Soc. Mass Spectrom. 23:176–87
    [Google Scholar]
  94. 94.  Liu F, Lössl P, Scheltema R, Viner R, Heck AJR 2017. Optimized fragmentation schemes and data analysis strategies for proteome-wide cross-link identification. Nat. Commun. 8:15473
    [Google Scholar]
  95. 95.  Yang B, Wu YJ, Zhu M, Fan SB, Lin J et al. 2012. Identification of cross-linked peptides from complex samples. Nat. Methods 9:9904–6
    [Google Scholar]
  96. 96.  Leitner A, Walzthoeni T, Kahraman A, Herzog F, Rinner O et al. 2010. Probing native protein structures by chemical cross-linking, mass spectrometry, and bioinformatics. Mol. Cell Proteom. 9:81634–49
    [Google Scholar]
  97. 97.  Chen ZA, Fischer L, Cox J, Rappsilber J 2016. Quantitative cross-linking/mass spectrometry using isotope-labeled cross-linkers and MaxQuant. Mol. Cell Proteom. 15:2769–78
    [Google Scholar]
  98. 98.  Nesvizhskii AI 2014. Proteogenomics: concepts, applications and computational strategies. Nat. Methods 11:111114–25
    [Google Scholar]
  99. 99.  Temu T, Mann M, Räschle M, Cox J 2016. Homology-driven assembly of NOn-redundant protEin sequence sets (NOmESS) for mass spectrometry. Bioinformatics 32:91417–19
    [Google Scholar]
  100. 100.  Huang T, Wang J, Yu W, He Z 2012. Protein inference: a review. Brief. Bioinform. 13:5586–614
    [Google Scholar]
  101. 101.  Yang X, Dondeti V, Dezube R, Maynard DM, Geer LY et al. 2004. DBParser: web-based software for shotgun proteomic data analyses. J. Proteome Res. 3:51002–8
    [Google Scholar]
  102. 102.  Ma ZQ, Dasari S, Chambers MC, Litton MD, Sobecki SM et al. 2009. IDPicker 2.0: improved protein assembly with high discrimination peptide identification filtering. J. Proteome Res. 8:83872–81
    [Google Scholar]
  103. 103.  Slotta DJ, McFarland MA, Markey SP 2010. MassSieve: panning MS/MS peptide data for proteins. Proteomics 10:163035–39
    [Google Scholar]
  104. 104.  Alves P, Arnold RJ, Novotny MV, Radivojac P, Reilly JP, Tang H 2007. Advancement in protein inference from shotgun proteomics using peptide detectability. Proc. Pac. Symp. Biocomput., Maui, Hawaii, 3–7 Jan.409–20 http://psb.stanford.edu/psb-online/proceedings/psb07/alves.pdf
  105. 105.  Sober E 2017. Ockham's Razors: A User's Manual Cambridge, UK: Cambridge Univ. Press
  106. 106.  Nesvizhskii AI, Keller A, Kolker E, Aebersold R 2003. A statistical model for identifying proteins by tandem mass spectrometry. Anal. Chem. 75:174646–58
    [Google Scholar]
  107. 107.  Serang O, MacCoss MJ, Noble WS 2010. Efficient marginalization to compute protein posterior probabilities from shotgun mass spectrometry data. J. Proteome Res. 9:105346–57
    [Google Scholar]
  108. 108.  Reiter L, Claassen M, Schrimpf SP, Jovanovic M, Schmidt A et al. 2009. Protein identification false discovery rates for very large proteomics data sets generated by tandem mass spectrometry. Mol. Cell Proteom. 8:112405–17
    [Google Scholar]
  109. 109.  Savitski MM, Wilhelm M, Hahne H, Kuster B, Bantscheff M 2015. A scalable approach for protein false discovery rate estimation in large proteomic data sets. Mol. Cell Proteom. 14:2394–404
    [Google Scholar]
  110. 110.  Cox J, Hein MY, Luber CA, Paron I, Nagaraj N, Mann M 2014. Accurate proteome-wide label-free quantification by delayed normalization and maximal peptide ratio extraction, termed MaxLFQ. Mol. Cell Proteom. 13:92513–26
    [Google Scholar]
  111. 111.  Gauthier NP, Soufi B, Walkowicz WE, Pedicord VA, Mavrakis KJ et al. 2013. Cell-selective labeling using amino acid precursors for proteomic studies of multicellular environments. Nat. Methods 10:8768–73
    [Google Scholar]
  112. 112.  Merrill AE, Hebert AS, MacGilvray ME, Rose CM, Bailey DJ et al. 2014. NeuCode labels for relative protein quantification. Mol. Cell Proteom. 13:92503–12
    [Google Scholar]
  113. 113.  Geiger T, Cox J, Ostasiewicz P, Wisniewski JR, Mann M 2010. Super-SILAC mix for quantitative proteomics of human tumor tissue. Nat. Methods 7:5383–85
    [Google Scholar]
  114. 114.  Thompson A, Schäfer JJ, Kuhn K, Kienle S, Schwarz J et al. 2003. Tandem mass tags: a novel quantificaiton strategy for comparative analysis of complex protein mixtures by MS/MS. Anal. Chem. 75:81895–1904
    [Google Scholar]
  115. 115.  Ross PL, Huang YN, Marchese JN, Williamson B, Parker K et al. 2004. Multiplexed protein quantitation in Saccharomyces cerevisiae using amine-reactive isobaric tagging reagents. Mol. Cell. Proteom. 3:121154–69
    [Google Scholar]
  116. 116.  Rauniyar N, Yates JR 2014. Isobaric labeling-based relative quantification in shotgun proteomics. J. Proteome Res. 13:125293–303
    [Google Scholar]
  117. 117.  Ow SY, Salim M, Noirel J, Evans C, Rehman I, Wright PC 2009. iTRAQ underestimation in simple and complex mixtures: “the good, the bad and the ugly. .” J. Proteome Res. 8:115347–55
    [Google Scholar]
  118. 118.  Wenger CD, Lee MV, Hebert AS, McAlister GC, Phanstiel DH et al. 2011. Gas-phase purification enables accurate, multiplexed proteome quantification with isobaric tagging. Nat. Methods 8:11933–35
    [Google Scholar]
  119. 119.  McAlister GC, Nusinow DP, Jedrychowski MP, Wuhr M, Huttlin EL et al. 2014. MultiNotch MS3 enables accurate, sensitive, and multiplexed detection of differential expression across cancer cell line proteomes. Anal. Chem. 86:147150–58
    [Google Scholar]
  120. 120.  Wühr M, Haas W, McAlister GC, Peshkin L, Rad R et al. 2012. Accurate multiplexed proteomics at the MS2 level using the complement reporter ion cluster. Anal. Chem. 84:219214–21
    [Google Scholar]
  121. 121.  Savitski MM, Fischer F, Mathieson T, Sweetman G, Lang M, Bantscheff M 2010. Targeted data acquisition for improved reproducibility and robustness of proteomic mass spectrometry assays. J. Am. Soc. Mass Spectrom. 21:101668–79
    [Google Scholar]
  122. 122.  Michalski A, Cox J, Mann M 2011. More than 100,000 detectable peptide species elute in single shotgun proteomics runs but the majority is inaccessible to data-dependent LC-MS/MS. J. Proteome Res. 10:41785–93
    [Google Scholar]
  123. 123.  Savitski MM, Mathieson T, Zinn N, Sweetman G, Doce C et al. 2013. Measuring and managing ratio compression for accurate iTRAQ/TMT quantification. J. Proteome Res. 12:83586–98
    [Google Scholar]
  124. 124.  Schwanhäusser B, Busse D, Li N, Dittmar G, Schuchhardt J et al. 2011. Global quantification of mammalian gene expression control. Nature 473:7347337–42
    [Google Scholar]
  125. 125.  Silva JC 2005. Absolute quantification of proteins by LCMSE: a virtue of parallel MS acquisition. Mol. Cell. Proteom. 5:1144–56
    [Google Scholar]
  126. 126.  Wisniewski JR, Hein MY, Cox J, Mann M 2014. A “proteomic ruler” for protein copy number and concentration estimation without spike-in standards. Mol. Cell Proteom. 13:123497–506
    [Google Scholar]
  127. 127.  Olsen JV, Vermeulen M, Santamaria A, Kumar C, Miller ML et al. 2010. Quantitative phosphoproteomics reveals widespread full phosphorylation site occupancy during mitosis. Sci. Signal. 3:104ra3
    [Google Scholar]
  128. 128.  Krzywinski M, Altman N 2013. Points of significance: significance, P values and t-tests. Nat. Methods 10:1041–42
    [Google Scholar]
  129. 129.  Krzywinski M, Altman N 2014. Points of significance: Analysis of variance and blocking. Nat. Methods 11:7699–700
    [Google Scholar]
  130. 130.  Noble WS 2009. How does multiple testing correction work?. Nat. Biotechnol. 27:121135–37
    [Google Scholar]
  131. 131.  Benjamini Y, Hochberg Y 1995. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. B 57:289–300
    [Google Scholar]
  132. 132.  Tusher VG, Tibshirani R, Chu G 2001. Significance analysis of microarrays applied to the ionizing radiation response. PNAS 98:95116–21
    [Google Scholar]
  133. 133. Gene Ontol. Consort. 2015. Gene Ontology Consortium: going forward. Nucleic Acids Res 43:D1049–56
    [Google Scholar]
  134. 134.  Fabregat A, Sidiropoulos K, Garapati P, Gillespie M, Hausmann K et al. 2016. The reactome pathway knowledgebase. Nucleic Acids Res 44:D1D481–87
    [Google Scholar]
  135. 135.  Ruepp A, Waegele B, Lechner M, Brauner B, Dunger-Kaltenbach I et al. 2009. CORUM: the comprehensive resource of mammalian protein complexes. Nucleic Acids Res 38:Suppl.1D646–50
    [Google Scholar]
  136. 136.  Robles MS, Cox J, Mann M 2014. In-vivo quantitative proteomics reveals a key contribution of post-transcriptional mechanisms to the circadian regulation of liver metabolism. PLOS Genet 10:1e1004047
    [Google Scholar]
  137. 137.  Hornbeck PV, Zhang B, Murray B, Kornhauser JM, Latham V, Skrzypek E 2015. PhosphoSitePlus, 2014: mutations, PTMs and recalibrations. Nucleic Acids Res 43:D512–20
    [Google Scholar]
  138. 138.  Perfetto L, Briganti L, Calderone A, Perpetuini AC, Iannuccelli M et al. 2016. SIGNOR: a database of causal relationships between biological entities. Nucleic Acids Res 44:D1D548–54
    [Google Scholar]
  139. 139.  Dinkel H, Chica C, Via A, Gould CM, Jensen LJ et al. 2011. Phospho.ELM: A database of phosphorylation sites—update 2011. Nucleic Acids Res 39:Suppl. 1D261–67
    [Google Scholar]
  140. 140.  Deeb SJ, Tyanova S, Hummel M, Schmidt-Supprian M, Cox J, Mann M 2015. Machine learning based classification of diffuse large B-cell lymphoma patients by their protein expression profiles. Mol. Cell Proteom. 14:112947–60
    [Google Scholar]
  141. 141.  Iglesias-Gato D, Wikstrom P, Tyanova S, Lavallee C, Thysell E et al. 2015. The proteome of primary prostate cancer. Eur. Urol. 69:5942–52
    [Google Scholar]
  142. 142.  Tyanova S, Albrechtsen R, Kronqvist P, Cox J, Mann M, Geiger T 2016. Proteomic maps of breast cancer subtypes. Nat. Commun. 7:10259
    [Google Scholar]
  143. 143.  Kohavi R 1995. A study of cross-validation and bootstrap for accuracy estimation and model selection. Proc. Int. Jt. Conf. Artif. Intell., 14th, Montr., Can., 20–25 Aug.1137–43 San Francisco: Morgan Kaufmann
    [Google Scholar]
  144. 144.  Vapnik VN 1995. The Nature of Statistical Learning Theory New York: Springer
  145. 145.  Schmidhuber J 2015. Deep learning in neural networks: an overview. Neural Netw 61:85–117
    [Google Scholar]
  146. 146.  LeCun Y, Bengio Y, Hinton G 2015. Deep learning. Nature 521:7553436–44
    [Google Scholar]
  147. 147.  Itzhak DN, Tyanova S, Cox J, Borner GHH 2016. Global, quantitative and dynamic mapping of protein subcellular localization. eLife 5:e16950
    [Google Scholar]
  148. 148.  Itzhak DN, Davies C, Tyanova S, Mishra A, Williamson J et al. 2017. A mass spectrometry-based approach for mapping protein subcellular localization reveals the spatial proteome of mouse primary neurons. Cell Rep 20:112706–18
    [Google Scholar]
  149. 149.  Bensimon A, Heck AJR, Aebersold R 2012. Mass spectrometry–based proteomics and network biology. Annu. Rev. Biochem. 81:379–405
    [Google Scholar]
  150. 150.  Keilhauer EC, Hein MY, Mann M 2015. Accurate protein complex retrieval by affinity enrichment mass spectrometry (AE-MS) rather than affinity purification mass spectrometry (AP-MS). Mol. Cell. Proteom. 14:1120–35
    [Google Scholar]
  151. 151.  Hein MY, Hubner NC, Poser I, Cox J, Nagaraj N et al. 2015. A human interactome in three quantitative dimensions organized by stoichiometries and abundances. Cell 163:3712–23
    [Google Scholar]
  152. 152.  Sowa ME, Bennett EJ, Gygi SP, Harper JW 2009. Defining the human deubiquitinating enzyme interaction landscape. Cell 138:2389–403
    [Google Scholar]
  153. 153.  Huttlin EL, Ting L, Bruckner RJ, Gebreab F, Gygi MP et al. 2015. The BioPlex network: a systematic exploration of the human interactome. Cell 162:2425–40
    [Google Scholar]
  154. 154.  Linding R, Jensen LJ, Pasculescu A, Olhovsky M, Colwill K et al. 2008. NetworKIN: a resource for exploring cellular phosphorylation networks. Nucleic Acids Res 36:Suppl. 1D695–99
    [Google Scholar]
  155. 155.  Dermit M, Dokal A, Cutillas PR 2017. Approaches to identify kinase dependencies in cancer signalling networks. FEBS Lett 591:172577–92
    [Google Scholar]
  156. 156.  Hernandez-Armenta C, Ochoa D, Gonçalves E, Saez-Rodriguez J, Beltrao P 2017. Benchmarking substrate-based kinase activity inference using phosphoproteomic data. Bioinformatics 33:121845–51
    [Google Scholar]
  157. 157.  Casado P, Rodriguez-Prados J-C, Cosulich SC, Guichard S, Vanhaesebroeck B et al. 2013. Kinase-substrate enrichment analysis provides insights into the heterogeneity of signaling pathway activation in leukemia cells. Sci. Signal. 6:268rs6
    [Google Scholar]
  158. 158.  Yang P, Zheng X, Jayaswal V, Hu G, Yang JYH, Jothi R 2015. Knowledge-based analysis for detecting key signaling events from time-series phosphoproteomics data. PLOS Comput. Biol. 11:8e1004403
    [Google Scholar]
  159. 159.  Mischnik M, Sacco F, Cox J, Schneider HC, Schäfer M et al. 2015. IKAP: A heuristic framework for inference of kinase activities from phosphoproteomics data. Bioinformatics 32:3424–31
    [Google Scholar]
  160. 160.  Rudolph JD, de Graauw M, van de Water B, Geiger T, Sharan R 2016. Elucidation of signaling pathways from large-scale phosphoproteomic data using protein interaction networks. Cell Syst 3:6585–93
    [Google Scholar]
  161. 161.  Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT et al. 2003. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res 13:112498–2504
    [Google Scholar]
  162. 162.  Maere S, Heymans K, Kuiper M 2005. BiNGO: a Cytoscape plugin to assess overrepresentation of Gene Ontology categories in Biological Networks. Bioinformatics 21:163448–49
    [Google Scholar]
  163. 163.  Bader GD, Hogue CW 2003. An automated method for finding molecular complexes in large protein interaction networks. BMC Bioinform 4:2
    [Google Scholar]
  164. 164.  Ideker T, Ozier O, Schwikowski B, Siegel AF 2002. Discovering regulatory and signalling circuits in molecular interaction networks. Bioinformatics 18:Suppl. 1S233–40
    [Google Scholar]
  165. 165.  Yosef N, Zalckvar E, Rubinstein AD, Homilius M, Atias N et al. 2011. ANAT: a tool for constructing and analyzing functional protein networks. Sci. Signal. 4:196pl1
    [Google Scholar]
  166. 166.  Geiger T, Cox J, Mann M 2010. Proteomic changes resulting from gene copy number variations in cancer cells. PLOS Genet 6:9e1001090
    [Google Scholar]
  167. 167.  Ingolia NT 2014. Ribosome profiling: new views of translation, from single codons to genome scale. Nat. Rev. Genet. 15:3205–13
    [Google Scholar]
  168. 168.  Cox J, Mann M 2012. 1D and 2D annotation enrichment: a statistical method integrating quantitative proteomics with complementary high-throughput data. BMC Bioinform 13:Suppl. 1S12
    [Google Scholar]
  169. 169.  He L, Hannon GJ 2004. MicroRNAs: small RNAs with a big role in gene regulation. Nat. Rev. Genet. 5:7522–31
    [Google Scholar]
  170. 170.  Hochstrasser M 1996. Ubiquitin-dependent protein degradation. Annu. Rev. Genet. 30:405–39
    [Google Scholar]
  171. 171.  Teo G, Vogel C, Ghosh D, Kim S, Choi H 2014. PECA: a novel statistical tool for deconvoluting time-dependent gene expression regulation. J. Proteome Res. 13:129–37
    [Google Scholar]
  172. 172.  Cheng Z, Teo G, Krueger S, Rock TM, Koh HW et al. 2016. Differential dynamics of the mammalian mRNA and protein expression response to misfolding stress. Mol. Syst. Biol. 12:1855–855
    [Google Scholar]
  173. 173.  Swainston N, Smallbone K, Hefzi H, Dobson PD, Brewer J et al. 2016. Recon 2.2: from reconstruction to model of human metabolism. Metabolomics 12:7109
    [Google Scholar]
  174. 174.  Yuan G-C, Cai L, Elowitz M, Enver T, Fan G et al. 2017. Challenges and emerging directions in single-cell analysis. Genome Biol 18:184
    [Google Scholar]
  175. 175.  Budnik B, Levy E, Slavov N 2017. Mass-spectrometry of single mammalian cells quantifies proteome heterogeneity during cell differentiation. bioRxiv 102681. https://doi.org/10.1101/102681
    [Crossref]
  176. 176.  Regev A, Teichmann SA, Lander ES, Amit I, Benoist C, Yosef N 2017. The Human Cell Atlas. bioRxiv 121202. http://dx.doi.org/10.1101/121202
    [Crossref]
  177. 177.  Kanehisa M, Sato Y, Kawashima M, Furumichi M, Tanabe M 2016. KEGG as a reference resource for gene and protein annotation. Nucleic Acids Res 44:D1D457–62
    [Google Scholar]
  178. 178.  Terfve CDA, Wilkes EH, Casado P, Cutillas PR, Saez-Rodriguez J 2015. Large-scale models of signal propagation in human cells derived from discovery phosphoproteomic data. Nat. Commun. 6:8033
    [Google Scholar]
  179. 179.  Hoops S, Sahle S, Gauges R, Lee C, Pahle J et al. 2006. COPASI–a COmplex PAthway SImulator. Bioinformatics 22:243067–74
    [Google Scholar]
  180. 180.  Angermann BR, Klauschen F, Garcia AD, Prustel T, Zhang F et al. 2012. Computational modeling of cellular signaling processes embedded into dynamic spatial contexts. Nat. Methods 9:3283–89
    [Google Scholar]
  181. 181.  Fenn JB, Mann M, Meng CK, Wong SF, Whitehouse CM 1989. Electrospray ionization for mass spectrometry of large biomolecules. Science 246:492664–71
    [Google Scholar]
  182. 182.  Hillenkamp F, Karas M, Beavis RC, Chait BT 1991. Matrix-assisted laser desorption/ionization mass spectrometry of biopolymers. Anal. Chem. 63:24A1193–1203
    [Google Scholar]
  183. 183.  Eliuk S, Makarov A 2015. Evolution of orbitrap mass spectrometry instrumentation. Annu. Rev. Anal. Chem. 8:61–80
    [Google Scholar]
  184. 184.  Meier F, Beck S, Grassl N, Lubeck M, Park MA et al. 2015. Parallel accumulation-serial fragmentation (PASEF): multiplying sequencing speed and sensitivity by synchronized scans in a trapped ion mobility device. J. Proteome Res. 14:125378–87
    [Google Scholar]
  185. 185.  Graumann J, Hubner NC, Kim JB, Ko K, Moser M et al. 2008. Stable isotope labeling by amino acids in cell culture (SILAC) and proteome quantitation of mouse embryonic stem cells to a depth of 5,111 proteins. Mol. Cell Proteom. 7:4672–83
    [Google Scholar]
  186. 186.  Mortazavi A, Williams BA, McCue K, Schaeffer L, Wold B 2008. Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat. Methods 5:7621–28
    [Google Scholar]
/content/journals/10.1146/annurev-biodatasci-080917-013516
Loading
/content/journals/10.1146/annurev-biodatasci-080917-013516
Loading

Data & Media loading...

  • Article Type: Review Article
This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error