Modern astronomy increasingly relies upon systematic surveys, whose dedicated telescopes continuously observe the sky across varied wavelength ranges of the electromagnetic spectrum; some surveys also observe nonelectromagnetic messengers, such as high-energy particles or gravitational waves. Stars and galaxies look different through the eyes of different instruments, and their independent measurements have to be carefully combined to provide a complete, sound picture of the multicolor and eventful universe. The association of an object's independent detections is, however, a difficult problem scientifically, computationally, and statistically, raising varied challenges across diverse astronomical applications. The fundamental problem is finding records in survey databases with directions that match to within the direction uncertainties. Such astronomical versions of the record linkage problem are known by various terms in astronomy: cross-matching; cross-identification; and directional, positional, or spatiotemporal coincidence assessment. Astronomers have developed several statistical approaches for such problems, largely independent of related developments in other disciplines. Here, we review emerging approaches that compute (Bayesian) probabilities for the hypotheses of interest: possible associations or demographic properties of a cosmic population that depend on identifying associations. Many cross-identification tasks can be formulated within a hierarchical Bayesian partition model framework, with components that explicitly account for astrophysical effects (e.g., source brightness versus wavelength, source motion, or source extent), selection effects, and measurement error. We survey recent developments and highlight important open areas for future research.


Article metrics loading...

Loading full text...

Full text loading...


Literature Cited

  1. Band DL, Hartmann DH. 1998. A statistical treatment of the gamma-ray burst “No Host Galaxy” problem. I. Methodology. Astrophys. J. 493:555–62 [Google Scholar]
  2. Berger JO. 2003. Could Fisher, Jeffreys and Neyman have agreed on testing?. Statist. Sci. 18:1–32.Article includes comments and a rejoinder by the author. [Google Scholar]
  3. Bernardo JM, Girón FJ. 1988. A Bayesian analysis of simple mixture problems. Bayesian Statistics, 3 (Valencia, 1987) JM Bernardo, MH Degroot, DV Lindley, AFM Smith 67–78 New York: Oxford Univ. Press [Google Scholar]
  4. Budavári T, Dobos L, Szalay AS. 2013. SkyQuery: federating astronomy archives. Comput. Sci. Eng. 15:12–20 [Google Scholar]
  5. Budavári T, Heinis S, Szalay AS, Nieto-Santisteban M, Gupchup J. et al. 2009. GALEX-SDSS catalogs for statistical studies. Astrophys. J. 694:1281–92 [Google Scholar]
  6. Budavári T, Szalay AS. 2008. Probabilistic cross-identification of astronomical sources. Astrophys. J. 679:301–9 [Google Scholar]
  7. Budavári T, Szalay AS, Fekete G. 2010. Searchable sky coverage of astronomical observations: footprints and exposures. Publ. Astron. Soc. Pac. 122:1375–88 [Google Scholar]
  8. Carroll RJ, Ruppert D, Stefanski LA, Crainiceanu CM. 2006. Measurement Error in Nonlinear Models. A Modern Perspective. Monogr Stat. Appl. Probab., vol. 105 Boca Raton, FL: Chapman & Hall/CRC 2nd ed.
  9. Crowley EM. 1997. Product partition models for normal means. J. Am. Stat. Assoc. 92:192–98 [Google Scholar]
  10. Dennerl K, Voges W, Englhauser J, Gruber R, Pfeffermann E. et al. 1994. The ROSAT X-ray sky around Orion. Astronomische Gesellschaft Abstract Series G Klare 1018 Hamburg, Ger: Astron. Ges. [Google Scholar]
  11. Diaconis P, Mosteller F. 1989. Methods for studying coincidences. J. Am. Stat. Assoc. 84:853–61 [Google Scholar]
  12. Fellegi IP, Sunter AB. 1969. A theory for record linkage. J. Am. Stat. Assoc. 64:1183–210 [Google Scholar]
  13. Fioc M. 2014. Probabilistic positional association of catalogs of astrophysical sources: the Aspects code. Astron. Astrophys. 566:A8 [Google Scholar]
  14. Fisher NI, Lewis T, Embleton BJJ. 1987. Statistical Analysis of Spherical Data Cambridge, UK: Cambridge Univ. Press
  15. Fisher R. 1953. Dispersion on a sphere. Proc. R. Soc. A: Math., Phys. Eng. Sci. 217:295–305 [Google Scholar]
  16. Gauss C. 1809. Theoria motus corporum celestium: in sectionibus conicis solem ambientium. Hamburg, Ger: I.H. Besser
  17. Graziani C, Lamb DQ. 1996. Likelihood methods and classical burster repetition. High Velocity Neutron Stars and Gamma Ray Bursts RE Rothschild, RE Lingenfelter Am. Inst. Phys. Conf. Proc. 366196–200 College Park, MD: Am. Inst. Phys. [Google Scholar]
  18. Graziani C, Lamb DQ, Marion GH. 1999. Evidence against an association between gamma-ray bursts and Type I supernovae. Astron. Astrophys. Suppl. 138:469–70 [Google Scholar]
  19. Green PJ, Mardia KV. 2006. Bayesian alignment using hierarchical models, with applications in protein bioinformatics. Biometrika 93:235–54 [Google Scholar]
  20. Hartigan J. 1990. Partition models. Commun. Stat. Theory Methods 19:2745–56 [Google Scholar]
  21. Kenobi K, Dryden IL. 2012. Bayesian matching of unlabeled point sets using procrustes and configuration models. Bayesian Anal 7:547–66 [Google Scholar]
  22. Kerekes G, Budavári T, Csabai I, Connolly AJ, Szalay AS. 2010. Cross identification of stars with unknown proper motions. Astrophys. J. 719:59–66 [Google Scholar]
  23. Lang D, Hogg DW, Mierle K, Blanton M, Roweis S. 2010. Astrometry.net: blind astrometric calibration of arbitrary astronomical images. Astron. J. 139:1782–800 [Google Scholar]
  24. Lee MA, Budavári T. 2013. Cross-identification of astronomical catalogs on multiple GPUs. Astronomical Data Analysis Software and Systems XXII, ed. DN Friedel, Astron. Soc. Pac. Conf. Ser. 475235 San Francisco: Astron. Soc. Pac. [Google Scholar]
  25. LIGO Sci. Collab., Virgo Collab 2013. Prospects for localization of gravitational wave transients by the Advanced LIGO and Advanced Virgo Observatories. arXiv: 1304.0670
  26. Loredo TJ. 2012. Commentary: On statistical cross-identification in astronomy. Statistical Challenges in Modern Astronomy V ED Feigelson, GJ Babu Lecture Notes Stat. 303–8 New York: Springer [Google Scholar]
  27. Luo S, Loredo T, Wasserman I. 1996. Likelihood analysis of GRB repetition. Gamma-Ray Bursts: 3rd Huntsville Symp., Am. Inst. Phys. Conf. Ser. C Kouveliotou, MF Briggs, GJ Fishman 384477–81 Melville, NY: AIP Publ. [Google Scholar]
  28. Mardia KV. 1972. Statistics of Directional Data. Probability and Mathematical Statistics, No. 13 London/New York: Acad. Press
  29. Marquez MJ, Budavári T, Sarro LM. 2014. Improving cross-identification of galaxies using their photometry. Astron. Astrophys. 563:A14 [Google Scholar]
  30. Sadinle M. 2014. Detecting duplicates in a homicide registry using a Bayesian partitioning approach. Ann. Appl. Stat. 8:42404–34 [Google Scholar]
  31. Sadinle M, Fienberg SE. 2013. A generalized Fellegi-Sunter framework for multiple record linkage with application to homicide record systems. J. Am. Stat. Assoc. 108:385–97 [Google Scholar]
  32. Scott JG, Berger JO. 2006. An exploration of aspects of Bayesian multiple testing. J. Statist. Plan. Inference 136:2144–62 [Google Scholar]
  33. Sidery T, Aylott B, Christensen N, Farr B, Farr W. et al. 2014. Reconstructing the sky location of gravitational-wave detected compact binary systems: methodology for testing and comparison. Phys. Rev. D 89:084060 [Google Scholar]
  34. Soiaporn K, Chernoff D, Loredo T, Ruppert D, Wasserman I. 2013. Multilevel Bayesian framework for modeling the production, propagation and detection of ultra-high energy cosmic rays. Ann. Appl. Stat. 7:1249–85 [Google Scholar]
  35. Steorts R, Hall R, Fienberg S. 2014. SMERED: a Bayesian approach to graphical record linkage and de-duplication. Proc. 17th Int. Conf. Artif. Intell. Stat., Vol. 33 San Francisco: Morgan Kaufmann Publ., Inc. [Google Scholar]
  36. Sutherland W, Saunders W. 1992. On the likelihood ratio for source identification. MNRAS 259:413–20 [Google Scholar]
  37. Tancredi A, Liseo B. 2011. A hierarchical Bayesian approach to record linkage and population size problems. Ann. Appl. Stat. 5:1553–85 [Google Scholar]
  38. Watson LJ, Mortlock DJ, Jaffe AH. 2011. A Bayesian analysis of the 27 highest energy cosmic rays detected by the Pierre Auger Observatory. MNRAS 418:206–13 [Google Scholar]

Data & Media loading...

  • Article Type: Review Article
This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error