1932

Abstract

How do we recognize objects despite changes in their appearance? The past three decades have been witness to intense debates regarding both whether objects are encoded invariantly with respect to viewing conditions and whether specialized, separable mechanisms are used for the recognition of different object categories. We argue that such dichotomous debates ask the wrong question. Much more important is the nature of object representations: What are features that enable invariance or differential processing between categories? Although the nature of object features is still an unanswered question, new methods for connecting data to models show significant potential for helping us to better understand neural codes for objects. Most prominently, new approaches to analyzing data from functional magnetic resonance imaging, including neural decoding and representational similarity analysis, and new computational models of vision, including convolutional neural networks, have enabled a much more nuanced understanding of visual representation. Convolutional neural networks are particularly intriguing as a tool for studying biological vision in that this class of artificial vision systems, based on biologically plausible deep neural networks, exhibits visual recognition capabilities that are approaching those of human observers. As these models improve in their recognition performance, it appears that they also become more effective in predicting and accounting for neural responses in the ventral cortex. Applying these and other deep models to empirical data shows great promise for enabling future progress in the study of visual recognition.

Loading

Article metrics loading...

/content/journals/10.1146/annurev-vision-111815-114621
2016-10-14
2024-12-07
Loading full text...

Full text loading...

/deliver/fulltext/vision/2/1/annurev-vision-111815-114621.html?itemId=/content/journals/10.1146/annurev-vision-111815-114621&mimeType=html&fmt=ahah

Literature Cited

  1. Andresen DR, Vinberg J, Grill-Spector K. 2009. The representation of object viewpoint in human visual cortex. NeuroImage 45:522–36 [Google Scholar]
  2. Barenholtz E, Tarr MJ. 2007. Reconsidering the role of structure in vision. Categories in Use 47 A Markman, B Ross 157–80 San Diego, CA: Academic [Google Scholar]
  3. Biederman I. 1985. Human image understanding: recent research and a theory. Comput. Vis. Graph. Image Process. 32:29–73 [Google Scholar]
  4. Biederman I, Bar M. 2000. Differing views on views: response to Hayward and Tarr (2000). Vis. Res. 40:3901–5 [Google Scholar]
  5. Booth MCA, Rolls ET. 1998. View-invariant representations of familiar objects by neurons in the inferior temporal visual cortex. Cereb. Cortex 8:510–23 [Google Scholar]
  6. Bukach CM, Phillips SW, Gauthier I. 2010. Limits of generalization between categories and implications for theories of category specificity. Atten. Percept. Psychophys. 72:1865–74 [Google Scholar]
  7. Bülthoff HH, Edelman SY, Tarr MJ. 1995. How are three-dimensional objects represented in the brain?. Cereb. Cortex 5:247–60 [Google Scholar]
  8. Chao LL, Haxby JV, Martin A. 1999. Attribute-based neural substrates in temporal cortex for perceiving and knowing about objects. Nat. Neurosci. 2:913–19 [Google Scholar]
  9. Curby KM, Glazek K, Gauthier I. 2009. A visual short-term memory advantage for objects of expertise. J. Exp. Psychol. 35:94–107 [Google Scholar]
  10. Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L. 2009. ImageNet: a large-scale hierarchical image database. Computer Vision and Pattern Recognition, 2009248–55 Piscataway, NJ: IEEE [Google Scholar]
  11. Dennett HW, McKone E, Tavashmi R, Hall A, Pidcock M. et al. 2012. The Cambridge Car Memory Test: a task matched in format to the Cambridge Face Memory Test, with norms, reliability, sex differences, dissociations from face memory, and expertise effects. Behav. Res. Methods 44587–605 [Google Scholar]
  12. Edelman S. 1999. Representation and Recognition in Vision. Cambridge, MA: MIT Press [Google Scholar]
  13. Erez J, Cusack R, Kendall W, Barense MD. 2015. Conjunctive coding of complex object features. Cereb. Cortex 26:2271–82 [Google Scholar]
  14. Ewbank MP, Andrews TJ. 2008. Differential sensitivity for viewpoint between familiar and unfamiliar faces in human visual cortex. NeuroImage 40:1857–70 [Google Scholar]
  15. Farah MJ. 1990. Visual Agnosia: Disorders of Object Recognition and What They Tell Us about Normal Vision. Cambridge, MA: MIT Press [Google Scholar]
  16. Felleman DJ, Van Essen DC. 1991. Distributed hierarchical processing in the primate cerebral cortex. Cereb. Cortex 1:1–47 [Google Scholar]
  17. Fukushima K. 1980. Neocognitron: a self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biol. Cybern. 36:193–202 [Google Scholar]
  18. Gauthier I, Curran T, Curby KM, Collins D. 2003. Perceptual interference evidence for a non-modular account of face processing. Nat. Neurosci. 6428–32 [Google Scholar]
  19. Gauthier I, Nelson CA. 2001. The development of face expertise. Curr. Opin. Neurobiol. 11:219–24 [Google Scholar]
  20. Gauthier I, Skudlarski P, Gore JC, Anderson AW. 2000. Expertise for cars and birds recruits brain areas involved in face recognition. Nat. Neurosci. 3:191–97 [Google Scholar]
  21. Grill-Spector K, Sayres R, Ress D. 2006. High-resolution imaging reveals highly selective nonface clusters in the fusiform face area. Nat. Neurosci. 9:1177–85 [Google Scholar]
  22. Güçlü U, van Gerven MA. 2015. Deep neural networks reveal a gradient in the complexity of neural representations across the ventral stream. J. Neurosci. 35:10005–14 [Google Scholar]
  23. Harel A, Gilaie-Dotan S, Malach R, Bentin S. 2010. Top-down engagement modulates the neural expressions of visual expertise. Cereb. Cortex 202304–18 [Google Scholar]
  24. Haxby JV, Gobbini MI, Furey ML, Ishai A, Schouten JL, Pietrini P. 2001. Distributed and overlapping representations of faces and objects in ventral temporal cortex. Science 293:2425–30 [Google Scholar]
  25. Hayward WG, Tarr MJ. 2000. Differing views on views: comments on Biederman and Bar (1999). Vis. Res. 40:3895–99 [Google Scholar]
  26. He K, Zhang X, Ren S, Sun J. 2015. Deep residual learning for image recognition. arXiv:1512.03385 [cs.CV] [Google Scholar]
  27. Hubel DH, Wiesel TN. 1968. Receptive fields and functional architecture of monkey striate cortex. J. Physiol. 203:237–60 [Google Scholar]
  28. Hubel DH, Wiesel TN. 1974. Sequence regularity and geometry of orientation columns in the monkey striate cortex. J. Comp. Neurol. 158:267–93 [Google Scholar]
  29. Huth AG, Nishimoto S, Vu AT, Gallant JL. 2012. A continuous semantic space describes the representation of thousands of object and action categories across the human brain. Neuron 76:1210–24 [Google Scholar]
  30. Jiang X, Bradley E, Rini RA, Zeffiro T, Vanmeter J, Riesenhuber M. 2007. Categorization training results in shape- and category-selective human neural plasticity. Neuron 53:891–903 [Google Scholar]
  31. Kamitani Y, Tong F. 2005. Decoding the visual and subjective contents of the human brain. Nat. Neurosci. 8:679–85 [Google Scholar]
  32. Kanwisher N. 2000. Domain specificity in face perception. Nat. Neurosci. 3:759–63 [Google Scholar]
  33. Kanwisher N, McDermott J, Chun MM. 1997. The fusiform face area: a module in human extrastriate cortex specialized for face perception. J. Neurosci. 17:4302–11 [Google Scholar]
  34. Kay KN, Naselaris T, Prenger RJ, Gallant JL. 2008. Identifying natural images from human brain activity. Nature 452:352–55 [Google Scholar]
  35. Khaligh-Razavi SM, Kriegeskorte N. 2014. Deep supervised, but not unsupervised, models may explain IT cortical representation. PLOS Comput. Biol. 10:e1003915 [Google Scholar]
  36. Kriegeskorte N. 2015. Deep neural networks: a new framework for modeling biological vision and brain information processing. Annu. Rev. Vis. Sci. 1:417–46 [Google Scholar]
  37. Kriegeskorte N, Goebel R, Bandettini P. 2006. Information-based functional brain mapping. PNAS 103:3863–68 [Google Scholar]
  38. Kriegeskorte N, Mur M, Bandettini P. 2008. Representational similarity analysis—connecting the branches of systems neuroscience. Front. Syst. Neurosci. 2:4 [Google Scholar]
  39. LeCun Y, Bengio Y, Hinton G. 2015. Deep learning. Nature 521:436–44 [Google Scholar]
  40. Leeds DD, Seibert DA, Pyles JA, Tarr MJ. 2013. Comparing visual representations across human fMRI and computational vision. J. Vis. 13:1325 [Google Scholar]
  41. Lowe DG. 2004. Distinctive image features from scale-invariant keypoints. Intl. J. Comp. Vis. 60:91–110 [Google Scholar]
  42. Marr D. 1982. Vision: A Computational Investigation into the Human Representation and Processing of Visual Information. San Francisco: Freeman [Google Scholar]
  43. Martin A. 2007. The representation of object concepts in the brain. Annu. Rev. Psychol. 58:25–45 [Google Scholar]
  44. McCandliss BD, Cohen L, Dehaene S. 2003. The visual word form area: expertise for reading in the fusiform gyrus. Trends Cogn. Sci. 7293–99 [Google Scholar]
  45. McGugin RW, Gatenby JC, Gore JC, Gauthier I. 2012a. High-resolution imaging of expertise reveals reliable object selectivity in the fusiform face area related to perceptual performance. PNAS 109:17063–68 [Google Scholar]
  46. McGugin RW, Newton AT, Gore JC, Gauthier I. 2014. Robust expertise effects in right FFA. Neuropsychologia 63:135–44 [Google Scholar]
  47. McGugin RW, Richler JJ, Herzmann G, Speegle M, Gauthier I. 2012b. The Vanderbilt Expertise Test reveals domain-general and domain-specific sex effects in object recognition. Vis. Res. 69:10–22 [Google Scholar]
  48. McGugin RW, Van Gulick AE, Gauthier I. 2016. Cortical thickness in fusiform face area predicts face and object recognition performance. J. Cogn. Neurosci. 28:282–94 [Google Scholar]
  49. McGugin RW, Van Gulick AE, Tamber-Rosenau BJ, Ross DA, Gauthier I. 2015. Expertise effects in face-selective areas are robust to clutter and diverted attention but not to competition. Cereb. Cortex 252610–22 [Google Scholar]
  50. Miller GA. 1995. WordNet: a lexical database for English. Commun. ACM 38:39–41 [Google Scholar]
  51. Mordvintsev A, Olah C, Tyka M. 2015. Inceptionism: going deeper into neural networks. Google Research Blog June 17. http://googleresearch.blogspot.com/2015/06/inceptionism-going-deeper-into-neural.html [Google Scholar]
  52. Moscovitch M, Winocur G, Behrmann M. 1997. What is special about face recognition? Nineteen experiments on a person with visual object agnosia and dyslexia but normal face recognition. J. Cogn. Neurosci. 9:555–604 [Google Scholar]
  53. Nestor A, Vettel JM, Tarr MJ. 2008. Task-specific codes for face recognition: how they shape the neural representation of features for detection and individuation. PLOS ONE 3:e3978 [Google Scholar]
  54. Norman KA, Polyn SM, Detre GJ, Haxby JV. 2006. Beyond mind-reading: multi-voxel pattern analysis of fMRI data. Trends Cogn. Sci. 10:424–30 [Google Scholar]
  55. O'Toole AJ, Jiang F, Abdi H, Haxby JV. 2005. Partially distributed representations of objects and faces in ventral temporal cortex. J. Cogn. Neurosci. 17:580–90 [Google Scholar]
  56. Peissig JJ, Tarr MJ. 2007. Visual object recognition: Do we know more now than we did 20 years ago?. Annu. Rev. Psychol. 58:75–96 [Google Scholar]
  57. Perrett DI, Oram MW, Ashbridge E. 1998. Evidence accumulation in cell populations responsive to faces: an account of generalisation of recognition without mental transformations. Cognition 67:111–45 [Google Scholar]
  58. Pirsig RM. 1974. Zen and the Art of Motorcycle Maintenance: An Inquiry into Values. New York: HarperCollins. E-book [Google Scholar]
  59. Riesenhuber M, Poggio T. 1999. Hierarchical models of object recognition in cortex. Nat. Neurosci. 2:1019–25 [Google Scholar]
  60. Russakovsky O, Deng J, Su H, Krause J, Satheesh S. et al. 2015. ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. 115:211–52 [Google Scholar]
  61. Serences JT, Saproo S. 2012. Computational advances towards linking BOLD and behavior. Neuropsychologia 50:435–46 [Google Scholar]
  62. Serre T, Wolf L, Bileschi S, Riesenhuber M, Poggio T. 2007. Robust object recognition with cortex-like mechanisms. IEEE Trans. Pattern Anal. Mach. Intell. 29:411–26 [Google Scholar]
  63. Shakeshaft NG, Plomin R. 2015. Genetic specificity of face recognition. PNAS 112:12887–92 [Google Scholar]
  64. Sheinberg D, Tarr MJ. 2009. Objects of expertise. Perceptual Expertise: Bridging Brain and Behavior I Gauthier, MJ Tarr, D Bub 41–65 New York: Oxford Univ. Press [Google Scholar]
  65. Shepard RN. 1980. Multidimensional scaling, tree-fitting, and clustering. Science 210:390–98 [Google Scholar]
  66. Tarr MJ, Gauthier I. 2000. FFA: a flexible fusiform area for subordinate-level visual processing automatized by expertise. Nat. Neurosci. 3:764–69 [Google Scholar]
  67. Tarr MJ, Pinker S. 1989. Mental rotation and orientation-dependence in shape recognition. Cogn. Psychol. 21:233–82 [Google Scholar]
  68. Ungerleider LG, Bell AH. 2010. Uncovering the visual “alphabet”: advances in our understanding of object perception. Vis. Res. 51:782–99 [Google Scholar]
  69. Van Gulick AE, McGugin RW, Gauthier I. 2015. Measuring non-visual knowledge about object categories: the Semantic Vanderbilt Expertise Test. Behav. Res. Methods In press doi: 10.3758/s13428-015-0637-5 [Google Scholar]
  70. Vuilleumier P, Henson RN, Driver J, Dolan RJ. 2002. Multiple levels of visual object constancy revealed by event-related fMRI of repetition priming. Nat. Neurosci. 5:491–99 [Google Scholar]
  71. Wong AC-N, Jobard G, James KH, James TW, Gauthier I. 2009. Expertise with characters in alphabetic and non-alphabetic writing systems engage overlapping occipito-temporal areas. Cogn. Neuropsychol. 26111–27 [Google Scholar]
  72. Wong YK, Peng C, Fratus KN, Woodman GF, Gauthier I. 2014. Perceptual expertise and top-down expectation of musical notation engages the primary visual cortex. J. Cogn. Neurosci. 26:1629–43 [Google Scholar]
  73. Yamane Y, Carlson ET, Bowman KC, Wang Z, Connor CE. 2008. A neural code for three-dimensional object shape in macaque inferotemporal cortex. Nat. Neurosci. 11:1352–60 [Google Scholar]
  74. Yamins DL, Hong H, Cadieu CF, Solomon EA, Seibert D, DiCarlo JJ. 2014. Performance-optimized hierarchical models predict neural responses in higher visual cortex. PNAS 111:238619–24 [Google Scholar]
/content/journals/10.1146/annurev-vision-111815-114621
Loading
/content/journals/10.1146/annurev-vision-111815-114621
Loading

Data & Media loading...

  • Article Type: Review Article
This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error