1932

Abstract

People who have lost the ability to speak due to neurological injuries would greatly benefit from assistive technology that provides a fast, intuitive, and naturalistic means of communication. This need can be met with brain–computer interfaces (BCIs): medical devices that bypass injured parts of the nervous system and directly transform neural activity into outputs such as text or sound. BCIs for restoring movement and typing have progressed rapidly in recent clinical trials; speech BCIs are the next frontier. This review covers the clinical need for speech BCIs, surveys foundational studies that point to where and how speech can be decoded in the brain, describes recent progress in both discrete and continuous speech decoding and closed-loop speech BCIs, provides metrics for assessing these systems’ performance, and highlights key remaining challenges on the road toward clinically useful speech neuroprostheses.

Loading

Article metrics loading...

/content/journals/10.1146/annurev-bioeng-110122-012818
2025-05-01
2025-06-24
Loading full text...

Full text loading...

/deliver/fulltext/bioeng/27/1/annurev-bioeng-110122-012818.html?itemId=/content/journals/10.1146/annurev-bioeng-110122-012818&mimeType=html&fmt=ahah

Literature Cited

  1. 1.
    Hochberg LR, Serruya MD, Friehs GM, Mukand JA, Saleh M, et al. 2006.. Neuronal ensemble control of prosthetic devices by a human with tetraplegia. . Nature 442:(7099):16471
    [Crossref] [Google Scholar]
  2. 2.
    Guenther FH, Brumberg JS, Wright EJ, Nieto-Castanon A, Tourville JA, et al. 2009.. A wireless brain-machine interface for real-time speech synthesis. . PLOS ONE 4:(12):e8218
    [Crossref] [Google Scholar]
  3. 3.
    Leuthardt EC, Gaona C, Sharma M, Szrama N, Roland J, et al. 2011.. Using the electrocorticographic speech network to control a brain-computer interface in humans. . J. Neural Eng. 8:(3):036004
    [Crossref] [Google Scholar]
  4. 4.
    Moses DA, Leonard MK, Makin JG, Chang EF. 2019.. Real-time decoding of question-and-answer speech dialogue using human cortical activity. . Nat. Commun. 10:(1):3096
    [Crossref] [Google Scholar]
  5. 5.
    Moses DA, Metzger SL, Liu JR, Anumanchipalli GK, Makin JG, et al. 2021.. Neuroprosthesis for decoding speech in a paralyzed person with anarthria. . N. Engl. J. Med. 385:(3):21727
    [Crossref] [Google Scholar]
  6. 6.
    Angrick M, Ottenhoff MC, Diener L, Ivucic D, Ivucic G, et al. 2021.. Real-time synthesis of imagined speech processes from minimally invasive recordings of neural activity. . Commun. Biol. 4:(1):1055
    [Crossref] [Google Scholar]
  7. 7.
    Card NS, Wairagkar M, Iacobacci C, Hou X, Singer-Clark T, et al. 2024.. An accurate and rapidly calibrating speech neuroprosthesis. . N. Engl. J. Med. 391:(7):60918
    [Crossref] [Google Scholar]
  8. 8.
    Slutzky MW. 2019.. Brain-machine interfaces: powerful tools for clinical treatment and neuroscientific investigations. . Neuroscientist 25:(2):13954
    [Crossref] [Google Scholar]
  9. 9.
    He B, Yuan H, Meng J, Gao S. 2020.. Brain–computer interfaces. . In Neural Engineering, ed. B He , pp. 13183. Cham:: Springer Int. Publ.
    [Google Scholar]
  10. 10.
    Pandarinath C, Bensmaia SJ. 2022.. The science and engineering behind sensitized brain-controlled bionic hands. . Physiol. Rev. 102:(2):551604
    [Crossref] [Google Scholar]
  11. 11.
    Luo S, Rabbani Q, Crone NE. 2022.. Brain-computer interface: applications to speech decoding and synthesis to augment communication. . Neurotherapeutics 19:(1):26373
    [Crossref] [Google Scholar]
  12. 12.
    Silva AB, Littlejohn KT, Liu JR, Moses DA, Chang EF. 2024.. The speech neuroprosthesis. . Nat. Rev. Neurosci. 25:(7):47392
    [Crossref] [Google Scholar]
  13. 13.
    Laureys S, Pellas F, Van Eeckhout P, Ghorbel S, Schnakers C, et al. 2005.. The locked-in syndrome: What is it like to be conscious but paralyzed and voiceless?. Prog. Brain Res. 150::495511
    [Crossref] [Google Scholar]
  14. 14.
    Hochberg LR, Cudkowicz ME. 2014.. Locked in, but not out?. Neurology 82:(21):185253
    [Crossref] [Google Scholar]
  15. 15.
    Majaranta P, Aula A, Räihä K-J. 2004.. Effects of feedback on eye typing with a short dwell time. . In ETRA '04: Proceedings of the 2004 Symposium on Eye Tracking Research & Applications, pp. 13946. New York:: Assoc. Comput. Mach.
    [Google Scholar]
  16. 16.
    Linse K, Aust E, Joos M, Hermann A. 2018.. Communication matters—pitfalls and promise of hightech communication devices in palliative care of severely physically disabled patients with amyotrophic lateral sclerosis. . Front. Neurol. 9::603
    [Crossref] [Google Scholar]
  17. 17.
    Chang EF, Anumanchipalli GK. 2020.. Toward a speech neuroprosthesis. . JAMA 323:(5):41314
    [Crossref] [Google Scholar]
  18. 18.
    Huggins JE, Wren PA, Gruis KL. 2011.. What would brain-computer interface users want? Opinions and priorities of potential users with amyotrophic lateral sclerosis. . Amyotrophic Lateral Scleros. 12:(5):31824
    [Crossref] [Google Scholar]
  19. 19.
    Collinger, Boninger ML, Bruns TM, Curley K, Wang W, Weber DJ. 2013.. Functional priorities, assistive technology, and brain-computer interfaces after spinal cord injury. . J. Rehabil. Res. Dev. 50:(2):14560
    [Crossref] [Google Scholar]
  20. 20.
    Blabe CH, Gilja V, Chestek CA, Shenoy KV, Anderson KD, Henderson JM. 2015.. Assessment of brain–machine interfaces from the perspective of people with paralysis. . J. Neural Eng. 12:(4):043002
    [Crossref] [Google Scholar]
  21. 21.
    Slutzky MW, Flint RD. 2017.. Physiological properties of brain-machine interface input signals. . J. Neurophysiol. 118::132943
    [Crossref] [Google Scholar]
  22. 22.
    Tam W, Wu T, Zhao Q, Keefer E, Yang Z. 2019.. Human motor decoding from neural signals: a review. . BMC Biomed. Eng. 1:(1):22
    [Crossref] [Google Scholar]
  23. 23.
    Sorrell E, Rule ME, O'Leary T. 2021.. Brain machine interfaces: closed-loop control in an adaptive system. . Annu. Rev. Control Robot. Auton. Syst. 4::16789
    [Crossref] [Google Scholar]
  24. 24.
    Fedorenko E, Blank IA. 2020.. Broca's area is not a natural kind. . Trends Cogn. Sci. 24:(4):27084
    [Crossref] [Google Scholar]
  25. 25.
    Flinker A, Korzeniewska A, Shestyuk AY, Franaszczuk PJ, Dronkers NF, et al. 2015.. Redefining the role of Broca's area in speech. . PNAS 112:(9):287175
    [Crossref] [Google Scholar]
  26. 26.
    Willett FR, Kunz EM, Fan C, Avansino DT, Wilson GH, et al. 2023.. A high-performance speech neuroprosthesis. . Nature 620:(7976):103136
    [Crossref] [Google Scholar]
  27. 27.
    Glasser MF, Coalson TS, Robinson EC, Hacker CD, Harwell J, et al. 2016.. A multi-modal parcellation of human cerebral cortex. . Nature 536:(7615):17178
    [Crossref] [Google Scholar]
  28. 28.
    Tang J, LeBel A, Jain S, Huth AG. 2023.. Semantic reconstruction of continuous language from non-invasive brain recordings. . Nat. Neurosci. 26:(5):85866
    [Crossref] [Google Scholar]
  29. 29.
    Fedorenko E, Ivanova AA, Regev TI. 2024.. The language network as a natural kind within the broader landscape of the human brain. . Nat. Rev. Neurosci. 25::289312
    [Crossref] [Google Scholar]
  30. 30.
    Takai O, Brown S, Liotti M. 2010.. Representation of the speech effectors in the human motor cortex: somatotopy or overlap?. Brain Lang. 113:(1):3944
    [Crossref] [Google Scholar]
  31. 31.
    Guenther FH. 2016.. Neural Control of Speech Movements. Cambridge, MA:: MIT Press
    [Google Scholar]
  32. 32.
    Correia JM, Caballero-Gaudes C, Guediche S, Carreiras M. 2020.. Phonatory and articulatory representations of speech production in cortical and subcortical fMRI responses. . Sci. Rep. 10:(1):4529
    [Crossref] [Google Scholar]
  33. 33.
    Boto E, Holmes N, Leggett J, Roberts G, Shah V, et al. 2018.. Moving magnetoencephalography towards real-world applications with a wearable system. . Nature 555:(7698):65761
    [Crossref] [Google Scholar]
  34. 34.
    Nguyen CH, Karavas GK, Artemiadis P. 2018.. Inferring imagined speech using EEG signals: a new approach using Riemannian manifold features. . J. Neural Eng. 15:(1):016002
    [Crossref] [Google Scholar]
  35. 35.
    Ghosh R, Sinha N, Phadikar S. 2023.. Identification of imagined Bengali vowels from EEG signals using activity map and convolutional neural network. . In Brain-Computer Interface: Using Deep Learning Applications, , pp. 23154. Hoboken, NJ:: John Wiley & Sons
    [Google Scholar]
  36. 36.
    Kamble A, Ghare PH, Kumar V. 2023.. Optimized rational dilation wavelet transform for automatic imagined speech recognition. . IEEE Trans. Instrum. Meas. 72::4002210
    [Google Scholar]
  37. 37.
    Dash D, Ferrari P, Wang J. 2020.. Decoding imagined and spoken phrases from non-invasive neural (MEG) signals. . Front. Neurosci. 14::290
    [Crossref] [Google Scholar]
  38. 38.
    Défossez A, Caucheteux C, Rapin J, Kabeli O, King J-R. 2023.. Decoding speech perception from non-invasive brain recordings. . Nat. Mach. Intell. 5::1097107
    [Crossref] [Google Scholar]
  39. 39.
    Rabbani Q, Milsap G, Crone NE. 2019.. The potential for a speech brain–computer interface using chronic electrocorticography. . Neurotherapeutics 16:(1):14465
    [Crossref] [Google Scholar]
  40. 40.
    Herff C, Krusienski DJ, Kubben P. 2020.. The potential of stereotactic-EEG for brain-computer interfaces: current progress and future directions. . Front. Neurosci. 14::123
    [Crossref] [Google Scholar]
  41. 41.
    Brandman DM, Cash SS, Hochberg LR. 2017.. Review: Human intracortical recording and neural decoding for brain-computer interfaces. . IEEE Trans. Neural Syst. Rehabil. Eng. 25:(10):168796
    [Crossref] [Google Scholar]
  42. 42.
    Buzsáki G, Anastassiou CA, Koch C. 2012.. The origin of extracellular fields and currents—EEG, ECoG, LFP and spikes. . Nat. Rev. Neurosci. 13:(6):40720
    [Crossref] [Google Scholar]
  43. 43.
    Tankus A, Fried I, Shoham S. 2012.. Structured neuronal encoding and decoding of human speech features. . Nat. Commun. 3::1015
    [Crossref] [Google Scholar]
  44. 44.
    Leuthardt EC, Schalk G, Wolpaw JR, Ojemann JG, Moran DW. 2004.. A brain–computer interface using electrocorticographic signals in humans. . J. Neural Eng. 1:(2):6371
    [Crossref] [Google Scholar]
  45. 45.
    Schalk G, Miller KJ, Anderson NR, Wilson JA, Smyth MD, et al. 2008.. Two-dimensional movement control using electrocorticographic signals in humans. . J. Neural Eng. 5:(1):7584
    [Crossref] [Google Scholar]
  46. 46.
    Wang W, Collinger JL, Degenhart AD, Tyler-Kabara EC, Schwartz AB, et al. 2013.. An electrocorticographic brain interface in an individual with tetraplegia. . PLOS ONE 8:(2):e55344
    [Crossref] [Google Scholar]
  47. 47.
    Flint RD, Rosenow JM, Tate MC, Slutzky MW. 2017.. Continuous decoding of human grasp kinematics using epidural and subdural signals. . J. Neural Eng. 14:(1):016005
    [Crossref] [Google Scholar]
  48. 48.
    Flesher SN, Downey JE, Weiss JM, Hughes CL, Herrera AJ, et al. 2021.. A brain-computer interface that evokes tactile sensations improves robotic arm control. . Science 372:(6544):83136
    [Crossref] [Google Scholar]
  49. 49.
    Willett FR, Avansino DT, Hochberg LR, Henderson JM, Shenoy KV. 2021.. High-performance brain-to-text communication via handwriting. . Nature 593:(7858):24954
    [Crossref] [Google Scholar]
  50. 50.
    Chan AM, Dykstra AR, Jayaram V, Leonard MK, Travis KE, et al. 2014.. Speech-specific tuning of neurons in human superior temporal gyrus. . Cereb. Cortex 24:(10):267993
    [Crossref] [Google Scholar]
  51. 51.
    Roussel P, Le Godais G, Bocquelet F, Palma M, Hongjie J, et al. 2020.. Observation and assessment of acoustic contamination of electrophysiological brain signals during speech production and sound perception. . J. Neural Eng. 17:(5):056028
    [Crossref] [Google Scholar]
  52. 52.
    Stavisky SD, Willett FR, Wilson GH, Murphy BA, Rezaii P, et al. 2019.. Neural ensemble dynamics in dorsal motor cortex during speech in people with paralysis. . eLife 8::e46015
    [Crossref] [Google Scholar]
  53. 53.
    Wilson G, Stavisky S, Willett F, Avansino D, Kelemen J, et al. 2020.. Decoding spoken English from intracortical electrode arrays in dorsal precentral gyrus. . J. Neural Eng. 17:(6):66007
    [Crossref] [Google Scholar]
  54. 54.
    Wandelt SK, Kellis S, Bjånes DA, Pejsa K, Lee B, et al. 2022.. Decoding grasp and speech signals from the cortical grasp circuit in a tetraplegic human. . Neuron 110::177787.e3
    [Crossref] [Google Scholar]
  55. 55.
    Wandelt SK, Bjånes DA, Pejsa K, Lee B, Liu C, Andersen RA. 2024.. Representation of internal speech by single neurons in human supramarginal gyrus. . Nat. Hum. Behav. 8::113649
    [Crossref] [Google Scholar]
  56. 56.
    Wairagkar M, Hochberg LR, Brandman DM, Stavisky SD. 2023.. Synthesizing speech by decoding intracortical neural activity from dorsal motor cortex. . In 2023 11th International IEEE/EMBS Conference on Neural Engineering (NER), pp. 14. New York:: IEEE
    [Google Scholar]
  57. 57.
    Tan X, Lian Q, Zhu J, Zhang J, Wang Y, Qi Y. 2024.. Effective phoneme decoding with hyperbolic neural networks for high-performance speech BCIs. . IEEE Trans. Neural Syst. Rehabil. Eng. 32::343241
    [Crossref] [Google Scholar]
  58. 58.
    Musk E. 2019.. An integrated brain-machine interface platform with thousands of channels. . bioRxiv 703801. https://doi.org/10.1101/703801
  59. 59.
    Sahasrabuddhe K, Khan AA, Singh AP, Stern TM, Ng Y, et al. 2021.. The Argo: a high channel count recording system for neural recording in vivo. . J. Neural Eng. 18:(1):015002
    [Crossref] [Google Scholar]
  60. 60.
    Luan L, Yin R, Zhu H, Xie C. 2023.. Emerging penetrating neural electrodes: in pursuit of large scale and longevity. . Annu. Rev. Biomed. Eng. 25::185205
    [Crossref] [Google Scholar]
  61. 61.
    Hettick M, Ho E, Poole AJ, Monge M, Papageorgiou D, et al. 2024.. The Layer 7 Cortical Interface: a scalable and minimally invasive brain–computer interface platform. . bioRxiv 2022.01.02.474656. https://doi.org/10.1101/2022.01.02.474656
  62. 62.
    Duraivel S, Rahimpour S, Chiang C-H, Trumpis M, Wang C, et al. 2023.. High-resolution neural recordings improve the accuracy of speech decoding. . Nat. Commun. 14:(1):6938
    [Crossref] [Google Scholar]
  63. 63.
    Metzger SL, Littlejohn KT, Silva AB, Moses DA, Seaton MP, et al. 2023.. A high-performance neuroprosthesis for speech decoding and avatar control. . Nature 620:(7976):103746
    [Crossref] [Google Scholar]
  64. 64.
    Geukes SH, Branco MP, Aarnoutse EJ, Bekius A, Berezutskaya J, Ramsey NF. 2024.. Effect of electrode distance and size on electrocorticographic recordings in human sensorimotor cortex. . Neuroinformatics 22::70717
    [Crossref] [Google Scholar]
  65. 65.
    Chen X, Wang R, Khalilian-Gourtani A, Yu L, Dugan P, et al. 2024.. A neural speech decoding framework leveraging deep learning and speech synthesis. . Nat. Mach. Intell. 6::46780
    [Crossref] [Google Scholar]
  66. 66.
    Chartier J, Anumanchipalli GK, Johnson K, Chang EF. 2018.. Encoding of articulatory kinematic trajectories in human speech sensorimotor cortex. . Neuron 98:(5):104254.e4
    [Crossref] [Google Scholar]
  67. 67.
    Dichter BK, Breshears JD, Leonard MK, Chang EF. 2018.. The control of vocal pitch in human laryngeal motor cortex. . Cell 174:(1):2131
    [Crossref] [Google Scholar]
  68. 68.
    Silva AB, Liu JR, Zhao L, Levy DF, Scott TL, Chang EF. 2022.. A neurosurgical functional dissection of the middle precentral gyrus during speech production. . J. Neurosci. 42:(45):841626
    [Crossref] [Google Scholar]
  69. 69.
    Crone NE, Hao L, Hart J, Boatman D, Lesser RP, et al. 2001.. Electrocorticographic gamma activity during word production in spoken and sign language. . Neurology 57:(11):204553
    [Crossref] [Google Scholar]
  70. 70.
    Bouchard KE, Mesgarani N, Johnson K, Chang EF. 2013.. Functional organization of human sensorimotor cortex for speech articulation. . Nature 495:(7441):32732
    [Crossref] [Google Scholar]
  71. 71.
    Bouchard KE, Chang EF. 2014.. Control of spoken vowel acoustics and the influence of phonetic context in human speech sensorimotor cortex. . J. Neurosci. 34:(38):1266277
    [Crossref] [Google Scholar]
  72. 72.
    Cheung C, Hamilton LS, Johnson K, Chang EF. 2016.. The auditory representation of speech sounds in human motor cortex. . eLife 5::e12577
    [Crossref] [Google Scholar]
  73. 73.
    Conant DF, Bouchard KE, Leonard MK, Chang EF. 2018.. Human sensorimotor cortex control of directly-measured vocal tract movements during vowel production. . J. Neurosci. 38::295566
    [Crossref] [Google Scholar]
  74. 74.
    Mugler EM, Tate MC, Livescu K, Templer JW, Goldrick MA, Slutzky MW. 2018.. Differential representation of articulatory gestures and phonemes in precentral and inferior frontal gyri. . J. Neurosci. 4653::120618
    [Google Scholar]
  75. 75.
    Castellucci GA, Kovach CK, Howard MA, Greenlee JDW, Long MA. 2022.. A speech planning network for interactive language use. . Nature 602:(7895):11722
    [Crossref] [Google Scholar]
  76. 76.
    Mugler EM, Patton JL, Flint RD, Wright ZA, Schuele SU, et al. 2014.. Direct classification of all American English phonemes using signals from functional speech motor cortex. . J. Neural Eng. 11:(3):035015
    [Crossref] [Google Scholar]
  77. 77.
    Lotte F, Brumberg JS, Brunner P, Gunduz A, Ritaccio AL, et al. 2015.. Electrocorticographic representations of segmental features in continuous speech. . Front. Hum. Neurosci. 9::97
    [Crossref] [Google Scholar]
  78. 78.
    Brumberg JS, Wright EJ, Andreasen DS, Guenther FH, Kennedy PR. 2011.. Classification of intended phoneme production from chronic intracortical microelectrode recordings in speech-motor cortex. . Front. Neurosci. 5::65
    [Google Scholar]
  79. 79.
    Ramsey NF, Salari E, Aarnoutse EJ, Vansteensel MJ, Bleichner MG, Freudenburg ZV. 2018.. Decoding spoken phonemes from sensorimotor cortex with high-density ECoG grids. . NeuroImage 180::30111
    [Crossref] [Google Scholar]
  80. 80.
    Livezey JA, Bouchard KE, Chang EF. 2019.. Deep learning as a tool for neural data analysis: speech classification and cross-frequency coupling in human sensorimotor cortex. . PLOS Comput. Biol. 15:(9):e1007091
    [Crossref] [Google Scholar]
  81. 81.
    Tankus A, Stern E, Klein G, Kaptzon N, Nash L, et al. 2025.. A speech neuroprosthesis in the frontal lobe and hippocampus: decoding high-frequency activity into phonemes. . Neurosurgery. 96:(2):35664
    [Crossref] [Google Scholar]
  82. 82.
    Lipski WJ, Alhourani A, Pirnia T, Jones PW, Dastolfo-Hromack C, et al. 2018.. Subthalamic nucleus neurons differentially encode early and late aspects of speech production. . J. Neurosci. 38:(24):562031
    [Crossref] [Google Scholar]
  83. 83.
    Tankus A, Fried I. 2018.. Degradation of neuronal encoding of speech in the subthalamic nucleus in Parkinson's disease. . Neurosurgery 84::37887
    [Crossref] [Google Scholar]
  84. 84.
    Tankus A, Solomon L, Aharony Y, Faust-Socher A, Strauss I. 2021.. Machine learning algorithm for decoding multiple subthalamic spike trains for speech brain–machine interfaces. . J. Neural Eng. 18:(6):066021
    [Crossref] [Google Scholar]
  85. 85.
    Tankus A, Rosenberg N, Ben-Hamo O, Stern E, Strauss I. 2024.. Machine learning decoding of single neurons in the thalamus for speech brain-machine interfaces. . J. Neural Eng. 21::036009
    [Crossref] [Google Scholar]
  86. 86.
    Oganian Y, Chang EF. 2019.. A speech envelope landmark for syllable encoding in human superior temporal gyrus. . Sci. Adv. 5::eaay6279
    [Crossref] [Google Scholar]
  87. 87.
    Yi HG, Leonard MK, Chang EF. 2019.. The encoding of speech sounds in the superior temporal gyrus. . Neuron 102:(6):1096110
    [Crossref] [Google Scholar]
  88. 88.
    Bhaya-Grossman I, Chang EF. 2022.. Speech computations of the human superior temporal gyrus. . Annu. Rev. Psychol. 73::79102
    [Crossref] [Google Scholar]
  89. 89.
    Leonard MK, Gwilliams L, Sellers KK, Chung JE, Xu D, et al. 2023.. Large-scale single-neuron speech sound encoding across the depth of human cortex. . Nature 626:(7999):593602
    [Crossref] [Google Scholar]
  90. 90.
    Moses DA, Mesgarani N, Leonard MK, Chang EF. 2016.. Neural speech recognition: continuous phoneme decoding using spatiotemporal representations of human cortical activity. . J. Neural Eng. 13:(5):056004
    [Crossref] [Google Scholar]
  91. 91.
    Akbari H, Khalighinejad B, Herrero JL, Mehta AD, Mesgarani N. 2019.. Towards reconstructing intelligible speech from the human auditory cortex. . Sci. Rep. 9:(1):874
    [Crossref] [Google Scholar]
  92. 92.
    Bush A, Chrabaszcz A, Peterson V, Saravanan V, Dastolfo-Hromack C, et al. 2022.. Differentiation of speech-induced artifacts from physiological high gamma activity in intracranial recordings. . NeuroImage 250::118962
    [Crossref] [Google Scholar]
  93. 93.
    Herff C, Heger D, de Pesters A, Telaar D, Brunner P, et al. 2015.. Brain-to-text: decoding spoken phrases from phone representations in the brain. . Front. Neurosci. 9::217
    [Crossref] [Google Scholar]
  94. 94.
    Salari E, Freudenburg ZV, Vansteensel MJ, Ramsey NF. 2018.. The influence of prior pronunciations on sensorimotor cortex activity patterns during vowel production. . J. Neural Eng. 15:(6):066025
    [Crossref] [Google Scholar]
  95. 95.
    Liu Y, Zhao Z, Xu M, Yu H, Zhu Y, et al. 2023.. Decoding and synthesizing tonal language speech from brain activity. . Sci. Adv. 9:(23):eadh0478
    [Crossref] [Google Scholar]
  96. 96.
    Sun P, Anumanchipalli GK, Chang EF. 2020.. Brain2Char: a deep architecture for decoding text from brain recordings. . J. Neural Eng. 17:(6):066015
    [Crossref] [Google Scholar]
  97. 97.
    Kellis S, Miller K, Thomson K, Brown R, House P, Greger B. 2010.. Decoding spoken words using local field potentials recorded from the cortical surface. . J. Neural Eng. 7:(5):056007
    [Crossref] [Google Scholar]
  98. 98.
    Martin S, Brunner P, Iturrate I, Millán JdR, Schalk G, et al. 2016.. Word pair classification during imagined speech using direct brain recordings. . Sci. Rep. 6:(1):25803
    [Crossref] [Google Scholar]
  99. 99.
    Makin JG, Moses DA, Chang EF. 2020.. Machine translation of cortical activity to text with an encoder–decoder framework. . Nat. Neurosci. 23:(4):57582
    [Crossref] [Google Scholar]
  100. 100.
    Petrosyan A, Voskoboinikov A, Sukhinin D, Makarova A, Skalnaya A, et al. 2022.. Speech decoding from a small set of spatially segregated minimally invasive intracranial EEG electrodes with a compact and interpretable neural network. . J. Neural Eng. 19:(6):066016
    [Crossref] [Google Scholar]
  101. 101.
    Le Godais G, Roussel P, Bocquelet F, Aubert M, Kahane P, et al. 2023.. Overt speech decoding from cortical activity: a comparison of different linear methods. . Front. Hum. Neurosci. 17::1124065
    [Crossref] [Google Scholar]
  102. 102.
    Herff C, Diener L, Angrick M, Mugler E, Tate MC, et al. 2019.. Generating natural, intelligible speech from brain activity in motor, premotor, and inferior frontal cortices. . Front. Neurosci. 13::1267
    [Crossref] [Google Scholar]
  103. 103.
    Angrick M, Herff C, Mugler E, Tate MC, Slutzky MW, et al. 2019.. Speech synthesis from ECoG using densely connected 3D convolutional neural networks. . J. Neural Eng. 16:(3):036019
    [Crossref] [Google Scholar]
  104. 104.
    Kohler J, Ottenhoff MC, Goulis S, Angrick M, Colon AJ, et al. 2022.. Synthesizing speech from intracranial depth electrodes using an encoder-decoder framework. . Neuron. Behav. Data Anal. Theory 6:(1). https://doi.org/10.51628/001c.57524
    [Google Scholar]
  105. 105.
    Berezutskaya J, Freudenburg ZV, Vansteensel MJ, Aarnoutse EJ, Ramsey NF, van Gerven MAJ. 2023.. Direct speech reconstruction from sensorimotor brain activity with optimized deep learning models. . J. Neural Eng. 20::056010
    [Crossref] [Google Scholar]
  106. 106.
    Shigemi K, Komeiji S, Mitsuhashi T, Iimura Y, Suzuki H, et al. 2023.. Synthesizing speech from ECoG with a combination of transformer-based encoder and neural vocoder. . In 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 15. New York:: IEEE
    [Google Scholar]
  107. 107.
    Griffin D, Lim J. 1984.. Signal estimation from modified short-time Fourier transform. . IEEE Trans. Acoust. Speech Signal Proc. 32:(2):23643
    [Crossref] [Google Scholar]
  108. 108.
    van den Oord A, Dieleman S, Zen H, Simonyan K, Vinyals O, et al. 2016.. WaveNet: a generative model for raw audio. . arXiv:1609.03499 [cs.SD]
  109. 109.
    Yamamoto R, Song E, Kim J-M. 2020.. Parallel WaveGAN: a fast waveform generation model based on generative adversarial networks with multi-resolution spectrogram. . In 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6199203. New York:: IEEE
    [Google Scholar]
  110. 110.
    Valin J-M, Skoglund J. 2019.. LPCNet: improving neural speech synthesis through linear prediction. . In 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 589195. New York:: IEEE
    [Google Scholar]
  111. 111.
    Anumanchipalli GK, Chartier J, Chang EF. 2019.. Speech synthesis from neural decoding of spoken sentences. . Nature 568:(7753):49398
    [Crossref] [Google Scholar]
  112. 112.
    Luo S, Angrick M, Coogan C, Candrea DN, Wyse-Sookoo K, et al. 2023.. Stable decoding from a speech BCI enables control for an individual with ALS without recalibration for 3 months. . Adv. Sci. 10::e2304853
    [Crossref] [Google Scholar]
  113. 113.
    Metzger SL, Liu JR, Moses DA, Dougherty ME, Seaton MP, et al. 2022.. Generalizable spelling using a speech neuroprosthesis in an individual with severe limb and vocal paralysis. . Nat. Commun. 13:(1):6510
    [Crossref] [Google Scholar]
  114. 114.
    Silva AB, Liu JR, Metzger SL, Bhaya-Grossman I, Dougherty ME, et al. 2024.. A bilingual speech neuroprosthesis driven by cortical articulatory representations shared between languages. . Nat. Biomed. Eng. 8::97791
    [Crossref] [Google Scholar]
  115. 115.
    Meng K, Goodarzy F, Kim E, Park YJ, Kim JS, et al. 2023.. Continuous synthesis of artificial speech sounds from human cortical surface recordings during silent speech production. . J. Neural Eng. 20::046019
    [Crossref] [Google Scholar]
  116. 116.
    Angrick M, Luo S, Rabbani Q, Candrea DN, Shah S, et al. 2024.. Online speech synthesis using a chronically implanted brain–computer interface in an individual with ALS. . Sci. Rep. 14:(1):9617
    [Crossref] [Google Scholar]
  117. 117.
    Wairagkar M, Card NS, Singer-Clark T, Hou X, Iacobacci C, et al. 2024.. An instantaneous voice synthesis neuroprosthesis. . bioRxiv 2024.08.14.607690. https://doi.org/10.1101/2024.08.14.607690
  118. 118.
    Arik SO, Chen J, Peng K, Ping W, Zhou Y. 2018.. Neural voice cloning with a few samples. . arXiv:1802.06006 [cs.CL]
  119. 119.
    Li YA, Han C, Raghavan V, Mischler G, Mesgarani N. 2023.. StyleTTS 2: towards human-level text-to-speech through style diffusion and adversarial training with large speech language models. . Adv. Neural Inform. Proc. Syst. 36::19594621
    [Google Scholar]
  120. 120.
    Bouton CE, Shaikhouni A, Annetta NV, Bockbrader MA, Friedenberg DA, et al. 2016.. Restoring cortical control of functional movement in a human with quadriplegia. . Nature 533:(7602):24750
    [Crossref] [Google Scholar]
  121. 121.
    Ajiboye AB, Willett FR, Young DR, Memberg WD, Murphy BA, et al. 2017.. Restoration of reaching and grasping movements through brain-controlled muscle stimulation in a person with tetraplegia: a proof-of-concept demonstration. . Lancet 389:(10081):182130
    [Crossref] [Google Scholar]
  122. 122.
    Canny E, Berezutskaya J. 2023.. The feasibility of combining communication BCIs with FES for individuals with locked-in syndrome. . arXiv:2306.03159 [q-bio.NC]
  123. 123.
    Lu J, Li Y, Zhao Z, Liu Y, Zhu Y, et al. 2023.. Neural control of lexical tone production in human laryngeal motor cortex. . Nat. Commun. 14::6917
    [Crossref] [Google Scholar]
  124. 124.
    Soroush PZ, Herff C, Ries SK, Shih JJ, Schultz T, Krusienski DJ. 2023.. The nested hierarchy of overt, mouthed, and imagined speech activity evident in intracranial recordings. . NeuroImage 269::119913
    [Crossref] [Google Scholar]
  125. 125.
    Kunz EM, Meschede-Krasa B, Kamdar F, Avansino D, Nason-Tomaszewski SR, et al. 2024.. Representation of verbal thought in motor cortex and implications for speech neuroprostheses. . bioRxiv 2024.10.04.616375v1. https://www.biorxiv.org/content/10.1101/2024.10.04.616375v1
  126. 126.
    Martin S, Brunner P, Holdgraf C, Heinze H-J, Crone NE, et al. 2014.. Decoding spectrotemporal features of overt and covert speech from the human cortex. . Front. Neuroeng. 7::14
    [Crossref] [Google Scholar]
  127. 127.
    Proix T, Delgado Saa J, Christen A, Martin S, Pasley BN, et al. 2022.. Imagined speech can be decoded from low- and cross-frequency intracranial EEG features. . Nat. Commun. 13:(1):48
    [Crossref] [Google Scholar]
  128. 128.
    Huth AG, De Heer WA, Griffiths TL, Theunissen FE, Gallant JL. 2016.. Natural speech reveals the semantic maps that tile human cerebral cortex. . Nature 532:(7600):45358
    [Crossref] [Google Scholar]
  129. 129.
    Verwoert M, Amigó-Vega J, Gao Y, Ottenhoff M, Kubben P, Herff C. 2025.. Whole-brain dynamics of articulatory, acoustic and semantic speech representations. . Commun. Biol. 8:432
    [Google Scholar]
  130. 130.
    Thompson DE, Quitadamo LR, Mainardi L, Laghari KUR, Gao S, et al. 2014.. Performance measurement for brain-computer or brain-machine interfaces: a tutorial. . J. Neural Eng. 11:(3):035001
    [Crossref] [Google Scholar]
  131. 131.
    Thomas TM, Singh A, Bullock LP, Liang D, Morse CW, et al. 2023.. Decoding articulatory and phonetic components of naturalistic continuous speech from the distributed language network. . J. Neural Eng. 20:(4):046030
    [Crossref] [Google Scholar]
  132. 132.
    Wu X, Wellington S, Fu Z, Zhang D. 2024.. Speech decoding from stereo-electroencephalography (sEEG) signals using advanced deep learning methods. . J. Neural Eng. 21::036055
    [Crossref] [Google Scholar]
  133. 133.
    Heelan C, Lee J, O'Shea R, Lynch L, Brandman DM, et al. 2019.. Decoding speech from spike-based neural population recordings in secondary auditory cortex of non-human primates. . Commun. Biol. 2:(1):466
    [Crossref] [Google Scholar]
  134. 134.
    Taal C, Hendricks R, Heusdens R, Jensen J. 2011.. An algorithm for intelligibility prediction of time–frequency weighted noisy speech. . IEEE Trans. Audio Speech Lang. Proc. 19::212536
    [Crossref] [Google Scholar]
  135. 135.
    Hickok G. 2014.. The architecture of speech production and the role of the phoneme in speech processing. . Lang. Cogn. Neurosci. 29:(1):220
    [Crossref] [Google Scholar]
  136. 136.
    Varshney S, Farias D, Brandman DM, Stavisky SD, Miller LM. 2023.. Using automatic speech recognition to measure the intelligibility of speech synthesized from brain signals. . In 11th International IEEE/EMBS Conference on Neural Engineering (NER), pp. 16. New York:: IEEE
    [Google Scholar]
  137. 137.
    Tremblay S, Shiller DM, Ostry DJ. 2003.. Somatosensory basis of speech production. . Nature 423:(6942):86669
    [Crossref] [Google Scholar]
  138. 138.
    Hickok G. 2012.. Computational neuroanatomy of speech production. . Nat. Rev. Neurosci. 13:(2):13545
    [Crossref] [Google Scholar]
  139. 139.
    Parrell B, Lammert AC, Ciccarelli G, Quatieri TF. 2019.. Current models of speech motor control: a control-theoretic overview of architectures and properties. . J Acoust. Soc. Am. 145:(3):145681
    [Crossref] [Google Scholar]
  140. 140.
    Dadarlat MC, Canfield RA, Orsborn AL. 2023.. Neural plasticity in sensorimotor brain–machine interfaces. . Annu. Rev. Biomed. Eng. 25::5176
    [Crossref] [Google Scholar]
  141. 141.
    Even-Chen N, Stavisky SD, Pandarinath C, Nuyujukian P, Blabe CH, et al. 2018.. Feasibility of automatic error detect-and-undo system in human intracortical brain–computer interfaces. . IEEE Trans. Biomed. Eng. 65::177184
    [Crossref] [Google Scholar]
  142. 142.
    Sussillo D, Stavisky SD, Kao JC, Ryu SI, Shenoy KV. 2016.. Making brain–machine interfaces robust to future neural variability. . Nat. Commun. 7:(1):13749
    [Crossref] [Google Scholar]
  143. 143.
    Koyama S, Chase SM, Whitford AS, Velliste M, Schwartz AB, Kass RE. 2010.. Comparison of brain-computer interface decoding algorithms in open-loop and closed-loop control. . J. Comput. Neurosci. 29:(1–2):7387
    [Crossref] [Google Scholar]
  144. 144.
    Cunningham JP, Nuyujukian P, Gilja V, Chestek CA, Ryu SI, Shenoy KV. 2011.. A closed-loop human simulator for investigating the role of feedback control in brain-machine interfaces. . J. Neurophysiol. 105:(4):193249
    [Crossref] [Google Scholar]
  145. 145.
    Jarosiewicz B, Masse NY, Bacher D, Cash SS, Eskandar E, et al. 2013.. Advantages of closed-loop calibration in intracortical brain-computer interfaces for people with tetraplegia. . J. Neural Eng. 10:(4):046012
    [Crossref] [Google Scholar]
  146. 146.
    Angrick M, Luo S, Rabbani Q, Candrea DN, Shah S, et al. 2023.. Online speech synthesis using a chronically implanted brain-computer interface in an individual with ALS. . Sci Rep. 14::9617
    [Crossref] [Google Scholar]
  147. 147.
    Graves A, Fernández S, Gomez F, Schmidhuber J. 2006.. Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks. . In ICML '06: Proceedings of the 23rd International Conference on Machine Learning, pp. 36976. New York:: Assoc. Comput. Mach.
    [Google Scholar]
  148. 148.
    Rabbani Q, Shah S, Milsap G, Fifer M, Hermansky H, Crone N. 2024.. Iterative alignment discovery of speech-associated neural activity. . J. Neural Eng. 21:(4):046056
    [Crossref] [Google Scholar]
  149. 149.
    Brandman DM, Hosman T, Saab J, Burkhart MC, Shanahan BE, et al. 2018.. Rapid calibration of an intracortical brain–computer interface for people with tetraplegia. . J. Neural Eng. 15:(2):026007
    [Crossref] [Google Scholar]
  150. 150.
    Pandarinath C, Nuyujukian P, Blabe CH, Sorice BL, Saab J, et al. 2017.. High performance communication by people with paralysis using an intracortical brain-computer interface. . eLife 6::e18554
    [Crossref] [Google Scholar]
  151. 151.
    Collinger JL, Wodlinger B, Downey JE, Wang W, Tyler-Kabara EC, et al. 2013.. High-performance neuroprosthetic control by an individual with tetraplegia. . Lancet 381:(9866):55764
    [Crossref] [Google Scholar]
  152. 152.
    Hochberg LR, Bacher D, Jarosiewicz B, Masse NY, Simeral JD, et al. 2012.. Reach and grasp by people with tetraplegia using a neurally controlled robotic arm. . Nature 485:(7398):37275
    [Crossref] [Google Scholar]
  153. 153.
    Agudelo-Toro A, Michaels JA, Sheng W-A, Scherberger H. 2024.. Accurate neural control of a hand prosthesis by posture-related activity in the primate grasping circuit. . Neuron. 112:(24):P411529.E8
    [Crossref] [Google Scholar]
  154. 154.
    Li J, Guo C, Fu L, Fan L, Chang EF, Li Y. 2024.. Neural2Speech: a transfer learning framework for neural-driven speech reconstruction. . In 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 22004. New York:: IEEE
    [Google Scholar]
  155. 155.
    Safaie M, Chang JC, Park J, Miller LE, Dudman JT, et al. 2023.. Preserved neural dynamics across animals performing similar behaviour. . Nature 623:(7988):76571
    [Crossref] [Google Scholar]
  156. 156.
    Silversmith DB, Abiri R, Hardy NF, Natraj N, Tu-Chan A, et al. 2020.. Plug-and-play control of a brain–computer interface through neural map stabilization. . Nat. Biotechnol. 39::32635
    [Crossref] [Google Scholar]
  157. 157.
    Wyse-Sookoo K, Luo S, Candrea D, Schippers A, Tippett DC, et al. 2024.. Stability of ECoG high gamma signals during speech and implications for a speech BCI system in an individual with ALS: a year-long longitudinal study. . J. Neural Eng. 21::046016
    [Crossref] [Google Scholar]
  158. 158.
    Flint RD, Wright ZA, Scheid MR, Slutzky MW. 2013.. Long term, stable brain machine interface performance using local field potentials and multiunit spikes. . J. Neural Eng. 10:(5):056005
    [Crossref] [Google Scholar]
  159. 159.
    Nuyujukian P, Kao JC, Fan JM, Stavisky SD, Ryu SI, Shenoy KV. 2014.. Performance sustaining intracortical neural prostheses. . J. Neural Eng. 11:(6):066003
    [Crossref] [Google Scholar]
  160. 160.
    Perge JA, Homer ML, Malik WQ, Cash S, Eskandar E, et al. 2013.. Intra-day signal instabilities affect decoding performance in an intracortical neural interface system. . J. Neural Eng. 10:(3):036004
    [Crossref] [Google Scholar]
  161. 161.
    Wodlinger B, Downey JE, Tyler-Kabara EC, Schwartz AB, Boninger ML, Collinger JL. 2015.. Ten-dimensional anthropomorphic arm control in a human brain−machine interface: difficulties, solutions, and limitations. . J. Neural Eng. 12:(1):016011
    [Crossref] [Google Scholar]
  162. 162.
    Downey JE, Schwed N, Chase SM, Schwartz AB, Collinger JL. 2018.. Intracortical recording stability in human brain–computer interface users. . J. Neural Eng. 15:(4):046016
    [Crossref] [Google Scholar]
  163. 163.
    Orsborn AL, Moorman HG, Overduin SA, Shanechi MM, Dimitrov DF, Carmena JM. 2014.. Closed-loop decoder adaptation shapes neural plasticity for skillful neuroprosthetic control. . Neuron 82:(6):138093
    [Crossref] [Google Scholar]
  164. 164.
    Bishop W, Chestek CC, Gilja V, Nuyujukian P, Foster JD, et al. 2014.. Self-recalibrating classifiers for intracortical brain-computer interfaces. . J. Neural Eng. 11:(2):026001
    [Crossref] [Google Scholar]
  165. 165.
    Degenhart AD, Bishop WE, Oby ER, Tyler-Kabara EC, Chase SM, et al. 2020.. Stabilization of a brain–computer interface via the alignment of low-dimensional spaces of neural activity. . Nat. Biomed. Eng. 4::67285
    [Crossref] [Google Scholar]
  166. 166.
    Jarosiewicz B, Sarma AA, Bacher D, Masse NY, Simeral JD, et al. 2015.. Virtual typing by people with tetraplegia using a self-calibrating intracortical brain-computer interface. . Sci. Transl. Med. 7::313ra179
    [Crossref] [Google Scholar]
  167. 167.
    Fan C, Hahn N, Kamdar F, Avansino D, Wilson G, et al. 2023.. Plug-and-play stability for intracortical brain-computer interfaces: a one-year demonstration of seamless brain-to-text communication. . Adv. Neural Inform. Proc. Syst. 36::4225870
    [Google Scholar]
  168. 168.
    Wilson GH, Willett FR, Stein EA, Kamdar F, Avansino DT, et al. 2023.. Long-term unsupervised recalibration of cursor BCIs. . bioRxiv 2023.02.03.527022. https://doi.org/10.1101/2023.02.03.527022
  169. 169.
    Dekleva BM, Chowdhury RH, Batista AP, Chase SM, Yu BM, et al. 2024.. Motor cortex retains and reorients neural dynamics during motor imagery. . Nat. Hum. Behav. 8::72942
    [Crossref] [Google Scholar]
  170. 170.
    Guan C, Aflalo T, Kadlec K, Gámez de Leon J, Rosario ER, et al. 2023.. Decoding and geometry of ten finger movements in human posterior parietal cortex and motor cortex. . J. Neural Eng. 20:(3):036020
    [Crossref] [Google Scholar]
  171. 171.
    Masse NY, Jarosiewicz B, Simeral JD, Bacher D, Stavisky SD, et al. 2014.. Non-causal spike filtering improves decoding of movement intention for intracortical BCIs. . J. Neurosci. Methods 236::5867
    [Crossref] [Google Scholar]
  172. 172.
    Willsey MS, Shah NP, Avansino DT, Hahn NV, Jamiolkowski RM, et al. 2025.. A high-performance brain–computer interface for finger decoding and quadcopter game control in an individual with paralysis. . Nat. Med. 31:96104
    [Google Scholar]
  173. 173.
    Nair DR, Laxer KD, Weber PB, Murro AM, Park YD, et al. 2020.. Nine-year prospective efficacy and safety of brain-responsive neurostimulation for focal epilepsy. . Neurology 95:(9):e124456
    [Crossref] [Google Scholar]
  174. 174.
    van Stuijvenberg OC, Broekman MLD, Wolff SEC, Bredenoord AL, Jongsma KR. 2024.. Developer perspectives on the ethics of AI-driven neural implants: a qualitative study. . Sci. Rep. 14:(1):7880
    [Crossref] [Google Scholar]
  175. 175.
    Ienca M, Haselager P. 2016.. Hacking the brain: brain–computer interfacing technology and the ethics of neurosecurity. . Ethics Inf. Technol. 18:(2):11729
    [Crossref] [Google Scholar]
  176. 176.
    Yuste R. 2023.. Advocating for neurodata privacy and neurotechnology regulation. . Nat. Protoc. 18::286975
    [Crossref] [Google Scholar]
  177. 177.
    Burwell S, Sample M, Racine E. 2017.. Ethical aspects of brain computer interfaces: a scoping review. . BMC Med. Ethics 18:(1):60
    [Crossref] [Google Scholar]
  178. 178.
    Vlek RJ, Steines D, Szibbo D, Kübler A, Schneider M-J, et al. 2012.. Ethical issues in brain–computer interface research, development, and dissemination. . J. Neurol. Phys. Ther. 36:(2):9499
    [Crossref] [Google Scholar]
/content/journals/10.1146/annurev-bioeng-110122-012818
Loading
/content/journals/10.1146/annurev-bioeng-110122-012818
Loading

Data & Media loading...

This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error