1932

Abstract

Privacy is an important consideration when sharing clinical data, which often contain sensitive information. Adequate protection to safeguard patient privacy and to increase public trust in biomedical research is paramount. This review covers topics in policy and technology in the context of clinical data sharing. We review policy articles related to () the Common Rule, HIPAA privacy and security rules, and governance; () patients’ viewpoints and consent practices; and () research ethics. We identify key features of the revised Common Rule and the most notable changes since its previous version. We address data governance for research in addition to the increasing emphasis on ethical and social implications. Research ethics topics include data sharing best practices, use of data from populations of low socioeconomic status (SES), recent updates to institutional review board (IRB) processes to protect human subjects’ data, and important concerns about the limitations of current policies to address data deidentification. In terms of technology, we focus on articles that have applicability in real world health care applications: deidentification methods that comply with HIPAA, data anonymization approaches to satisfy well-acknowledged issues in deidentified data, encryption methods to safeguard data analyses, and privacy-preserving predictive modeling. The first two technology topics are mostly relevant to methodologies that attempt to sanitize structured or unstructured data. The third topic includes analysis on encrypted data. The last topic includes various mechanisms to build statistical models without sharing raw data.

Loading

Article metrics loading...

/content/journals/10.1146/annurev-biodatasci-080917-013416
2018-07-20
2024-05-04
Loading full text...

Full text loading...

/deliver/fulltext/biodatasci/1/1/annurev-biodatasci-080917-013416.html?itemId=/content/journals/10.1146/annurev-biodatasci-080917-013416&mimeType=html&fmt=ahah

Literature Cited

  1. 1.  Page SA, Manhas KP, Muruve DA 2016. A survey of patient perspectives on the research use of health information and biospecimens. BMC Med. Ethics 17:148
    [Google Scholar]
  2. 2.  Menachemi N, Collum TH 2011. Benefits and drawbacks of electronic health record systems. Risk Manag. Healthc. Policy 4:47–55
    [Google Scholar]
  3. 3.  Meingast M, Roosta T, Sastry S 2006. Security and privacy issues with health care information technology. Proc. Annu. Int. Conf. IEEE Eng. Med. Biol. Soc., 28th, New York, N.Y., 30 Aug.–3 Sept.5453–58 New York: IEEE
    [Google Scholar]
  4. 4.  Kim H, Bell E, Kim J, Sitapati A, Ramsdell J et al. 2016. iCONCUR: informed consent for clinical data and bio-sample use for research. J. Am. Med. Inform. Assoc. 24:2380–87
    [Google Scholar]
  5. 5.  Caine K, Hanania R 2013. Patients want granular privacy control over health information in electronic medical records. J. Am. Med. Inform. Assoc. 20:17–15
    [Google Scholar]
  6. 6. Off. Hum. Res. Prot. 2017. Revised Common Rule Regul. Guid., updated Jan. 19. https://www.hhs.gov/ohrp/regulations-and-policy/regulations/finalized-revisions-common-rule/index.html
  7. 7. Off. Civ. Rights. 2017. The Security Rule Regul. Guid., updated May 12. https://www.hhs.gov/hipaa/for-professionals/security/index.html
  8. 8.  Jiang X, Sarwate AD, Ohno-Machado L 2013. Privacy technology to support data sharing for comparative effectiveness research: a systematic review. Med. Care. 51:S58–65
    [Google Scholar]
  9. 9.  Ohno-Machado L, Agha Z, Bell DS, Dahm L, Day ME et al. 2014. pSCANNER: patient-centered Scalable National Network for Effectiveness Research. J. Am. Med. Inform. Assoc. 21:4621–26
    [Google Scholar]
  10. 10.  Gardner J, Xiong L, Xiao Y, Gao J, Post AR et al. 2013. SHARE: system design and case studies for statistical health information release. J. Am. Med. Inform. Assoc. 20:1109–16
    [Google Scholar]
  11. 11.  Li H, Xiong L, Jiang X 2015. Differentially private histogram and synthetic data publication. Medical Data Privacy Handbook A Gkoulalas-Divanis, G Loukides 35–58 Cham, Switz: Springer Int.
    [Google Scholar]
  12. 12.  Bos JW, Lauter K, Naehrig M 2014. Private predictive analysis on encrypted medical data. J. Biomed. Inform. 50:234–43
    [Google Scholar]
  13. 13.  Menikoff J, Kaneshiro J, Pritchard I 2017. The Common Rule, updated. N. Engl. J. Med. 376:7613–15
    [Google Scholar]
  14. 14.  Wanerman RE, Armstrong MS, Davidsen BS 2017. Six key changes to the common rule Health Care and Life Sciences Client Alert, Epstein Becker & Green, P.C. http://www.ebglaw.com/content/uploads/2017/02/HCLS-Client-Alert-Six-Key-Changes-to-The-Common-Rule-13Feb17.pdf
  15. 15.  Lidz CW, Appelbaum PS, Arnold R, Candilis P, Gardner W et al. 2012. How closely do institutional review boards follow the common rule?. Acad. Med. 87:7969–74
    [Google Scholar]
  16. 16.  Kennedy S 2015. The Common Rule (1991).. IMARC Blog Sept. 24. http://www.imarcresearch.com/blog/the-common-rule-1991
  17. 17.  Hudson KL, Collins FS 2015. Bringing the Common Rule into the 21st century. N. Engl. J. Med. 373:242293–96
    [Google Scholar]
  18. 18.  Rivera SM, Nichols L, Brako L, Croft G, Russo T, Tran T 2017. CTSA institution responses to proposed Common Rule changes: Did they get what they wanted?. J. Empir. Res. Hum. Res. Ethics 12:279–86
    [Google Scholar]
  19. 19. Off. Civ. Rights. 2015. Guidance regarding methods for de-identification of protected health information in accordance with the Health Insurance Portability and Accountability Act (HIPAA) Privacy Rule Regul. Guid., updated Nov. 6. https://www.hhs.gov/hipaa/for-professionals/privacy/special-topics/de-identification/index.html
  20. 20.  Benitez K, Malin B 2010. Evaluating re-identification risks with respect to the HIPAA privacy rule. J. Am. Med. Inform. Assoc. 17:2169–77
    [Google Scholar]
  21. 21.  Liu X, Li X-B, Motiwalla L, Li W, Zheng H, Franklin PD 2016. Preserving patient privacy when sharing same-disease data. ACM J. Data Inf. Qual. 7:417
    [Google Scholar]
  22. 22.  Rodrigues JJPC, de la Torre I, Fernández G, López-Coronado M 2013. Analysis of the security and privacy requirements of cloud-based electronic health records systems. J. Med. Internet Res. 15:8e186
    [Google Scholar]
  23. 23.  Xia W, Heatherly R, Ding X, Li J, Malin BA 2015. R-U policy frontiers for health data de-identification. J. Am. Med. Inform. Assoc. 22:51029–41
    [Google Scholar]
  24. 24.  O'Keefe CM, Rubin DB 2015. Individual privacy versus public good: protecting confidentiality in health research. Stat. Med. 34:233081–103
    [Google Scholar]
  25. 25.  Glenn T, Monteith S 2014. Privacy in the digital world: medical and health data outside of HIPAA protections. Curr. Psychiatry Rep. 16:11494
    [Google Scholar]
  26. 26.  Shenoy A, Appel JM 2017. Safeguarding confidentiality in electronic health records. Camb. Q. Healthc. Ethics 26:2337–41
    [Google Scholar]
  27. 27.  DeAngles M 2015. National electronic health record network regulation and synchronization of national and state privacy laws needed to increase efficiency and reduce costs in healthcare. J. Leg. Med. 36:3–4413–19
    [Google Scholar]
  28. 28.  Mamo LA, Browe DK, Logan HC, Kim KK 2013. Patient informed governance of distributed research networks: results and discussion from six patient focus groups. AMIA Annu. Symp. Proc. 2013:920–29
    [Google Scholar]
  29. 29.  Holmes JH 2016. Privacy, security, and patient engagement: the changing health data governance landscape. eGEMs 4:21261
    [Google Scholar]
  30. 30.  Luchenski S, Balasanthiran A, Marston C, Sasaki K, Majeed A et al. 2012. Survey of patient and public perceptions of electronic health records for healthcare, policy and research: study protocol. BMC Med. Inform. Decis. Mak. 12:40
    [Google Scholar]
  31. 31.  Trachtenbarg DE, Asche C, Ramsahai S, Duling J, Ren J 2017. The benefits, risks and costs of privacy: patient preferences and willingness to pay. Curr. Med. Res. Opin. 33:5845–51
    [Google Scholar]
  32. 32.  Kim KK, Joseph JG, Ohno-Machado L 2015. Comparison of consumers’ views on electronic data sharing for healthcare and research. J. Am. Med. Inform. Assoc. 22:4821–30
    [Google Scholar]
  33. 33.  Bull S, Roberts N, Parker M 2015. Views of ethical best practices in sharing individual-level data from medical and public health research: a systematic scoping review. J. Empir. Res. Hum. Res. Ethics 10:3225–38
    [Google Scholar]
  34. 34.  Kim KK, Sankar P, Wilson MD, Haynes SC 2017. Factors affecting willingness to share electronic health data among California consumers. BMC Med. Ethics 18:125
    [Google Scholar]
  35. 35.  Mann SP, Savulescu J, Sahakian BJ 2016. Facilitating the ethical use of health data for the benefit of society: electronic health records, consent and the duty of easy rescue. Philos. Trans. R. Soc. A 374:208320160130
    [Google Scholar]
  36. 36.  Mascalzoni D, Paradiso A, Hansson M 2014. Rare disease research: breaking the privacy barrier. Appl. Transl. Genom. 3:223–29
    [Google Scholar]
  37. 37.  Hughes S, Wells K, McSorley P, Freeman A 2014. Preparing individual patient data from clinical trials for sharing: the GlaxoSmithKline approach. Pharm. Stat. 13:3179–83
    [Google Scholar]
  38. 38.  Samuels JG, McGrath RJ, Fetzer SJ, Mittal P, Bourgoine D 2015. Using the electronic health record in nursing research: challenges and opportunities. West. J. Nurs. Res. 37:101284–94
    [Google Scholar]
  39. 39.  Hanauer D, Aberdeen J, Bayer S, Wellner B, Clark C et al. 2013. Bootstrapping a de-identification system for narrative patient records: cost-performance tradeoffs. Int. J. Med. Inform. 82:9821–31
    [Google Scholar]
  40. 40.  Meystre SM, Ferrández Ó, Friedlin FJ, South BR, Shen S, Samore MH 2014. Text de-identification for privacy protection: a study of its impact on clinical text information content. J. Biomed. Inform. 50:142–50
    [Google Scholar]
  41. 41.  Gardner J, Xiong L 2009. An integrated framework for de-identifying unstructured medical data. Data Knowl. Eng. 68:121441–51
    [Google Scholar]
  42. 42.  Ferrández Ó, South BR, Shen S, Friedlin FJ, Samore MH, Meystre SM 2012. Generalizability and comparison of automatic clinical text de-identification methods and resources. AMIA Annu. Symp. Proc. 2012:199–208
    [Google Scholar]
  43. 43.  Dernoncourt F, Lee JY, Uzuner O, Szolovits P 2017. De-identification of patient notes with recurrent neural networks. J. Am. Med. Inform. Assoc. 24:3596–606
    [Google Scholar]
  44. 44.  Sweeney L 2002. k-anonymity: a model for protecting privacy. Internat. J. Uncertain. Fuzziness Knowl.-Based Syst. 10:05557–70
    [Google Scholar]
  45. 45.  Aristodimou A, Antoniades A, Pattichis CS 2016. Privacy preserving data publishing of categorical data through k-anonymity and feature selection. Healthc. Technol. Lett. 3:116–21
    [Google Scholar]
  46. 46.  Yoo S, Shin M, Lee D 2012. An approach to reducing information loss and achieving diversity of sensitive attributes in k-anonymity methods. Interact. J. Med. Res. 1:2e14
    [Google Scholar]
  47. 47.  Tamersoy A, Loukides G, Nergiz ME, Saygin Y, Malin B 2012. Anonymization of longitudinal electronic medical records. IEEE Trans. Inf. Technol. Biomed. 16:3413–23
    [Google Scholar]
  48. 48.  Martínez S, Sánchez D, Valls A 2013. A semantic framework to protect the privacy of electronic health records with non-numerical attributes. J. Biomed. Inform. 46:2294–303
    [Google Scholar]
  49. 49.  Kim S, Lee H, Chung YD 2017. Privacy-preserving data cube for electronic medical records: an experimental evaluation. Int. J. Med. Inform. 97:33–42
    [Google Scholar]
  50. 50.  Loukides G, Gkoulalas-Divanis A 2013. Utility-aware anonymization of diagnosis codes. IEEE J. Biomed. Health Inform. 17:160–70
    [Google Scholar]
  51. 51.  Heatherly R, Rasmussen LV, Peissig PL, Pacheco JA, Harris P et al. 2016. A multi-institution evaluation of clinical profile anonymization. J. Am. Med. Inform. Assoc. 23:e131–37
    [Google Scholar]
  52. 52.  Poulis G, Loukides G, Skiadopoulos S, Gkoulalas-Divanis A 2017. Anonymizing datasets with demographics and diagnosis codes in the presence of utility constraints. J. Biomed. Inform. 65:76–96
    [Google Scholar]
  53. 53.  Dwork C 2006. Differential privacy. Automata, Languages and Programming: 33rd International Colloquium, ICALP 2006, Venice, Italy, July 10–14, 2006, Proceedings, Part II M Bugliesi, B Preneel, V Sassone, I Wegener 1–12 Berlin: Springer-Verlag
    [Google Scholar]
  54. 54.  Vinterbo SA, Sarwate AD, Boxwala AA 2012. Protecting count queries in study design. J. Am. Med. Inform. Assoc. 19:5750–57
    [Google Scholar]
  55. 55.  Gkoulalas-Divanis A, Loukides G, Sun J 2014. Publishing data from electronic health records while preserving privacy: a survey of algorithms. J. Biomed. Inform. 50:4–19
    [Google Scholar]
  56. 56.  Dankar FK, El Emam K 2012. The application of differential privacy to health data. Proc. 2012 Jt. EDBT/ICDT Workshops Berlin, Ger., 30 Mar.158–66 New York: Assoc. Comput. Mach.
    [Google Scholar]
  57. 57.  Ji Z, Jiang X, Wang S, Xiong L, Ohno-Machado L 2014. Differentially private distributed logistic regression using private and public data. BMC Med. Genom. 7:S14
    [Google Scholar]
  58. 58.  Li H, Xiong L, Ohno-Machado L, Jiang X 2014. Privacy preserving RBF kernel support vector machine. Biomed. Res. Int. 2014:827371
    [Google Scholar]
  59. 59.  Simmons S, Sahinalp C, Berger B 2016. Enabling privacy-preserving GWASs in heterogeneous human populations. Cell Syst 3:154–61
    [Google Scholar]
  60. 60.  Simmons S, Berger B 2016. Realizing privacy preserving genome-wide association studies. Bioinformatics 32:91293–300
    [Google Scholar]
  61. 61.  Johnson A, Shmatikov V 2013. Privacy-preserving data exploration in genome-wide association studies. Proc. Int. Conf. Knowl. Discov. Data Min., 19th, Chicago, Ill., 11–14 Aug R Ghani, TE Senator, P Bradley, R Parek, J He 1079–87 New York: Assoc. Comput. Mach.
    [Google Scholar]
  62. 62.  Yu F, Fienberg SE, Slavković AB, Uhler C 2014. Scalable privacy-preserving data sharing methodology for genome-wide association studies. J. Biomed. Inform. 50:133–41
    [Google Scholar]
  63. 63.  Yu F, Ji Z 2014. Scalable privacy-preserving data sharing methodology for genome-wide association studies: an application to iDASH healthcare privacy protection challenge. BMC Med. Inform. Decis. Mak. 14:S3
    [Google Scholar]
  64. 64.  Thilakanathan D, Calvo RA, Chen S, Nepal S, Glozier N 2016. Facilitating secure sharing of personal health data in the cloud. JMIR Med. Inform. 4:2e15
    [Google Scholar]
  65. 65.  Chen C-L, Yang T-T, Shih T-F 2014. A secure medical data exchange protocol based on cloud environment. J. Med. Syst 38:9112
    [Google Scholar]
  66. 66.  Bredfeldt CE, Compton-Phillips AL, Snyder MH 2011. Effects of between visit physician-patient communication on Diabetes Recognition Program scores. Int. J. Qual. Health Care 23:6664–73
    [Google Scholar]
  67. 67.  Chen Y-C, Horng G, Lin Y-J, Chen K-C 2013. Privacy preserving index for encrypted electronic medical records. J. Med. Syst 37:69992
    [Google Scholar]
  68. 68.  Wu Y, Lu X, Su J, Chen P 2016. An efficient searchable encryption against keyword guessing attacks for sharable electronic medical records in cloud-based system. J. Med. Syst. 40:121–9
    [Google Scholar]
  69. 69.  Yuan J, Malin B, Modave F, Guo Y, Hogan WR et al. 2017. Towards a privacy preserving cohort discovery framework for clinical research networks. J. Biomed. Inform. 66:42–51
    [Google Scholar]
  70. 70.  Eom J, Lee DH, Lee K 2016. Patient-controlled attribute-based encryption for secure electronic health records system. J. Med. Syst. 40:12253
    [Google Scholar]
  71. 71.  Zhang L, Wu Q, Mu Y, Zhang J 2016. Privacy-preserving and secure sharing of PHR in the cloud. J. Med. Syst. 40:12267
    [Google Scholar]
  72. 72.  Wyatt MC, Hendrickson RC, Ames M, Bondy J, Ranauro P et al. 2014. Federated Aggregate Cohort Estimator (FACE): an easy to deploy, vendor neutral, multi-institutional cohort query architecture. J. Biomed. Inform. 52:65–71
    [Google Scholar]
  73. 73.  Fleurence RL, Curtis LH, Califf RM, Platt R, Selby JV, Brown JS 2014. Launching PCORnet, a national patient-centered clinical research network. J. Am. Med. Inform. Assoc. 21:4578–82
    [Google Scholar]
  74. 74.  Wu Y, Jiang X, Kim J, Ohno-Machado L 2012. Grid binary logistic regression (GLORE): building shared models without sharing data. J. Am. Med. Inform. Assoc. 19:5758–64
    [Google Scholar]
  75. 75.  Li Y, Jiang X, Wang S, Xiong H, Ohno-Machado L 2015. Vertical grid logistic regression (VERTIGO). J. Am. Med. Inform. Assoc. 23:3570–79
    [Google Scholar]
  76. 76.  Wu Y, Jiang X, Wang S, Jiang W, Li P, Ohno-Machado L 2015. Grid multi-category response logistic models. BMC Med. Inform. Decis. Mak. 15:758–64
    [Google Scholar]
  77. 77.  Li Y, Bai C, Reddy CK 2016. A distributed ensemble approach for mining healthcare data under privacy constraints. Inf. Sci. 330:245–59
    [Google Scholar]
  78. 78.  Brumen B, Heričko M, Sevčnikar A, Završnik J, Hölbl M 2013. Outsourcing medical data analyses: Can technology overcome legal, privacy, and confidentiality issues?. J. Med. Internet Res. 15:12e283
    [Google Scholar]
  79. 79.  Liu X, Lu R, Ma J, Chen L, Qin B 2016. Privacy-preserving patient-centric clinical decision support system on naïve Bayesian classification. IEEE J. Biomed. Health Inform. 20:2655–68
    [Google Scholar]
  80. 80.  Rahulamathavan Y, Veluru S, Phan RC-W, Chambers JA, Rajarajan M 2014. Privacy-preserving clinical decision support system using Gaussian kernel-based classification. IEEE J. Biomed. Health Inform. 18:156–66
    [Google Scholar]
  81. 81.  Graepel T, Lauter K, Naehrig M 2012. ML Confidential: machine learning on encrypted data. Information Security and Cryptology T Kwon, MK Lee, D Kwon 1–21 Berlin: Springer-Verlag
    [Google Scholar]
  82. 82.  Dowlin N, Gilad-Bachrach R, Laine K, Lauter K, Naehrig M, Wernsing J 2016. CryptoNets: applying neural networks to encrypted data with high throughput and accuracy. J. Mach. Learn. Res. 48:201–10
    [Google Scholar]
  83. 83.  Jiang X, Ohno-Machado L, Malin B, Tang H, Wang S et al. 2014. A community assessment of data perturbation techniques on privacy protection for human genome data. BMC Med. Inform. Decis. Mak. 14:1S1
    [Google Scholar]
  84. 84.  Tang H, Jiang X, Wang X, Wang S, Sofia H et al. 2016. Protecting genomic data analytics in the cloud: state of the art and opportunities. BMC Med. Genom. 9:163
    [Google Scholar]
  85. 85.  El Emam K, Jonker E, Arbuckle L, Malin B 2011. A systematic review of re-identification attacks on health data. PLOS ONE 6:12e28071
    [Google Scholar]
  86. 86.  Machanavajjhala A, Gehrke J, Kifer D, Venkitasubramaniam M 2006. l-diversity: privacy beyond k-anonymity. ACM Trans. Knowl. Discov. Data 113
    [Google Scholar]
  87. 87.  Li N, Li T, Venkatasubramanian S 2007. t-closeness: privacy beyond k-anonymity and l-diversity. Proc. IEEE Int. Conf. Data Eng., 23rd, Istanb., Turk., 15–20 Apr106–15 New York: IEEE
    [Google Scholar]
  88. 88.  Li H, Xiong L, Ji Z, Jiang X 2017. Partitioning-based mechanisms under personalized differential privacy. Proc. Adv. Knowledge Discov. Data Mining, 21st, Jeju, S. Korea, 23–26 May615–27 Cham, Switz: Springer Int.
    [Google Scholar]
  89. 89.  Xu S, Su S, Xiong L, Cheng X, Xiao K 2016. Differentially private frequent subgraph mining. Proc. Int. Conf. Data Eng., 32nd, Hels., Finl., 16–20 May229–40 New York: IEEE
    [Google Scholar]
  90. 90.  Li H, Xiong L, Jiang X, Liu J 2015. Differentially private histogram publication for dynamic datasets: an adaptive sampling approach. Proc. ACM Int. Conf. Inf. Knowl. Manag., 24th, Melb., Aust., 18–23 Oct1001–10 New York: Assoc. Comput. Mach.
    [Google Scholar]
  91. 91.  Mohammed N, Chen R, Fung BC, Yu PS, Philip SY 2011. Differentially private data release for data mining. Proc. Int. Conf. Knowl. Discov. Data Mining, 17th, San Diego, Calif., 21–24 Aug.493–501 New York: Assoc. Comput. Mach.
    [Google Scholar]
/content/journals/10.1146/annurev-biodatasci-080917-013416
Loading
/content/journals/10.1146/annurev-biodatasci-080917-013416
Loading

Data & Media loading...

  • Article Type: Review Article
This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error