Predicting the risk of suicide by analyzing the text of clinical notes

PLoS One. 2014 Jan 28;9(1):e85733. doi: 10.1371/journal.pone.0085733. eCollection 2014.

Abstract

We developed linguistics-driven prediction models to estimate the risk of suicide. These models were generated from unstructured clinical notes taken from a national sample of U.S. Veterans Administration (VA) medical records. We created three matched cohorts: veterans who committed suicide, veterans who used mental health services and did not commit suicide, and veterans who did not use mental health services and did not commit suicide during the observation period (n = 70 in each group). From the clinical notes, we generated datasets of single keywords and multi-word phrases, and constructed prediction models using a machine-learning algorithm based on a genetic programming framework. The resulting inference accuracy was consistently 65% or more. Our data therefore suggests that computerized text analytics can be applied to unstructured medical records to estimate the risk of suicide. The resulting system could allow clinicians to potentially screen seemingly healthy patients at the primary care level, and to continuously evaluate the suicide risk among psychiatric patients.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Humans
  • Medical Records / statistics & numerical data
  • Mental Health Services / statistics & numerical data
  • Risk Factors
  • Suicide / statistics & numerical data
  • Suicide Prevention*
  • United States
  • United States Department of Veterans Affairs / statistics & numerical data
  • Veterans / psychology
  • Veterans / statistics & numerical data