AI and ML for Depression And Suicide Screening

From Clinfowiki
Revision as of 23:00, 20 October 2019 by Derekmrichardson (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

AI/ML for Depression and Suicide Prediction

   Depression is a serious global health issue. The World Health Organization (WHO) estimates that 800,000 people per year commit suicide.1  This equates to one person committing suicide every 40 seconds.1  The current clinical standard for depression screening is a positive Patient Health Questionnaire (PHQ-9) score with at least two weeks of symptoms.2 
   In recent years there have been multiple efforts to harness the power of Artificial Intelligence (AI) and Machine Learning (ML) to increase detection of those at risk of suicide. This article will detail some recent efforts and their results. 

Stanford Project

    A study conducted by computer scientists at Stanford University [1] utilized 3D facial expressions and spoken language to predict depression.3  The researches gathered their data from the publicly available Distress Analysis and Interview Corpus (DAIC) dataset. This dataset includes 3D facial scans and audio from depressed and non-depressed patients along with each subject’s corresponding PHQ-9 score. The study was able to detect major depressive disorder with a sensitivity of 83.3% and a specificity of 82.6%.3 
   For this study the patient’s speech was broken down into two components. One was an audio recording, visualized as a log-mel spectrogram. The other was text transcription of the patient’s speech. To improve the model’s prediction capabilities the researchers used the novel approach of embedding the audio data at the sentence level as opposed to typical word or phenome level embeddings. The study states, “This allows us to capture long-term acoustic, visual, and linguistic elements.”3   The study then used a Casual Convolution Network (CCN) to analyze the audio data as they found that this model outperformed a Recurrent Neural Network (RNN) on longer sequences. 
   The authors claim their study highlights the potential of combining speech recognition, computer vision, and natural language processing (NLP) to accurately predict depression and posit that such methods could cheaply be deployed on existing cell phone technology.3  
   A notable caveat in this study is that the test subjects were responding to a physician controlled digital avatar that was attempting to diagnose depression. The authors counter that “the data was collected from human-to-computer interviews and not human-to-human. Compared to a human interviewer, research has shown that patients report lower fear of disclosure and display more emotional intensity when conversing with an avatar.”3 
   The authors conclude, “Our work is meant to augment existing clinical methods and not issue a formal diagnosis” and “Mitigating bias is outside the scope of our work but is crucial to providing culturally sensitive diagnosis and treatment.”3 

MIT Project

   A research team at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) [2] published the results of a study in 2018 that used ML to predict depression risk by parsing both text and speech.4  For this study the researchers gathered data on 142 subjects that underwent depression screening from the previously mentioned DAIC dataset. 
   The study consisted of three experiments. The first was to assess the predictive value of context-free audio and text data. The second experiment was a weighted model that considered what type of question was asked but independent of the time the question was asked. The third experiment was run with a Long Short-Term Memory (LSTM) neural network. [3] For this final experiment the researchers created one model with separate LSTM for audio and text, and another model that combined both types of data input.  
   The best performing model in the MIT study was the LSTM that combined both audio and text data. This model yielded an F1 measure of 0.77 and recall of 0.83.4  The authors concluded that this reflects that the audio and text data “provided additional discriminative power” and “that they contained complementary information.”4 
   Head researcher, Tuka Alhanai, told the MIT News Office [4] that a key innovation of this model is the ability to detect patterns of depression without context.5  Alhanai stated, “We call it ‘context-free,’ because you’re not putting any constraints into the types of questions you’re looking for and the type of responses to those questions.”5   
   Mr. Alhanai went on further, “If you want to deploy [depression-detection] models in scalable way … you want to minimize the amount of constraints you have on the data you’re using. You want to deploy it in any regular conversation and have the model pick up, from the natural interaction, the state of the individual.”5 
   Another interesting finding from the study was that the model needed over four times as much audio data (average 30 sequences) as compared to text (average 7 sequences) when predicting depression.4  Alhanai explained, “That implies that the patterns in words people use that are predictive of depression happen in shorter time span in text than in audio.”5  

Facebook Project

   Facebook has created an algorithm to detect suicide risk in an attempt to prevent the live-streaming of suicides.6  A Business Insider article from 2019 [5] details how this algorithm combs almost every post on Facebook, and rates all the content on a scale from zero to one, with one expressing the highest likelihood of "imminent harm."6 
   The article details how Health Insurance Portability and Accountability Act (HIPAA) protections do not apply to Facebook’s generated data because Facebook is not providing healthcare services and therefore exempt from the need to be HIPAA compliant.6

   Per the article, “Facebook did not respond when asked how long and in what form data about higher suicide risk scores and subsequent interventions are stored.”6 If a post is flagged for suicide risk, it is forwarded to their content moderators. However, the article states, “Facebook would not go into specifics on the training content moderators receive around suicide but insist that they are trained to accurately screen posts for potential suicide risk.”6 
   Facebook CEO Mark Zuckerberg wrote a post [6] on the initiative and said, "In the last year, we've helped first responders quickly reach around 3,500 people globally who needed help."  Despite Mr. Zuckerberg’s excitement, the use of this algorithm in the European Union was halted due to special privacy protections under the recently passed General Data Protection Regulation (GDPR).6 [7]

Other Notable Projects

   There have been many other recent studies using machine learning techniques to predict suicide risk. A 2017 study at Vanderbilt University [8] designed a ML algorithm that predicted suicide risk using health and demographic data extracted from Electronic Health Records (EHRs).7  The Vanderbilt model predicted suicide risk with an accuracy rate of 84–92% within 1 week, and 80–86% within 2 years.7 
   A 2018 Korean study by Seunghyong Ryu [9] published in Psychiatry Investigation detailed a random forest machine learning algorithm to predict individuals with suicide risk within the general population. This model used 15 features of public health data with variables such as; sex, education, age, reasons for unemployment, and generated an area under the curve (AUC) = 0.85.8 
   Another 2018 study published in BMC Medical Informatics and Decision Making [10] detailed a convolutional neural network (CNN)[11] based algorithm to recognize suicide related psychiatric stressors from Twitter posts.9  
   The Department of Veterans Affairs (VA) is in the process of refining an AI algorithm [12] to predict suicide risk along with another to predict opiate overdose risk.10 

Future Applications and Legal Ramifications

   The increasing accuracy of these depression prediction models makes it likely that they will soon be utilized in the general population. It will be important to ensure that the models do not exhibit bias and that they perform adequately when used outside of a controlled study environment. 
   If these technologies do indeed become more prevalent it will inevitably create questions on what regulations and legal frameworks will apply to such algorithms. The Federal Drug Administration (FDA) released a discussion paper [13] for proposed regulatory frameworks around AI/ML based software as a medical device in April 2019.11  


1 World Health Organization. Retrieved from: 2 American Psychiatric Association. (2013). Diagnostic and statistical manual of mental disorders (5th ed.). Washington, DC: Author. 3 Haque, A, et al. “Measuring Depression Symptom Severity from Spoken Language and 3D Facial Expressions.” November 27, 2018. Retrieved from: 4 Alhanai, T., et al. “Detecting Depression with Audio/Text Sequence Modeling Interviews.” Interspeech 2-6 September 2018, 1716-1720. 5 Matheson, R. (2018, August 29). Model can more naturally detect depression in conversations. MIT News. Retrieved from: 6 Goggin, B. (2019, Jan 6). Inside Facebook’s suicide algorithm: Here’s how the company uses artificial intelligence to predict your mental health state from your posts. Business Insider. Retrieved from: 7 Walsh, C. et al. “Predicting Risk of Suicide Attempts Over Time Through Machine Learning.” Association for Psychological Science. 2017: 1-12. 8 Ryu, S. et. al. “Use of Machine Learning Algorithm to Predict Individuals with Suicide Ideation in the General Population.” Psychiatry Investig 2018;15(11):1030-1036. 9 Du et al. “Extracting Psychiatric Stressors for suicide from social media using deep learning.” BMC Medical Informatics and Decision Making 2018, 18(Suppl 2):43 10 Ravindranath, M. (2019, June 25) How the VA uses algorithms to predict suicide. Politico. Retrieved from: 11 U.S. Federal Drug Administration (2019). Proposed Regulatory Framework for Modifications to Artificial Intelligence/Machine Learning (AI/ML) – Based Software as a Medical Device (SaMD). Retrieved from:

Submitted by Derek Richardson