Detecting Depression from Speech through an Attentive LSTM Network. / Zhao, Yan; Xie, Yue; LIANG, Ruiyu ; Zhang, Li; Zhao, Li; Liu, Chengyu.

In: IEICE Transactions on Information and Systems, Vol. E104.D, No. 11, 01.11.2021, p. 2019-2023.

Research output: Contribution to journalArticlepeer-review

Published

Standard

Detecting Depression from Speech through an Attentive LSTM Network. / Zhao, Yan; Xie, Yue; LIANG, Ruiyu ; Zhang, Li; Zhao, Li; Liu, Chengyu.

In: IEICE Transactions on Information and Systems, Vol. E104.D, No. 11, 01.11.2021, p. 2019-2023.

Research output: Contribution to journalArticlepeer-review

Harvard

Zhao, Y, Xie, Y, LIANG, R, Zhang, L, Zhao, L & Liu, C 2021, 'Detecting Depression from Speech through an Attentive LSTM Network', IEICE Transactions on Information and Systems, vol. E104.D, no. 11, pp. 2019-2023. https://doi.org/10.1587/transinf.2020EDL8132

APA

Zhao, Y., Xie, Y., LIANG, R., Zhang, L., Zhao, L., & Liu, C. (2021). Detecting Depression from Speech through an Attentive LSTM Network. IEICE Transactions on Information and Systems, E104.D(11), 2019-2023. https://doi.org/10.1587/transinf.2020EDL8132

Vancouver

Zhao Y, Xie Y, LIANG R, Zhang L, Zhao L, Liu C. Detecting Depression from Speech through an Attentive LSTM Network. IEICE Transactions on Information and Systems. 2021 Nov 1;E104.D(11):2019-2023. https://doi.org/10.1587/transinf.2020EDL8132

Author

Zhao, Yan ; Xie, Yue ; LIANG, Ruiyu ; Zhang, Li ; Zhao, Li ; Liu, Chengyu. / Detecting Depression from Speech through an Attentive LSTM Network. In: IEICE Transactions on Information and Systems. 2021 ; Vol. E104.D, No. 11. pp. 2019-2023.

BibTeX

@article{920ba78783f4458494a22a4335e933be,
title = "Detecting Depression from Speech through an Attentive LSTM Network",
abstract = "Depression endangers people{\textquoteright}s health conditions and afffects the social order as a mental disorder. As an efficient diagnosis of depression, automatic depression detection has attracted lots of researcher{\textquoteright}s interest. This study presents an attention-based Long Short-Term Memory (LSTM) model for depression detection to make full use of the difference between depression and non-depression between timeframes. The proposed model uses frame-level features, which capture the temporal information of depressive speech, to replace traditional statistical features as an input of the LSTM layers. To achieve more multi-dimensional deep feature representations, the LSTM output is then passed on attention layers on both time and feature dimensions. Then, we concat the output of the attention layers and put the fused feature representation into the fully connected layer. At last, the fully connected layer{\textquoteright}s output is passed on to softmax layer. Experiments conducted on the DAIC-WOZ database demonstratethat the proposed attentive LSTM model achieves an average accuracy rate of 90.2% and outperforms the traditional LSTM network and LSTM with local attention by 0.7% and 2.3%, respectively, which indicates its feasibility.",
author = "Yan Zhao and Yue Xie and Ruiyu LIANG and Li Zhang and Li Zhao and Chengyu Liu",
year = "2021",
month = nov,
day = "1",
doi = "10.1587/transinf.2020EDL8132",
language = "English",
volume = "E104.D",
pages = "2019--2023",
journal = "IEICE Transactions on Information and Systems",
issn = "1745-1361",
publisher = "Maruzen Co., Ltd/Maruzen Kabushikikaisha",
number = "11",

}

RIS

TY - JOUR

T1 - Detecting Depression from Speech through an Attentive LSTM Network

AU - Zhao, Yan

AU - Xie, Yue

AU - LIANG, Ruiyu

AU - Zhang, Li

AU - Zhao, Li

AU - Liu, Chengyu

PY - 2021/11/1

Y1 - 2021/11/1

N2 - Depression endangers people’s health conditions and afffects the social order as a mental disorder. As an efficient diagnosis of depression, automatic depression detection has attracted lots of researcher’s interest. This study presents an attention-based Long Short-Term Memory (LSTM) model for depression detection to make full use of the difference between depression and non-depression between timeframes. The proposed model uses frame-level features, which capture the temporal information of depressive speech, to replace traditional statistical features as an input of the LSTM layers. To achieve more multi-dimensional deep feature representations, the LSTM output is then passed on attention layers on both time and feature dimensions. Then, we concat the output of the attention layers and put the fused feature representation into the fully connected layer. At last, the fully connected layer’s output is passed on to softmax layer. Experiments conducted on the DAIC-WOZ database demonstratethat the proposed attentive LSTM model achieves an average accuracy rate of 90.2% and outperforms the traditional LSTM network and LSTM with local attention by 0.7% and 2.3%, respectively, which indicates its feasibility.

AB - Depression endangers people’s health conditions and afffects the social order as a mental disorder. As an efficient diagnosis of depression, automatic depression detection has attracted lots of researcher’s interest. This study presents an attention-based Long Short-Term Memory (LSTM) model for depression detection to make full use of the difference between depression and non-depression between timeframes. The proposed model uses frame-level features, which capture the temporal information of depressive speech, to replace traditional statistical features as an input of the LSTM layers. To achieve more multi-dimensional deep feature representations, the LSTM output is then passed on attention layers on both time and feature dimensions. Then, we concat the output of the attention layers and put the fused feature representation into the fully connected layer. At last, the fully connected layer’s output is passed on to softmax layer. Experiments conducted on the DAIC-WOZ database demonstratethat the proposed attentive LSTM model achieves an average accuracy rate of 90.2% and outperforms the traditional LSTM network and LSTM with local attention by 0.7% and 2.3%, respectively, which indicates its feasibility.

U2 - 10.1587/transinf.2020EDL8132

DO - 10.1587/transinf.2020EDL8132

M3 - Article

VL - E104.D

SP - 2019

EP - 2023

JO - IEICE Transactions on Information and Systems

JF - IEICE Transactions on Information and Systems

SN - 1745-1361

IS - 11

ER -