Stress Recognition in Speech – A Survey of The State of The Art

Authors

  • L Lavanya
  • N Vasavya

DOI:

https://doi.org/10.52783/jns.v14.2153

Keywords:

Survey, Stress Recognition, Stress Speech Database, Features recognizes stress

Abstract

Stress recognition from speech seeks an humongous attention among the researchers and from the industrial sides like call centres for recognizing the customer’s intension over speech. Recognizing stress using visual is easier when compared with recognition of stress from speech signal since Lombard effect affects the normal speech heavily. In this paper a detailed survey has been made on the research works that are carried out only to recognize the stress from speech signal. This paper also addresses the databases that are considered only for stress recognition. The speech signals of the databases cited in this paper consists of the speech signals that are only intended to recognize the level of stress from the speech signal. A detailed Table has been cited which holds the core part of each and every research work carried on recognition of speech signal.

Downloads

Download data is not yet available.

Metrics

Metrics Loading ...

References

Hansen, J.H.L., 1996. NATO IST-03 (formerly RSG. 10) speech under stress web page. Available from: http://cslr.colorado.edu/rspl/stress.html

McMahon, E., Cowie, R., Kasderidis, S., Taylor, J., Kollias, S., 2003. What chance that a DC could recognise hazardous mental states from sensor outputs? In: Tales of the Disappearing Computer, Santorini, Greece.

Rahurkar, M., Hansen, J.H.L., 2002. Frequency band analysis for stress detection using a Teager energy operator based feature. In: Proc. Internat. Conf. on Spoken Language Processing (ICSLP ’02), Vol. 3, pp. 2021–2024

Scherer, K.R., 2000b. Emotion effects on voice and speech: paradigms and approaches to evaluation. In: Proc. ISCA Workshop on Speech and Emotion, Belfast, invited paper.

Scherer, K.R., 2003. Vocal communication of emotion: a review of research paradigms. Speech Comm. 40, 227–256.

Fernandez, R., & Picard, R. W. (2003). Modeling driver’s speech under stress. Speech Communication, 40, 145–159.

Luig, Johannes, and Alois Sontacchi. "A speech database for stress monitoring in the cockpit." Proceedings of the Institution of Mechanical Engineers, Part G: Journal of Aerospace Engineering 228.2 (2014): 284-296.

Sabo, Róbert, and Jakub Rajčáni. "Designing the database of speech under stress." Journal of Linguistics/Jazykovednýcasopis 68.2 (2017): 326-335.

Rahurkar, Mandar A., et al. "Frequency band analysis for stress detection using a Teager energy operator-based feature." Seventh International Conference on Spoken Language Processing. 2002.

Besbes, Salsabil, and ZiedLachiri. "Classification of speech under stress based on cepstral features and one-class SVM." 2017 International Conference on Control, Automation and Diagnosis (ICCAD). IEEE, 2017.

http://emodb.bilderbar.info/index-1024.html

Yogesh, C. K., Hariharan, M., Yuvaraj, R., Ngadiran, R., Yaacob, S., &Polat, K. (2017). Bispectral features and mean shift clustering for stress and emotion recognition from natural speech. Computers & Electrical Engineering, 62, 676-691.

Yogesh, C. K., Hariharan, M., Ngadiran, R., Adom, A. H., Yaacob, S., &Polat, K. (2017). Hybrid BBO_PSO and higher order spectral features for emotion and stress recognition from natural speech. Applied Soft Computing, 56, 217-232.

Yogesh, C. K., Hariharan, M., Ngadiran, R., Adom, A. H., Yaacob, S., Berkai, C., &Polat, K. (2017). A new hybrid PSO assisted biogeography-based optimization for emotion and stress recognition from speech signal. Expert Systems with Applications, 69, 149-158.

Bou-Ghazale, S. E., & Hansen, J. H. (2000). A comparative study of traditional and newly proposed features for recognition of speech under stress. IEEE Transactions on speech and audio processing, 8(4), 429-442.

Bou-Ghazale, S. E., & Hansen, J. H. (1998). HMM-based stressed speech modeling with application to improved synthesis and recognition of isolated speech under stress. IEEE Transactions on Speech and Audio Processing, 6(3), 201-216.

Aswathi Varsha K T K, S.Lalitha, “Stress Recognition using Sparse Representation of Speech Signal for Deception Detection Applications in Indian Context” IEEE Conference, 2017.

Zhou, G., Hansen, J. H., & Kaiser, J. F. (1998). A new nonlinear feature for stress classification. In Third IEEE Nordic Signal Processing Symposium.

Yakoumaki, T., Kafentzis, G. P., &Stylianou, Y. (2014). Emotional speech classification using adaptive sinusoidal modelling. In Fifteenth Annual Conference of the International Speech Communication Association.

Reddy, G. G. SPEECH ANALYSIS-SYNTHESIS FOR SPEAKER CHARACTERISTIC MODIFICATION.

McAulay, R., &Quatieri, T. (1986). Speech analysis/synthesis based on a sinusoidal representation. IEEE Transactions on Acoustics, Speech, and Signal Processing, 34(4), 744-754.

Sagayama, S., &Itakura, F. (1981). A composite sinusoidal model applied to spectral analysis of speech. Electronics and Communications in Japan (Part I: Communications), 64(2), 1-10.

M. You, C. Chen, J. Bu, J. Liu, J. Tao, Getting started with susas: a speech undersimulated and actual stress database, in: EUROSPEECH-97, vol. 4, 1997, pp. 1743–1746.

C. Williams, K. Stevens, Emotions and speech: some acoustical correlates,J. Acoust. Soc. Am. 52 (4 Pt 2) (1972) 1238–1250

Downloads

Published

2025-03-15

How to Cite

1.
Lavanya L, Vasavya N. Stress Recognition in Speech – A Survey of The State of The Art. J Neonatal Surg [Internet]. 2025Mar.15 [cited 2025Sep.12];14(5S):793-8. Available from: https://www.jneonatalsurg.com/index.php/jns/article/view/2153