Stress Recognition in Speech – A Survey of The State of The Art
DOI:
https://doi.org/10.52783/jns.v14.2153Keywords:
Survey, Stress Recognition, Stress Speech Database, Features recognizes stressAbstract
Stress recognition from speech seeks an humongous attention among the researchers and from the industrial sides like call centres for recognizing the customer’s intension over speech. Recognizing stress using visual is easier when compared with recognition of stress from speech signal since Lombard effect affects the normal speech heavily. In this paper a detailed survey has been made on the research works that are carried out only to recognize the stress from speech signal. This paper also addresses the databases that are considered only for stress recognition. The speech signals of the databases cited in this paper consists of the speech signals that are only intended to recognize the level of stress from the speech signal. A detailed Table has been cited which holds the core part of each and every research work carried on recognition of speech signal.
Downloads
Metrics
References
Hansen, J.H.L., 1996. NATO IST-03 (formerly RSG. 10) speech under stress web page. Available from: http://cslr.colorado.edu/rspl/stress.html
McMahon, E., Cowie, R., Kasderidis, S., Taylor, J., Kollias, S., 2003. What chance that a DC could recognise hazardous mental states from sensor outputs? In: Tales of the Disappearing Computer, Santorini, Greece.
Rahurkar, M., Hansen, J.H.L., 2002. Frequency band analysis for stress detection using a Teager energy operator based feature. In: Proc. Internat. Conf. on Spoken Language Processing (ICSLP ’02), Vol. 3, pp. 2021–2024
Scherer, K.R., 2000b. Emotion effects on voice and speech: paradigms and approaches to evaluation. In: Proc. ISCA Workshop on Speech and Emotion, Belfast, invited paper.
Scherer, K.R., 2003. Vocal communication of emotion: a review of research paradigms. Speech Comm. 40, 227–256.
Fernandez, R., & Picard, R. W. (2003). Modeling driver’s speech under stress. Speech Communication, 40, 145–159.
Luig, Johannes, and Alois Sontacchi. "A speech database for stress monitoring in the cockpit." Proceedings of the Institution of Mechanical Engineers, Part G: Journal of Aerospace Engineering 228.2 (2014): 284-296.
Sabo, Róbert, and Jakub Rajčáni. "Designing the database of speech under stress." Journal of Linguistics/Jazykovednýcasopis 68.2 (2017): 326-335.
Rahurkar, Mandar A., et al. "Frequency band analysis for stress detection using a Teager energy operator-based feature." Seventh International Conference on Spoken Language Processing. 2002.
Besbes, Salsabil, and ZiedLachiri. "Classification of speech under stress based on cepstral features and one-class SVM." 2017 International Conference on Control, Automation and Diagnosis (ICCAD). IEEE, 2017.
http://emodb.bilderbar.info/index-1024.html
Yogesh, C. K., Hariharan, M., Yuvaraj, R., Ngadiran, R., Yaacob, S., &Polat, K. (2017). Bispectral features and mean shift clustering for stress and emotion recognition from natural speech. Computers & Electrical Engineering, 62, 676-691.
Yogesh, C. K., Hariharan, M., Ngadiran, R., Adom, A. H., Yaacob, S., &Polat, K. (2017). Hybrid BBO_PSO and higher order spectral features for emotion and stress recognition from natural speech. Applied Soft Computing, 56, 217-232.
Yogesh, C. K., Hariharan, M., Ngadiran, R., Adom, A. H., Yaacob, S., Berkai, C., &Polat, K. (2017). A new hybrid PSO assisted biogeography-based optimization for emotion and stress recognition from speech signal. Expert Systems with Applications, 69, 149-158.
Bou-Ghazale, S. E., & Hansen, J. H. (2000). A comparative study of traditional and newly proposed features for recognition of speech under stress. IEEE Transactions on speech and audio processing, 8(4), 429-442.
Bou-Ghazale, S. E., & Hansen, J. H. (1998). HMM-based stressed speech modeling with application to improved synthesis and recognition of isolated speech under stress. IEEE Transactions on Speech and Audio Processing, 6(3), 201-216.
Aswathi Varsha K T K, S.Lalitha, “Stress Recognition using Sparse Representation of Speech Signal for Deception Detection Applications in Indian Context” IEEE Conference, 2017.
Zhou, G., Hansen, J. H., & Kaiser, J. F. (1998). A new nonlinear feature for stress classification. In Third IEEE Nordic Signal Processing Symposium.
Yakoumaki, T., Kafentzis, G. P., &Stylianou, Y. (2014). Emotional speech classification using adaptive sinusoidal modelling. In Fifteenth Annual Conference of the International Speech Communication Association.
Reddy, G. G. SPEECH ANALYSIS-SYNTHESIS FOR SPEAKER CHARACTERISTIC MODIFICATION.
McAulay, R., &Quatieri, T. (1986). Speech analysis/synthesis based on a sinusoidal representation. IEEE Transactions on Acoustics, Speech, and Signal Processing, 34(4), 744-754.
Sagayama, S., &Itakura, F. (1981). A composite sinusoidal model applied to spectral analysis of speech. Electronics and Communications in Japan (Part I: Communications), 64(2), 1-10.
M. You, C. Chen, J. Bu, J. Liu, J. Tao, Getting started with susas: a speech undersimulated and actual stress database, in: EUROSPEECH-97, vol. 4, 1997, pp. 1743–1746.
C. Williams, K. Stevens, Emotions and speech: some acoustical correlates,J. Acoust. Soc. Am. 52 (4 Pt 2) (1972) 1238–1250
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution 4.0 International License.
You are free to:
- Share — copy and redistribute the material in any medium or format
- Adapt — remix, transform, and build upon the material for any purpose, even commercially.
Terms:
- Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
- No additional restrictions — You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.