Light SAED: A Robust, Lightweight, and Culturally Adaptable Cross-Modal Transformer for Sarcasm-Aware Emotion and Intensity Detection in Multimodal Tweets

Authors

Sanjeet Kumar
Jameel Ahmad

DOI:

https://doi.org/10.63682/jns.v14i14S.4370

Keywords:

Multimodal Emotion Detection, Cross-Modal Transformer, Sarcasm Detection, LIightSAED

Abstract

Detecting emotions on social media is crucial for applications such as mental health monitoring and brand analytics. However, existing models often overlook inter-modal interactions, disregard cultural variations, and rely on computationally expensive architectures. We propose LightSAED, a lightweight cross-modal transformer that fuses textual, visual, and emoji data to detect emotions, sarcasm, and emotional intensity in tweets. LightSAED introduces three key innovations: (1) a dynamic cross-modal attention mechanism for effective multimodal fusion, (2) a dedicated sarcasm detection sub-layer trained with explicit supervision, and (3) a hierarchical cultural adaptation layer leveraging region-specific embeddings based on sociolinguistic features. We also present TwemoInt++, a curated dataset of 50,000+ tweets, annotated for emotion, sarcasm, and intensity, stratified into ten culturally defined regions. Extensive experiments show that LightSAED outperforms state-of-the-art baselines, improving emotion accuracy by 6.2% and sarcasm detection F1-score by 9.8%. Robustness tests against noisy data and adversarial examples further validate its reliability. To enhance efficiency, pruning and 8-bit quantization reduce inference time by 42% and model size by 63%, enabling real-time edge deployment on resource-constrained devices. Despite its advancements, challenges remain in handling ambiguous cultural cues and low-resource languages, paving the way for future enhancements

Downloads

Download data is not yet available.

References

N. Braig, A. Benz, S. Voth, J. Breitenbach, and R. Buettner, “Machine Learning Techniques for Sentiment Analysis of COVID-19-Related Twitter Data,” IEEE Access, vol. 11, pp. 14778–14803, 2023, doi: 10.1109/ACCESS.2023.3242234.

M. Y. Kabir and S. Madria, “EMOCOV: Machine learning for emotion detection, analysis and visualization using COVID-19 tweets,” Online Soc Netw Media, vol. 23, May 2021, doi: 10.1016/j.osnem.2021.100135.

J. Devlin, M.-W. Chang, K. Lee, K. T. Google, and A. I. Language, “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding.” [Online]. Available: https://github.com/tensorflow/tensor2tensor

V. Sanh, L. Debut, J. Chaumond, and T. Wolf, “DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter,” Oct. 2019, [Online]. Available: http://arxiv.org/abs/1910.01108

A. Joshi, P. Bhattacharyya, and M. J. Carman, “Automatic Sarcasm Detection: A Survey,” Feb. 2016, [Online]. Available: http://arxiv.org/abs/1602.03426

I. Loshchilov and F. Hutter, “Decoupled Weight Decay Regularization,” Nov. 2017, [Online]. Available: http://arxiv.org/abs/1711.05101

A. Howard et al., “Searching for MobileNetV3.”

M. Tan and Q. V Le, “EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks.”

T. Chen, I. Goodfellow, and J. Shlens, “Net2Net: Accelerating Learning via Knowledge Transfer,” Nov. 2015, [Online]. Available: http://arxiv.org/abs/1511.05641

K. He, X. Zhang, S. Ren, and J. Sun, “Deep Residual Learning for Image Recognition.” [Online]. Available: http://image-net.org/challenges/LSVRC/2015/

K. Xu et al., “Show, Attend and Tell: Neural Image Caption Generation with Visual Attention,” Feb. 2015, [Online]. Available: http://arxiv.org/abs/1502.03044

S. Han, J. Pool, J. Tran, and W. J. Dally, “Learning both Weights and Connections for Efficient Neural Networks.”

Q. You, J. Luo, H. Jin, and J. Yang, “Robust Image Sentiment Analysis Using Progressively Trained and Domain Transferred Deep Networks.” [Online]. Available: www.aaai.org

Y. Kim, “Convolutional Neural Networks for Sentence Classification.” [Online]. Available: http://nlp.stanford.edu/sentiment/

Shamim Ahmad, Manish Madhava Tripathi, “A Review Article on Detection of Fake Profile on Social-Media”, International Journal of Innovative Research in Computer Science & Technology (IJIRCST), ISSN (online): 2347-5552, Volume-11, Issue-2, March 2023,pp 44-49.

K. Saurabh, Manish Madhava Tripathi, Satyasundara Mahapatra “IoT Resources and Their Practical Application, A Comprehensive Study,” International Journal on Recent and Innovation Trends in Computing and Communication, Nov. 02, 2023. https://doi.org/10.17762/ijritcc.v11i10.8705 ( SCOPUS).

Umesh Pratap Singh, Manish Madhav Tripathi, “A Critical Review of the Effectiveness of Machine Learning & Deep Learning Approaches in Forecasting Stock Market Trends” International Journal on Recent and Innovation Trends in Computing and Communication,ISSN: 2321-8169 Vol 11, no 9,pp 3797-3801 https://doi.org/10.17762/ijritcc.v11i9.9624 , 2023.( SCOPUS).

Peeyush Pathak, Manish Madhava Tripathi, “A Systematic Review: Forecasting Post-Pandemic Health Trends with Machine Learning Methods”, Published in International Journal of INTELLIGENT SYSTEMS AND APPLICATIONS IN ENGINEERING, ISSN:2147-6799, pp 437-444( SCOPUS).

Jahan, R., & Tripathi, M. M. (2024). Brain Tumor Detection using Hybrid CNN in fMRI images. Multidisciplinary Science Journal, 6, 2024

Vinayak, Manish Madhava Tripathi, “Mmt-Vin: An Intelligent Lung Cancer Detection Framework Utilizing Machine Learning”, Published in Nanotechnology Perceptions ISSN 1660-6795 www.nano-ntp.com Nanotechnology Perceptions Vol 20 No. S13 (2024) 467-479.

Downloads

Published

2025-04-23

How to Cite

Kumar S, Ahmad J. Light SAED: A Robust, Lightweight, and Culturally Adaptable Cross-Modal Transformer for Sarcasm-Aware Emotion and Intensity Detection in Multimodal Tweets. J Neonatal Surg [Internet]. 2025 Apr. 23 [cited 2025 Dec. 17];14(14S):832-41. Available from: https://www.jneonatalsurg.com/index.php/jns/article/view/4370

Download Citation

Issue

Vol. 14 No. 14S (2025): Journal of Neonatal Surgery

Section

Original Article

License

This work is licensed under a Creative Commons Attribution 4.0 International License.

You are free to:

Share — copy and redistribute the material in any medium or format
Adapt — remix, transform, and build upon the material for any purpose, even commercially.

Terms:

Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
No additional restrictions — You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.

Light SAED: A Robust, Lightweight, and Culturally Adaptable Cross-Modal Transformer for Sarcasm-Aware Emotion and Intensity Detection in Multimodal Tweets

Authors

DOI:

Keywords:

Abstract

Downloads

References

Downloads

Published

How to Cite

Issue

Section

License

You are free to:

Similar Articles

Current Issue

Information

Developed By

Make a Submission