Modified Grasshopper Optimization Algorithm (Mgoa) Based Feature Selection And Convolutional Long Short-Term Memory (Conv-Lstm) For Diagnosis Of Thyroid Cancer
DOI:
https://doi.org/10.63682/jns.v14i19S.4789Keywords:
Thyroid Cancer (TC), Deep Learning (DL), Modified Grasshopper Optimization Algorithm (MGOA), Convolutional Long Short-Term Memory (Conv-LSTM),, Synthetic Minority Over-sampling Technique (SMOTE),, Data Mining (DM), Boosting, SMOTE, and Tomek links (BST)Abstract
One prevalent endocrine carcinoma that affects the thyroid gland is Thyroid Cancer (TC). Thyroidectomy is still the principal therapeutic option, despite significant efforts to improve diagnosis. An effective procedure with no unnecessary side effects depends on an accurate preoperative diagnosis. A precise preoperative diagnosis may not always be ensured by the current human evaluation of Thyroid Nodule (TN) malignancy, because this TN maligancy is prone to errors. Medical Data Analysis (MDA) problems can be easily solved with the use of Data Mining (DM) algorithms. To find every problem in the dataset, DM helps with complex DA. With several techniques for classification, clustering, association, etc., DM offers significant assistance with thyroid datasets. Though there is enormous scope for clinical utility in applying Deep Learning (DL) techniques to predict and identify TC, their development and application present a number of difficulties. In order to find critical features and remove irrelevant, redundant, or noisy features, data pre-processing is necessary for DL approaches. This reduces the size of the feature space. Data collection (DC), data pre-processing, feature selection (FS), data classification, and performance evaluation are the five main stages of the suggested task. Initially, the Kaggle online repository is used to gather the DC, TC risk prediction dataset. This dataset mimics real-world TC risk factors and includes 212,691 records with 23 features. Then, issues with Data Encoding (DE), data resampling, data normalisation (DN), and handling missing data are addressed by using data pre-processing techniques. Min-Max normalisation (MMN), also known as Min-max scaling, is used to perform normalisation. Boosting, the Synthetic Minority Over-sampling Technique (SMOTE), and Tomek links (BST) are effective data balancing techniques for addressing data imbalance problems. Finally, the Modified Grasshopper Optimisation Algorithm (MGOA) is used to choose appropriate features. For TC detection, the most pertinent features (an optimal reduced feature subset) are chosen using MGOA. The search space (SS) can be effectively explored and exploited by MGOA, a nature-inspired algorithm (NIA). The TN malignancy is then predicted by classifying the data using Convolutional- (LSTM) Long Short-Term Memory (Conv-LSTM). This will enhance estimation and makes the parameters easier to understand. The final metrics utilised for assessing efficiency are Accuracy (AC), Specificity (SP), Precision (PR), Recall (R) or Sensitivity (SE), and the Area Under Receiver Operating Characteristic (AUROC) curve. According to the study of the results, the suggested model outperformed the other methods currently in use in terms of AC
Downloads
Metrics
References
Rossi, E.D., Pantanowitz, L. and Hornick, J.L., 2021. A worldwide journey of thyroid cancer incidence centred on tumour histology. The lancet Diabetes & endocrinology, 9(4), pp.193-194.
Nguyen, Q.T., Lee, E.J., Huang, M.G., Park, Y.I., Khullar, A. and Plodkowski, R.A., 2015. Diagnosis and treatment of patients with thyroid cancer. American health & drug benefits, 8(1), pp.30-40.
Davies, L. and Hoang, J.K., 2021. Thyroid cancer in the USA: current trends and outstanding questions. The lancet Diabetes & endocrinology, 9(1), pp.11-12.
Siegel, R.L., Miller, K.D., Wagle, N.S. and Jemal, A., 2023. Cancer statistics, 2023. CA: a cancer journal for clinicians, 73(1), pp.17-48.
Chen, D.W., Lang, B.H., McLeod, D.S., Newbold, K. and Haymart, M.R., 2023. Thyroid cancer. The Lancet, 401(10387), pp.1531-1544.
Lamartina, L., Grani, G., Durante, C., Filetti, S. and Cooper, D.S., 2020. Screening for differentiated thyroid cancer in selected populations. The lancet Diabetes & endocrinology, 8(1), pp.81-88.
Abe, I. and Lam, K.Y., 2021. Anaplastic thyroid carcinoma: Updates on WHO classification, clinicopathological features and staging. Histol Histopathol, 36, pp.239-248.
Banu, G.R., 2016. A Role of decision Tree classification data Mining Technique in Diagnosing Thyroid disease. International Journal of Computer Sciences and Engineering, 4(11), pp.64-70.
Hirsch, D., Levy, S., Tsvetov, G., Gorshtein, A., Slutzky-Shraga, I., Akirov, A., Robenshtok, E., Shimon, I. and Benbassat, C.A., 2017. Long-term outcomes and prognostic factors in patients with differentiated thyroid cancer and distant metastases. Endocrine Practice, 23(10), pp.1193-1200.
Hou, C.J., Wei, R., Tang, J.L., Hu, Q.H., He, H.F. and Fan, X.M., 2018. Diagnostic value of ultrasound features and sex of fetuses in female patients with papillary thyroid microcarcinoma. Scientific reports, 8(1), pp.1-6.
Saito, D., Nakajima, R. and Yasuda, S., 2020. Examination of malignant findings of thyroid nodules using thyroid ultrasonography. Journal of Clinical Medicine Research, 12(8), pp.499-507.
Jajroudi, M., Baniasadi, T., Kamkar, L., Arbabi, F., Sanei, M. and Ahmadzade, M., 2014. Prediction of survival in thyroid cancer using data mining technique. Technology in cancer research & treatment, 13(4), pp.353-359.
Jain, M., Singh, M.N., Somani, V. and Kumar, A., 2022. Technique to Identify & Classify Thyroid Cancer Using Supervise & Supervise Learning. Journal of Algebraic Statistics, 13(3), pp.4512-4520.
Jia, Z., Huang, Y., Lin, Y., Fu, M. and Sun, C., 2023. Multidimensional Prediction Method for Thyroid Cancer Based on Spatiotemporally Imbalanced Distribution Data. IEEE Access, 12, pp.4674-4686.
Vu, T.A., Huyen, N.A., Huy, H.Q. and Huong, P.T.V., 2023. Enhancing Thyroid Cancer Detection Through Machine Learning Approach. In 2023 12th International Conference on Control, Automation and Information Sciences (ICCAIS), pp.188-193.
Shah, A.A., Daud, A., Bukhari, A., Alshemaimri, B., Ahsan, M. and Younis, R., 2024. DEL-Thyroid: deep ensemble learning framework for detection of thyroid cancer progression through genomic mutation. BMC Medical Informatics and Decision Making, 24(1), pp.1-15.
Begum, M.A., Tresa, I.M., Sandhya, S., Vidhya, S. and Vinodhini, G., 2021. Machine learning based dysfunction thyroid cancer detection with optimal analysis. Turkish Journal of Computer and Mathematics Education, 12(7), pp.818-823.
Çiçek, İ.B. and Küçükakçalı, Z., 2023. Machine learning approach for thyroid cancer diagnosis using clinical data. Middle Black Sea Journal of Health Science, 9(3), pp.440-452.
Reddy, N.N., Maddula, S., Deepika, P., Sanjana, R., Nipun, L. and Shilpa, T., 2025. Enhancing thyroid cancer diagnosis with ensemble machine learning techniques. In Recent Trends in VLSI and Semiconductor Packaging, pp.421-430.
Firat Atay, F., Yagin, F.H., Colak, C., Elkiran, E.T., Mansuri, N., Ahmad, F. and Ardigò, L.P., 2024. A hybrid machine learning model combining association rule mining and classification algorithms to predict differentiated thyroid cancer recurrence. Frontiers in Medicine, 11, pp.01-10.
Mourad, M., Moubayed, S., Dezube, A., Mourad, Y., Park, K., Torreblanca-Zanca, A., Torrecilla, J.S., Cancilla, J.C. and Wang, J., 2020. Machine learning and feature selection applied to SEER data to reliably assess thyroid cancer prognosis. Scientific reports, 10(1), pp.1-11.
https://www.kaggle.com/datasets/ankushpanday1/thyroid-cancer-risk-prediction-dataset
Islam, M., Chen, G. and Jin, S., 2019. An overview of neural network. American Journal of Neural Networks and Applications, 5(1), pp.7-11.
Shantal, M., Othman, Z. and Bakar, A.A., 2023. A novel approach for data feature weighting using correlation coefficients and min–max normalization. Symmetry, 15(12), pp.1-18.
Wang, Z.H.E., Wu, C., Zheng, K., Niu, X. and Wang, X., 2019. SMOTETomek-based resampling for personality recognition. IEEE access, 7, pp.129678-129689.
Saremi, S., Mirjalili, S. and Lewis, A., 2017. Grasshopper optimisation algorithm: theory and application. Advances in engineering software, 105, pp.30-47.
Goel, N., Grover, B., Gupta, D., Khanna, A. and Sharma, M., 2020. Modified grasshopper optimization algorithm for detection of autism spectrum disorder. Physical Communication, 41, pp.1-11.
Rahman, M.M. and Siddiqui, F.H., 2019. An optimized abstractive text summarization model using peephole convolutional LSTM. Symmetry, 11(10), pp.1-24.
Rahman, M., Islam, D., Mukti, R.J. and Saha, I., 2020. A deep learning approach based on convolutional LSTM for detecting diabetes. Computational biology and chemistry, 88, pp.1-10.
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution 4.0 International License.
You are free to:
- Share — copy and redistribute the material in any medium or format
- Adapt — remix, transform, and build upon the material for any purpose, even commercially.
Terms:
- Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
- No additional restrictions — You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.