Sequence Entropy And Markov Modelling Of Mutated CFTR Genes: Insights From Multiple Sequence Alignment

Authors

  • S. D. Jeniffer

Keywords:

Cystic Fibrosis, CFTR Gene, Gene Mutation, Multiple Sequence Alignment, Shannon Entropy, Markov Chain Model, Genotype-Phenotype Correlation

Abstract

Mutations in the Cystic Fibrosis Transmembrane Conductance Regulator (CFTR) gene are central to the development of cystic fibrosis, a severe genetic disorder affecting epithelial function. This study analyses 35 mutated CFTR gene sequences to explore underlying sequence variation and structural patterns. Multiple sequence alignment was performed to organize the sequences, followed by Shannon entropy analysis to identify regions of high variability and conservation. Hierarchical clustering provided insights into relationships among the mutated sequences, while sequence logo plots visually highlighted nucleotide distribution at each alignment position. To model the sequence behaviour statistically, a first-order Markov chain was constructed, capturing transition probabilities between nucleotides across the aligned sequences. Together, these methods offer a comprehensive view of the mutational landscape within the CFTR gene. The findings enhance our understanding of sequence-level mutation dynamics and provide a foundation for further computational modelling and genotype-phenotype correlation studies in cystic fibrosis research.

Downloads

Download data is not yet available.

Metrics

Metrics Loading ...

References

Bonidia RP, Domingues DS, Sanches DS, de Carvalho AC. MathFeature: feature extraction package for DNA, RNA and protein sequences based on mathematical descriptors. Briefings in bioinformatics. 2022 Jan;23(1)

Capra, J. A., & Singh, M. (2007). Predicting functionally important residues from sequence conservation. Bioinformatics, 23(15), 1875-1882.

Donaldson, S. H., Samulski, T. D., LaFave, C., Zeman, K., Wu, J., Trimble, A., ... & Davis, S. D. (2020). A four week trial of hypertonic saline in children with mild cystic fibrosis lung disease: effect on mucociliary clearance and clinical outcomes. Journal of Cystic Fibrosis, 19(6), 942-948.

Durbin, R., Eddy, S. R., Krogh, A., & Mitchison, G. (1998). Biological sequence analysis: probabilistic models of proteins and nucleic acids. Cambridge university press.

Emdadi A, Moughari FA, Meybodi FY, Eslahchi C. A novel algorithm for parameter estimation of Hidden Markov Model inspired by Ant Colony Optimization. Heliyon. 2019 Mar 1;5(3):e01299.

Hasan, S., Soltman, S., Wood, C., & Blackman, S. M. (2022). The role of genetic modifiers, inflammation and CFTR in the pathogenesis of Cystic fibrosis related diabetes. Journal of Clinical & Translational Endocrinology, 27, 100287.

Jeniffer S D and Senthamarai Kannan K (2021) Stochastic modelling for identifying malignant diseases. Advances and Applications in Mathematical Sciences, 20(9) : 1923-1936.

Kannan KS, Jeniffer SD. Hidden Markov Modelling for Biological Sequence. In Proceedings of International Conference on Computational Intelligence: ICCI (2022) Oct 4 (p. 383). Springer Nature.

Karuppusamy T. Biological Gene Sequence Stucture Analysis Using Hidden Markov Model. Turkish Journal of Computer and Mathematics Education (TURCOMAT). 2021 Apr 11;12(4):1652-66.

Kumar S, Gadagkar SR. Disparity index: a simple statistic to measure and test the homogeneity of substitution patterns between molecular sequences. Genetics. 2001 Jul 1;158(3):1321-7.

Li J, Lee JY, Liao L. A new algorithm to train hidden Markov models for biological sequences with partial labels. BMC bioinformatics. 2021 Dec;22(1):1-21.

Meng Y, Fei J. Hidden service publishing flow homology comparison using profile‐hidden markov model. International Journal of Intelligent Systems. 2022 Feb;37(2):1081-112.

Mor B, Garhwal S, Kumar A. A systematic review of hidden Markov models and their applications. Archives of computational methods in engineering. 2021 May;28(3):1429-48.

Muthu, J. D. P., & Kaliyaperumal, S. K. (2022). Markov Modelling for Mucoviscidosis using Genomic Data. European Journal of Mathematics and Statistics, 3(6), 27-34.

Roth C. Statistical methods for biological sequence analysis for DNA binding motifs and protein contacts (Doctoral dissertation, Georg-August-Universität Göttingen).

Sarkar BK. Entropy Based Biological Sequence Study. In Entropy and Exergy in Renewable Energy 2021 Mar 29. IntechOpen.

Sasidharan SK, Thomas C. ProDroid—An Android malware detection framework based on profile hidden Markov model. Pervasive and Mobile Computing. 2021 Apr 1;72:101336.

Schuster‐Böckler B, Bateman A. An introduction to hidden Markov models. Current protocols in bioinformatics. 2007 Jun;18(1):A-3A.

Shannon, P., Markiel, A., Ozier, O., Baliga, N. S., Wang, J. T., Ramage, D., & Ideker, T. (2003). Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome research, 13(11), 2498-2504.

Sosnay, P. R., Siklosi, K. R., Van Goor, F., Kaniecki, K., Yu, H., Sharma, N., ... & Cutting, G. R. (2013). Defining the disease liability of variants in the cystic fibrosis transmembrane conductance regulator gene. Nature genetics, 45(10), 1160-1167.

Downloads

Published

2025-05-15

How to Cite

1.
Jeniffer SD. Sequence Entropy And Markov Modelling Of Mutated CFTR Genes: Insights From Multiple Sequence Alignment. J Neonatal Surg [Internet]. 2025May15 [cited 2025Sep.22];14(23S):1043-50. Available from: https://www.jneonatalsurg.com/index.php/jns/article/view/5884