Design of an Integrated Model Using R-GCN, TPOT, and Transformers for Efficient NoSQL Data Processing and Analysis

Authors

  • Pragya Lekheshwar Balley
  • Shrikant V. Sonekar

DOI:

https://doi.org/10.52783/jns.v14.2237

Keywords:

NoSQL Databases, Dynamic Schema Inference, Multimodal Feature Selection, Transformer Networks, Query Optimizations

Abstract

The rapid deployment of NoSQL databases to manage large complex data, unstructured, and sample dataset has posed huge challenges for the efficient processing and analysis of such data samples. Current approaches, particularly those using rule-based schemes for schema detection and query optimization, fail to address the dynamic and heterogeneous nature of NoSQL data samples. To address these shortcomings, this work proposes an overall framework of integrating several advanced methods based on machine learning into NoSQL data processing and analysis to improve efficiency. This work begins with Relational Graph Convolutional Network, a method that dynamically infers schema from the NoSQL database. This helps in automatically detecting intricate relationships within data, reducing schema processing timestamp by 30%. We extend TPOT with transformers-BERT and convolutional neural networks-ResNet for feature selection of multimodal data text, image, and tabular, improving accuracy by 15-20% against the model. MMT allows us to fuse disparate types of data into a shared latent space, lifting multimodal classification accuracy from 78% to 90%. We use DQL-based optimization learning from past query performance to reduce the average query execution timestamp by 33%. Finally, we employ Hierarchical Attention Networks for analyzing nested NoSQL structures; this improved the classification performance with a boost in the F1-score from 0.78 to 0.88. This combined approach mainly results in improved schema inference, feature selection, multimodal data fusion, and query optimization, and it leads to significant performance gains for NoSQL-based systems that pave the way for efficient treatment of large-scale, heterogeneous datasets & samples.

Downloads

Download data is not yet available.

Metrics

Metrics Loading ...

References

Bansal, N., Sachdeva, S. & Awasthi, L.K. Query-based denormalization using hypergraph (QBDNH): a schema transformation model for migrating relational to NoSQL databases & their sample sets. Knowl Inf Syst 66, 681–722 (2024). https://doi.org/10.1007/s10115-023-02017-y

Abdelhedi, F., Rajhi, H. & Zurfluh, G. Extraction of Semantic Links from a Document-Oriented NoSQL Database. SN COMPUT. SCI. 4, 148 (2023). https://doi.org/10.1007/s42979-022-01578-z

Sumalatha, V., Pabboju, S. Optimal Index Selection Using Optimized Deep Q-Learning Algorithm for NoSQL Database. SN COMPUT. SCI. 5, 504 (2024). https://doi.org/10.1007/s42979-024-02863-9

Jemmali, R., Abdelhedi, F. & Zurfluh, G. DLToDW: Transferring Relational and NoSQL Databases from a Data Lake. SN COMPUT. SCI. 3, 381 (2022). https://doi.org/10.1007/s42979-022-01287-7

Sen, P.S., Mukherjee, N. Ontology-Based Data Modeling for NoSQL Databases: A Case Study in e-Healthcare Application. SN COMPUT. SCI. 4, 3 (2023). https://doi.org/10.1007/s42979-022-01405-5

Liling, W. Interactive system for English online vocabulary teaching based on database fuzzy query algorithm. Int J Syst Assur Eng Manag (2023). https://doi.org/10.1007/s13198-023-02028-6

Bansal, N., Sachdeva, S. & Awasthi, L.K. Schema generation for document stores using workload-driven approach. J Supercomput 80, 4000–4048 (2024). https://doi.org/10.1007/s11227-023-05613-5

Muse, B.A., Nafi, K.W., Khomh, F. et al. Data-access performance anti-patterns in data-intensive systems. Empir Software Eng 29, 144 (2024). https://doi.org/10.1007/s10664-024-10535-8

Ludongdong, S. Voice detection and cloud computing service database improvement based on data sharing network. Int J Syst Assur Eng Manag (2023). https://doi.org/10.1007/s13198-023-02057-1

Mallek, H., Ghozzi, F. & Gargouri, F. Conceptual modeling of big data SPJ operations with Twitter social medium. Soc. Netw. Anal. Min. 13, 105 (2023). https://doi.org/10.1007/s13278-023-01112-w

Eremeev, A.P., Muntyan, E.R. Developing an Ontology on the Basis of Graphs with Multiple and Heterotypic Connections. Sci. Tech. Inf. Proc. 49, 427–438 (2022). https://doi.org/10.3103/S0147688222060041

Hewasinghage, M., Nadal, S., Abelló, A. et al. Automated database design for document stores with multicriteria optimization. Knowl Inf Syst 65, 3045–3078 (2023). https://doi.org/10.1007/s10115-023-01828-3

S. B. Kenitar, M. Arioua and M. Yahyaoui, "A Novel Approach of Latency and Energy Efficiency Analysis of IIoT With SQL and NoSQL Databases Communication," in I’E’ Access, vol. 11, pp. 129247-129257, 2023, doi: 10.1109/ACCESS.2023.3332483. keywords: {Databases;NoSQL databases;Energy efficiency;Protocols;Industrial Internet of Things;Servers;Data models;Low latency communication;SQL;NoSQL;latency;efficiency;IIoT;energy},

A. H. Chillón, M. Klettke, D. S. Ruiz and J. G. Molina, "A Generic Schema Evolution Approach for NoSQL and Relational Databases," in I’E’ Transactions on Knowledge and Data Engineering, vol. 36, no. 7, pp. 2774-2789, July 2024, doi: 10.1109/TKDE.2024.3362273. keywords: {Data models;Taxonomy;Codes;Databases;Engines;Aggregates;Relational databases;NoSQL databases;schema evolution;Evolution management;taxonomy of changes;schema change operations},

E. Gupta, S. Sural, J. Vaidya and V. Atluri, "Enabling Attribute-Based Access Control in NoSQL Databases," in I’E’ Transactions on Emerging Topics in Computing, vol. 11, no. 1, pp. 208-223, 1 Jan.-March 2023, doi: 10.1109/TETC.2022.3193577. keywords: {Access control;NoSQL databases;Databases;Servers;Wires;Protocols;Organizations;Attribute-based access control;NoSQL datastores;MongoDB},

H. Akid, G. Frey, M. B. Ayed and N. Lachiche, "Performance of NoSQL Graph Implementations of Star vs. Snowflake Schemas," in I’E’ Access, vol. 10, pp. 48603-48614, 2022, doi: 10.1109/ACCESS.2022.3171256. keywords: {Data warehouses;Data models;Databases;Stars;Relational databases;Big Data;Scalability;Data model;graph data warehouse;NoSQL;performance;relational data warehouse},

R. Li et al., "TrajMesa: A Distributed NoSQL-Based Trajectory Data Management System," in I’E’ Transactions on Knowledge and Data Engineering, vol. 35, no. 1, pp. 1013-1027, 1 Jan. 2023, doi: 10.1109/TKDE.2021.3079880. keywords: {Trajectory;Indexing;Global Positioning System;Distributed databases;Scalability;Engines;Query processing;Trajectory data management;distributed NoSQL storage;spatio-temporal indexing and query processing},

R. Andreoli, T. Cucinotta and D. B. De Oliveira, "Priority-Driven Differentiated Performance for NoSQL Database-as-a-Service," in I’E’ Transactions on Cloud Computing, vol. 11, no. 4, pp. 3469-3482, Oct.-Dec. 2023, doi: 10.1109/TCC.2023.3292031. keywords: {Cloud computing;Throughput;Databases;Time factors;Message systems;Proposals;Scalability;Cloud computing;differentiated performance;NoSQL;cloud storage;MongoDB},

M. Hemmatpour, B. Montrucchio, M. Rebaudengo and M. Sadoghi, "Analyzing In-Memory NoSQL Landscape," in I’E’ Transactions on Knowledge and Data Engineering, vol. 34, no. 4, pp. 1628-1643, 1 April 2022, doi: 10.1109/TKDE.2020.3002908. keywords: {Protocols;Semantics;Prefetching;Acceleration;Facebook;Hardware;Sockets;RDMA;memory;key Value store;big data;high performance;cluster;parallel programming},

L. B. Monteiro, V. F. Ribeiro, C. P. Garcia, G. P. Rocha Filho and L. Weigang, "4D Trajectory Conflict Detection and Resolution Using Decision Tree Pruning Method," in I’E’ Latin America Transactions, vol. 21, no. 2, pp. 277-287, Feb. 2023, doi: 10.1109/TLA.2023.10015220. keywords: {Trajectory;Atmospheric modeling;Aircraft;Hidden Markov models;Decision trees;NoSQL databases;Prediction algorithms;4-Dimensional Trajectory;Conflict Detection and Resolution;Decision Tree Pruning Method;Not Only SQL},

B. Yang and H. Hu, "An Efficient Verification Approach to Separation of Duty in Attribute-Based Access Control," in I’E’ Transactions on Knowledge and Data Engineering, vol. 36, no. 9, pp. 4428-4442, Sept. 2024, doi: 10.1109/TKDE.2024.3373562. keywords: {IP networks;Authorization;Standards;NoSQL databases;Turing machines;Time factors;Systems engineering and theory;Attribute based access control;polynomial-time verification;separation of duty;violation resolving},

N. Santos, B. Ghita and G. L. Masala, "Medical Systems Data Security and Biometric Authentication in Public Cloud Servers," in I’E’ Transactions on Emerging Topics in Computing, vol. 12, no. 2, pp. 572-582, April-June 2024, doi: 10.1109/TETC.2023.3271957. keywords: {Cloud computing;Biometrics (access control);Data security;Biomedical imaging;Encryption;Authentication;Security;Data fragmentation;cloud security;NoSQL database;security and protection},

D. Fioriti, N. Stevanato, P. Ducange, F. Marcelloni, E. Colombo and D. Poli, "Data Platform Guidelines and Prototype for Microgrids and Energy Access: Matching Demand Profiles and Socio-Economic Data to Foster Project Development," in I’E’ Access, vol. 11, pp. 73218-73234, 2023, doi: 10.1109/ACCESS.2023.3294841. keywords: {Prototypes;Microgrids;Guidelines;Home appliances;Estimation;Biological system modeling;Open data;Energy management;Structured Query Language;Access to electricity;load estimation;open data;NoSQL database;software platform;energy access},

Ł. Szeremeta, D. Tomaszuk and R. Angles, "YARS-PG: Property Graphs Representation for Publication and Exchange," in I’E’ Access, vol. 12, pp. 73386-73399, 2024, doi: 10.1109/ACCESS.2024.3403924. keywords: {Syntactics;Metadata;Measurement;US Department of Transportation;Data visualization;XML;Standards;Data models;Data processing;Database systems;Structured Query Language;Graphical models;Data models;data processing;data structures;database systems;metadata;NoSQL databases;databases},

Ł. Szeremeta, D. Tomaszuk and R. Angles, "YARS-PG: Property Graphs Representation for Publication and Exchange," in I’E’ Access, vol. 12, pp. 73386-73399, 2024, doi: 10.1109/ACCESS.2024.3403924. keywords: {Syntactics;Metadata;Measurement;US Department of Transportation;Data visualization;XML;Standards;Data models;Data processing;Database systems;Structured Query Language;Graphical models;Data models;data processing;data structures;database systems;metadata;NoSQL databases;databases},

Downloads

Published

2025-03-17

How to Cite

1.
Lekheshwar Balley P, V. Sonekar S. Design of an Integrated Model Using R-GCN, TPOT, and Transformers for Efficient NoSQL Data Processing and Analysis. J Neonatal Surg [Internet]. 2025Mar.17 [cited 2025Nov.18];14(6S):298-314. Available from: https://www.jneonatalsurg.com/index.php/jns/article/view/2237