Gomathi, B and Vadivel, D and Harish Prasath, V G and Srinivasa Pradeep, S (2026) Enhancing text-to-structured query language translation for seamless electronic medical record access. PeerJ Computer Science, 12: e3467. ISSN 2376-5992
Enhancing text-to-structured query language translation for seamless electronic medical record access.pdf - Published Version
Available under License Creative Commons Attribution Non-commercial No Derivatives.
Download (5MB)
Abstract
Traditional models for natural language-to-SQL translation in Electronic Medical Record (EMR) systems struggle with understanding medical terminology, handling complex queries, and bridging the syntax-semantics gap, leading to scalability and accuracy issues. Advanced solutions like Large Language Model (LLM) based approaches address these challenges by leveraging deep learning and domain-specific training to enhance performance and usability. Hence, this article introduces an advanced medical Text-to-Structured Query Language (SQL) paradigm that simplifies accessing EMRs by translating natural language queries into SQL commands. This model is built on the advanced Code-T5 (Text-to-Text Transfer Transformer) architecture, further enhanced with Low-Rank Adaptation (LoRA) and Quantized Low-Rank Adaptation (QLoRA) techniques; it effectively addresses the challenges posed by the complexity of traditional SQL queries enabling seamless access to critical healthcare data. The innovation of the proposed model lies in its exceptional performance across multiple evaluation metrics. It achieves a Bilingual Evaluation Understudy (BLEU) score of 81.68, significantly outperforming leading models like T5, Fine-Tuned Language Net (FLAN) T5, and Bidirectional and Auto-Regressive Transformers (BART) while excelling in Recall-Oriented Understudy for Gisting Evaluation (ROUGE) metrics, underscoring its proficiency in generating semantically accurate and coherent SQL queries. Furthermore, the proposed model attains a high token-level F1-score, ensuring a balanced precision and recall and a Jaccard similarity score of 0.83, surpassing T5, Flan T5, and BART. The proposed model excels in handling complex medical queries, bridging natural language and SQL to empower data-driven decisions and advance medical informatics.
| Item Type: | Article |
|---|---|
| Subjects: | Artificial Intelligence and Data Science > Natural Language Processing |
| Divisions: | Computer Science and Engineering |
| Depositing User: | Dr Krishnamurthy V |
| Date Deposited: | 06 May 2026 08:28 |
| Last Modified: | 06 May 2026 08:28 |
| URI: | https://ir.psgitech.ac.in/id/eprint/1772 |
Dimensions
Dimensions