Logo-bi
Bioimpacts. 2025;15: 31226.
doi: 10.34172/bi.31226
  Abstract View: 66
  PDF Download: 67

Original Article

Hybrid deep learning models for text-based identification of gene-disease associations

Noor Fadhil Jumaa 1 ORCID logo, Jafar Razmara 1* ORCID logo, Sepideh Parvizpour 2, Jaber Karimpour 1

1 Department of Computer Science, Faculty of Mathematics, Statistics, and Computer Science, University of Tabriz, Tabriz, Iran
2 Research Center for Pharmaceutical Nanotechnology, Biomedicine Institute, Tabriz University of Medical Sciences, Tabriz, Iran
*Corresponding Author: Jafar Razmara, Email: razmara@tabrizu.ac.ir

Abstract

Introduction: Identifying gene-disease associations is crucial for advancing medical research and improving clinical outcomes. Nevertheless, the rapid expansion of biomedical literature poses significant obstacles to extracting meaningful relationships from extensive text collections.
Methods: This study uses deep learning techniques to automate this process, using publicly available datasets (EU-ADR, GAD, and SNPPhenA) to classify these associations accurately. Each dataset underwent rigorous pre-processing, including entity identification and preparation, word embedding using pre-trained Word2Vec and fastText models, and position embedding to capture semantic and contextual relationships within the text. In this research, three deep learning-based hybrid models have been implemented and contrasted, including CNN-LSTM, CNN-GRU, and CNN-GRU-LSTM. Each model has been equipped with attentional mechanisms to enhance its performance.
Results: Our findings reveal that the CNN-GRU model achieved the highest accuracy of 91.23% on the SNPPhenA dataset, while the CNN-GRU-LSTM model attained an accuracy of 90.14% on the EU-ADR dataset. Meanwhile, the CNN-LSTM model demonstrated superior performance on the GAD dataset, achieving an accuracy of 84.90%. Compared to previous state-of-the-art methods, such as BioBERT-based models, our hybrid approach demonstrates superior classification performance by effectively capturing local and sequential features without relying on heavy pre-training.
Conclusion: The developed models and their evaluation data are available at https://github.com/NoorFadhil/Deep-GDAE.
First Name
Last Name
Email Address
Comments
Security code


Abstract View: 60

Your browser does not support the canvas element.


PDF Download: 67

Your browser does not support the canvas element.

Submitted: 08 Apr 2025
Revision: 05 May 2025
Accepted: 28 May 2025
ePublished: 28 Jun 2025
EndNote EndNote

(Enw Format - Win & Mac)

BibTeX BibTeX

(Bib Format - Win & Mac)

Bookends Bookends

(Ris Format - Mac only)

EasyBib EasyBib

(Ris Format - Win & Mac)

Medlars Medlars

(Txt Format - Win & Mac)

Mendeley Web Mendeley Web
Mendeley Mendeley

(Ris Format - Win & Mac)

Papers Papers

(Ris Format - Win & Mac)

ProCite ProCite

(Ris Format - Win & Mac)

Reference Manager Reference Manager

(Ris Format - Win only)

Refworks Refworks

(Refworks Format - Win & Mac)

Zotero Zotero

(Ris Format - Firefox Plugin)