Qi Wu
1 
, Yannan Wang
2* 
, Xiaojuan Zhang
1, Hongqiang Zhang
3, Kuanyu Che
41 Department of Psychiatry, The Third People's Hospital of Lanzhou, Lanzhou 730030, Gansu, China
2 Department of Pediatric Psychiatry, The Third People's Hospital of Lanzhou, Lanzhou 730030, Gansu, China
3 Department of Traditional Chinese Medicine, The Third People's Hospital of Lanzhou , Lanzhou 730030, Gansu, China
4 Department of Magnetic Resonance Imaging, The First People's Hospital of Lanzhou, Lanzhou 730030, Gansu, China
Abstract
Introduction: Alzheimer's disease (AD) is a progressive neurodegenerative disorder that poses significant challenges for early detection. Advanced diagnostic methods leveraging machine learning techniques, particularly deep learning, have shown great promise in enhancing early AD diagnosis. This paper proposes a multimodal approach combining transfer learning, Transformer networks, and recurrent neural networks (RNNs) for diagnosing AD, utilizing MRI images from multiple perspectives to capture comprehensive features.
Methods: Our methodology integrates MRI images from three distinct perspectives: sagittal, coronal, and axial views, ensuring the capture of rich local and global features. Initially, ResNet50 is employed for local feature extraction using transfer learning, which improves feature quality while reducing model complexity. The extracted features are then processed by a Transformer encoder, which incorporates positional embeddings to maintain spatial relationships. Finally, 2D convolutional layers combined with LSTM networks are used for classification, enabling the model to capture sequential dependencies in the data.
Results: The proposed framework was rigorously tested on the Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset. Our approach achieved an impressive accuracy of 96.92% on test data and 98.12% on validation data, significantly outperforming existing methods in the field. The integration of Transformer and LSTM models led to enhanced feature representation and improved diagnostic performance.
Conclusion: This study demonstrates the effectiveness of combining transfer learning, Transformer networks, and LSTMs for AD diagnosis. The proposed framework provides a comprehensive analysis that improves classification accuracy, offering a valuable tool for early detection and intervention in clinical practice. These findings highlight the potential for advancing neuroimaging analysis and supporting future research in AD diagnostics.