Riel Castro-Zunti, Eun Hae Park, Amol Satsangi, Younhee Choi, Gong Yong Jin, Hee Suk Chae, Seok-bum Ko. A Novel Vision Transformer+InceptionV3 Hybrid Network for Accurate Diagnosis of Ankylosing Spondylitis from Computed Tomography Scans[J]. Machine Intelligence Research. DOI: 10.1007/s11633-024-1539-8
Citation: Riel Castro-Zunti, Eun Hae Park, Amol Satsangi, Younhee Choi, Gong Yong Jin, Hee Suk Chae, Seok-bum Ko. A Novel Vision Transformer+InceptionV3 Hybrid Network for Accurate Diagnosis of Ankylosing Spondylitis from Computed Tomography Scans[J]. Machine Intelligence Research. DOI: 10.1007/s11633-024-1539-8

A Novel Vision Transformer+InceptionV3 Hybrid Network for Accurate Diagnosis of Ankylosing Spondylitis from Computed Tomography Scans

  • Rationale and Objectives: Ankylosing spondylitis (AS) is a lifelong form of arthritis that inflames the sacroiliac joints (SIJs) and spine. Compared to magnetic resonance imaging (MRI), computed tomography (CT) better images bony erosions within SIJs—subtle early-stage AS indicators—and CT is considered the “gold standard” AS diagnostic modality. While clinical AS diagnoses via CT yield impressive specificity, sensitivity often lags. Materials and Methods: Using 2000+ SIJs across the pelvic CT scans of 35 AS patients and 65 control patients, a 3-stage computer vision diagnostic pipeline was developed: SIJ localization/extraction using YOLOv5; classifying SIJs as AS (across various progression levels) or control (across young and old age groups) via a custom Vision Transformer (ViT)+InceptionV3 hybrid architecture; and using aggregated/normalized SIJ classification results to predict AS at the patient level using a traditional machine learning model. Results: For 73 patients with >8 SIJs (restriction limits the impact of SIJ misclassifications), the proposed pipeline achieved 98.63% accuracy, 96.43% sensitivity, perfect specificity, and 96.51% ROC AUC. It is demonstrated to be robust and outperforms retrained related works. Compared to vanilla ViT for SIJ classification, the hybrid model pipeline is 16% more accurate with 14% higher ROC AUC; compared to vanilla InceptionV3, it is 7% more accurate with 4% higher ROC AUC. Conclusion: AS can be diagnosed with high sensitivity and specificity via the proposed pipeline. ViT+InceptionV3 outperforms its baseline forebears while unifying their respective strengths. Able to capture global and local details, ViT+InceptionV3 is applicable to tasks within computer assisted diagnostics and elsewhere.
  • loading

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return