- Original research
- Open Access
The predictive value of [18F]FDG PET/CT radiomics combined with clinical features for EGFR mutation status in different clinical staging of lung adenocarcinoma
EJNMMI Research volume 13, Article number: 26 (2023)
This study aims to construct radiomics models based on [18F]FDG PET/CT using multiple machine learning methods to predict the EGFR mutation status of lung adenocarcinoma and evaluate whether incorporating clinical parameters can improve the performance of radiomics models.
A total of 515 patients were retrospectively collected and divided into a training set (n = 404) and an independent testing set (n = 111) according to their examination time. After semi-automatic segmentation of PET/CT images, the radiomics features were extracted, and the best feature sets of CT, PET, and PET/CT modalities were screened out. Nine radiomics models were constructed using logistic regression (LR), random forest (RF), and support vector machine (SVM) methods. According to the performance in the testing set, the best model of the three modalities was kept, and its radiomics score (Rad-score) was calculated. Furthermore, combined with the valuable clinical parameters (gender, smoking history, nodule type, CEA, SCC-Ag), a joint radiomics model was built.
Compared with LR and SVM, the RF Rad-score showed the best performance among the three radiomics models of CT, PET, and PET/CT (training and testing sets AUC: 0.688, 0.666, and 0.698 vs. 0.726, 0.678, and 0.704). Among the three joint models, the PET/CT joint model performed the best (training and testing sets AUC: 0.760 vs. 0.730). The further stratified analysis found that CT_RF had the best prediction effect for stage I–II lesions (training set and testing set AUC: 0.791 vs. 0.797), while PET/CT joint model had the best prediction effect for stage III–IV lesions (training and testing sets AUC: 0.722 vs. 0.723).
Combining with clinical parameters can improve the predictive performance of PET/CT radiomics model, especially for patients with advanced lung adenocarcinoma.
Lung cancer is the second most common cancer worldwide and has the highest mortality rate (21%) [1, 2], of which 80–85% are non-small cell lung cancer (NSCLC) . Epidermal growth factor receptor (EGFR) plays an important role in the progression of NSCLC, which makes it an effective therapeutic target; tumors with EGFR mutations are more heterogeneous [4, 5]. The most common histological type of NSCLC is lung adenocarcinoma , which has a higher EGFR mutation rate than other subtypes . In Asian patients with lung adenocarcinoma, the EGFR mutation rate is as high as 50% . Studies have shown that tyrosine kinase inhibitors (TKIs) can effectively prolong the progression-free survival (PFS) of patients with EGFR mutations, and therefore they are widely used in the targeted therapy for lung adenocarcinoma [8, 9]. The therapy efficacy and prognosis are closely related to the EGFR mutation status of the patient. Accurately identifying EGFR mutation status in lung adenocarcinoma patients can greatly improve patient prognosis.
Molecular testing of needle biopsy or surgically resected tumor tissue is the "gold standard" for diagnosing EGFR mutation status. However, this technique is invasive and time-consuming, and tumor heterogeneity can easily affect its accuracy [10, 11]. In addition, many patients cannot undergo this test due to poor physical conditions and other reasons (such as fears and anxieties, concerns about potential complications, and suboptimal lesion location). In recent years, some studies used blood samples instead of biopsies to assess EGFR mutation status by analyzing circulating tumor DNA (ctDNA). However, ctDNA testing is expensive and has low sensitivity for detecting EGFR mutations [12,13,14]. Therefore, there is an urgent need for an economical, rapid, and reliable non-invasive detection method to assess EGFR mutation status.
[18F]FDG PET/CT is a non-invasive molecular imaging technique widely used in the clinical diagnosis, staging, prognosis, and efficacy evaluation of lung cancer [15,16,17]. The maximum standard uptake value (SUVmax) is one of the routine parameters of PET/CT, which reflects the highest uptake value of [18F]FDG by the tumor tissue and is often used in PET image analysis. Radiomics is a method that quantitatively evaluates tumor imaging phenotypes via high-throughput feature extraction from medical images . Compared to SUVmax, radiomics features can better reflect the spatial distribution of tumors and more comprehensively evaluate tumor heterogeneity. In recent years, using machine learning methods with high prediction efficiency and strong feasibility to assess radiomic features and predict EGFR mutation status has become a research “hot spot” [5, 19,20,21,22,23,24,25,26,27,28,29,30,31,32]. However, most of these studies had small sample sizes, with a total sample number of no more than 200 cases [19,20,21,22,23,24,25,26], and the TNM stages of the enrolled patients varied greatly [21, 27, 31]. These factors significantly affected the stability of the results.
Studies have shown that clinical parameters such as gender, smoking history, and the presence of ground-glass opacity (GGO) are closely related to the EGFR mutation status in lung adenocarcinoma . Combining with clinical parameters can improve model performance in predicting EGFR mutation status [4, 5, 26, 30], but some researchers suggest that adding clinical features to the radiomics model does not improve its predictive performance [34, 35]. Therefore, whether incorporating clinical parameters can improve the performance of the radiomics model is still inconclusive.
In this study, we included 515 patients with all clinical stages. We used LR, RF, and SVM to model CT, PET, and PET/CT radiomics features and assess the predictive power. Then, we included clinical parameters to construct joint models and evaluate whether adding clinical parameters can further improve the model performance in predicting EGFR mutation in lung adenocarcinoma patients with different clinical stages.
We retrospectively and consecutively collected lung cancer patients who underwent [18F]FDG PET/CT examinations before treatment in our hospital from January 2018 to April 2022. Inclusion criteria: (1) Lung adenocarcinoma was confirmed by surgery or biopsy pathology, and the pathological classification was based on IASLC/ATS/ERS lung adenocarcinoma classification criteria ; (2) the patients completed [18F]FDG PET/CT examination before surgery, and the interval between surgery and examination was less than 30 days; (3) the EGFR mutation test result was available; (4) patient had no history of other malignant tumors. Exclusion criteria: (1) lesions with poor image quality or difficulty to measure; (2) patient had other subtypes of lung cancer; (3) no chest CT images.
According to the above criteria, five hundred fifteen patients with lung adenocarcinoma were included, and the clinical information of age, gender, smoking status, clinical stage, tumor marker, SUVmax, and postoperative pathology was recorded. We used patients collected from January 2018 to April 2021 as the training set and patients collected from May 2021 to April 2022 as the testing set. The study was performed in accordance with the ethical standards as laid down in the 1964 Declaration of Helsinki and its later amendments or comparable ethical standards. The study protocol was approved by the Ethics Committee of our institution (No.  KD 087). As our study is a retrospective analysis, informed consent is not required. The study flow chart is shown in Fig. 1.
EGFR mutation test
EGFR mutation test was performed on tissue specimens obtained by surgical resection or biopsy. Real-time fluorescent PCR was performed to detect EGFR mutations in exons 18–21 using the EGFR gene mutation detection kit purchased from Shanghai Yuanqi Company. The detailed procedures followed the manufacturer's instructions (see Additional file 1). If the mutation was detected in any of the above exons, the lesion was defined as EGFR mutant; otherwise, the lesion was defined as EGFR wild type.
The image acquisition protocol was developed according to the acquisition protocol based on Imaging Biomarker Standardization Initiative (IBSI) guidelines . The PET/CT image acquisition instrument was a German Siemens BiographmCT (64) PET/CT machine. The patients fasted for 4–6 h before the examination, and their height, weight, and blood sugar were recorded on the examination day. [18F]FDG was intravenously injected according to the patient's body weight at 3.70–5.55 MBq/kg. The imaging agent was provided by Nanjing Jiangyuan Andico Positron Research and Development Company (radiochemical purity > 95%). The patients underwent PET/CT whole-body imaging after resting in a quiet and comfortable environment for 1 h. The patient was placed in the supine position and kept holding the head with both hands. Imaging lasted for 2 min/bed, and the collection range was from the base of the skull to the middle of the femur. Diagnostic chest CT imaging was performed after the PET/CT scan. After image acquisition, the TrueX + TOF (ultraHD-PET) system was used for image reconstruction. A postprocessing workstation TrueD system (Siemens) was used for image evaluation. Image acquisition parameters are listed in Additional file.
Image analysis and tumor region segmentation
[18F]FDG PET/CT image analysis: PET and CT images were analyzed by two physicians (A and B) with 3 years of experience in nuclear medicine. The software they used was the 3D slicer software 4.11.2 (http://www.slicer.org). For PET images, they used a semi-automatic segmentation method developed by Beichel et al. . For CT images (3 mm), they used NVIDIA AI-Assisted Annotation (3D-Slicer built-in) and the boundary-based CT segmentation models to process lung nodule images. The segmented tumor region of interest (ROI) was checked and proofread by another physician with more than 10 years of experience in PET/CT diagnosis. Four weeks after completing the ROI for all cases, Doctor A segmented the tumor region again for 300 patients, among which 70 patients were randomly selected for Doctor B to segment.
Before feature extraction, the image was normalized, and all images were interpolated (sitkBSpline algorithm, 3rd-order B-spline interpolation) so that the isotropic voxel spacing was rotated unchanged and the extracted features were compared between different samples. CT images were resampled to 1 × 1 × 1 mm3, and PET images were resampled to 3 × 3 × 3 mm3. The method of fixed bin width was used for discretization, and the bin width of CT and PET images was 25 and 0.313, respectively. The bin discretization, Laplacian of Gaussian (LOG), and wavelet transform were applied to generate different feature sets. For the LOG filter, different sigma values were used to extract fine, medium, and coarse features; specifically, these features ranged from 0.5 to 5 with a step size of 0.5. The wavelet transforms produced 8 decompositions per layer (applying all possible combinations of high-pass or low-pass filters in each of the three dimensions, including HHH, HHL, HLH, HLL, LHH, LHL, LLH, and LLL). All intensity, histogram, and texture features were preprocessed (including discretization, logarithm, and wavelet).
Using the Pyradiomics module in Python 3.8.8, we extracted multiple features from different feature categories based on three segmented ROIs (twice from Doctor A and once from Doctor B). These categories included shape and morphology-based features (14 shape features), first-order statistics (18 FOS features), gray-level co-occurrence matrix (GLCM 24 features), gray-level dependency matrix (14 GLDM features), gray-level run-length matrix (16 GLRLM features), gray-level size region matrix (16 GLSZM features), and the neighbor gray-level tone difference matrix (5 NGTDM features). The features extracted from three sets of ROIs were assessed for within-group and between-group intraclass correlation efficient (ICC), and the features with ICCs greater than 0.75 were considered in good consistency and kept for further analysis.
Screening of radiomic features and model construction
To avoid overfitting, the variance method was used to remove features with small variance (threshold = 0.24). Next, the Mann–Whitney U test was used in the training set to screen out radiomics features with p value < 0.1, which might be associated with EGFR mutation status. Then, the Least Absolute Shrinkage and Selection Operator (LASSO) algorithm was applied to the normalized training set data to select the best predictive features. The LASSO algorithm added an L1 regularization term to the least squares algorithm to avoid overfitting and employed fivefold cross-validation.
Machine learning models were built using the Sklearn module in Python 3.8.8. Nine models were constructed based on CT, PET, and PET/CT radiomics features with LR, SVM, and RF, respectively. The training set used a grid search with fivefold cross-validation to find the optimal hyperparameters (the specific parameters are listed in Additional file 1: Table S1), and the models were retrained on the entire training set. Receiver operating characteristic (ROC) curves and the area under the curve (AUC) were used to evaluate model performance in training and testing sets. The 3 models with the best performance on the testing set were kept to calculate the Rad-score. We used the SHAP module in Python 3.8.8 to interpret the model better to understand the importance of different features in different models. The biggest advantage of the SHAP value is that SHAP can reflect the positive and negative influence of the features in each sample. The radiomics workflow used in this study is shown in Fig. 2.
Statistical analysis was performed using R software (version 3.4.3; http://www.R-project.org/). Continuous variables were expressed as mean ± standard deviation (normal distribution) or median (Q1–Q3) (skewed distribution). Categorical variables were expressed as frequency or rate (%). Differences in clinical data and PET/CT metabolic parameters between different EGFR mutation status (binary variables) were tested using the χ2 test (for categorical variables), T test (for normal distribution), or Mann–Whitney U test (skewed distribution). Multivariate logistic regression method was used to construct a clinical model with significant clinical parameters, and a joint model and the corresponding nomogram were built by combining the clinical parameters with three Rad-scores. The minimum Akaike information criterion was used to select the best model parameters. The calibration curve confirmed the agreement between the nomogram and observations, and the model’s power was evaluated using ROC curve and AUC. The model's accuracy, sensitivity, and specificity were calculated to obtain a quantitative performance measurement. Decision curve analysis (DCA) assessed the models' clinical utility and net clinical benefit. A pairwise comparison of the model AUCs was performed using the method proposed by Delong et al. . All statistical tests were two-sided, and p < 0.05 was considered statistically significant. Since the carcinoembryonic antigen (CEA) of 49 cases (9.5%), the cytokeratin 19 fragment (CYFRA21-1) of 99 cases (19.1%), the neuron-specific enolase (NSE) of 74 cases (14.3%), the serum squamous cell carcinoma antigen (SCC-Ag) of 91 cases (17.6%) were missing, we imputed the missing data using miceforest (version 5.4.0; https://github.com/AnotherSamWilson/miceforest/).
Clinical characteristics of patients
The clinical characteristics of the patients are shown in Table 1. A total of 515 patients with lung adenocarcinoma were enrolled in the study, including 264 females and 251 males. The average age was 64.0 ± 9.2 years (ranging from 36 to 87 years). In total, 175 patients (34.0%) had a smoking history, and 348 (67.6%) had solid nodules. The patients’ clinical stages were: stage I: 209 cases (40.6%), stage II: 24 cases (4.7%), stage III: 85 cases (16.5%), and stage IV: 197 cases (38.3%). The EGFR mutation status was pathologically confirmed by surgical resection or biopsy: there were 202 cases (39.2%) with EGFR wild type and 313 cases (60.8%) with EGFR mutant (including 3 cases of exon 18, 127 cases of exon 19, 13 cases of exon 20, 150 cases of exon 21, 1 case of exon 19 + 20, 2 cases of exon 19 + 21, 1 case of exon 20 + 21, and 16 cases of the unknown exon).
There were no significant differences between the training set (n = 404) and testing set (n = 111) in age, gender, smoking history, nodule type, nodule location, tumor markers, and EGFR mutation rate (all p > 0.05). Tumor long axis, tumor short axis, clinical stage, and SUVmax showed significant differences between the two datasets (all p < 0.05), possibly due to the different compositions of patients during different periods (see Additional file 1: Table S2). To eliminate this difference, we performed a stratified analysis (stratified by clinical stage) in both datasets to verify the robustness of the joint model. The clinical parameters of gender, smoking history, nodule type, CEA, SCC-Ag, clinical stage, tumor long axis, and tumor short axis explained the differences between EGFR mutant and wild-type patients (all p < 0.05 in training set).
Validation of the predictive efficacy of traditional metabolic parameters
SUVmax was significantly different between EGFR mutant and wild type in the training set (p = 0.005) but not in the testing set (p > 0.05). SUVmax had a weak predictive ability for EGFR mutation status in lung adenocarcinoma (training set AUC = 0.582, 95% CI 0.526–0.638, testing set AUC = 0.584, 95% C0I 0.475–0.694).
The screening results of the three modality radiomics features
Based on the segmentations of tumor regions on PET/CT images, a total of 3562 radiomics features (1781 PET features, 1781 CT features) were extracted. Among them, 423 CT features and 248 PET features were excluded based on intragroup ICCs evaluation, and 109 CT features and 81 PET features were excluded according to intergroup ICCs evaluation. Next, we used the variance method, Mann–Whitney U test, and LASSO algorithm (Additional file 1:Figure S1) to further screen out 8 CT features, 4 PET features, and 4 PET/CT fusion radiomics features (2CT + 2PET), respectively (Additional file 1:Table S3).
The predictive power of the three modality radiomics features for EGFR mutation status
The corresponding AUCs of the radiomics models are shown in Fig. 3, and the feature weights for each model are shown in Additional file 1:Figure S2. Among the 9 constructed radiomics models, the three models based on RF algorithm were better than the models based on LR and SVM algorithms in both training set and testing set, and the AUCs of CT_RF, PET_RF, and PET/CT_RF in the training set were 0.688, 0.666, and 0.698, respectively, and the AUCs in the testing set were 0.726, 0.678, and 0.704, respectively. Although the performance of CT_RF model was better than that of PET_RF and PET/CT_RF models in the testing set, the difference was not significant (both p > 0.05). Table 2 lists the diagnostic efficacy of the three best models.
We further compared the performance of CT_RF, PET_RF, PET/CT_RF, and SUVmax for predicting EGFR mutation status. The ROC curves of the three radiomics models and SUVmax in training set and testing set are shown in Additional file 1: Figure S3. The AUCs of three radiomics models were significantly better than SUVmax in both training and testing sets (all p < 0.05).
The performance of radiomics model combined with clinical parameters for predicting EGFR mutation status
We first constructed a clinical prediction model (baseline model) with the 8 clinical parameters in Table 1 that might be related to the EGFR mutation status. Five parameters were finally included, and the model equation is as follows:
Logit(P) = 0.91617 + 0.00156 × CEA − 0.05743 × SCC-Ag − 0.92507 × (Gender = male) − 0.57848 × (Smoking history = positive) + 0.70786 × (Nodule Type = sub-solidity).
Next, the above 5 parameters were combined with the three best radiomics models (CT_RF, PET_RF, and PET/CT_RF) to construct joint prediction models. The three joint models are as follows:
CT joint model:
Logit(P) = − 2.11428 + 0.00177 × CEA − 1.31985 × (Gender = male) + 0.52447 × (Nodule Type = sub-solidity) + 4.96787 × CT_Rad;
PET joint model:
Logit(P) = − 1.01180 + 0.00170 × CEA –0.86919 × (Gender = male)− 0.58352 × (Smoking history = positive) + 4.02651 × PET_Rad;
PET/CT joint model:
Logit(P) = − 1.66386 + 0.00183 × CEA –0.89610 × (Gender = male) − 0.53267 × (Smoking.history = positive) + 5.20912 × PET/CT_Rad.
The ROC and DCA curves of the clinical and three joint models are shown in Fig. 4, and the diagnostic efficacy is shown in Table 2. In the training set, CT_joint model had the highest specificity of 0.795, while the PET joint model and the PET/CT joint model had the highest accuracy of 0.713, and the PET joint model had the highest sensitivity of 0.794. In the testing set, CT_RF had the highest specificity of 0.756, while the PET/CT_RF and the PET/CT joint model had the highest accuracy of 0.712, and the PET/CT_RF model had the highest sensitivity of 0.800.
The AUCs of three joint models and clinical model in the training set followed the order of CT joint model > PET/CT joint model > PET joint model > clinical model, but only the AUC of CT joint model was significantly better than the clinical model (p = 0.049). The AUC values in the testing set followed the order of PET/CT joint model > CT joint model > PET joint model > clinical model, but the differences were not significant (p > 0.05). By calculating the net reclassification index (NRI), we found that the PET/CT joint model correctly reclassified 10.0% (95% CI 1.0–19.0%, p = 0.029) of mutant type more than the clinical model, while there was no significant difference in identifying wild type (p = 0.410). Further comparison of DCA showed that the net benefit of the three joint models was higher than that of the clinical model in both training and testing sets (Fig. 4).
We further performed stratified analysis to verify the diagnostic performance of radiomics models, clinical models, and joint models in different clinical stage (Table 3). In the training set, clinical stage significantly affected the prediction of EGFR mutation status by the three radiomics models and the joint PET/CT model, suggesting there was an interaction effect (all p < 0.05); however, this effect was not significant in the testing set (p = 0.067–0.869). The EGFR mutation rates and clinical characteristics of the patients with different clinical stages are shown in Additional file 1: Table S4. For stage I-II nodules, besides clinical model, other models all showed varying degrees of overfitting. The CT joint model performed best in the training set (AUC: 0.838), while the CT_RF model performed the best in the testing set (AUC: 0.797). For stage III-IV nodules, the clinical model performed the best in the training set (AUC: 0.729), followed by the PET/CT joint model (AUC: 0.722); in the testing set, the clinical model showed significant overfitting (AUC: 0.675), while the PET/CT joint model showed the best performance (AUC: 0.723). The nomogram and a calibration curve of the PET/CT joint model are shown in Fig. 5.
This article aims to use machine learning methods to construct a radiomics model based on 18F-FDG PET/CT images to predict the EGFR mutation status in lung adenocarcinoma patients and evaluate the added value of clinical parameters in improving the predictive performance of the radiomics model. CT_RF, PET_RF, and PET/CT_RF achieved moderate predictive performance (training and testing sets AUC: 0.688, 0.666, and 0.698 vs. 0.726, 0.678, and 0.704, respectively). Furthermore, the PET/CT joint model had the best predictive performance (training and testing sets AUCs: 0.760 vs. 0.730), especially in the advanced lung adenocarcinoma subgroup (training and testing sets AUCs: 0.722 vs. 0.723).
Recent meta-analysis confirmed that SUVmax had moderate predictive power for EGFR mutation status (AUC: 0.68–0.69) [40, 41]. Our study found that the predictive ability of SUVmax for EGFR mutation status was weak (AUC: 0.584), possibly because our study only included lung adenocarcinomas. In this study, three radiomics models based on RF were finally selected, and their prediction performance was all significantly better than SUVmax, which is consistent with the results from Zhang et al. . The possible reason is that radiomics features can better reflect the spatial distribution of tumors than the traditional metabolic parameters, thereby more comprehensively evaluating the tumor heterogeneity. For the specific radiomics features in the model, the one with the highest weight in CT_RF was original_firstorder_Median, which represents the median grayscale intensity of the CT image, indicating that the lower the nodule density, the higher the probability of EGFR mutation. original_shape_Maximum2DdiameterColumn was the feature with the highest weight in the PET_RF model. It represents the nodule length, and a smaller length is associated with a higher EGFR mutation rate. In the PET/CT_RF model, original_firstorder_Media and original_shape_Maximum2DdiameterColumn were still the two radiomics features with the highest weights, which confirmed the robustness of these features.
The clinical characteristics of lung adenocarcinoma patients are important variables in evaluating EGFR mutation status. We found that the lesion size and clinical stage of the EGFR mutation group were significantly lower than those of the wild-type group, suggesting that the lesions in mutation group were smaller and earlier in stage. Moreover, the subsolid nodules had a higher EGFR mutation rate, which is consistent with other reports [33, 43]. Several other studies have shown that female gender and no smoking history are associated with EGFR mutations [8, 44, 45], which is supported by our findings. In this study, the clinical model had a certain predictive value for EGFR mutation status (AUC: 0.681), of which CEA level and gender were kept in the subsequent joint models, and higher CEA level and female gender were independent predictors for EGFR mutation. In the testing set, the predictive performance of PET joint model and PET/CT joint model was further improved compared to the original radiomics models, which is consistent with previous studies [4, 29].
Lung adenocarcinoma patients with different clinical stages have different treatment options. Patients with advanced stages are often inoperable, and the treatment relies more on traditional chemotherapy and targeted therapy. In this study, we found that the CT_RF model had the best prediction power on stage I–II lesions, and the combination with clinical parameters did not improve its predictive performance, which might be related to the high proportion of subsolid nodules in stage I–II nodules. Yang et al.  and Cheng et al.  obtained similar results in subsolid nodules. For stage III–IV nodules, the PET/CT joint model was the best at predicting EGFR mutation status, and it was better than PET/CT_RF. Since stage III–IV patients are more dependent on targeted therapy, the PET/CT joint model can assist clinicians in making more precise treatment decisions for advanced patients.
Compared to previous studies, a major advantage of our research is that we had a large sample size and included all stages, and through stratified analysis by clinical stage, it provided the optimal population for the radiomics model in clinical practice. There are still some limitations of this study. First, this study is a single-center retrospective study with fewer patients in stage II; thus, the model needs to be further verified prospectively in external datasets. Secondly, Beig et al.  believed that the CT radiomics features of the surrounding area of nodules also have certain predictive values, but we did not include them when performing the segmentation; thus, we might lose some information around the tumor, and further research is needed to determine the optimal parameter values for image reconstruction and preprocessing. Third, the stratified analysis showed that the clinical stage impacted the model’s prediction performance, and it is necessary to build a separate model for early lung adenocarcinoma. Fourth, we did not include CT semantic features (such as burrs, vacuoles, lobulation) in the study; although some studies believe that these features are related to EGFR mutation status [47, 48], the process of semantic feature labeling is highly observer-dependent, with significant inter-observer variability [45, 49, 50]. Fifth, to include more samples in the study, we did not use 1-mm CT images, which could affect the performance of CT radiomics model .
The [18F]FDG PET/CT radiomics models constructed using machine learning algorithms were a potential non-invasive method to identify EGFR mutation status in patients with lung adenocarcinoma. The clinical stage could affect the model’s prediction performance, and the PET/CT joint model was more effective in predicting the EGFR mutation status in patients with advanced lung adenocarcinoma. The different models based on PET/CT radiomics features and clinical parameters can help guide clinical decision-making and promote individualized and precise targeted therapy for patients in different clinical stages.
Availability of data and materials
All data generated or analyzed during this study are available from the corresponding author Xiaonan Shao upon reasonable request.
Non-small cell lung cancer
Circulating tumor DNA
Imaging Biomarker Standardization Initiative
Region of interest
Laplacian of Gaussian
Least Absolute Shrinkage and Selection Operator
Area under the curve
Receiver operating characteristic
Decision curve analysis
Squamous cell carcinoma antigen
Epidermal growth factor receptor
- CYFRA 21–1:
Cytokeratin 19 fragment
- SUVmax :
Maximum standardized uptake value
Intraclass correlation efficient
Support vector machine
Net reclassification index
Tyrosine kinase inhibitors
Prolong the progression-free survival
Siegel RL, Miller KD, Wagle NS, Jemal A. Cancer statistics, 2023. CA Cancer J Clin. 2023;73:17–48.
Xia C, Dong X, Li H, Cao M, Sun D, He S, et al. Cancer statistics in China and United States, 2022: profiles, trends, and determinants. Chin Med J (Engl). 2022;135:584–90.
Herbst RS, Morgensztern D, Boshoff C. The biology and management of non-small cell lung cancer. Nature. 2018;553:446–54.
Li X, Yin G, Zhang Y, Dai D, Liu J, Chen P, et al. Predictive power of a radiomic signature based on (18)F-FDG PET/CT images for EGFR mutational status in NSCLC. Front Oncol. 2019;9:1062.
Zhang J, Zhao X, Zhao Y, Zhang J, Zhang Z, Wang J, et al. Value of pre-therapy (18)F-FDG PET/CT radiomics in predicting EGFR mutation status in patients with non-small cell lung cancer. Eur J Nucl Med Mol Imaging. 2020;47:1137–46.
Shi Y, Au JS, Thongprasert S, Srinivasan S, Tsai CM, Khoa MT, et al. A prospective, molecular epidemiology study of EGFR mutations in Asian patients with advanced non-small-cell lung cancer of adenocarcinoma histology (PIONEER). J Thorac Oncol. 2014;9:154–62.
McLoughlin EM, Gentzler RD. Epidermal growth factor receptor mutations. Thorac Surg Clin. 2020;30:127–36.
Recondo G, Facchinetti F, Olaussen KA, Besse B, Friboulet L. Making the first move in EGFR-driven or ALK-driven NSCLC: First-generation or next-generation TKI? Nat Rev Clin Oncol. 2018;15:694–708.
Tan CS, Gilligan D, Pacey S. Treatment approaches for EGFR-inhibitor-resistant patients with non-small-cell lung cancer. Lancet Oncol. 2015;16:e447–59.
Devarakonda S, Morgensztern D, Govindan R. Genomic alterations in lung adenocarcinoma. Lancet Oncol. 2015;16:e342-351.
Zhang Y, Chang L, Yang Y, Fang W, Guan Y, Wu A, et al. Intratumor heterogeneity comparison among different subtypes of non-small-cell lung cancer through multi-region tissue and matched ctDNA sequencing. Mol Cancer. 2019;18:7.
Li Z, Zhang Y, Bao W, Jiang C. Insufficiency of peripheral blood as a substitute tissue for detecting EGFR mutations in lung cancer: a meta-analysis. Target Oncol. 2014;9:381–8.
Hur JY, Kim HJ, Lee JS, Choi CM, Lee JC, Jung MK, et al. Extracellular vesicle-derived DNA for performing EGFR genotyping of NSCLC patients. Mol Cancer. 2018;17:15.
Moding EJ, Diehn M, Wakelee HA. Circulating tumor DNA testing in advanced non-small cell lung cancer. Lung Cancer. 2018;119:42–7.
Ettinger DS, Wood DE, Aisner DL, Akerley W, Bauman JR, Bharat A, et al. Non-small cell lung cancer, version 3.2022, NCCN clinical practice guidelines in oncology. J Natl Compr Canc Netw. 2022;20:497–530.
Eberhardt WE, De Ruysscher D, Weder W, Le Pechoux C, De Leyn P, Hoffmann H, et al. 2nd ESMO Consensus Conference in Lung Cancer: locally advanced stage III non-small-cell lung cancer. Ann Oncol. 2015;26:1573–88.
Vansteenkiste J, Crino L, Dooms C, Douillard JY, Faivre-Finn C, Lim E, et al. 2nd ESMO Consensus Conference on Lung Cancer: early-stage non-small-cell lung cancer consensus on diagnosis, treatment and follow-up. Ann Oncol. 2014;25:1462–74.
Mayerhoefer ME, Materka A, Langs G, Haggstrom I, Szczypinski P, Gibbs P, et al. Introduction to radiomics. J Nucl Med. 2020;61:488–95.
Liu Q, Sun D, Li N, Kim J, Feng D, Huang G, et al. Predicting EGFR mutation subtypes in lung adenocarcinoma using (18)F-FDG PET/CT radiomic features. Transl Lung Cancer Res. 2020;9:549–62.
Nair JKR, Saeed UA, McDougall CC, Sabri A, Kovacina B, Raidu BVS, et al. Radiogenomic models using machine learning techniques to predict EGFR mutations in non-small cell lung cancer. Can Assoc Radiol J. 2021;72:109–19.
Jiang M, Zhang Y, Xu J, Ji M, Guo Y, Guo Y, et al. Assessing EGFR gene mutation status in non-small cell lung cancer with imaging features from PET/CT. Nucl Med Commun. 2019;40:842–9.
Li H, Gao C, Sun Y, Li A, Lei W, Yang Y, et al. Radiomics analysis to enhance precise identification of epidermal growth factor receptor mutation based on positron emission tomography images of lung cancer patients. J Biomed Nanotechnol. 2021;17:691–702.
Huang W, Wang J, Wang H, Zhang Y, Zhao F, Li K, et al. PET/CT based EGFR mutation status classification of NSCLC using deep learning features and radiomics features. Front Pharmacol. 2022;13: 898529.
Li S, Li Y, Zhao M, Wang P, Xin J. Combination of (18)F-Fluorodeoxyglucose PET/CT radiomics and clinical features for predicting epidermal growth factor receptor mutations in lung adenocarcinoma. Korean J Radiol. 2022;23:921–30.
Ruan D, Fang J, Teng X. Efficient 18F-Fluorodeoxyglucose Positron Emission Tomography/Computed Tomography-based machine learning model for predicting epidermal growth factor receptor mutations in non-small cell lung cancer. Q J Nucl Med Mol Imaging. 2022. https://doi.org/10.23736/s1824-4785.22.03441-0.
Zhao HY, Su YX, Zhang LH, Fu P. Prediction model based on 18F-FDG PET/CT radiomic features and clinical factors of EGFR mutations in lung adenocarcinoma. Neoplasma. 2022;69:233–41.
Yang L, Xu P, Li M, Wang M, Peng M, Zhang Y, et al. PET/CT radiomic features: a potential biomarker for EGFR mutation status and survival outcome prediction in NSCLC patients treated with TKIs. Front Oncol. 2022;12: 894323.
Shiri I, Amini M, Nazari M, Hajianfar G, Haddadi Avval A, Abdollahi H, et al. Impact of feature harmonization on radiogenomics analysis: prediction of EGFR and KRAS mutations from non-small cell lung cancer PET/CT images. Comput Biol Med. 2022;142: 105230.
Chang C, Zhou S, Yu H, Zhao W, Ge Y, Duan S, et al. A clinically practical radiomics-clinical combined model based on PET/CT data and nomogram predicts EGFR mutation in lung adenocarcinoma. Eur Radiol. 2021;31:6259–68.
Yang B, Ji HS, Zhou CS, Dong H, Ma L, Ge YQ, et al. (18)F-fluorodeoxyglucose positron emission tomography/computed tomography-based radiomic features for prediction of epidermal growth factor receptor mutation status and prognosis in patients with lung adenocarcinoma. Transl Lung Cancer Res. 2020;9:563–74.
Shiri I, Maleki H, Hajianfar G, Abdollahi H, Ashrafinia S, Hatt M, et al. Next-generation radiogenomics sequencing for prediction of EGFR and KRAS mutation status in NSCLC patients using multimodal imaging and machine learning algorithms. Mol Imaging Biol. 2020;22:1132–48.
Koyasu S, Nishio M, Isoda H, Nakamoto Y, Togashi K. Usefulness of gradient tree boosting for predicting histological subtype and EGFR mutation status of non-small cell lung cancer on (18)F FDG-PET/CT. Ann Nucl Med. 2020;34:49–57.
Zhang H, Cai W, Wang Y, Liao M, Tian S. CT and clinical characteristics that predict risk of EGFR mutation in non-small cell lung cancer: a systematic review and meta-analysis. Int J Clin Oncol. 2019;24:649–59.
Yang X, Dong X, Wang J, Li W, Gu Z, Gao D, et al. Computed tomography-based radiomics signature: a potential indicator of epidermal growth factor receptor mutation in pulmonary adenocarcinoma appearing as a subsolid nodule. Oncologist. 2019;24:e1156–64.
Cheng B, Deng H, Zhao Y, Xiong J, Liang P, Li C, et al. Predicting EGFR mutation status in lung adenocarcinoma presenting as ground-glass opacity: utilizing radiomics model in clinical translation. Eur Radiol. 2022;32:5869–79.
Travis WD, Brambilla E, Noguchi M, Nicholson AG, Geisinger KR, Yatabe Y, et al. International association for the study of lung cancer/american thoracic society/european respiratory society international multidisciplinary classification of lung adenocarcinoma. J Thorac Oncol. 2011;6:244–85.
Zwanenburg A, Vallieres M, Abdalah MA, Aerts H, Andrearczyk V, Apte A, et al. The image biomarker standardization initiative: standardized quantitative radiomics for high-throughput image-based phenotyping. Radiology. 2020;295:328–38.
Beichel RR, Van Tol M, Ulrich EJ, Bauer C, Chang T, Plichta KA, et al. Semiautomated segmentation of head and neck cancers in 18F-FDG PET scans: a just-enough-interaction approach. Med Phys. 2016;43:2948–64.
DeLong ER, DeLong DM, Clarke-Pearson DL. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics. 1988;44:837–45.
Guo Y, Zhu H, Yao Z, Liu F, Yang D. The diagnostic and predictive efficacy of (18)F-FDG PET/CT metabolic parameters for EGFR mutation status in non-small-cell lung cancer: a meta-analysis. Eur J Radiol. 2021;141: 109792.
Du B, Wang S, Cui Y, Liu G, Li X, Li Y. Can (18)F-FDG PET/CT predict EGFR status in patients with non-small cell lung cancer? A systematic review and meta-analysis. BMJ Open. 2021;11: e044313.
Zhang M, Bao Y, Rui W, Shangguan C, Liu J, Xu J, et al. Performance of (18)F-FDG PET/CT radiomics for predicting EGFR mutation status in patients with non-small cell lung cancer. Front Oncol. 2020;10: 568857.
Fujikawa R, Muraoka Y, Kashima J, Yoshida Y, Ito K, Watanabe H, et al. Clinicopathologic and genotypic features of lung adenocarcinoma characterized by the IASLC grading system. J Thorac Oncol. 2022. https://doi.org/10.1016/j.jtho.2022.02.005.
Locatelli-Sanchez M, Couraud S, Arpin D, Riou R, Bringuier PP, Souquet PJ. Routine EGFR molecular analysis in non-small-cell lung cancer patients is feasible: exons 18–21 sequencing results of 753 patients and subsequent clinical outcomes. Lung. 2013;191:491–9.
Zhou JY, Zheng J, Yu ZF, Xiao WB, Zhao J, Sun K, et al. Comparative analysis of clinicoradiologic characteristics of lung adenocarcinomas with ALK rearrangements or EGFR mutations. Eur Radiol. 2015;25:1257–66.
Beig N, Khorrami M, Alilou M, Prasanna P, Braman N, Orooji M, et al. Perinodular and intranodular radiomic features on lung CT images distinguish adenocarcinomas from granulomas. Radiology. 2019;290:783–92.
Chen L, Zhou Y, Tang X, Yang C, Tian Y, Xie R, et al. EGFR mutation decreases FDG uptake in nonsmall cell lung cancer via the NOX4/ROS/GLUT1 axis. Int J Oncol. 2019;54:370–80.
AlGharras A, Kovacina B, Tian Z, Alexander JW, Semionov A, van Kempen LC, et al. Imaging-based surrogate markers of epidermal growth factor receptor mutation in lung adenocarcinoma: a local perspective. Can Assoc Radiol J. 2020;71:208–16.
Hsu JS, Huang MS, Chen CY, Liu GC, Liu TC, Chong IW, et al. Correlation between EGFR mutation status and computed tomography features in patients with advanced pulmonary adenocarcinoma. J Thorac Imaging. 2014;29:357–63.
Ozkan E, West A, Dedelow JA, Chu BF, Zhao W, Yildiz VO, et al. CT gray-level texture analysis as a quantitative imaging biomarker of epidermal growth factor receptor mutation status in adenocarcinoma of the lung. AJR Am J Roentgenol. 2015;205:1016–25.
Niu R, Wang Y, Shao X, Jiang Z, Wang J, Shao X. Association between (18)F-FDG PET/CT-based SUV index and malignant status of persistent ground-glass nodules. Front Oncol. 2021;11: 594693.
This study was supported by Major Project of Changzhou Health Commission (Grant No. ZD202109), Key Laboratory of Changzhou High-tech Research Project (Grant No.CM20193010), Young Talent Development Plan of Changzhou Health Commission (Grant No. CZQM2020012), Changzhou Science and Technology Program (Grant No. CJ20220228); Science and technology project of Changzhou Health Commission (Grant No. WZ202108); and Top Talent of Changzhou “The 14th Five-Year Plan” High-Level Health Talents Training Project (2022260).
Ethics approval and consent to participate
The study was performed in accordance with the ethical standards as laid down in the 1964 Declaration of Helsinki and its later amendments or comparable ethical standards. The study protocol was approved by the Ethics Committee of our institution (No.  KD 087). As our study is a retrospective analysis, informed consent is not required.
Consent for publication
The authors of this manuscript declare no relationships with any companies, whose products or services may be related to the subject matter of the article.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Additional file 1
. Table S1. List of best parameter configurations for 9 machine learning algorithm. Table S2. Clinical characteristics and the EGFR mutation rate of patients in training set and testing set. Table S3. Radiomics features used by the three modality models. Table S4. Clinical characteristics and the EGFR mutation rate of patients in clinical stage I-II group and III-IV group. Figure S1. The LASSO algorithm and 5-fold cross-validation were used to extract the optimal subset of radiomics features. Figure S2. SHAP value graph of CT, PET and PET/CT radiomics models. Figure S3. The ROC curve of the three best radiomics models and SUVmax for identifying EGFR mutation status in training set and testing set.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Gao, J., Niu, R., Shi, Y. et al. The predictive value of [18F]FDG PET/CT radiomics combined with clinical features for EGFR mutation status in different clinical staging of lung adenocarcinoma. EJNMMI Res 13, 26 (2023). https://doi.org/10.1186/s13550-023-00977-4
- lung adenocarcinoma
- [18F]FDG PET/CT
- Epidermal growth factor receptor