Prognostic value of visual IMPeTUs criteria and metabolic tumor burden at baseline [18F]FDG PET/CT in patients with newly diagnosed multiple myeloma

Background 2-[18F]fluoro-2-deoxy-D-glucose ([18F]FDG) positron emission tomography combined with low-dose computed tomography (PET/CT) can be used at diagnosis to identify myeloma-defining events and also provides prognostic factors. The aim of this study was to assess the prognostic significance of baseline [18F]FDG PET/CT visual IMPeTUs (Italian myeloma criteria for PET Use)-based parameters and/or total metabolic tumor volume (TMTV) in a single-center population of patients with newly diagnosed multiple myeloma (NDMM) eligible for transplantation. Methods Patients with MM who underwent a baseline [18F]FDG PET/CT were retrospectively selected from a large internal database of the University Hospital of Liege (Liege, Belgium). Initially, all PET/CT images were visually analyzed using IMPeTUs criteria, followed by delineation of TMTV using a semi-automatic lesion delineation workflow, including [18F]FDG-positive MM focal lesions (FL) with an absolute SUV threshold set at 4.0. In a first step, to ensure PET/CT scans accurate reporting, the agreement between two nuclear medicine physicians with distinct experience was assessed. In the second step, univariable and multivariable analyses were conducted to determine the prognostic significance of [18F]FDG PET/CT parameters on progression free survival (PFS) and overall survival (OS), respectively. Results A total of 40 patients with NDMM were included in the study. The observers agreement in the analysis [18F]FDG PET/CT images was substantial for the presence of spine FL, extra spine FL, at least one fracture and paramedullary disease (Cohen’s kappa 0.79, 0.87, 0.75 and 0.64, respectively). For the presence of skull FL and extramedullary disease the agreement was moderate (Cohen’s kappa 0.56 and 0.53, respectively). Among [18F]FDG PET/CT parameters, a high number of delineated volumes of interest (VOI) using the SUV4.0 threshold was the only independent prognostic factor associated with PFS [HR (95% CI): 1.03 (1.004–1.05), P = 0.019] while a high number of FL (n > 10; F group 4) was the only independent prognostic factor associated with OS [HR (95% CI): 19.10 (1.90–191.95), P = 0.01]. Conclusion Our work confirms the reproducibility IMPeTUs criteria. Furthermore, it demonstrates that a high number of FL (n > 10; IMPeTUs F group 4), reflecting a high [18F]FDG-avid tumor burden, is an independent prognostic factor for OS. The prognostic value of the TMTV delineated using a SUV4.0 threshold was not significant. Nevertheless, the count of delineated [18F]FDG-avid lesions VOI using a SUV4.0 threshold was an independent prognostic factor for PFS. Supplementary Information The online version contains supplementary material available at 10.1186/s13550-024-01113-6.


Introduction
Multiple myeloma (MM) is the second most common hematological malignancy with higher incidence in highincome countries, particularly in Western Europe, North America and Australasia [1].MM is characterized by a clonal plasma cell proliferative disorder, demonstrating significant clonal heterogeneity and consequently highly variable clinical outcomes [2].Induction therapy followed by autologous stem-cell transplantation (ASCT) and novel therapeutic agents have improved the prognosis of MM patients, elevating the median survival from 2-3 years to 7-15 over the past two decades [3,4].In patients with newly diagnosed MM (NDMM), the revised International Staging System (R-ISS) is used as a simple prognostic tool to categorize patients into three stage groups with distinct survival rate [5,6].Patients age, performance status, comorbidities, and the depth of response to therapy are also well-known prognostic factors [2,7].Additional biomarkers are still needed to identify high-risk patients for whom therapy could maximize response, as well as standard-risk patients for whom unnecessary treatment-related toxicity can be minimized.
The aim of this study was to evaluate the prognostic significance of baseline [ 18 F]FDG PET/CT visual IMPe-TUs-based parameters and/or TMTV in a single-center population of NDMM.

Study population
The approval was obtained from the institutional ethics committee (Comité d'Éthique Hospitalo-Facultaire Universitaire de Liège; reference 2023/9).No written informed consent was obtained due to the retrospective design of the study.Patients diagnosed with plasma-cell disorders who had undergone a PET/CT scan were retrospectively extracted from a large internal database of the University Hospital of Liege (Liege, Belgium), covering the period from 2012 to 2021.A subsequent selection was performed with the following eligibility criteria: NDMM without prior treatment, active MM requiring induction therapy and eligible for transplant; age ≤ 70 year (usual cutoff age for establishing transplant eligibility); availability of baseline [ 18 F]FDG PET/CT within a maximum interval of three months the MM diagnosis requiring therapy.Exclusion criteria encompassed cases where immediate treatment initiation did not occur, follow-up information was unavailable, and cases where the [ 18 F] FDG PET/CT did not meet the quality criteria, such as glycemia > 150 mg/dL at the time of [ 18 F]FDG injection.
[ 18 F]FDG PET/CT analyses [ 18 F]FDG PET/CT acquisition parameters are presented in Supplemental Data.
[ 18 F]FDG PET/CT images were independently analyzed by an experienced nuclear medicine physician (18-year experience) and a trainee in nuclear medicine (2-year experience in nuclear medicine) using MIM Software version 7.0.5 (LesionID ® tool; MIM Software Inc.).To ensure PET/CT scans accurate reporting, the agreement between the two nuclear medicine physicians was assessed in the qualitative and quantitative description of PET/CT images.In a second step, consensus between the two nuclear medicine physicians was obtained for the IMPeTUs variables that were then used in the Cox proportional hazards models.The consensus results were also compared with the clinical report provided by the third physician involved in the PET/CT report as part of routine clinical practice.The CT part of the PET/CT was further analyzed by a radiologist experienced in bone imaging and MM.All PET/CT images were first visually analyzed using IMPeTUs criteria [16,17].The following parameters were reported: The presence of positive FL with [ 18 F]FDG uptake with or without underlying osteolytic lesion in CT images; the presence of at least one osteolytic lesion of 5 mm or more in size with or without [ 18 F]FDG uptake; the number of FL (F) and osteolytic lesions (L), respectively, merged into the following groups × 1: no lesion; × 2: 1-3 lesions; × 3: 4-10 lesions and × 4: > 10 lesions; the site of MM lesions (skull, spinal and/or extraspinal); the intensity of [ 18 F]FDG uptake in the BM and the hottest FL, respectively, quantified according to the five-point Deauville scale and SUV max ; the presence and site of EMD and/or PMD; the presence of fractures; the presence of a diffuse pattern in CT images (innumerable osteolytic lesions distributed diffusely throughout the axial skeleton).
The liver SUV max and SUV mean were measured using a 30-mm diameter volume of interest (VOI).Wholebody [ 18 F]FDG-positive tumor volume delineation was performed using a semi-automatic lesion delineation workflow of MIM Software version 7.0.5 (MIM Software Inc.).A rectangular volume of interest including the whole body was manually selected.Then, a fully automated preselection of [ 18 F]FDG-positive MM FL was applied to delineate lesions with an absolute SUV threshold set at 4.0.No minimum or maximum volume threshold was applied.The observer used a clearing option to manually remove areas of physiological uptake/activity, e.g., heart, brain, bladder.The software then provided the SUV max , SUV mean , the number of VOI delineated using the SUV4.0 threshold, the TMTV and the total lesion glycolysis (TLG).
For patients with no delineated volume (patients with no FL or FL with SUV max < 4), the SUV max was estimated manually by drawing a VOI in the hottest FL.Semi-quantitative values of the experienced nuclear medicine physician were used in the Cox proportional hazards models.

Outcomes
The primary endpoint was to test the association between baseline [ 18 F]FDG PET/CT IMPeTUs criteria and semiquantitative parameters (TMTV and SUVs) and PFS and OS, respectively.The PFS was defined as the duration from the date of diagnosis of active MM requiring treatment until either the date when progression criteria were met or the date of death, whichever occurred earlier.Patients without documented progression after diagnosis and without a recorded death date were censored for PFS at their last contact date.The OS was defined as the time from the date of diagnosis of active MM requiring treatment to the time of death from any cause; patients who were still alive at the last follow-up or lost to follow-up were considered as censored.

Statistical analyses
Patient characteristics and PET parameters were summarized using median and interquartile range (P25-P75).The normality of the distribution of quantitative variables was assessed through numerical comparison of mean and median values, graphical representations with histograms and quantile-quantile plots as well as the Shapiro-Wilk normality test.Categorical covariates were reported as absolute and relative frequencies.The agreement between variables measured by both observers was evaluated using Cohen's Kappa coefficient for qualitative variables.In the case of quantitative variables, the agreement was assessed using intra-class correlation (ICC).Both, Cohen's Kappa and ICC are presented along with their corresponding 95% Confidence Intervals (CIs).The values of Cohen's kappa and ICC ranged from 0 to 1, and guideline for interpreting the degree of agreement was as follows: total disagreement ≤ 0.01, slight agreement = 0.01-0.20,fair agreement = 0.21-0.40,moderate agreement = 0.41-0.60,substantial agreement = 0.61-0.80,and almost perfect agreement = 0.81-1.00.Kaplan-Meier plots were employed to visualize PFS and OS curves.Hazard ratios (HRs) with corresponding 95% CI were derived to access potential clinical and PET/ CT risk factors for time to death and time to progression after diagnosis using Cox proportional hazards models (Cox-ph).Univariable and multivariable analyses were conducted for both PFS and OS.Results were considered significant at the 5% level of significance (P < 0.05).Analyses were carried out on the maximum of data availability; the missing values were not replaced.Calculations, visualization, and modelling were performed using R programming-version 4.2.2.

Patients
The consort flow diagram is presented in the Fig. 1.A total of 660 patients with plasma-cell disorders, and for whom a PET/CT was performed, were selected.A total of 620 patients did not meet the eligibility criteria and were excluded for various reasons: age > 70 year (n = 222); absence of baseline [ 18 F]FDG PET/CT within three months before MM diagnosis (n = 315); [ 18 F] FDG PET/CT not meeting the quality criteria (n = 2; glycemia > 150 mg/dL); no active MM and no treatment initiated (n = 79), presence of other malignancy (n = 1); and misdiagnosis of MM (n = 1; Waldenström macroglobulinaemia).  1).

[ 18 F]FDG PET/CT results
The PET/CT interpretation results are illustrated in the consort flow diagram (Fig. 1) and in Suppl.Table 1.F]FDG uptake but SUV max 3.9.Note that this patient presented two delineated skull FLs, one of which was large with paramedullary extension.The patient underwent induction therapy with bortezomib, thalidomide and dexamethasone followed by ASCT.Time to progression was 22.8 months and time to death was 4.9 years The median (range) liver SUV max and SUV mean were 3.1 (2.5-4.2) and 2.2 (1.7-2.8),respectively.Using the SUV4.0 threshold, it was possible to delineate the TMTV and extract TLG and SUV mean in 20/40 (50.0%) patients who had at least one MM FL and/or diffuse BM uptake with SUV max ≥ 4 (Fig. 2).No SUV max was available in the 8/40 (20.0%) patients with no FL.
The agreement between the 2 nuclear medicine physicians in [ 18 F]FDG PET/CT images analyses is presented in Table 2. Almost perfect agreement (ICC value > 0.90) was observed for all semi-quantitative parameters (TMTV, TLG, SUVs and the number of delineated VOI), for the group number of FL, the hottest bone FL DS and PMD DS.Substantial agreement (Cohen's kappa or ICC value > 0.60) was observed for the BM DS, the group number of osteolytic lesions, the presence of fracture and PMD.The agreement was moderate (Cohen's kappa < 0.60) for the presence of skull FL and EMD.

Outcome results
Figure 3

Discussion
The IMPeTUs criteria, as demonstrated by Nanni et al., proved to be highly reproducible among expert nuclear medicine physicians [17].This work confirms the utility of IMPeTUs criteria and semi-quantitative parameters in generating consistent reports for [ 18 F]FDG PET/ CT scans, even when interpreted by less experienced Table 3 Multivariable cox regression analysis of clinical and baseline [ 18 F]FDG PET/CT parameters predicting for prolonged PFS and OS 1, 2   VOI, volume of interest 1 An interaction term between the significant parameters in the multivariate analyses was included in the model (P > 0.05) 2 Results are adjusted for the following parameters tested in the univariate analyses: Age; gender; hemoglobin; ISS stage; R-ISS stage; serum β2-microglobulin; serum albumin; serum lactate dehydrogenase (LDH); serum calcium; [ 18 F]FDG PET scan status defined positive in case of F group > 1 (at least one FL) and/or diffuse BM uptake with DS ≥ 4; the PET scan status defined negative in case of F group 1 (no FL) and no diffuse BM uptake with DS ≥ 4; F group 2 and 3; number of bone FL > 3; presence of at least one fracture; presence of PMD; presence of PMD with DS ≥ 4; presence of EMD; TMTV; TLG; hottest FL SUV max ; hottest SUV max > 4.2; SUV mean ; more than 3 delineated VOI and CT diffuse pattern (P > 0.05) 3 The multivariate analysis for OS is constrained by the low number of deaths (13 out  residents in nuclear medicine.This emphasizes the benefit of using IMPeTUs descriptive criteria to standardize PET interpretation not only in clinical trials but also in routine clinical practice.The concordance in interpreting and quantifying [ 18 F]FDG PET/CT scans was substantial or almost perfect for the majority of parameters, with the exception of the presence of skull FL and EMD, for which the agreement was moderate (Cohen's kappa < 0.60).The low prevalence of EMD (n = 2/40; 5.0%) may account for this lack of agreement.
For skull lesions, anatomic variants such as arachnoid granulations and benign lesions can mimic osteolytic MM lesions in CT images (false positive results) while FL can be missed due to high brain [ 18 F]FDG uptake (false negative results) [30,31].Baseline [ 18 F]FDG PET/CT was positive in the majority of newly diagnosed transplant-eligible MM patients (n = 38/40; 95.0%) included in this study.Half of the patients (n = 20/40) had diffuse BM uptake and/or FL with DS ≥ 4. Therefore, conducting a baseline [ 18 F]FDG PET/CT scan is advisable when planning a post-treatment PET evaluation [32].These 20 patients with MM involvement and DS ≥ 4 were those for whom the TMTV was delineable using the SUV4.0 threshold.
According to the latest International Myeloma Working Group consensus recommendations, diffuse [ 18 F] FDG uptake in BM is not a myeloma-defining event to avoid false diffuse BM pattern related to reactive BM [8][9][10].The grading of diffuse BM uptake using the 5-point scale Deauville score is among the IMPeTUs criteria [16][17][18].In our population, 8/40 (20.0%) patients presented with diffuse BM DS ≥ 4, with BM plasma cell infiltration ranging from 25 to 90% (Fig. 4), and patients with a diffuse BM [ 18 F]FDG uptake with DS ≥ 4 were at higher risk of progression in the univariable analysis [HR (95% CI): 2.46 (1.04-5.83),P = 0.040].However, this was no longer the case in the multivariable analysis when considering transplant in the model [HR (95% CI): 1.29 (0.22-7.52,P = 0.779].This result differs from Deng et al. who showed that baseline BM DS ≥ 4 was independently associated with OS [19]. Similarly to previous studies, the number of FLs at baseline was a prognostic factor, with a high number of FL (n > 10; F group 4), reflecting a high tumor burden, being The patient underwent involved-field radiation therapy of the spinal lesion and induction therapy with daratumumab, lenalidomide and dexamethasone followed by ASCT.Time to progression was 6.9 months and time to death was 21.8 months a prognostic factor for OS in our population (Table 3) [12,15,33].However, the wide range of the 95% CI [F group 4: HR (95% CI): 19.10 (1.90-191.95,P = 0.010] introduces significant uncertainty into the findings.One of the primary limitations is the small sample size with only 40 patients included, and the low number of observed deaths (13 out of 40; 32.5%) throughout the follow-up period.The Suppl.Table 2. describes the difference of patient characteristics between the IMPeTUs F groups.Tumor burden is a well-known major factor affecting survival in MM patients.MM is an incurable hematological malignancy with variable clinical outcomes, ranging from a few years to more than 10 years, depending on host factors, tumor burden (stage), biology (cytogenetic abnormalities), and depth of response to therapy [34].In our population, undergoing a transplant was significantly associated with PFS, and the presence of high-risk cytogenetic abnormalities was an independent prognostic factor in term of OS.The association between ISS stage, R-ISS, LDH and serum β2-microglobulin with survival did not reach statistical significance most likely due to the small number of patients included in this study.The number of osteolytic lesions, reflecting bone destruction, was not a significant prognostic factor (Table 3).
In our study, the EMD was not significantly associated with PFS or OS likely due to the low prevalence in our population (n = 2/40; 5.0%) [12][13][14]35].Contrary to previous studies, there was no significant association between the presence of PMD and survival [14,21].Similarly to the research group of Nantes, who recently assessed the prognostic value of baseline [ 18 F]FDG PET biomarkers in NDMM patients included in the prospective multicenter CASSIOPET study, the prognostic value of SUV max using a threshold of 4.2 was not significant in our sample [14,21].Previous studies showed an independent prognostic significance of SUV max when using a higher threshold, SUV max ≥ 5.3 and SUV max > 7.1, respectively [13,36].
Using a SUV4.0 threshold, the present work did not show the prognostic value of the delineated TMTV.This finding contradicts prior research indicating TMTV's significant independent prognostic value for both PFS and OS in NDMM patients.However, it is worth noting that many of these studies employed a lower SUV threshold, typically set at 2.5, for delineating TMTV.[22][23][24][25][26]28]. Nonetheless, Jamet et al. demonstrated the lower prognostic significance of volume-derived metabolic parameters such as TMTV or TLG compared to BM SUV max [29].The TMTVs tested were derived from three different segmentation methods: a threshold SUV ≥ 2.5, a fixed threshold at 40% of the SUV max , and a K-means clustering algorithm with 2 clusters [29].The SUV4.0 threshold used to delineate TMTV in the present work was based on three assumptions.First, the choice of SUV4.0 rather than SUV2.5 was based on the work of Zamagni et al. which showed that FL with SUV > 4.2 at baseline independently adversely affected PFS and OS, [12,37].Secondly, opting for SUV4.0 instead of the SUV4.2 threshold aimed at facilitating the automatic workflow, drawing on the approach of Barrington et al. in lymphoma research, also preventing the selection and delineation of healthy tissues such as the liver (median liver SUV max , range: 3.1, 2.5-4.2) [38].The selected stringent threshold (SUV4.0)surpassed the threshold that would have been determined using the liver SUV mean .In our population, a method based on the liver SUV mean plus 1.5 or 2 times the standard deviation would have generated thresholds ranging from 2.1 to 3.4 and 2.2 to 3.6, respectively.An SUV2.1 threshold would have been inappropriate and would have necessitated more manual adjustment, which is crucial to avoid for future implementation in clinical practice.The SUV4.0 threshold was reinforced by the results of Morales-Lozano et al. who showed that, among different SUV thresholds tested to delineate TMTV, the second most relevant threshold was SUV > 4 [37].Thirdly, MM is recognized for displaying spatial heterogeneity, with sub-clones from FL with unfavorable genomic profiles being more prone to therapy resistance [11,33,39,40].In the post-treatment setting, Zamagni et al. demonstrated that persistent FL with SUV max > 4.2 or more recently persistent FL and/or BM [ 18 F]FDG uptake with a Deauville-scale score ≥ 4 were independent prognostic biomarkers [12,32].In this study, post-treatment [ 18 F]FDG PET/CT scans were unavailable.Instead, we focused on evaluating the potential prognostic value of the potentially clinically relevant high-risk tumor burden ([ 18 F]FDG avid lesions delineated using the SUV4.0 threshold) based solely on the baseline [ 18 F]FDG PET/CT scans.
The absence of prognostic significance of TMTV delineated using the SUV4.0 threshold in our population might be related to the limited number of patients (n = 40), but this threshold might also underestimate tumor burden, that is a well-known prognostic factor (Fig. 2).Deep learning-based tools for automated TMTV delineation might be a faster tool to more accurately delineate [ 18 F]FDG-positive tumor burden [41].Sachpekidis et al. showed that the TMTV obtained using a deep learning-based delineation method in previously untreated MM patients was significantly correlated with BM plasma cell infiltration and was associated with PFS and OS [20].Nevertheless, the number of delineated VOI using the SUV4.0 threshold, i.e. the number of FL with high [ 18 F]FDG uptake with a SUV ≥ 4, was an independent prognostic factor associated with worse PFS.
Post-treatment [ 18 F]FDG PET/CT was not available in enough patients to assess the depth of response as an additional biomarker.The present work is mainly limited by the retrospective design, the heterogeneity of first-line treatments, and the small number of included patients (n = 40), which accounts for the low statistical power.With the exception of research groups who investigated TMTV in patients enrolled in the Total Therapy 3A, IFM/DFCI2009, EMN02/HO95 and CAS-SIOPET trials, the majority of previous studies investigating the prognostic value of IMPeTUs criteria and volumetric parameters at baseline included a limited number of patients [14, 15, 22-25, 28, 29, 37, 40, 42, 43].The limited number of patients included in the trials poses a challenge for drawing definitive conclusions regarding the prognostic value and clinical utility of baseline [ 18 F]FDG PET and TMTV measurements in MM patients.

Conclusions
Our work revealed that a high number of FL (n > 10; IMPeTUs F group 4), indicative of [ 18 F]FDG-avid tumor burden, emerged as a prognostic factor for OS.The prognostic value of the TMTV defined using a SUV4.0 threshold was not statistically significant; nevertheless, the count of delineated [ 18 F]FDG-avid outlined lesions (VOI) using a SUV4.0 threshold was an independent prognostic factor for PFS.

Fig. 2 [
Fig. 2 [ 18 F]FDG PET images [A & B: maximum intensity projection (MIP); C: PET/CT axial slices; SUV scale 0-5] of a patient with multiple FL (F group 4; n = 16 delineated VOI).This 65-year-old patient was diagnosed with IgG kappa MM (ISS and R-ISS stage II; serum M-protein 54.3 g/L; free light-chain Kappa 79.17 mg/L, abnormal free light-chain ratio Kappa/Lambda: 34.88; bone marrow plasma cell infiltration (BMPC): 73%; hemoglobin 11.0 g/dL).Images show the TMTV (70.91 mL) delineated using the SUV4.0 threshold B. The figure illustrates the underestimation of tumor burden using SUV4.0 threshold; FL with SUV max < 4.0 were not delineated, such as the osteolytic lesion of the left clavicle, pointed out by the red arrows, with [18 F]FDG uptake but SUV max 3.9.Note that this patient presented two delineated skull FLs, one of which was large with paramedullary extension.The patient underwent induction therapy with bortezomib, thalidomide and dexamethasone followed by ASCT.Time to progression was 22.8 months and time to death was 4.9 years

Fig. 4 [
Fig. 4 [ 18 F]FDG PET/CT images (A: PET MIP; B: PET MIP with SUV4.0 threshold TMTV & C: low dose CT, PET and fused PET/CT; SUV scale 0-5) of a 49-year-old patient with newly diagnosed IgA kappa MM (ISS and R-ISS stage III; serum M-protein 7.3 g/L; free light-chain Kappa 4187 mg/L, abnormal free light-chain ratio Kappa/Lambda: 1495; hypercalcemia; bone marrow plasma cell infiltration: 25%; presence of t(4;14); hemoglobin 11.3 g/dL).The PET images show a high diffuse BM [ 18 F]FDG uptake with a DS 5 and multiple FL (F group 4; 198 delineated VOI; TMTV 1369,7 ml) with DS 5.The red arrows in PET/CT images C point PMD with spinal cord compression confirmed by magnetic resonance imaging.The patient underwent involved-field radiation therapy of the spinal lesion and induction therapy with daratumumab, lenalidomide and dexamethasone followed by ASCT.Time to progression was 6.9 months and time to death was 21.8 months

Table 2
Agreement between the 2 nuclear medicine physicians in [ 18 F]FDG PET/CT images analyses BM, bone marrow; CI, confidence interval; CT, computed tomography; DS, Deauville scale; EMD, extramedullary disease; FL, focal lesions; ICC: intraclass correlation coefficient; PMD, paramedullary disease; SUV, standardized uptake value; TLG, total lesion glycolysis; TMTV, total metabolic tumor volume † No focal lesion in 8 patients.† † Absence of skull FL but the skull vault was not included in the field of view Fig. 3 Kaplan-Meier curve for PFS A and OS B of 40; 32.5%) observed during the follow-up period