For avid glucose tumors, the SUV peak is the most reliable parameter for [18F]FDG-PET/CT quantification, regardless of acquisition time

Background This study is an assessment of the impact of acquisition times on SUV with [18F]FDG-PET/CT on healthy livers (reference organ with stable uptake over time) and on tumors. Methods One hundred six [18F]FDG-PET/CT were acquired in list mode over a single-bed position (livers (n = 48) or on tumors (n = 58)). Six independent datasets of different durations were reconstructed (from 1.5 to 10 min). SUVmax (hottest voxel), SUVpeak (maximum average SUV within a 1-cm3 spherical volume), and SUVaverage were measured within a 3-cm-diameter volume of interest (VOI) in the right lobe of the liver. For [18F]FDG avid tumors (SUVmax ≥ 5), the SUVmax, SUVpeak, and SUV41% (isocontour threshold method) were computed. Results For tumors, SUVpeak values did not vary with acquisition time. SUVmax displayed significant differences between 1.5- and 5–10-min reconstruction times. SUV41% was the most time-dependent parameter. For the liver, the SUVaverage was the sole parameter that did not vary over time. Conclusions For [18F]FDG avid tumors, with short acquisition times, i.e., with new generations of PET systems, the SUVpeak may be more robust than the SUVmax. The SUVaverage over a 3-cm-diameter VOI in the right lobe of the liver appears to be a good method for a robust and reproducible assessment of the hepatic metabolism.


Background
Assumed to be more accurate and less operatordependent than visual analysis, quantification is increasingly used in positron emission tomography (PET) studies in routine practice or clinical trials. This is particularly relevant for treatment monitoring since it has been shown that objective quantification of [ 18 F]FDG uptake changes may improve the prognostic value of [ 18 F]FDG-PET compared with visual analysis [1]. Even prone to many sources of errors and variability, the semi-quantitative method (standardized uptake value SUV) is currently preferred to the absolute quantification of glucose metabolic rate, which requires dynamic imaging and measurement of the arterial input function, and thus considered to be too complex for a use in routine practice. SUV max (SUV of the hottest voxel within a defined volume of interest (VOI)) is the most widely used parameter, easy-to-use, and operator-independent. However, SUV max may be affected by noise and may merely reflect statistical fluctuations when the acquisition time is too short [2]. Among the other SUV, SUV peak has been suggested as an alternative to SUV max [3]. SUV peak is an average SUV computed within a fixed-size VOI, most often containing (and not necessarily centered on) the hottest pixel value. Because this VOI encompasses several pixels, SUV peak is assumed to be less affected by image noise than SUV max [4,5] and then more reliable and appropriate for monitoring tumor response, while remaining easy-to-use with very little or no operator dependency. The major drawback of SUV peak is that its associated volume of interest (VOI peak ) is not uniquely defined, leading to a few dozen of SUV peak definitions, differing in the shape, size, and location of the VOI peak [6]. In one hand, VOI peak should be large enough to prevent SUV peak to be affected by noise and partial-volume effects, and in other hand, VOI peak should not be too large to avoid inclusion of voxels outside the tumor. These considerations lead to a fixed 1-cm 3 sphere recommended by PERCIST [3] as a standard definition of SUV peak .
The use of time of flight (TOF) in reconstruction algorithms of new generations of hybrid PET/CT machines (positron emission tomography scanner/X-ray computed tomography scanner) improves signal-tonoise ratio, spatial resolution, and lesion detectability, theoretically allowing reduced injected activity (and thus radiation exposure) and/or acquisition time [7]. Furthermore, point spread function (PSF) reconstruction, also available in new generations of PET systems, is known not only to improve sensitivity but also to overestimate SUV [8,9]. The quantitative accuracy of these techniques is not fully known [8], especially their impact on the SUV max determination when acquisition time is reduced to its lower limit for optimizing acquisition protocol in clinical practice. The implementation of these new techniques therefore presents a challenge for centers to define an acquisition protocol that can be used for visual and quantitative analysis, while respecting the European Association of Nuclear Medicine (EANM) guidelines [10], i.e., either by determining the minimum FDGadministrated dose in relation to PET acquisition duration and patient weight or by choosing to apply a higher activity to reduce duration of the study. The aim of this study was to evaluate the impact of acquisition time on SUV on healthy livers (reference organs with stable uptake over time) and on tumors.

Materials
One hundred six whole-body PET/CT scans with 2- Whole-body PET/CT were performed 62 ± 4.6 min after [ 18 F]FDG injection, and an additional PET/CT acquisition of 10 min in list mode (LM) was acquired using a single-bed position 83 ± 5.8 min after [ 18 F]FDG injection.
After the whole-body scan acquisitions (n = 106), additional list mode acquisitions were performed on the most avid tumor (tumor SUV max > 5; n = 58) or on normal healthy liver (no history of liver metastasis and no evidence of liver lesion on whole-body [ 18 F]FDG-PET/CT scans; n = 48).

Image analysis
Two experienced nuclear medicine physicians analyzed all images datasets on an Imagys ® workstation (Keosys ® , Saint-Herblain, France), allowing the computation of different SUV. SUV was corrected for body weight (SUVbw). For tumor [ 18 F]FDG uptake quantification, a manual VOI encompassing the entire tumor was drawn on R 10 . VOIs were registered and repositioned identically for the five other replays (R 1.5 to R 5 ) using 3D coordinates that allow the reposition of the VOI on all replays. SUV max (SUV of the hottest voxel), SUV peak (maximum average SUV within a 1-cm 3 sphere), SUV 41% (threshold-based tumor delineation applying a threshold of 41 % of the SUV max ), and metabolic tumor volume (applying a threshold of 41 % of the SUV max on R 10 ) were then automatically generated. A threshold of 41 % was chosen following the EANM guidelines [10] and because the most common thresholding value chosen in the clinical setting is 40-43 % of the SUV max .
For the liver, a 14-cm 3 VOI was positioned on the right lobe of the liver on R 10 , as proposed in PERCIST [3]. As for tumors, VOIs were repositioned identically for the five other replays using 3D coordinates that allow the reposition of the VOI on all replays. SUV max , SUV peak , and SUV average (average SUV within the fixed 14-cm 3 VOI) were automatically generated.

Statistical analysis
For tumors, SUV max , SUV peak , and SUV 41% were analyzed. For livers, SUV max , SUV peak , and SUV average were analyzed.
Paired t tests were performed to study the inter-observer reproducibility of tumor and liver measurements.
Repeated-measures ANOVA (Tukey post-tests) was performed to test the variations of measurements over time, i.e., between the six replays (R 1.5 to R 10 ).
Individual SUV fluctuations Δt over time (i.e., at 1.5, 2, 2.5, 3, and 5 min) were evaluated using the SUV at 10 min as reference SUV and were calculated as follows: For each replay, and for liver and tumors, the maximum individual fluctuations were registered.
Statistical tests were performed using Prism 4 software (GraphPad software, CA, USA). The level of significance was set at 5 %.

Results
No significant differences were observed between the two readers for the evaluated image datasets (tumor and liver).

Tumors
SUV peak was the only parameter stable over time with no significant statistical difference between the six replays.
SUV max and SUV 41% decreased significantly with time (p ≤ 0.0005) ( Table 2). For SUV max , statistical differences were observed between the shortest acquisition time R 1.5 versus the longest R 5 and R 10 (p = 0.0005).
SUV 41% was the most time-dependent parameter, with significant statistical differences between R 1.5 and R 2 versus R 5 and R 10 and between R 2.5 and R 10 (p < 0.0001) ( Table 2).
SUV peak was the least variable parameter with individual fluctuations up to 38 % (from 7.22 to 9.96) versus 58 % for SUV max (from 10.82 to 17.1) and 56 % for SUV 41% (from 6.22 to 9.7) ( Table 3). For all tumor SUVs, maximal fluctuations were observed between the shortest replay (R 1.5 or R 2 ) and R 10 . Considering a maximum fluctuation of 10 % as an acceptable level of variation for tumor SUV [11], the number of patients with tumor SUV fluctuations >10 % (compared to R 10 ) was noted. R 1.5 and R 2 were the replays in which the higher number of patients with SUV fluctuations >10 % was observed, whatever the type of SUV. SUV peak was the least variable Livers SUV average was stable over time with no significant statistical difference, whereas a significant tendency to decrease with time was observed for SUV max and SUV peak (p ≤ 0.0005; Table 2).
For SUVpeak, statistical differences were observed between R 1.5 and the other replays and between R 2 , R 2.5 , and R 3 versus R 5 and R 10 (p < 0.0001).
SUV average was the least variable parameter with individual variations up to 16 % (between R 1.5 and R 10 ) versus 19 % for SUV peak and 41 % for SUV max (between R 1.5 and R 10 ).

Discussion
The SUV definitions used in the present study are usually described to be particularly suitable for responsemonitoring purposes. Indeed, these semi-automatic and operator-independent methods allow a simple and reproducible evaluation of SUV that is a basic requirement to provide an added value to the quantitative measurement compared to visual assessment. There is no actual consensus about the best SUV parameter to be used to assess response to therapy, but most response assessment studies use SUV max , the first reason being related to its easy implementation and operator-independent character. Moreover, for small lesions or during an effective treatment with a decrease in tumor size, when partial-volume effect may result in an underestimation of [ 18 F]FDG uptake, SUV max may be best suited as metabolic index than the other SUVs [12]. In a clinical point of view, identification of a suitable SUV for response quantification requires clinical trials with patients' clinical outcomes, and SUV max demonstrated positive predictive values and accuracies for outcome prediction in lymphoma [1] as well as in solid tumors [13,14]. Although the increase in counting statistics with new PET/CT systems contributes to reduce image noise, SUV max remains adversely affected by noise, leading to uncertainty in the uptake quantification and thus in the treatment response categorization. Theses inaccuracies may be more pronounced in case of tumor heterogeneity that moreover may change during the course of treatment [15]. Consequently, SUV peak has been recently proposed as a more robust alternative of SUV max [3]. SUV peak has a larger volume compared to the single pixel value of SUV max and thus should be less affected by image noise [4,5,16].
Acquisition times recommended by the manufacturer are most often used in clinical routine, even though the actual influence of acquisition time upon SUV is not fully known. Regarding different quantifications of the same metabolic process (for a given tumor), a correlation does exist between SUV max and SUV peak and the uncertainties of these two parameters are probably comparable, although from different causes [4]. Nevertheless, these considerations do not allow an assessment of the influence of acquisition time on SUV, and substantial differences may exist between SUV max and SUV peak in individual tumors that affect the treatment response quantification and the response categorization.
Because the current worldwide trend is to reduce patient exposure to ionizing radiation, it is not conceivable to increase the injected activity to overcome a low statistical quality of PET images due to a too short acquisition time. Conversely, increasing the acquisition time may lead to discomfort of the patient and to motion artifacts. In our study, the highest fluctuations of SUV were observed when acquisition times were the shortest, whereas no significant difference was observed for replays superior or equal to 3 min. Brown et al. [17] investigated the effect of varying acquisition times on phantom and patient Δmax R t -R 10 (%) is the maximal individual fluctuation between SUV value at R t and SUV value at R 10 . Δmax mean value is the mean Δmax from "R 1.5 vs R 10 " to "R 5 vs R 10 " PET images on a 3D GE Discovery-STE PET/CT system. Patient data were investigated using list mode acquisition to obtain comparable 2-, 3-, and 4-min frames. As we reported for tumors, no significant difference was observed over 3 min at standard clinical [ 18 F]FDG activities. In two other studies, the image quality was slightly adversely affected by an acquisition time of 1.5 min compared to 3 min [18] and the volume and SUV variability were significantly larger for images with scan times below 3 min [19]. For tumors, we did not observe any significant difference for SUV peak over time, and regarding individual variations, SUV peak was also the least variable parameter. Using a reference time of 15 min, Lodge et al. [20] reported similar results with a significant lower bias for the SUV peak compared to the SUV max for the 1-, 2-, 3-, and 4-min images. In our study, large SUV max variations up to 58 % (R 1.5 versus R 10 ) were observed for the same tumor in the same patient, which is obviously unacceptable for response-monitoring purposes, particularly when accumulated to other sources of bias [11,21] and if a threshold value is applied to determine treatment response as with PERCIST criteria [3]. As previously reported [2,4,22], we noted that fluctuations of SUV max also affected threshold method SUV since the VOI was determined by selecting pixel values equal to 41 % of the maximum pixel value.
For all these reasons discussed, SUV peak may be a robust alternative for the assessment of [ 18 F]FDG avid tumor uptake at standard clinical activities. We implemented a SUV peak algorithm using a fixed-size 1-cm 3 VOI, automatically positioned so as to maximize the enclosed average (maximum average), typically including but not necessarily centered on the maximum pixel value. However, for small tumor sizes, the automatic placement of the VOI may be impossible, but in these cases, the placement of the VOI can be performed manually.
Finally, an important additional result is the absence of significant difference for the liver SUVaverage over time. These results confirm that the liver metabolism can be used as the reference organ for quality comparison of repeated [ 18 F]FDG-PET studies in the same patient, or as a reference threshold for evaluating tumor response. In PERCIST criteria, the size of the VOI and its placement in the right lobe are mentioned, but not the position to which the measure should be made. However, a recent study found an excellent inter-observer agreement and no significant difference whether the VOI is placed in the upper part, the portal level, or at the bottom of the right liver lobe [23]. Our results are also in agreement with those of Grohien et al. [24], who showed a larger dispersion of values and a higher variance for hepatic SUV max compared to hepatic SUV average .

Conclusions
Tumor SUV peak (volume 1 cm 3 ) was the most stable quantitative parameter for acquisition times over 1.5 min at standard clinical [ 18 F]FDG activities.
Although not yet widely available, commercial development of SUV peak may increase the reproducibility and accuracy of quantitative PET studies, this quality of measure being essential for response-monitoring purposes. Average SUV over a 3-cm VOI in the right liver lobe was a good method for a robust assessment of hepatic metabolism, confirming the choice of the liver as the reference organ in [ 18 F]FDG studies.