Comparative evaluation of SUV, tumor-to-blood standard uptake ratio (SUR), and dual time point measurements for assessment of the metabolic uptake rate in FDG PET.

BACKGROUND
We have demonstrated recently that the tumor-to-blood standard uptake ratio (SUR) is superior to tumor standardized uptake value (SUV) as a surrogate of the metabolic uptake rate K m of fluorodeoxyglucose (FDG), overcoming several of the known shortcomings of the SUV approach: excellent linear correlation of SUR and K m from Patlak analysis was found using dynamic imaging of liver metastases. However, due to the perfectly standardized uptake period used for SUR determination and the comparatively short uptake period, these results are not automatically valid and applicable for clinical whole-body examinations in which the uptake periods (T) are distinctly longer and can vary considerably. Therefore, the aim of this work was to investigate the correlation between SUR derived from clinical static whole-body scans and K m-surrogate derived from dual time point (DTP) measurements.


METHODS
DTP (18)F-FDG PET/CT was performed in 90 consecutive patients with histologically proven non-small cell lung cancer (NSCLC). In the PET images, the primary tumor was delineated with an adaptive threshold method. For determination of the blood SUV, an aorta region of interest (ROI) was delineated manually in the attenuation CT and transferred to the PET image. Blood SUV was computed as the mean value of the aorta ROI. SUR values were computed as ratio of tumor SUV and blood SUV. SUR values from the early time point of each DTP measurement were scan time corrected to 75 min postinjection (SURtc). As surrogate of K m, we used the SUR(T) slope, K slope, derived from DTP measurements since it is proportional to the latter under the given circumstances. The correlation of SUV and SURtc with K slope was investigated. The prognostic value of SUV, SURtc, and K slope for overall survival (OS) and progression-free survival (PFS) was investigated with univariate Cox regression in a homogeneous subgroup (N=31) treated with primary chemoradiation.


RESULTS
Correlation analysis revealed for both, SUV and SURtc, a clear linear correlation with K slope (P<0.001). Correlation SUR vs. K slope was considerably stronger than correlation SUV vs. K slope (R (2)=0.92 and R (2)=0.69, respectively, P<0.001). Univariate Cox regression revealed SURtc and K slope as significant prognostic factors for PFS (hazard ratio (HR) =3.4/ P=0.017 and HR =4.3/ P=0.020, respectively). For SUV, no significant effect was found. None of the investigated parameters was prognostic for OS.


CONCLUSIONS
Scan-time-corrected SUR is a significantly better surrogate of tumor FDG metabolism in clinical whole-body PET compared to SUV. The very high linear correlation of SUR and DTP-derived K slope (which is proportional to actual K m) implies that for histologically proven malignant lesions, FDG-DTP does not provide added value in comparison to the SUR approach in NSCLC.


(Continued from previous page)
Conclusions: Scan-time-corrected SUR is a significantly better surrogate of tumor FDG metabolism in clinical whole-body PET compared to SUV. The very high linear correlation of SUR and DTP-derived K slope (which is proportional to actual K m ) implies that for histologically proven malignant lesions, FDG-DTP does not provide added value in comparison to the SUR approach in NSCLC.

Background
In a recent publication [1], we have demonstrated that the tumor-to-blood standard uptake ratio (SUR) is superior to tumor standardized uptake value (SUV) as a surrogate of the metabolic uptake rate K m of [ 18 F]fluorodeoxyglucose (FDG), overcoming several of the known shortcomings [2][3][4][5][6] of the SUV approach. In that work, we performed dynamic PET scans of liver metastases and computed lesion K m using the Patlak method [7,8] which was compared to lesion SUV and SUR in the late time frames. For these thus perfectly standardized uptake periods prior to SUV and SUR determination, we found that SUR correlated much better than SUV with the true metabolic rate of FDG.
However, in clinical oncological PET, variability of the uptake period is unavoidable [9,10] which directly translates into a corresponding variability of the measured tracer uptake [11]. To avoid this limitation, a method to reliably correct SUR for variations of the FDG uptake period by converting SUR to a preselected fixed scan time point was proposed recently by our group [12]. This scan-time normalized SUR removes several of the shortcomings of SUV. We found strong evidence in a survival analysis of 130 patients with esophageal carcinoma that, consequently, SUR has a higher prognostic value than SUV [13]. However, these results were achieved in clinical whole-body scans with varying uptake periods. Thus, our previous results [1] regarding the correlation of SUR and K m (with perfectly standardized and relatively short uptake period) are not necessarily valid for static oncological PET with variable uptake times. It remains an open question whether the improved prognostic value can actually be related to a linear correlation between SUR and K m which is superior to that between SUV and K m .
Therefore, the aim of the present work was to investigate the correlation of SUV and SUR, respectively, with K m in clinical whole-body scans. We evaluated 90 dual time point (DTP) measurements of patients with non-small cell lung cancer (NSCLC) which were used to derive the SUR slope K slope as a surrogate parameter of K m with a procedure similar to that in [14]. Secondary aim of this study was to test the prognostic value of SUV and SUR in comparison to K slope in a homogeneous subgroup (N = 31) of patients with NSCLC which underwent primary chemoradiation.

Assessment of metabolic uptake rate in dual time point measurement
In [14], we have shown that the metabolic uptake rate K m of a tumor lesion can be estimated from a DTP measurement using two time points T 1 and T 2 . There, we interpolated the arterial input function (AIF) c a (t) in the time window [ T 1 , T 2 ] using a single exponential. While mono-exponential interpolation is sufficient in the context of [14] and does not require knowledge or assumptions concerning the AIF outside the considered DTP time window, we have shown [12] that the shape of the AIF can be described globally (i.e., for all times after completion of the bolus passage) quite accurately by an inverse power law (i.e., a hyperbola) where, moreover, b seems only modestly variable (≈ ± 10 %) across different investigations. (Precisely speaking, t in Eq. 1 has to be divided by the chosen time unit (e.g., 1 min.) to generate a dimensionless number that can be exponentiated; A then represents the AIF value at t = 1 time unit.) Utilizing this information offers an alternative way of estimating K m (or a proportional surrogate) from a DTP measurement. An immediate consequence of the hyperbolic AIF shape is that the so-called Patlak time entering the Patlak equation [7,8] (c t , tissue concentration; V r , apparent volume of distribution) is given by Recalling that SUR is by definition equal to the left-hand side of Eq. 3, the Patlak equation can be rewritten as In other words, if the AIF obeys Eq. 1 with some fixed value for b, SUR varies linearly with time for a given K m and V r . Furthermore, V r is numerically small and might be replaced by some group-averaged constant valuē V r without introducing relevant errors (see [1]). SUR then also varies linearly with K m (i.e., across different lesions/investigations/patients) at a given T. Equation 5 thus has two immediate consequence. The first one, as discussed in [12], is that scan time correction of SURs from actual measurement time point T to a reference time T 0 is possible according to which allows to use scan-time-corrected SURs as a surrogate of K m . The second consequence is that K m is directly related to the SUR slope, which can be derived from a DTP measurement as If it is permissible to neglect inter-and intra-individual variability of the shape parameter b (i.e., assuming a generic AIF shape), we come to the conclusion that the DTP-derived SUR slope is directly proportional to K m in an investigation-independent way. Of course, the above considerations rest on several assumptions whose validity cannot be taken for granted, especially if results from previous investigations are applied in a different context (e.g., different patient groups, much later scan times). Especially, notable deviations of the AIF from the assumed/extrapolated hyperbola at late times or violation of the assumed irreversible kinetics would affect the relation between instantaneous SUR at some fixed time point and the DTP-derived SUR slope.
In the present investigation, we therefore want to clarify the empirical relation between scan-time-corrected SUR (which is derivable from a standard whole-body investigation) and the SUR slope as derivable from DTP measurements and the extent to which this correlation is in agreement with the theory presented above according to which both quantities can serve as essentially equivalent surrogates of K m .

Patient group
In the present study, 105 consecutive patients with histologically proven NSCLC were included retrospectively.
Evaluation of the data was approved by the Institutional Ethics Committee, and all subjects provided written informed consent. In all patients, a routine dual time point FDG PET/CT was performed between March 2011 and June 2014 prior to treatment (radio(chemo)therapy (RCT) and/or surgery). Twelve patients were excluded because of a too short time difference between the two scans ( T< 20 min) which compromises reliable determination of the DTP-derived SUR slope. Three patients were excluded because of misalignment of PET and attenuation CT affecting reliable SUR quantitation (see below). Altogether, 90 subjects were included (70 men, 20 females) with a mean age of (range) 67 (45-85) years. Validation of scan time correction of SUR and correlation analysis of K slope vs. SUV and SUR, respectively, was performed in this group.
Survival analysis (see below) was performed in a homogeneous subgroup. Inclusion criteria were as follows: inoperable primary tumor, curative treatment intent, and no distant metastases. Altogether, 31 subjects were included in the survival analysis (27 men, 4 females) with a mean age of (range) 67 (49-85) years. Characteristics of the tumors are summarized in Table 1.

Image analysis
Coregistration, region of interest (ROI) definition, and ROI analyses were performed using the ROVER software, version 3.0.5 (ABX, Radeberg, Germany). Here and in the following, ROI is used synonymously with "VOI" for denoting a three-dimensional volume of interest. For all PET data, the alignment with the attenuation CT (with focus on the tumor region) was inspected. Data were excluded when substantial parts of the FDG uptake was outside the morphological volume as measured in the CT data.
PET data of the late scan were coregistered to the PET data of the early scan using rigid body transformations. Coregistration was restricted to the tumor region plus a margin of 3-5 cm. Coregistration was visually inspected using the difference image of the late and early scans which allows to detect misalignments in the order of half of a voxel. Alignment was corrected manually when necessary. This was the case in 5 out of 90 cases.
The metabolically active part of the primary tumor was delineated in the early scan by an automatic algorithm based on adaptive thresholding taking the local background into account [15,16]. The result of the automatic delineation was inspected visually by an experienced observer and corrected manually in case of obvious segmentation failure. The resulting ROIs were transferred to the respective late scan (Fig. 1). In both scans, SUV mean was computed. In the following, the index "mean" is omitted, since only the mean value of lesion SUV/SUR was considered in the evaluation. Lesion mean rather than maximum or peak (maximum + immediate vicinity) values where used since in the special case of DTP measurements, mean values can be determined with higher accuracy by performing precise coregistration and using identical delineation in both measurements. The usual problems typical of lesion mean values (systematic errors including partial volume effects due to variable delineation) thus affect both time points of the DTP measurement identically which minimizes their adverse effects. Maximum and peak values on the other hand do have distinctly higher statistical errors. Therefore, the minimum required time difference is distinctly larger when using maximum values which would not have been acceptable for the present retrospective evaluation of available data.
The arterial blood SUV was determined by defining a roughly cylindrical aorta ROI in the attenuation CT data which was then transferred to the PET data. To reduce partial volume effects, a concentric safety margin was used in the transaxial planes, centering the ROI in the aorta. Planes showing high tracer uptake close to the aorta (pathological or otherwise) were excluded. The minimum volume of the aorta ROI was 5 ml. Blood SUV was computed as mean SUV of the aorta ROI. The DTP pairs of blood SUVs were separately analyzed regarding consistency with the assumption of an invariant AIF shape (see Appendix).
SUR was computed as ratio of lesion SUV and blood SUV. K slope was computed according to Eq. (8). SUR values from the early time point of each DTP measurement (SUR 1 ) were scan time corrected to T 0 = 75 min (SUR tc ) using Eq. (6). The early time points were chosen since they correspond closely to the uptake times typical for static whole body investigations.
The performance of scan time correction was assessed by mean ± standard deviation (SD) of the fractional difference of SUR between late (SUR 2 ) and early (SUR 1 ) scan (δSUR = (SUR 2 − SUR 1 )/SUR 1 ) before and after applying scan time correction to SUR 2 from T 2 to T 1 . Linear correlation analysis of SUV and, respectively, SUR tc vs. K slope was performed and visualized through scatterplots. Correlations were compared using a two-tailed z-test of the corresponding (Fisher transformed) correlation coefficients.

Survival analysis
Survival analysis was performed in the patient subgroup described above (N = 31). In this group, the association of the overall survival (OS) and progression-free survival (PFS) with SUV, SUR tc , and K slope was analyzed using univariate Cox proportional hazard regression in which the PET parameters were included as binarized parameters. The cutoffs used for binarization were calculated by performing an univariate Cox regression for each measured value. The value leading to the hazard ratio (HR) with the highest significance was used as cutoff. The probability of survival was computed and rendered as Kaplan-Meier curves, and samples were compared using a log-rank test.
Statistical significance was assumed at a P value of less than 0.05. Statistical analysis was performed with the R language and environment for statistical computing [17] version 3.1.2.

Results
Measured SURs are depicted in Fig. 2. The two respective DTP measurements (black circles) are connected with solid lines. The dashed lines represent linear extrapolations to T = 0 yielding individual estimates of V r whose average isV r = (−0.35 ± 0.83) ml/ml. Consequently, the minor influence of a finite V r might be neglected and V r = 0 be used during scan time correction of SUR (Eq. 6) which thus reduces to SUR(T 0 ) = T 0 /T × SUR(T).
δSur of the uncorrected values (i.e., the fractional difference between actually measured SUR 2 and SUR 1 of the DTP pairs) showed the expected strong dependency on T with a large average value of (73 ± 34) % (Fig. 3). After scan time correction of SUR 2 from T 2 to T 1 , the difference is essentially removed resulting in δSUR = (1.5 ± 7.6)%. Correlation analysis revealed for both, SUV and SUR tc , a clear linear correlation with K slope (P < 0.001). Correlation SUR tc vs. K slope was considerably stronger than correlation SUV vs. K slope (R 2 = 0.92 and R 2 = 0.69, respectively, P < 0.001). While SUR tc thus correlates highly with K slope (and by implication with K m ) (Fig. 4b), this is not the case for SUV (Fig. 4a) where large deviations from the regression line do occur.
In survival analysis (N = 31), univariate Cox regression revealed SUR tc and K slope as significant prognostic factors for PFS (HR= 3.4/P = 0.017 and HR= 4.3/P = 0.020, respectively). For SUV, no significant effect was found ( Table 2). Corresponding Kaplan-Meier curves are shown in Fig. 5. None of the investigated parameters was prognostic for OS.

Discussion
In this work, we investigated the correlation of SUV and scan-time-corrected SUR with the DTP-derived rate of SUR increase (SUR-slope), K slope . Accepting the results provided in the Appendix together with those of our previous studies [1,12] as sufficient evidence for an essentially invariant hyperbolic AIF shape described by a unique value of the exponent b valid for the whole investigated patient group, K slope is a proportional measure of K m . The main result of the current analysis is that SUR tc correlates significantly better with K slope (and thus K m ) than is the case for SUV. This finding is a direct consequence of the behavior apparent in Fig. 2 which demonstrates-in accordance with Eq. 5-that with good accuracy, all DTP-derived SUR pairs can be described by straight lines with quite small y-axis intercepts (parameter V r in Eq. 5). While our previous results [1] yielded a mean ofV r ≈ 0.53 ml/ml, the data in Fig. 2 suggest rather to useV r = 0 ml/ml in the evaluation (which we did in the present paper). It should be emphasized that the precise choice for the (numerically small) value ofV r is of no major importance: using the previous best estimateV r = 0.53 ml/ml in the present investigation would just lead to a small bias of six percentage points in the computation of the scan-time-corrected δSUR (blue points in Fig. 3) without any notable effect on the SUR tc vs. K slope correlation and the survival analysis. We further note that there is a visible negative correlation between SUR slope (and thus K m ) and V r in Fig. 2 (correlation coefficient r = −0.75), i.e., V r tends to be smaller for larger K m , which also is to be expected on theoretical grounds. This correlation leads to a shift of the approximate point of convergence of the different straight lines. From a purely phenomenological point of view, this time shift might be accounted for in the equation relating SUR tc and K slope (while adjustingV r accordingly) but we prefer to avoid this ad hoc approach and instead just use Regarding the SUR tc vs. K m correlation, the current results are of comparable quality to that observed in [1] (R 2 = 0.92 compared to R 2 = 0.96), where the correlation of SUR and K m as derived from Patlak analysis of dynamic studies up to 60 min p.i. was investigated (see Figure 4 in [1]). Our results are also in accord with previous findings by Hunter et al. which used a somewhat different but ultimately mostly equivalent approach [18] as well as with a recent study by Grecchi et al. [19] which demonstrates the much improved correlation of SUR-referred to as "ratio method" in that paper-with K m compared to that of SUV vs. K m in a different context, namely patients with acute lung injury. The present study further supports the observation that SUR tc is a better surrogate of tumor FDG metabolism than SUV. In the current analysis, we especially demonstrated validity of this assumption for later times points p.i. (and in a different tumor entity).
This finding is also in accord with the performed survival analysis. While there was no significant effect for SUV, both SUR tc and K slope were significant prognostic factors for PFS with comparable effect size. Since it is clear that the sample size available for this analysis (N = 31) is far too small for conclusive results, further investigations will be necessary to clarify this point. Nevertheless, our results still are an indication that the increased correlation of SUR tc with K slope compared to SUV translates into an increased prognostic value. Since SUR tc and K slope showed almost the same prognostic value, it can be stated that for histologically proven NSCLC DTP measurements seem not to provide additional information compared to SUR tc analysis of static whole-body scans. We presume that this conclusion is generally valid as long as no large deviations from irreversible FDG kinetics are to be expected.
Our analysis rests on the assumption that the AIF can be described by a hyperbola (Eq. 1). Therefore, the results regarding scan time correction (Fig. 3) are of special interest. Scan time correction will work correctly only if the AIF can be described by a power law. Our results thus strongly support this assumption. To some extent, this was already shown in [12]. Here, we were able to confirm these results in a larger patient group and for a larger range of uptake periods. As has already been explained, K slope is a proportional substitute of the actual K m across different investigations and patients if the AIF follows Eq. 1 with a unique value of the exponent b, i.e., the shape need not only be hyperbolic but also be invariant across investigations. The actual numerical value taken on by b is irrelevant as long as one is not interested in quantitatively deriving K m from K slope (or SUR tc ). It is, however, relevant to ensure that shape invariance of the AIF with a constant b is a valid assumption in order to compare K slope or SUR tc from different investigations. Figures 2 and 4b do not allow any direct conclusions in this respect (and neither does Fig. 3). But the fact that, indeed, K slope as well as SUR tc perform superior to SUV in the survival analysis supports the conjecture that the former two quantities are in fact better correlated with K m which in turn implies that b ≈ const in the whole patient group. The detailed analysis of the blood SUV data presented in the Appendix supports the hypothesis that the AIF actually adheres to an invariant shape.

Conclusions
Scan-time-corrected SUR is a significantly better surrogate of tumor FDG metabolism in clinical whole-body PET compared to SUV. The very high linear correlation of SUR and DTP-derived K slope (proportional to the actual K m ) implies that for histologically proven malignant lesions, FDG-DTP does not provide added value in comparison to the SUR approach in NSCLC. The potential benefit of DTP for differentiation of malignant and inflammatory or benign lesions with high uptake should be established in further studies.

Compliance with ethical standards
All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards. Informed consent was obtained from all individual participants included in the study.
work otherwise while being a prerequisite of considering Eq. 5 for a fixed time T and predicting the correlation Fig. 4b). On the other hand, these results do not allow any direct conclusion regarding a generic (investigationindependent constant) value of the power law exponent b since only the ratio K slope = K m /(1 − b) appears in Eq. 5 and K m is not known a priori. But only if b can be assumed to be constant (and thus the AIF to be shape-invariant) across different investigations, K slope (and SUR tc ) are valid surrogates of K m when comparing data from different patients/investigations. Only the fact that K slope as well as SUR tc indeed perform superior to SUV in the survival analysis (and thus should be more closely related to K m and, ultimately, to glucose consumption) provides indirect evidence for b ≈ const.
The purpose of the present Appendix is to provide direct evidence that the b ≈ const assumption is valid for our patient group. For this purpose, Fig. 6 shows the relative decrease of blood SUV from the first to second measurements for all DTP pairs. The dashed lines connecting both points are the hyperbolas according to Eq. 1. Blue/red arrows indicate the deviation of the first/second measurements from the mean hyperbola (corresponding to the group-averaged b value).
The inset shows the individual b values for all DTP pairs as well as their mean (blue line) and standard deviation (dashed red lines). Individual error bars are determined by Gaussian error propagation using a realistic estimate of 5 % for the statistical accuracy of the separate blood SUV measurements.
As can be seen, the deviation of the individual blood SUVs from the mean hyperbola is quite small. The inset, furthermore, demonstrates, that the standard deviation of the b value distribution compares favorably with the error bars of the individual b values. Altogether, the data thus do not provide any evidence at the given level of measurement accuracy that the individual b values are significantly different, i.e., the assumption of a common AIF shape (at least in the accessible time window) is fulfilled in this patient group. As already explained, this suffices to ensure that K slope as well as SUR tc can act as accurate surrogates of K m which in our view is the underlying explanation for superior performance of both parameters in comparison to SUV in the survival analysis.
Obviously, knowledge of the actual numerical value of b is irrelevant (as long as it remains constant across investigations) if one is not interested in quantitatively deriving K m from K slope or SUR tc . While the actual numerical value of b thus is of no importance for the present paper, it seems necessary to point out that the average b value derived from the data in Fig. 6 (b = 0.615 ± 0.153) is distinctly larger-corresponding to a more pronounced decrease of the AIF over time-than our previously published valueb = 0.313 ± 0.030). While the latter value is more reliable (being based on full, dynamic AIFs) the underlying data were restricted to times ≤60 min, whereas the present data cover distinctly later times after injection. For the present data, the substantial discrepancy between both b values still only implies on average a ≈ ±8 % change of the first/second blood SUV of the DTP measurement in comparison to what would have been expected from the previous results. Presently, it cannot be ruled out completely that this effect is real (rather than some unidentified systematic small error in the data such as contrast-dependence of the scanner's image reconstruction software) and that the AIF shape is deviating from simple hyperbolic behavior beyond 60 min p.i. Based on data from an independent ongoing investigation, we think this to be improbable but this question deserves further attention.