Assessment of simplified ratio-based approaches for quantification of PET [11C]PBR28 data

Purpose Kinetic modelling with metabolite-corrected arterial plasma is considered the gold standard for quantification of [11C]PBR28 binding to the translocator protein (TSPO), since there is no brain region devoid of TSPO that can serve as reference. The high variability in binding observed using this method has motivated the use of simplified ratio-based approaches such as standardised uptake value ratios (SUVRs) and distribution volume (VT) ratios (DVRs); however, the reliability of these measures and their relationship to VT have not been sufficiently evaluated. Methods Data from a previously published [11C]PBR28 test-retest study in 12 healthy subjects were reanalysed. VT was estimated using a two-tissue compartment model. SUVR and DVR values for the frontal cortex were calculated using the whole brain and cerebellum as denominators. Test-retest reliability was assessed for all measures. Interregional correlations were performed for SUV and VT, and principal component analysis (PCA) was applied. Lastly, correlations between ratio-based outcomes and VT were assessed. Results Reliability was high for VT, moderate to high for SUV and SUVR, and poor for DVR. Very high interregional correlations were observed for both VT and SUV (all R 2 > 85%). The PCA showed that almost all variance (>98%) was explained by a single component. Ratio-based methods correlated poorly with VT (all R 2 < 34%, divided by genotype). Conclusions The reliability was good for SUVR, but poor for DVR. Both outcomes showed little to no association with VT, questioning their validity. The high interregional correlations for VT and SUV suggest that after dividing by a denominator region, most of the biologically relevant signal is lost. These observations imply that results from TSPO PET studies using SUVR or DVR estimates should be interpreted with caution.


Introduction
The PET radioligand [ 11 C]PBR28 binds to the translocator protein (TSPO), which is expressed in glial cells and regarded as a marker of brain immune function. Since there is no reference brain region devoid of TSPO [1], kinetic modelling with metabolite-corrected arterial plasma as input function is considered the gold standard for analysis of [ 11 C]PBR28 binding, and the distribution volume (V T ) is the commonly used outcome measure. There is, however, a large degree of intra-and interindividual variability in V T , even after accounting for TSPO affinity genotype [2,3]. This variability reduces sensitivity for detection of effects in clinical studies. In attempts to circumvent this shortcoming, simplified ratio-based approaches, including standardised uptake value ratios (SUVRs), or distribution volume ratios (DVRs), have been suggested and applied [4][5][6].
Recently, a test-retest analysis of [ 11 C]PBR28 SUVR values in Alzheimer's disease patients was reported, showing an apparent high utility of this method [7]. The study observed low absolute percentage variability and high intraclass correlation coefficient (ICC) values in five high-affinity binders (HABs). Apart from reducing variability, this approach would additionally be advantageous from a practical perspective, by omitting the need for arterial blood sampling. However, the study did not examine the association of SUVR to traditional V T values. With regard to DVR, neither reliability nor relation to V T has been examined.
The objectives of this study were to assess the testretest reliability of [ 11 C]PBR28 ratio-based outcomes and to examine their association with V T in healthy control subjects. We also investigated the interregional correlations for SUV and V T respectively, since the relationships between target and denominator regions, which both contain TSPO, may influence both the reliability and validity of ratio-based outcome measures.

Subjects
PET measurements from 12 healthy subjects (mean age 23.9, sd 2.99, 6 females) who had participated in a previous test-retest study of [ 11 C]PBR28 binding [2] were included in the analysis. Six participants were mixedaffinity binders (MABs) and six were high-affinity binders (HABs). The study was approved by the Karolinska University Hospital Radiation Safety Committee and the Regional Ethics Committee in Stockholm. All subjects gave written informed consent prior to participating.

Test-retest study design
Six of the subjects underwent the two PET examinations on the same day, and for the remaining six, the examinations were run 2-5 days apart. Radiosynthesis and production of [ 11 C]PBR28 was performed as described previously [2]. All examinations were performed using the high-resolution research tomograph (Siemens Molecular Imaging, Knoxville, TN). Radioactivity concentration in blood was obtained by arterial measurements, from which a metabolite-corrected arterial input function was derived as described previously [8].
For one individual, the second PET examination was shortened due to technical issues. This subject was  from the test-retest analysis, but the first PET examination was included in correlational analyses.
SUVs were calculated between 40 and 60 min in order to allow for a direct comparison with Nair et al. [7]. To derive V T values, kinetic modelling was performed on TACs from 0 to 63 min, using the R package kinfitr (version 0.2.0,www.github.com/mathesong/kinfitr).
The fractional volume of blood present in the tissue volume (vB) and the delay between the arterial input function and TACs were fitted using the two tissue compartment model (2TCM) and the whole brain TAC. Subsequently, total distribution volume (V T ) for each ROI was estimated using 2TCM using the fitted delay and vB from the prior whole brain fit. vB values ranged between 2.7 and 6.4% (HABs: mean = 3.9%, sd = 1.0%; MABs: mean = 3.8%, sd = 0.8%).
We used two different denominator regions, i.e. the whole brain (WB) and cerebellum (CBL), to derive the ratio-based outcomes for SUV and V T for FC. This produced the following outcome measures: SUVR WB , SUVR CBL , DVR WB and DVR CBL .

Statistical analysis
Interregional correlations were derived for ROI V T and SUV values. Principal component analysis (PCA) was used to further examine the correlational structure of the data by identifying the number of independent components required to explain the majority of the variability in ROI V T values. PCAs were performed independently for PET1 and PET2 after zscore standardisation of V T values within genotype groups. For the analyses of variability and reliability, we focused on the frontal cortex as target region. The coefficient of variation (COV) was calculated for all outcome measures, using both PET measurements combined. We used the intraclass correlation coefficient (ICC) as a measure of test-retest reliability. The one-way ANOVA fixed effects ICC was used: where MS B and MS W represent the between-and within-subject mean sums of squares and k represents the number of groups (in this case 2). We also calculated the absolute percentage variability (VAR), and the standard error of measurement (SEM) (expressed as a ratio to the mean value, providing the estimated withinsubject COV) [9]. For V T , genotype groups were analysed separately. For ratio methods, results are also reported as combined, since the genotype effect is mostly cancelled out. Finally, we correlated SUV and all ratiobased outcomes against V T . All statistical analysis was performed in R (version 3.4.0).

Interregional correlations and principal component analysis
The interregional correlations were high both for V T and for SUV, in both HABs and MABs (all R 2 > 0.85) (Fig. 1). Using all six ROIs, the first component of the PCAs explained 98.7 and 99.4% of the total variability for PET1 and PET2 respectively. Using only the frontal cortex, whole brain and cerebellum, the first component explained 99.6 and 99.7% respectively.

Variability and test-retest reliability in the frontal cortex
High inter-and intraindividual variability was observed both for V T and SUV. V T showed high reliability with ICC values of 0.89 (HABs) and 0.93 (MABs), corresponding to 11 and 7% of the variance estimated as being attributable to error respectively. SUV and SUVR Table 1 Mean values, variability and test-retest metrics for V T , SUV, SUVR and DVR using the frontal cortex as the target region and the cerebellum (CBL) or whole brain (WB) as the denominator region for the ratio-based outcomes showed moderate to high reliability, while DVR showed poor reliability with half on average of the signal estimated to be attributable to error (Table 1).

Relationships with V T
SUV was found to be moderately associated with V T , but the estimated association differed both between genotypes, as well as between individuals (Fig. 2). SUVR and DVR correlations with V T , after separating individuals by genotype, showed R 2 values <34% for all regions (Table 2 and Fig. 3).

Discussion
The reliability of SUVR was moderate to high as has been reported earlier in Alzheimer's disease patients (7). For DVR, the reliability was poor. For both SUVR and DVR, associations with the traditional outcome measure V T were weak or non-existent. Hence, if V T is considered to be at least moderately associated with brain TSPO levels in healthy subjects, the validity of ratiobased methods must be questioned. The interregional correlations and PCA showed that almost all variability between ROIs, including denominator ROIs, is attributable to a single underlying dimension of variance. Consequently, dividing the outcome from a target region with a highly correlated denominator region leaves minimal residual differences between individuals. This means that a large part of the biologically relevant signal is lost. This will be the case especially when using the whole brain as a denominator, as the target region is included within the reference region. Although the resulting low COV, VAR and SEM values for SUVR and DVR may seem reassuring, the low reliability as well as the weak correlations with VT does indeed suggest that the remaining variance is largely attributable to noise.
Importantly, this study was conducted using young, healthy participants. As such, no regionally specific alterations in TSPO binding are to be expected, which may partially account for the high degree of interregional correlations observed. The present results suggest that SUVR or DVR estimates may be useful when there is already strong evidence for regionally specific changes in TSPO expression (for example, [10]). These approaches have been suggested also for diseases which affect the brain more globally, based on evidence for a region with relatively spared pathology [4,7]. However, in practice, the use of ratio methods is conditional on prior knowledge of both (i) significant changes in TSPO expression in target regions such that interregional correlations are reduced and (ii) significant equivalence [11,12] of TSPO expression in the reference region between groups. These prerequisites have, to our knowledge, not yet been fulfilled for any disease or TSPO radioligand, and results obtained using SUVR or DVR estimates should therefore be interpreted with caution. When the whole brain is used as denominator, (i) and (ii) are particularly unlikely to co-occur due to the overlap between target and  Fig. 3 Associations between frontal cortex V T and ratio-based outcomes, using the whole brain and cerebellum as denominator regions. Dotted lines indicate repeated measurements reference regions. As shown in our analysis, this leads to further reductions in variability which may result in artificially inflated effect sizes, sometimes even in the direction opposite to that of the raw V T values [5].
We found medium to high reliability of SUV, suggesting a potential utility of this method. However, the relationship with V T differed between both genotypes and individuals, which is in line with previous observations in non-human primates [13]. This may indicate a nonlinear relationship between SUV and V T , in which case a potential ceiling effect may lead to loss of sensitivity of SUV to detect increases in binding. More importantly, the use of SUV relies on the assumption of no differences in radioligand delivery to the brain between groups. In patient-control samples, this is not something that can be safely assumed, since the disorder may involve changes in brain blood flow, and where differences in metabolism, protein binding or peripheral TSPO binding cannot be excluded without arterial sampling.