Reproducibility of 18F-fluoromisonidazole intratumour distribution in non-small cell lung cancer

Background Hypoxic tumours exhibit increased resistance to radiation, chemical, and immune therapies. 18F-fluoromisonidazole (FMISO) positron emission tomography (PET) is a non-invasive, quantitative imaging technique used to evaluate the presence and spatial distribution of tumour hypoxia. To facilitate the use of FMISO PET for identification of individuals likely to benefit from hypoxia-targeted treatments, we investigated the reproducibility of FMISO PET spatiotemporal intratumour distribution in patients with non-small cell lung cancer (NSCLC). Methods Ten patients underwent 18F-fluorodeoxyglucose (FDG) PET/CT scans, followed by two FMISO PET/CT scans 1–2 days apart. Nineteen lesions in total were segmented from co-registered FDG PET image sets. Volumes of interest were also defined on normal contralateral lung and subscapularis muscle. The Pearson correlation coefficient r was calculated for mean standardized uptake values (SUV) within investigated volumes of interest and for voxels within tumour volumes (r TV). The reproducibility of FMISO voxelwise distribution, SUV- and tumour-to-blood ratio (TBR)-derived indices was assessed using correlation and Bland-Altman analyses. Results The SUVmax, SUVmean, TBRmax, and TBRmean were highly correlated (r ≥ 0.87, p < 0.001) and were reproducible to within 10–15 %. The mean r TV was 0.84 ± 0.10. 77 % of voxels identified as hypoxic on one FMISO scan were confirmed as such on the other FMISO scan. Mean voxelwise differences between TBR values as calculated from pooled data including all lesions were 0.9 ± 10.8 %. Conclusions High reproducibility of FMISO intratumour distribution in NSCLC patients was observed, facilitating its use in determining the topology of the hypoxic tumour sub-volumes for dose escalation, in patient stratification strategies for hypoxia-targeted therapies, and in monitoring response to therapeutic interventions. Trial registration Current Controlled Trials NCT02016872

Background Non-small cell lung cancer (NSCLC) remains the leading cause of cancer-related mortality worldwide [1]. Tumourcell hypoxia, a common feature of solid tumours, is a pivotal determinant of the effectiveness of radiation, chemical, and immune therapies and is associated with poor overall survival [2,3]. The hypoxic microenvironment can be assessed by a variety of approaches, e.g. by measurement of partial pressure of oxygen with polarographic electrodes [4] or by immunohistochemical detection of endogenous and exogenous hypoxia markers [5]. However, such procedures are invasive and potentially hazardous, restricted to accessible lesions, and limited by sampling errors. 18 F-fluoromisonidazole (FMISO) positron emission tomography (PET), a non-invasive imaging technique, presents an attractive alternative [6][7][8]. FMISO is clinically the most extensively investigated hypoxia PET tracer. Several studies in lung cancer patients have suggested stratification strategies based on FMISO uptake and kinetics [9][10][11][12]. The hypothesis is that selective dose painting of putative radioresistant hypoxic tumour sub-volumes, as defined by FMISO PET, may improve locoregional control [13]. Numerous efforts also continue to evaluate 18 F-fluorodeoxyglucose (FDG) PET for target delineation in radiotherapy [14,15] and explore the utility of intensity-modulated radiotherapy (IMRT) based on FDG voxel intensities [16,17]. However, despite a number of ongoing hypoxiaimaging trials, quantitative assessment of intratumour distribution of hypoxia-specific PET tracers has yet to be widely implemented in the clinical decision-making process.
In order to fully exploit the benefits of incorporating tumour hypoxia information as obtained by FMISO PET into patient management, whether as an IMRT target, in patient stratification strategies for hypoxia-targeting regimens, or for monitoring response to therapeutic interventions, it is essential to examine the spatiotemporal reproducibility of FMISO intratumour distribution. To our knowledge, such studies have been performed only in head and neck cancer (HNC) patients, with discordant results [18,19]. Due to the absence of similar studies in other tumour entities, e.g. lung cancer, it remains unclear to what extent the reproducibility of FMISO will be affected by the lack of rigid immobilization and the presence of respiratory motion. Therefore, the aim of this pilot study was to investigate the reproducibility of FMISO intratumoural distribution in serial baseline FMISO PET scans in a cohort of NSCLC patients.

Ethics statement
This study was approved by Memorial Sloan Kettering Cancer Center's Institutional Review Board (IRB 13-186; registered under www.clinicaltrials.gov identifier number NCT02016872), and all subjects signed a written informed consent regarding the examination and use of anonymized data for research and publication purposes. The methods were carried out in accordance with the approved guidelines.

Patient characteristics
The eligibility criteria were as follows: age > 18 years, pathological confirmation of NSCLC, no prior treatment, primary or nodal tumour measuring ≥2 cm on CT, and a Karnofsky performance status of ≥70 %. Exclusion criteria included pregnant or breast-feeding women and patients with severe diabetes (fasting blood glucose >200 mg/dl). Fifteen patients agreed to participate in the study. Patients were scanned on a flat-top couch insert and immobilized in an alpha cradle (Smithers Medical Products, Inc.). As the second FMISO PET scan was not acquired for five patients due to their inability to continue (n = 3) or technical reasons (n = 2), ten patients in total were included in the reproducibility analysis (Table 1). 18

F-fluorodeoxyglucose PET/CT protocol
Each patient underwent an FDG PET/CT study scan for radiotherapy simulation purposes. Patients were injected intravenously with 460 ± 17 MBq (range, 429-477 MBq) of FDG, after a fasting period of ≥6 h. PET scans were acquired for 3 min/bed position, at 100 ± 38 min (range, 60-171 min) post-injection (pi). All PET data were acquired in 3D mode on a General Electric Discovery ST PET/CT (GE Health Care Inc.). A CT acquired in cine mode (140 kVp, 10 mA, 5.0-mm slice thickness, 0.5-s tube rotation) was averaged (CT avg ) and used for attenuation correction of PET images. The cine duration was set to match the patient breathing period plus 1 s (~6 s on average). PET emission data were corrected for attenuation, scatter, and random events and reconstructed into 128 × 128 × 47 matrix (voxel dimensions 5.47 × 5.47 × 3.27 mm). The reconstruction was performed using the GE ordered subset expectation maximization (OSEM) algorithm with standard clinical reconstruction parameters: 2 iterations, 16 subsets, and 6.0 mm full width at half maximum Gaussian post-filter. 18 F-fluoromisonidazole PET/CT protocol Ten patients underwent two FMISO PET studies each (i.e. FMISO1 and FMISO2). FMISO1 was performed 2.4 ± 1.4 days (range, 1-5 days) after the FDG PET/CT, with FMISO2 being performed 1.7 ± 1.6 days (range, 1-6 days) after FMISO1. Patients received an average FMISO bolus injection of 388 ± 15 MBq (range, 356-407 MBq). Data were acquired for 10 min over one field of view (~15 cm; centred on the lesions) at 163 ± 13 min pi (range, 144-183 min). A low-dose cine CT scan (the same parameters as for the FDG study) was performed and used for attenuation correction and image co-registration. All FMISO

Image analysis
The FDG and FMISO2 tumour volumes were co-registered to those of FMISO1 by means of their corresponding CT avg image sets, using the GE AW Workstation v4.6 General Co-Registration tool (GE Health Care Inc.). Rigid transformation was used, and the results were visually inspected for potential mismatches. The transformation matrices obtained were then applied to the corresponding PET images. Tumour metabolic target volumes (TV) were delineated on the FDG PET images with the adaptive threshold algorithm in the GE AW Workstation PET VCAR™ (Volume Computer-Assisted Reading) semi-automated software (FDA-approved), which is based on the companion CT as a fiduciary marker and a count-based edge recognition algorithm. The corresponding target volumes were subsequently copied onto the two coregistered FMISO image sets. Tumour uptake in the target volumes on the two FMISO scans was compared on a voxel-by-voxel basis in PMOD v3.604 (PMOD Technologies GmbH). Activity concentration data were converted into standardized uptake values (SUV; normalized to lean body mass). The blood SUV (SUV blood ) was measured by (i) segmenting the descending aorta on the CT avg , (ii) copying the volume of interest (VOI) to the corresponding FMISO PET, (iii) eroding the VOI by 1 voxel in 3D, and (iv) measuring the average SUV within the eroded VOI.
Hypoxic sub-volume (HTV; in cm 3 ) was defined as including voxels within the TV having a tumour-to-blood ratio (TBR) ≥ 1.2 on both FMISO scans [8]. For each esion, maximum and mean values for voxels within the TV were calculated in units of SUV (SUV max , SUV mean ) and TBR (TBR max , TBR mean ). Reproducibility of FMISO uptake was also assessed in the non-diseased normal lung tissue (by evaluating the mean SUV within a 20mm-diameter spherical VOI that was placed in the healthy contralateral lung; SUV lung ) and in the nondiseased muscle (by evaluating the mean SUV within a manually drawn VOI on the subscapularis muscle on the CT avg and subsequently copied to the corresponding PET FMISO scan; SUV muscle ).

Statistical analysis
The Pearson correlation coefficient was calculated between the FMISO1 and FMISO2 intratumour distributions (r TV ) and between all SUV-and TBR-derived indices. The normality of the distribution of differences in the investigated parameters between the two FMISO studies was verified with a two-sample Kolmogorov-Smirnov test. This was done to validate the applicability of Bland-Altman analysis, which was subsequently performed to calculate the mean differences between voxelwise TBR values and 95 % limits of agreement (LoA) [20]. The latter are defined as ±1.96 * SD of the mean differences and represent the boundaries within which 95 % of observations are expected to be observed. p < 0.05 was assumed to represent statistical significance. To evaluate the geographic stability of hypoxic sub-volumes, the percentage of intratumour voxels that were identified as hypoxic in both FMISO studies was calculated, as based on the TBR ≥ 1.2, ≥1.4, and ≥1.6 thresholds [6][7][8][9]. All statistical analyses were carried out in MedCalc v15.6 (MedCalc Software bvba).

Results
Nineteen FDG-avid lesions were included in the analysis. None of the investigated lesions were located near the edge of the PET field of view. The average lesion volume was 28 cm 3 (range, 4-111 cm 3 ). No lesions were found that would exhibit uptake on the FMISO scan while being negative on the corresponding FDG scan. As mismatches between PET and CT scans were identified in two patients (#4 and #9), the co-registrations were modified manually based on the PET images. All differences between the FMISO1 and FMISO2 scans in the SUVand TBR-derived parameters, imaging time pi, and injected dose were normally distributed, as assessed with Kolmogorov-Smirnov test (p > 0.05). Tumour volume, imaging time pi, SUV blood , SUV lung , SUV max , SUV mean , TBR max , TBR mean , HTV, and r TV are summarized in Table 2. Significantly high correlations were observed between all SUV-and TBR-derived parameters from the first and second FMISO scans (r ≥ 0.87, p < 0.001) and HTV (r = 0.99, p < 0.001).
Scatter plots for the co-registered FMISO images display intratumour voxels colour-coded according to their hypoxia status as based on the TBR ≥ 1.2 threshold (Fig. 1). The mean r TV was 0.84 ± 0.10 (range, 0.52-0.95), with all lesions except for one having r TV > 0.70. The hypoxic status (i.e. the presence of intratumour voxels with TBR above the pre-defined threshold) remained unchanged in nine out of ten patients between the two scans, regardless of the implemented threshold. 77 %, 68 %, and 63 % of voxels identified as hypoxic on one FMISO scan were confirmed as such on the other FMISO scan (based on TBR ≥ 1.2, ≥1.4, and ≥1.6, respectively).
No significant correlation could be established between SUV lung or SUV muscle and SUV max , SUV mean , TBR max , or TBR mean . The muscle-to-blood ratio, defined as SUV muscle / SUV blood , was 0.97 ± 0.11 across the patients, confirming that FMISO uptake for non-diseased normoxic tissues approaches unity. Representative FMISO PET/CT images from both scans are displayed for two patients (Fig. 2).
Bland-Altman analysis revealed that voxelwise SUVand TBR-derived indices were reproducible to within  Fig. 3). The associated limits of agreement indicate that for 95 % of cases, the relative differences will be within 22 %.

Discussion
It is important to determine the reproducibility of imagebased prognostic and predictive parameters, including those deduced from nuclear medicine images, which typically exhibit greater statistical variation (i.e. noise) than other modalities. This is especially true for hypoxiaselective radiotracers such as FMISO in light of its generally low tumour uptakes and tumour-to-background ratios. An evaluation of the reproducibility of FMISO PET in NSCLC is a prerequisite if FMISO images are to be rationally used in stratification of NSCLC patients for hypoxiatargeting regimens, monitoring response to therapeutic interventions, or to determine the topology of the hypoxic tumour sub-volumes for dose escalation. Our data showed strong correlations for both SUVand TBR-derived metrics between repeated FMISO scans, corroborating results from FDG and 18 F-flortanidazole (HX4) PET scans of NSCLC patients [21][22][23]. TBR max and TBR mean were as reproducible as SUV max and SUV mean , despite the fact that the definition of a second region of interest to measure the blood activity Fig. 1 Reproducibility of FMISO intratumor distribution in patients with NSCLC. Voxelwise scatter plots of tumour-to-blood ratio in FMISO1 (x-axis) vs. FMISO2 (y-axis) are presented for all 19 lesions. Black, blue, and red voxels represent normoxic, hypoxia-ambiguous, and hypoxic tumour sub-volumes, respectively, as based on the TBR ≥ 1.2 threshold (dashed lines). Equality lines (dotted) and r TV are also displayed for all scatter plots. r TV values were significant in all cases introduces additional source of variability. The classification status (i.e. indicating either the presence or absence of tumour hypoxia) as based on one FMISO scan remained unchanged in the majority (9/10) of patients when reassessment was performed using the other FMISO scan. These results are encouraging in the context of using FMISO PET in stratification of NSCLC patients for hypoxia-targeted treatments. Changes in FMISO uptake were reported to measure the early response to chemoradiotherapy in NSCLC [10]; however, it remains unclear to what extent the spatiotemporal variability of FMISO PET affects the quality of monitoring treatment response.
Data on the reproducibility of FMISO intratumour distributions from serial FMISO PET scans have been presented previously only for HNC patients, in two separate studies by Nehmeh et al. [18] followed by Okamoto et al. [19]. While Nehmeh et al. reported variability in spatial uptake, speculating that the possible differences were observed due to transient hypoxia, Okamoto et al. Okamoto and colleagues further speculated that another potential reason for the discrepancy between the two studies might have been imaging time post-injection and considered imaging at 4+ h to be more suitable, due to slow clearance of FMISO from the blood [19]. While longer waiting periods should in principle increase the contrast (and image noise), our results indicate that for non-small lung cancer, similarly high reproducibility can be obtained when imaging already at 2.5 h postinjection. The mean r TV (0.84 ± 0.10) is comparable to the results from Okamoto et al. (0.89 ± 0.09 [19]), though not with those from Nehmeh et al. (0.60 ± 0.14 [18]). While in the current study patients were imaged at 163 ± 13 min pi, there are several differences in the methodology compared to that by Nehmeh and colleagues [18]: (i) variations in scan times were substantially lower (5 ± 4 %), (ii) data were acquired in 3D mode for 10 min, (iii) image acquisition was performed on a more recent version of the GE PET/CT scanner, and (iv) a different (FDA-approved) image co-registration software was used compared to the previous study which utilized in-house image co-registration software [18]. The quality of co-registrations may have additionally affected the voxelwise correlation (for example, deliberate misregistration by a single voxel in patient #3 resulted in a drop of r TV from 0.72 to 0.17).
More recently, reproducibility of hypoxia imaging using HX4 PET has been investigated by Zegers and colleagues in a multicenter trial in both HNC and NSCLC patients [23]. The authors concluded that HX4 PET imaging is reproducible regarding the spatial uptake in both HNC and NSCLC patients, reporting no major Table 4 Bland-Altman analysis results for differences between voxelwise tumour-to-blood ratio values  KS Kolmogorov-Smirnov, CI confidence interval, LLA lower limit of agreement, ULA upper limit of agreement differences in the results between the two cohorts [23]. The mean ΔSUV was 0.02 ± 0.07; high correlations were reported between SUV max and TBR max as well as between hypoxic sub-volumes [23]. Our results are also in concordance with this study.
Scatter plots indicate systematic differences in voxelwise uptake between the two FMISO scans, also observed in earlier PET reproducibility studies [18,19,23]. Various technical (e.g. incorrect synchronization of time between injection and calibration), biologic (uptake period, presence of acute hypoxia, patient motion, breathing, and comfort), and physical factors (VOI for the calculation of SUV blood ) might affect PET quantification [24]. However, the mean difference in voxelwise TBR values from pooled data was 0.9 ± 10.8 %, suggesting no systematic biases. This observation is further supported by the absence of significant correlation between SUV values in normal tissues (contralateral lung and subscapularis muscle) and in the tumour. Approximately 23 % of voxels identified as hypoxic on one FMISO scan did not meet the hypoxia criterion on the other FMISO scan (assuming the TBR > 1.2 threshold). In addition to the aforementioned factors, this could be attributed to relatively low uptake of FMISO that exacerbates the impact of statistical noise, potential mismatch between the PET and the CT images (affecting attenuation correction), CT-CT co-registration of the FMISO1 to FMISO2 image sets, and/or the susceptibility of the lesion to respiratory motion due to its location within the lung. Resampling of FMISO2 resulted on average in <3 % absolute differences in uptake values. When correlation analysis was repeated by co-registering FMISO1 to FMISO2, the change in r TV was <1 % (data not shown). The extent to which the changes in spatial distribution of tumour hypoxia compromise the coverage of hypoxic tumour sub-volumes achievable by IMRT remains to be investigated.
A limitation of the current study is a small sample size. Nevertheless, high reproducibility of FMISO spatiotemporal distribution was confirmed, providing an impetus for the use of FMISO PET imaging in thoracic oncology. Another limitation of this as well as earlier PET reproducibility studies in NSCLC is the absence of respiratory gating [21][22][23]. While motion correction is not yet widely used clinically [22], it may alter the accuracy of quantitative uptake measures due to image blurring [25]. However, similar reproducibility of FMISO was observed for non-small cell lung cancer patients as for patients with head and neck cancer [19], despite the fact that the latter were immobilized during image acquisition, the tumours were not affected by respiratory motion, and for which the co-registration is expected to be more accurate. Lastly, the clinical significance of the observed variability in FMISO intratumour distribution in the context of patient stratification for hypoxia-targeting therapies, monitoring treatment response, efficacy of biologically conformal radiotherapy, or radiomics warrants further examination in larger datasets.

Conclusions
The results of this pilot study confirm that (i) FMISO intratumour distribution is highly reproducible in NSCLC, facilitating its use in dose escalation of hypoxic tumour sub-volumes, patient stratification strategies, and monitoring treatment response; (ii) high reproducibility can be achieved with relatively shorter imaging times postinjection than those previously suggested, potentially reducing long patient waiting periods; and (iii) the spatiotemporal uptake patterns of FMISO as measured by PET are not expected to be affected by transient hypoxia.