RETRACTED ARTICLE: Inter- and intraobserver agreement of the quantitative assessment of [99mTc]-labelled anti-programmed death-ligand 1 (PD-L1) SPECT/CT in non-small cell lung cancer

Checkpoint inhibition therapy using monoclonal antibodies against programmed cell death protein 1 (PD-1) or its ligand (PD-L1) is now standard management of non-small cell lung cancer (NSCLC). PD-L1 expression is a validated and approved prognostic and predictive biomarker for anti-PD-1/PD-L1 therapy. Technetium-99 m [99mTc]-labelled anti-PD-L1 single-domain antibody (NM-01) SPECT/CT quantification correlates with PD-L1 expression in NSCLC, presenting an opportunity for non-invasive assessment. The aim of this study was to determine the inter- and intraobserver agreement of the quantitative assessment of [99mTc]NM-01 SPECT/CT in NSCLC. [99mTc]NM-01 SPECT/CT studies of 21 consecutive NSCLC participants imaged for the evaluation of PD-L1 expression were analysed. Three independent observers measured maximum counts in a tumour region of interest (ROImax) of primary lung, metastatic lesions and normal tissue references of both 1 and 2 h post-injection (n = 42) anonymised studies using a manual technique. Intraclass correlation coefficients (ICC) were calculated, and Bland–Altman plot analysis was performed to determine inter- and intraobserver agreement. Intraclass correlation of primary lung tumour-to-blood pool (T:BP; ICC 0.83, 95% CI 0.73–0.90) and lymph node metastasis-to-blood pool (LN:BP; ICC 0.87, 0.81–0.92) measures of [99mTc]NM-01 uptake was good to excellent between observers. Freehand ROImax of T (ICC 0.94), LN (ICC 0.97), liver (ICC 0.97) and BP (ICC 0.90) reference tissues also demonstrated excellent interobserver agreement. ROImax scoring of healthy lung demonstrated moderate to excellent interobserver agreement (ICC 0.84) and improved when measured consistently at the level of the aortic arch (ICC 0.89). Manual ROImax re-scoring of T, LN, T:BP and LN:BP using [99mTc]NM-01 SPECT/CT following a 42-day interval was consistent with excellent intraobserver agreement (ICC range 0.95–0.97). Good to excellent inter- and intraobserver agreement of the quantitative assessment of [99mTc]NM-01 SPECT/CT in NSCLC was demonstrated in this study, including T:BP which has been shown to correlate with PD-L1 status. [99mTc]NM-01 SPECT/CT has the potential to reliably and non-invasively assess PD-L1 expression. ClinicalTrials.gov identifier no. NCT02978196. Registered 30th November 2016.


Background
Lung cancer is the most commonly diagnosed cancer globally and a leading cause of mortality with over 1.7 million deaths in 2018 alone [1]. Therapeutic molecular-targeting agents have resulted in significant improvements in progression-free and overall survival in advanced non-small cell lung cancer (NSCLC); however, targetable genetic aberrations represent only a small proportion of cases [2]. The introduction of monoclonal antibodies targeting immune checkpoint molecules including programmed cell death protein 1 (PD-1) and its ligand (PD-L1) has revolutionised the treatment paradigm of NSCLC. An important mechanism of immune escape involves the upregulation of co-inhibitory molecule PD-L1 by tumour cells, which on interaction with PD-1, expressed by effector T cells, lead to their dysfunction. Anti-PD-1/PD-L1 therapy improves median overall survival in advanced NSCLC in both first-and secondline settings compared to standard cytotoxic chemotherapy, with durable responses seen in around 20% [3][4][5][6].
PD-L1 expression determined by immunohistochemistry (IHC) is a widely validated biomarker correlating with anti-PD-1/PD-L1 therapeutic response and survival [4][5][6][7]. Despite this correlation, up to 10% of patients deemed 'non-expressers' by IHC respond to anti-PD-1/ PD-L1 therapy [4]. Heterogeneity of PD-L1 expression both within and between tumours is well reported, as are changes over time particularly following exposure to anti-cancer therapies [8,9]. Considering that multiple or serial biopsies are impractical and associated with increased risk to individual patients, this temporospatial heterogeneity presents a particular challenge as needle biopsy only samples a small area of the tumour. Additionally, there are multiple PD-L1 assays available which may assess PD-L1 expression on tumour or infiltrative immune cells alone or in combination [10]. Considering a potential for false negative results with IHC and the limitations described, non-invasive imaging techniques present a potential solution and opportunity to improve the predictive value of PD-L1 assessment.
NM-01 is a camelid single-domain antibody against PD-L1 that when radiolabelled with technetium-99 m ([ 99m Tc]) can be detected by single-photon emission computed tomography (SPECT). Recently, we have reported results from a first-in-human study of [ 99m Tc]NM-01 that demonstrated both safety and acceptable dosimetry in the first 16 recruited participants with NSCLC [11]. SPECT/computed tomography (CT) scans were obtained 1 and 2 h following [ 99m Tc]NM-01 injection with primary tumour-to-blood pool ratio (T:BP) assessment correlating with PD-L1 expression determined by IHC. Additionally, uptake was demonstrated in nodal and bone metastases with heterogeneity of expression in 30% of cases. This novel single-domain antibody presents an opportunity for the non-invasive total tumoural assessment of PD-L1 that could help clinicians better stratify patients to receive the most appropriate anti-cancer therapy at the right time in their disease course. Our hypothesis was that quantitative measurement of PD-L1 expression using [ 99m Tc]NM-01 SPECT/CT is consistent and reproducible between and within observers. The aim of this study was to determine the reproducibility of and agreement between experienced and less experienced observers within a cohort of patients with NSCLC.

Methods
Participants aged between 18 and 75 years with histologically confirmed, untreated NSCLC and an Eastern Cooperative Oncology Group (ECOG) performance score of 1 or less were eligible to participate and undergo [ 99m Tc] NM-01 SPECT/CT. Exclusion criteria included pregnant or lactating females, severe infection and inability to provide biopsy sample for assessment of PD-L1. The study was registered with ClinicalTrials.gov identifier no. NCT02978196. Ethics approval was obtained from Shanghai General Hospital Ethics Committee (approval no. 2016KY220), and all enrolled participants provided written informed consent [11].

SPECT/CT protocol
SPECT/CT examinations were performed on a GE Discovery NM670 SPECT/CT scanner (GE Healthcare; NY, USA). Participants were administered an intravenous bolus of [ 99m Tc]NM-01 (3.8-8.4 MBq/kg) equivalent to 100 μg (n = 18; 1.65 ± 0.46 μg/kg; range 1.19-2.11 μg/ kg) and (9.1-10.4 MBq/kg) equivalent to 400 μg (n = 3; 5.81 ± 0.25 μg/kg; range 5.56-6.06 μg/kg). Participants were asked to drink 300-500 mL water post-injection and void bladder prior to imaging. Following an uptake time of 60 min, a low-dose CT was performed for anatomical correlation and attenuation correction. SPECT imaging, focusing on primary tumour (thorax) and site(s) of suspected metastases, was performed with the patient supine at 1 and 2 h post-injection at 10 cm/slice/min. Scans were performed as previously described using lowenergy high-resolution collimators with a ± 10% energy window centred around 140 keV in a 64 × 64 matrix for tomographic images [11]. A 10% energy window centred Keywords: Technetium, SPECT, Non-small cell lung cancer, Immunotherapy, PD-L1, Single-domain antibody (sdAb) at 120 keV was also used for tomographic image acquisition for scatter correction. SPECT was performed over 360° in 60 frames per rotation with 20-s acquisition per frame. Images were reconstructed using OSEM iterative reconstruction (2 iterations, 10 subsets) at a matrix size of 128 × 128 using scatter correction.

Image analysis
Images were reviewed by three independent observers blinded to patient details and each other's assessments using Hermes GOLD ™ (Hermes Medical Solutions; Stockholm, Sweden). The observers included one nuclear medicine physician, one nuclear medicine clinical fellow in training and one oncology clinical fellow PhD student with 30, 3 and 1 years of experience in nuclear medicine image analysis, respectively. Regions of interest including primary tumour and metastatic lesions, including lymph nodes and normal tissue references (lung, liver and blood pool), were identified with CT correlation. Using a freehand manual technique, the maximum count for regions of interest (ROI max ) was recorded from 1-and 2-h SPECT images (n = 42) for each patient. ROI max was chosen as ROI mean could be affected by differences in the manual segmentation and is more likely to be affected by the partial volume effect. In addition, the method using ROI max was previously shown to correlate with IHC [11]. Freehand ROI max was recorded for normal lung in the right upper lobe (or contralateral upper lobe if pathology present) for calculation of tumour-to-lung (T:L) ratio and for blood pool within the aortic arch for calculation of tumour-to-blood pool (T:BP) ratio. To evaluate if rulebased approaches improved consistency of scoring of normal tissue references, ROI max was also recorded using a standardised 3-cm-diameter sphere for normal lung at the level of the aortic arch and carina, and the liver at the level of the gastroesophageal junction (GOJ) on axial view. Examples of image analysis are provided in Fig. 1.
To determine intraobserver agreement, the two independent observers with least experience (one nuclear medicine and one oncology clinical fellow) repeated their calculations for all measured regions blind to their initial measurements following a 42-day period.

Statistical analysis
Intraclass correlation coefficient (ICC) is a reliability index that represents both the degree of correlation but also the agreement between measurements. A full description of their application and formulae is described in the literature [12]. ICC and their 95% confidence intervals (CIs) were calculated using a two-way random consistent model, to determine interobserver agreement between all three observers. ICC and their 95% CI were calculated using a two-way mixed effects absolute agreement model, to determine intraobserver agreement for two observers. ICC values range from 0 to 1, where the values less than 0.5 indicate poor agreement, 0.5-0.75 moderate, 0.75-0.9 good, and greater than 0.9, i.e. close to 1, represent excellent agreement [12]. As the ICC Fig. 1 Image analysis using ROI max scoring of [ 99m Tc]NM-01 SPECT/CT of: primary left lower lobe tumour, IHC PD-L1 65% (a), freehand; unaffected lung tissue freehand (b) and using a 3-cm sphere at level of the aortic arch (c); blood pool reference tissue (d); liver reference tissue freehand (e) and using a 3-cm sphere at the axial level of the gastroesophageal junction (f) obtained is an estimated value of the true ICC, the levels of agreements are defined by their 95% confidence intervals. Bland-Altman plots and their 95% limits of agreement were used to determine the agreement between observers and their repeat measurements for logarithmtransformed T:BP and LN:BP scores. Linear regression of Bland-Altman plots was performed to determine the β coefficient of the mean difference and demonstrate any proportional bias (where p < 0.05 is significant). Statistical analysis was performed using IBM SPSS Statistics for Windows, version 26.0 (Armonk, NY: IBM Corp.).

Participant characteristics
Participants were recruited to the study between March 2018 and April 2019 (n = 21). The median age was 65 years (range 36-75 years); all were of Asian ethnicity. All had a histologically confirmed diagnosis of NSCLC (adenocarcinoma n = 10, squamous cell carcinoma n = 11) with 9 of 21 participants having metastatic disease. A full summary of participant characteristics is provided in Table 1.
Freehand ROI max scoring of non-affected lung background reference tissue demonstrated moderate to excellent interobserver agreement (ICC 0.84; 0.75-0.90). The ICC was improved with good to excellent agreement when either rule-based approach was applied, measuring ROI max at the level of the aortic arch (ICC 0.89; 0.82-0.93) or the carina (ICC 0.88; 0.81-0.93). Calculated T:L ratios, when measuring healthy lung ROI max at the level of the aortic arch, were also improved to good to excellent (ICC 0.85; 0.77-0.91) compared to moderate to excellent agreement demonstrated with freehand (ICC 0.79; 0.68-0.88) and carina rule-based (ICC 0.80; 0.69-0.88) approaches.
Excellent interobserver agreement (ICC 0.97; 0.95-0.98) was also demonstrated of freehand ROI max scores for healthy reference tissue liver. Applying a consistent rule-based approach to score the liver at the level of the gastroesophageal junction did not improve agreement further (ICC 0.95; 0.92-0.97).
Using a T:BP score of ≥ 2.32 to represent a PD-L1 of ≥ 1%, the interobserver mean sensitivity was 61% and specificity 73% for this cohort (Table 3). Discrepant cases were reviewed, and a consensus was made between the three observers defining the T:BP as either < or ≥ 2.32 (Table 4). Five cases with PD-L1 expression between 1 and 10% on IHC remained discordant, four of which were considered negative PD-L1 by T:BP score of [ 99m Tc]NM-01 SPECT/CT but positive (≥ 1%) by IHC.

Intraobserver agreement
Manual ROI max scoring of primary lung tumour, lymph node metastases and blood pool reference tissue using [ 99m Tc]NM-01 SPECT/CT following a 42-day interval was consistent for the two observers analysed ( Table 5). The intraobserver ICC for primary lung tumour ROI max scores for observer B (ICC 0.96; 95% CI 0.93-0.98) and observer C (ICC 0.95; 0.91-0.97) demonstrated excellent agreement. Scoring of lymph node metastases also demonstrated excellent agreement (observer B ICC 0.97; observer C ICC 0.97, see Table 5 for 95% CIs). The intraobserver ICC for freehand ROI max scores for reference tissue blood pool (observer B ICC 0.98; observer C ICC 0.97) confirmed excellent agreement. Excellent intraobserver agreement of both T:BP and LN:BP ratios for both observer B (ICC 0.96 and 0.95, respectively) and observer C (ICC 0.95 and 0.95) were also demonstrated. Bland-Altman plot analysis demonstrated intraobserver agreement with no proportional bias on linear regression for both T:BP and LN:BP scores (Fig. 3).
The intraobserver ICC for freehand ROI max scores for healthy lung (observer B ICC 0.87; observer C ICC 0.91) and liver (observer B ICC 0.98; observer C ICC 0.99) demonstrated good to excellent agreement. A trend towards improved intraobserver agreement with rulebased approaches for healthy lung scoring was demonstrated, but no overall difference in the level of agreement was seen. Calculated T:L ratios demonstrated good or excellent intraobserver agreement (ICCs 0.84 to 0.92) irrespective of the healthy lung tissue scoring applied. This study is the first to assess the agreement of SPECT/CT in measuring PD-L1 expression in cancer. Several other radionuclides are currently being developed specifically for imaging the PD-1/PD-L1 axis. 18 F-BMS-986192 ( 18 Fluor-labelled anti-PD-L1 Adnectin) uptake on positron emission tomography (PET) has been shown to correlate with PD-L1 expression in NSCLC, as has 89 Zirconium-nivolumab for PD-1 expression, both in early phase clinical trials [13]. In both cases, inter-and intra-tumoural heterogeneity was demonstrated, consistent with the findings described in the early phase trial of [ 99m Tc]NM-01 SPECT/CT. An important characteristic of [ 99m Tc]NM-01 is that it is a small (14.3 kDa) antigen-binding fragment radiotracer with rapid blood clearance, with optimal SPECT/CT imaging performed at just 2 h following administration. As [ 99m Tc]NM-01 does not directly block the PD-L1 binding site, it does not interfere with the PD-1/PD-L1 axis and thus has the potential to assess whole-body PD-L1 status before, during and after anti-PD-L1 therapy. Whilst PET/CT provides a higher degree of spatial resolution, there are some notable benefits to SPECT/CT imaging. [ 99m Tc] radioisotope and SPECT imaging are both more widely where applying a set 3-cm sphere to score the unaffected lung at the level of the aortic arch improved the interobserver ICC. Whilst we did not show any significant improvement in agreement applying a similar rule to the liver, both inter-and intraobserver ICC remained excellent, suggesting that simple rule-based approaches may be used to standardise and simplify image interpretation without significant impact on quantification. There are some limitations to this study. Firstly, it is limited by its sample size; nevertheless, the relatively narrow confidence intervals suggest a good estimate of the agreement. Despite good to excellent interobserver agreement, the mean sensitivity and specificity were relatively poor with some discrepant cases resulting in a PD-L1 assessment determined by T:BP of [ 99m Tc]NM-01 discordant with that found on IHC. This is not unexpected considering that heterogeneity of PD-L1 measured by IHC is widely reported in the literature and was demonstrated on [ 99m Tc]NM-01 assessment in our previous study [11]. In addition, the cut-off value of T:BP ≥ 2.32 correlating with a PD-L1 of ≥ 1% on IHC was determined on a small sample size and requires further validation in larger cohorts [11]. It is also important to note that the patient cohort was relatively heterogenous with regards to tumour staging. Due to the low number of measurable extra-nodal (lung and bone) metastases in the cohort (n = 8), statistical analysis using ICC of the quantitative assessment of [ 99m Tc]NM-01 in these lesions was not possible. With further understanding of the relationship between PD-L1 expression by IHC and [ 99m Tc]NM-01 SPECT/CT, it may be possible for both quantitative (as described in this study) and qualitative assessments to be made by observers blind to IHC PD-L1 expression, and their agreement evaluated. SPECT is a highly sensitive imaging modality but has relatively poor resolution; further optimisation with iterative reconstruction methods along with CT attenuation and scatter corrections have the potential to further improve and standardise quantification [14]. Novel SPECT reconstruction techniques that enable standardised quantification will be employed in forthcoming PECan and PELICAN studies [EudraCT 2020-002809-26] to further investigate and validate [ 99m Tc]NM-01 SPECT/CT clinically. This would also enable quantitative comparison with other PD-L1 PET radionuclides, for example the aforementioned 18 F-BMS-986192 [13].