The latest proposed update to the EARL recommendations accommodates modern PET technologies, in particular time-of-flight and point spread function (PSF) [6]. Our study compares datasets that are compatible with both current and the proposed update to the EARL recommendations. The introduction of time-of-flight and PSF have shown to have minimal effect on the liver and mediastinal uptake [10], but new hardware and reconstruction algorithms provide higher SUV in small lesions [4], and therefore affect DS classification. Our findings show that if the proposed update to the EARL recommendations is accepted, it will have an impact on DS and therapy response evaluation. The studies behind the recommendations on DS and treatment evaluation are performed on older generations of PET-CT scanners, and it is not known whether the DS obtained from new state-of-the-art PET-CT scanners will have an impact on patient outcome in large cohorts.
Investigation of whether the choice of reconstruction method affects DS has recently been performed by Enilorac et al. [13], comparing one dataset with unfiltered PSF (Siemens HD) and one where a 6-mm Gaussian filter was applied to PSF images to match the EARL requirements. The proportion of major discordances was comparable to our findings for SUVmax but our conclusions differ. In their study of 126 patients, no difference in progression-free survival and overall survival was seen depending on the reconstruction method, when patients were classified as responders or non-responders. However, they analysed i-PET and EoT-PET separately, yielding small groups with only a few patients classified differently depending on the reconstruction method.
There are different aspects on DS that should be considered, such as how SUV in reference organs are measured, the cut-off for DS 5 and how to handle patients with a higher SUV in the mediastinum compared with liver. In our study, we used automatic segmentation of liver and mediastinum [6]. The edges around the liver and aortic wall were automatically truncated to avoid uptake from adjacent structures and the vessel wall. Segmentations were manually corrected when needed. This method increases the likelihood of obtaining the true SUVmax. When manual ROIs are placed in reference organs, there is an apparent risk of missing the true SUVmax. In a couple of the patients, we found the mediastinal SUVmax to be higher than the liver SUVmax. This was confirmed with manual ROI measurements (data not shown). There is no support in the literature or guidelines on how these cases should be managed in terms of DS classification. However, in our study, no patient had lesion uptake that was between uptake in the mediastinum and in the liver. There is no consensus where the cut-off point should be for DS 5, and both a limit of two or three times the maximum uptake in the liver has been proposed [4]. We classified DS 5 as two times the maximum uptake in the liver. There were few major discordances in DS (i.e. when a non-responder (QC) is reclassified as responder (EARLlower/EARLupper)) between reconstruction methods, which has clinical significance in terms of treatment strategy. If a worst-case scenario is preferred, then using settings that adhere to the newly proposed EARL recommendation is more suitable.
We included baseline exams in order to increase the study population, although DS is normally not calculated in baseline examinations. In theory, a follow-up scan could look like a baseline scan. In the retrospective analysis, baseline exams did not show any major discordances: for SUVmax, there were 2 discordances, and for SUVpeak, there were also 2 discordances. If all baseline scans were removed from the study, the results would show an even higher percentage of discordances across all pairwise comparisons between reconstruction algorithms.
No solid recommendation of how to obtain DS exists, although SUVmax appears to be the most commonly used method. In this study, we investigated the use of both SUVmax and SUVpeak, as proposed both by Barrington et al. [4] and the newly proposed EARL recommendations [9]. SUVmax is more noise dependent [14]; therefore, SUVpeak is a more stable measure. This was also true in our study, where we did not find any significant differences in DS between QC and EARLupper. However, SUVpeak requires a lesion of more than 1 cm in order to be relevant. SUVmax, on the other hand, has been shown to be unreliable in sub-centimetre lesions when PSF is used [15]. There is no standard definition of SUVpeak calculation which may be seen in differing implementations of SUVpeak calculations in various software. A harmonization across vendors is desirable to further increase its reproducibility.
Limitations
In this study, we included all patients with lymphoma, regardless of the indication for the PET-CT examination. In clinical routine, DS is only used for therapy assessment and not for initial staging/baseline. However, in order to increase the number of patients and the range of included DS, also, patients referred for baseline PET-CT were included. Despite this, we recognize the limitation of the study due to its small sample size and its monocentric nature.
Although we have showed considerable differences in DS between the reconstruction algorithms, it remains to be proven which reconstruction algorithm has the most favourable outcome for the patients. The type of lymphoma and the intensity of stage-adapted chemotherapy adds further complexity to the outcome.
1It would be of interest to compare the upper and lower limits of the newly proposed EARL recommendations, but for our PET-CT system, longer acquisition times are necessary to reach the new upper limit, which was not feasible for the current study.