Effects of rigid and non-rigid image registration on test-retest variability of quantitative [18F]FDG PET/CT studies

Background [18F]fluoro-2-deoxy-D-glucose ([18F]FDG) positron emission tomography (PET) is a valuable tool for monitoring response to therapy in oncology. In longitudinal studies, however, patients are not scanned in exactly the same position. Rigid and non-rigid image registration can be applied in order to reuse baseline volumes of interest (VOI) on consecutive studies of the same patient. The purpose of this study was to investigate the impact of various image registration strategies on standardized uptake value (SUV) and metabolic volume test-retest variability (TRT). Methods Test-retest whole-body [18F]FDG PET/CT scans were collected retrospectively for 11 subjects with advanced gastrointestinal malignancies (colorectal carcinoma). Rigid and non-rigid image registration techniques with various degrees of locality were applied to PET, CT, and non-attenuation corrected PET (NAC) data. VOI were drawn independently on both test and retest scans. VOI drawn on test scans were projected onto retest scans and the overlap between projected VOI and manually drawn retest VOI was quantified using the Dice similarity coefficient (DSC). In addition, absolute (unsigned) differences in TRT of SUVmax, SUVmean, metabolic volume and total lesion glycolysis (TLG) were calculated in on one hand the test VOI and on the other hand the retest VOI and projected VOI. Reference values were obtained by delineating VOIs on both scans separately. Results Non-rigid PET registration showed the best performance (median DSC: 0.82, other methods: 0.71-0.81). Compared with the reference, none of the registration types showed significant absolute differences in TRT of SUVmax, SUVmean and TLG (p > 0.05). Only for absolute TRT of metabolic volume, significant lower values (p < 0.05) were observed for all registration strategies when compared to delineating VOIs separately, except for non-rigid PET registrations (p = 0.1). Non-rigid PET registration provided good volume TRT (7.7%) that was smaller than the reference (16%). Conclusion In particular, non-rigid PET image registration showed good performance similar to delineating VOI on both scans separately, and with smaller TRT in metabolic volume estimates.


Background
Positron emission tomography (PET) has now been accepted as a valuable tool in oncology, not only for detecting or staging disease and estimating target volumes for radiotherapy purposes, but also for monitoring response to therapy and predicting prognosis [1,2].
To date, [ 18 F]fluoro-2-deoxy-D-glucose ([ 18 F]FDG) is the most widely used tracer for oncological applications. Especially for monitoring response to therapy, it is likely that quantitative assessment of [ 18 F]FDG uptake will become the standard. One practical issue in longitudinal PET/CT studies is that patients are not scanned in exactly the same position. Therefore, baseline volumes of interest (VOI) cannot be reused directly. Reusing baseline VOI are of interest for measuring changes in tracer uptake (response) compared with baseline and for studying changes in uptake heterogeneity [3]. Rigid or non-rigid image registration needs to be applied to enable reuse of baseline VOI on response scans. Rigid image transformation only allows for rotational and translational movements of the entire image, whereas non-rigid image registration allows for any type of local (elastic) deformations. De Moor et al [4] showed that non-rigid image registration of [ 18 F]FDG images could be used for easier and faster therapy assessments.
Validating image registration strategies for response assessments poses several problems. Response monitoring requires a scan before therapy (a baseline scan), followed by one or several scans sometime during treatment. Because the interval between these scans can be several weeks, tumours are likely to change in volume and/or tracer uptake due to either treatment or progression of disease. On the other hand, test-retest studies are acquired within a limited time frame, without administration of therapy between scans, so only small differences in metabolic volume and tracer uptake are expected. Image registration strategies that fail to properly register such test and retest scans can be discarded for future use and, therefore, application to test-retest studies should be regarded as a first attempt to validate image registration strategies for oncological response assessment studies.
The purpose of this study was to investigate the impact of various image registration strategies on standardized uptake value (SUV) and metabolic volume test-retest variability (TRT). To this end, test-retest PET/CT data from patients with advanced gastrointestinal malignancies were collected retrospectively in order to evaluate various image registration strategies as candidates for registration of response assessment studies. Previously, effects of intermodality rigid and non-rigid image registration on maximum SUV (SUV max ) and metabolic tumour volume have been determined for non-small cell lung cancer patients [5]. It was shown that the type of registration had no significant impact on SUV max . To the best of our knowledge, this is the first study that reports on the impact of various types of intramodality image registration (with different levels of image cropping and various types of input data) on various quantitative measures derived from the PET scan (and in particular on the repeatability of those measures).

Patient data
Two baseline whole-body [ 18 F]FDG PET/CT studies were acquired for 11 subjects (9 male, 2 female; age: 63 ± 11 y (mean ± standard deviation); weight: 90 ± 15 kg) with advanced gastrointestinal malignancies (colorectal carcinoma) at five different sites [6]. Nine patients were scanned on a Biograph PET/CT scanner (CTI/Siemens, Knoxville, TN, USA) and two patients were scanned on a Gemini TF-64 PET/CT scanner (Philips Healthcare, Cleveland, OH, USA). Test and retest studies were performed within 12 days (5.5 ± 3.4 days) of each other. All patients fasted for at least 4 h before scanning and refrained from strenuous activity. Blood glucose levels were obtained for each patient prior to scanning and these levels were within the normal range (5.7 ± 1.4 mmol·l -1 ). Patients had received no therapy (chemotherapy, radiotherapy, or surgical treatment) for at least 2 weeks prior to the first baseline [ 18 F]FDG PET/CT scan.
A static whole-body emission scan was started 79 ± 20 min after injection of [ 18 F]FDG (509 ± 98 MBq). Prior to this emission scan, a low dose CT scan was acquired for attenuation correction purposes (120-130 kVp and 54-133 mAs) during normal breathing. All data were acquired and reconstructed according to local guidelines that were in accordance with recently published guidelines for quantitative [ 18 F]FDG PET studies [7,8]. For the Gemini TF-64 PET/CT, PET images were reconstructed onto a 144 × 144 image matrix (voxel size: 4.0 × 4.0 × 4.0 mm) using a row action maximum likelihood algorithm. The corresponding CT images were reconstructed onto a 512 × 512 image matrix with a voxel size of 0.78 × 0.78 × 5.0 mm. For the Biograph PET/CT, PET images were reconstructed onto either 128 × 128 (voxel size: 5.2 × 5.2 × 2.4 mm, n = 2; or 5.3 × 5.3 × 3.4 mm, n = 5) or 168 × 168 (voxel size: 4.1 × 4.1 × 2.0 mm, n = 2) image matrices using an ordered-subsets expectation maximization algorithm. The corresponding CT images were reconstructed onto a 512 × 512 image matrix with a voxel size of 0.98 × 0.98 × 2.4 (n = 2), 0.98 × 0.98 × 2.5 (n = 5) or 0.98 × 0.98 × 4.0 (n = 2) mm. Both attenuation corrected and nonattenuation corrected (NAC) PET images were obtained. After reconstruction, attenuation-corrected PET data were transformed to SUV using: All PET and CT data were acquired as part of an ongoing clinical study [6], which was approved by an authorised medical ethical review committee, and informed consent was obtained from each patient prior to inclusion in the study.

Image registration strategies
Several rigid and non-rigid strategies were attempted, based on various input data: • PET to PET image registration. This registration type takes functional information into account; • NAC to NAC image registration, after which the transformation was applied to the PET images. NAC images are not affected by erroneous attenuation correction that might be caused by a possible (small) mismatch between PET and CT; • CT to CT image registration, after which the transformation was applied to the PET data. This registration takes anatomical information into account. No global mismatches between CT and PET images were observed. The low dose CT scans were downsampled to the PET resolution prior to image registration to increase computational performance and to avoid issues with computer memory; • CT to CT image registration, after which the transformation was used to initialize PET to PET registration (referred to as CTPET). This registration takes first the anatomical and second the functional information into account. This method was only used for (non-linear) non-rigid transformations, as (linear) rigid CTPET-based image registration would produce identical results to rigid PET-based image registration.
These various types of image registration were applied on: • Whole-body images as obtained from the PET/CT reconstruction, referred to as 'global'; • Whole-body images as obtained from the PET/CT reconstruction, cropped from the shoulders until just above the bladder, so that they did not include high tracer uptake regions (i.e. the head and bladder). This method is referred to as 'semi-local'; • Whole-body images as obtained from the PET/CT reconstruction, cropped in such a way that they included either only the liver or only the lung region. This method is referred to as 'local'.
In total, 21 different image registration strategies were investigated, summarized in Table 1. In all cases similarity between scans was measured by minimizing summed squared differences (SSD), maximizing normalized cross correlation (NCC) or maximizing mutual information (MI) [9]. For non-rigid PET registration, high uptake regions (regions having a SUV larger than 10, e.g. the bladder or the brain) were limited to a SUV of 10 to enhance performance and to avoid image artefacts. When MI was used, this was applied during (joint) histogram calculations to avoid loss of performance due to the limited number of grey level bins. When NCC and SSD were used, this was applied to images before registration to create temporary images that were used as input for the image registration, thereby increasing proper matching of certain high uptake regions that can vary highly amongst different scans, i.e. the bladder.
In case of Elastix, optimal parameters for the adaptive stochastic gradient descent optimisation method (i.e. step size) were derived for each combination of strategy, similarity measure and type of image registration (rigid or non-rigid) using data from one patient [12]. Four resolution levels were applied, except for rigid PET for which three resolution levels were applied. In the finest resolution level, the control point spacing of the Bspline transformation was set to 16 mm and 32 mm for rigid and non-rigid transformations, respectively. In each resolution level, 32 grey level bins and 2,000 spatial samples were used to compute the mutual information. A maximum of 500 iterations were applied. During registration and optimization the order of B-spline interpolation was set to linear. Prior to any non-rigid registration, a rigid registration was performed. More details on the implementation of Elastix can be found elsewhere [10].
In case of splineMIRIT, stages of resolution and refinements were optimized for each strategy to avoid artefacts and increase performance. In total, 10, 6 or 10 stages of resolutions and refinements were used for non-rigid PET, CT or CTPET registration, respectively. In each resolution level, 32 grey level bins were used to compute the mutual information. No maximum of iterations were applied. Quadratic B-splines have been used for interpolation. No rigid registration was performed prior to non-rigid registration. More details on the implementation of splineMIRIT can be found elsewhere [11].
No optimization or calibration was required for Regis-Rigid. RegisRigid makes use of a greedy search algorithm and three different resolution levels to obtain translation and rotation parameters [4]. In each resolution level, 20 grey level bins were used to compute the mutual information. No maximum of iterations were applied. During registration and optimization trilinear interpolation was used.

Data analysis
In total, 24 lesions could be identified that were located in liver (n = 13), lung (n = 10) or bone (n = 1). VOIs were drawn on both test (VOI test ) and retest (VOI retest ) scans using a 3 dimensional (semi)-automatic isocontour method at 50% of the maximum pixel value [13]. For each VOI, SUV max , mean SUV within a VOI (SUV mean ), volume of the VOI (metabolic volume) and total lesion glycolysis (TLG, calculated as product of SUV mean and metabolic volume) were obtained. For all these parameters, both relative (signed) and absolute (unsigned) TRT values were calculated as the (absolute) difference between the results of the test and retest scans, divided by the mean of these two values. These TRT values were used as a reference.
To validate the image registration techniques, VOI test were transformed according to the transformation parameters obtained (resulting in VOI reg ). Dice similarity coefficients (DSC) were calculated between VOIreg and denotes the volume of VOIreg, |Y| the volume of VOIretest, and |X ∩ Y| the overlap between the two volumes. In addition, VOI reg were applied to the retest scans to obtain SUV max , SUV mean , metabolic volume and TLG. To measure the impact on TRT, for each parameter, absolute (unsigned) TRT was calculated as the absolute difference between results of the test scans (using VOI test ) and those of the retest scans using VOI reg , after which these results were compared with the absolute reference TRT data. To assess whether any of the registration strategies showed a systematically different trend compared with the reference, relative (signed) TRT values were also calculated and compared with relative reference TRT values. A two-tailed Wilcoxon signed ranks test was applied to reference TRT values and TRT values obtained after image registration, or between DSC values obtained using Elastix, RegisRigid or splineMIRIT. P-values less than 0.05 were considered significant.
As there is no reference value for DSC, an estimate was made of the largest possible overlap in volumes between both VOI delineated separately on test and retest scans. This number is fictive and, theoretically, can only be achieved when VOI test can be folded exactly onto VOI retest , retaining its original volume.

Computation time
Depending on the registration strategy and level of image cropping, the computation times on a PC with Core 2 Duo 2.53 GHz cpu (Intel, Santa Clara, CA, USA) were around 0.5 to 4 min for Elastix, 1 to 3 min for RegisRigid and 20 to 90 min for splineMIRIT. Table 2 shows median DSC values for various global registration strategies and using several similarity Effects of input data on DSC Figure 1 shows box plots illustrating DSC of global rigid and non-rigid registration strategies using various input data, with corresponding mean, median and range given in Additional file 1: Table S1. Non-rigid image registration outperformed rigid image registration for most input data (23% higher median DSC), except CT (5% lower median DSC). For rigid registration, CT input data provided the highest DSC (median: 0.72 and 0.71 using Elastix and splineMIRIT, respectively), while for non-rigid image registration, both PET and CTPET input data resulted in the highest DSC (median: 0.82 and 0.78 using Elastix and splineMIRIT, respectively). NAC data did not provide an improvement in median DSC for rigid registration strategies and showed more artefacts in the registered images following non-rigid image registration, resulting in an increased number of outliers. Therefore, in the remainder only results using CT, CTPET and PET as input data for registrations will be provided. Both Elastix ( Figure 1A) and RegisRigid/ splineMIRIT ( Figure 1B) provided similar trends and most differences between programs were non-significant (p > 0.12). Elastix, however, showed a small but significant improvement in non-rigid PET and CTPET image registration of 5% in median and 4% in mean DSC values (p < 0.001). Therefore, only data obtained using Elastix will be presented.

Effects of image cropping on DSC
In Figure 2 box plots are shown illustrating the effects of various levels of image cropping on DSC of rigid and non-rigid registration strategies using various input data. Corresponding mean, median and range are given in Additional file 2: Table S2. All changes in performance of semi-local and local compared with global were insignificant (p > 0.05). As various levels of image cropping showed no improvement and an insignificant change in DSC for non-rigid image registration strategies (p > 0.36), cropping in combination with non-rigid image registration will not be considered further. A small, albeit insignificant (p > 0.14), improvement in median DSC values was observed for local rigid image registration compared with global rigid image registration. For global rigid PET registration, 80% of lung lesions showed a DSC of less than 0.50. Using local PET registration, this number decreased to 40%. In case of non-rigid PET registration, however, all lung lesions showed a DSC of more than 0.59. For semi-local rigid PET registration, one subject with a lung lesion showed a decrease in DSC (from 0.38 to 0.14) compared with global and local PET registration ( Figure 3). Consequently, only results of local rigid image registration will be reported as an illustration of the effects of cropping on various quantitative measures. Figure 3 also shows a mismatch between CT and PET, causing CT registration to show less overlap between VOI and lesion than PET registration. For rigid CT registration, only 30% of the lung lesions showed a DSC of less than 0.5. Both non-rigid and local CT registration did not change this number.

Effects on TRT of various quantitative PET measures
In Figure 5 box plots are shown illustrating the effects of (local) rigid and non-rigid registration strategies on absolute TRT of various quantitative measures derived from final PET images. TRT of SUV max ( Figure 5A, Table 3), SUV mean ( Figure 5B, Table 3) and TLG (Figure 5D, Table 3) showed no significant differences between registration strategies and reference (p > 0.20), except for non-rigid CTPET registration with a significantly lower absolute TRT of SUV mean (p < 0.05). Only non-rigid image registration provided an identical TRT of absolute SUV max as the reference. Global rigid CT registration showed one subject with a lung lesion that had a higher TRT of absolute SUV max (37.3) than other registration strategies (10.3). For all registration strategies, except non-rigid PET registration (p = 0.10), absolute TRT of metabolic volume ( Figure 5C, Table 3) were For rigid image registration, some lesions showed small absolute TRT that were larger than zero. Figure 6 shows effects of similarity measures on absolute TRT of various quantitative measures derived for non-rigid PET registration. Both SSD and NCC showed lower TRT of absolute volume and SUV mean than MI. These performance differences corresponded with the results for similarity presented in Table 2. Figure 7 shows box plots illustrating the effects of (local) rigid and non-rigid registration strategies on relative TRT of various quantitative PET measures. For relative TRT of SUV mean ( Figure 7B, Table 3), all registrations showed a significant higher median value compared to the reference (p < 0.05). Only non-rigid CTPET and PET registrations showed a negative median value (-1.4 and -0.7%, respectively) in line with the reference (-5.8%). Other registration strategies showed a small positive relative median TRT (range: 2.4-9.1%). This was also reflected in relative TRT of TLG ( Figure 7D, Table 3), where CTPET and PET registrations were the only registration strategies that showed no significant differences compared to the reference (p > 0.35). For relative TRT of volume ( Figure 7C, Table 3) and SUV max ( Figure 7A, Table 3) similar trends Figure 3 Example images of one patient with lung lesions. Example of coronal and sagittal images of one patient with lung lesions, illustrating the effects of various rigid and non-rigid registration strategies using CT or PET as input data. The four lower rows show the test scan registered onto the retest scan (reference, shown on the top row). VOI retest is shown in red, indicted by blue arrows. All images are resliced to the same position of VOI retest and all images are shown using the same colour scales.
van Velden et al. EJNMMI Research 2012, 2:10 http://www.ejnmmires.com/content/2/1/10 were observed as for absolute TRT, except that for all registration strategies the differences in metabolic volume were insignificant compared to the reference (p > 0.51).

Discussion
This study investigates the impact of various image registration strategies on test-retest variability of SUV and metabolic volume derived from repeat [ 18 F]FDG PET scans. The main purpose of the study was to identify image registration strategies that can serve as candidates for registration of repeat [ 18 F]FDG PET scans in order to monitor response to treatment. To the best of our knowledge, this is the first study that reports on the impact of various types of image registration (with different levels of image cropping and various types of input data) on quantitative measures derived from [ 18 F] FDG PET scans. Note that this study did not focus on the accuracy of the entire registered images, but on the accuracy of the registered baseline VOIs only. The idea is that not the images but the VOIs will be transformed, such that tracer uptake quantification will always be performed on the original (i.e. non-transformed) images. In this setting, accurate and precise VOI transformation are most important. However, all registered images were checked visually for image artefacts that might have resulted from (non-rigid) image registration. Occasionally, during optimization and calibration of the registration strategies slightly higher DSCs then those reported in this paper were observed for some patients when other registration parameters were used. However, the use of these parameters was considered not feasible for reuse of baseline VOIs due to (severe) image artefacts that were observed in the registered images. Only those parameters were used that showed a high DSC but did not show image artefacts. Despite that the parameters have been chosen carefully, recalibration of the registration strategies might be required for other purposes (other type of studies or tracers) and final registration results should always be supervised (i.e. visually checked). Two independent software packages were used to investigate the various registration strategies. Results were very similar (Figure 1), although the small improvement in DSC values obtained with Elastix was significant.
Consistent with a previous study [5] showing that the type of intermodality image registration had no significant impact on SUV max , the present study showed that the type of intramodality registration had no significant impact on absolute TRT of SUV max compared with delineating VOIs separately on both scans. The same was true for SUV mean (except for CTPET registration) and TLG. Only for absolute TRT of metabolic volume, significant lower values were observed for all registration strategies when compared to delineating VOIs separately, except for non-rigid PET and CTPET registrations. Rigid image registration does not allow any changes in volume, so in theory this should always be zero. Some lesions, however, showed small non-zero TRT values due to small sampling errors or VOIs that were moved partly outside the image borders after registration.
For relative TRT values, similar trends were observed. However, the type of intramodality registration had no significant impact on relative TRT of metabolic volume. In addition, for relative TRT of SUV mean and TLG, Figure 4 Effects of lesion size on DSC. Box plots illustrating the effects of lesion size on DSC obtained using (local) rigid and nonrigid registration strategies. DSC values were obtained from (a) large lesions (n = 8; average: 166 ml, range: 48-749 ml) or (b) small lesions (n = 16; average: 13 ml, range: 1.1-33 ml). Reference shows the largest overlap in volume that could be achieved for VOI delineated on both scans separately. The mean is illustrated by a square, outliers by dots, and minimum and maximum values by crosses.
For most lesions, CT registration provided accurate results. As illustrated in Figure 3, however, some small lung lesions showed small misalignments between PET and CT that could have been caused by respiratory motion. This resulted in a poorer performance of CT registration for these lesions. In these cases, performance of CT image registration could probably be improved by using respiratory gating [14] or intermodality image registration to correct for small residual misalignments between CT and PET [15,16]. In general, performance of CT registration might be improved by using the original CT images that were not downsampled to the PET resolution. Van Herk et al [17] showed that reducing pixel resolution has little effect on performance of rigid CT registration and can be used to speed up the algorithm without loss of accuracy. However, this has yet to be shown for non-rigid CT registrations. NAC image registration was investigated, as (attenuation corrected) PET images could potentially contain errors due to faulty attenuation correction resulting from respiratory motion or from small mismatches between CT and PET. This type of registration, however, provided poorer results than the other registration strategies and, therefore, NAC image registration cannot be recommended.
Despite a small misalignment of a few small lung lesions and the relatively poorer DSC performance compared with PET image registration, CT image registration might still be a good candidate for certain response monitoring studies. Disagreements between CT or PET and clinical response have been observed [18]. When it is of interest to reuse the baseline VOI without changes in volume or shape, i.e. to study changes of [ 18 F]FDG uptake within the anatomical volume [3], then CT image registration could be of interest, because (local) rigid CT image registration showed good similarity (median DSC: 0.72) with no change in absolute volume TRT.
Using various levels of image cropping did not show an effect on non-rigid image registration. Roels et al [19] suggested using more local non-rigid image registration to minimize effects of high uptake regions such as the bladder in patients with rectal cancer. In the present Figure 5 Effects on absolute TRT of various quantitative PET measures. Box plots illustrating the effects of (local) rigid and non-rigid registration strategies on the absolute test-retest variability (TRT) of SUV max (a), SUV mean (b), metabolic volume (c) or total lesion glycolysis (TLG, d). These TRT values were obtained using PET, CT or both (CTPET). Reference shows those TRT values that were obtained by delineating VOIs on both scans separately. Lower values are better than higher values. The mean is illustrated by a square, outliers by dots, and minimum and maximum values by crosses.
van Velden et al. EJNMMI Research 2012, 2:10 http://www.ejnmmires.com/content/2/1/10 study, however, effects of these regions were already minimized by setting the maximum allowed SUV to 10. This could explain why local non-rigid image registrations did not show an improvement in performance.
Rigid PET image registration showed poor results for most lung lesions. In contrast, non-rigid PET and CTPET registrations showed good performance for all types of lesions, and they provided high similarity and similar trends as delineating VOI separately. In addition, CTPET provided significantly lower absolute TRT of volume and SUV mean . Nevertheless, as CTPET requires two non-rigid registrations (first CT, followed by PET), non-rigid PET image registration is preferred. For both non-rigid PET and CTPET registrations, DSC was somewhat lower for small lesions (0.79) compared to that for large lesions (0.88). Partial volume effects may be responsible for this. Differences in quantitative measures obtained using non-rigid PET image registration were not significantly different from those obtained by drawing VOI independently on both scans (except for relative TRT in SUV mean ). This may suggest that there is no additional benefit in using non-rigid image registration. Use of non-rigid image registration, however, will make data analysis easier and faster, because manual search for Units of absolute values, obtained from the retest scan, were g/ml, g/ml, ml and g for SUV max , SUV mean , metabolic volume and TLG, respectively. b P-values were calculated from data obtained with registration strategy and reference (test-retest variability values that were obtained by delineating VOIs on both scans separately). c Reference illustrates those test-retest variability values that were obtained by delineating VOIs on both scans separately. d Statistically significant difference (P < 0.05); TLG = total lesion glycolysis; VOI = volume of interest lesions in the retest scan can be avoided [4]. In addition, drawing VOI on both scans independently is not perfect either, because it still shows an absolute volume TRT of 14%. Although more sophisticated tumour delineation methods may lower this value [13], a significantly smaller absolute volume TRT was observed for non-rigid PET image registration (7.7%). Therefore, in agreement with previous findings [4], non-rigid image registration may be a good alternative for obtaining accurate VOI in response monitoring studies. These results are also consistent with a previous study [16] showing that non-rigid intermodality image registration is a significant improvement over rigid registration for fusion between [ 18 F]FDG PET and CT. The present study should primarily be seen Figure 6 Effects of similarity measures on absolute TRT of various quantitative measures. Box plots illustrating the effects of similarity measures derived using non-rigid PET registration on the absolute test-retest variability (TRT) of SUV max (a), SUV mean (b), metabolic volume (c) or total lesion glycolysis (TLG, d). Reference shows those TRT values that were obtained by delineating VOIs on both scans separately. Lower values are better than higher values. The mean is illustrated by a square, outliers by dots, and minimum and maximum values by crosses.
van Velden et al. EJNMMI Research 2012, 2:10 http://www.ejnmmires.com/content/2/1/10 as a first attempt to exclude those registration strategies that perform poorly in cases without change in metabolic volume or tracer uptake. However, because tumours are likely to change in metabolic volume and/or tracer uptake during treatment, the remaining registration strategies need to be validated in clinical response monitoring studies.

Conclusion
Most registration types showed no significant differences in absolute test-retest variability of SUV max , SUV mean and TLG compared with delineating VOI separately on both scans. In particular, non-rigid PET image registration showed good performance similar to delineating VOI on both scans separately, and with smaller absolute TRT for volume estimates.

Additional material
Additional file 1: Table S1. Mean, median and range Dice similarity coefficients (DSCs) for various registration strategies and programs.
Additional file 2: Table S2. Mean, median and range for Dice similarity coefficients (DSCs) obtained with various registration strategies.
Additional file 3: Table S3. Mean, median and range of Dice similarity coefficients (DSCs) obtained with various registration strategies for small (average: 13 ml, range: 1.1-33 ml) and large (average: 166 ml, range: 48-749 ml) lesions. data interpretation. JN implemented RegisRigid, contributed to the intellectual content (supervision) and critically reviewed manuscript. LMV provided PET image data and reviewed the manuscript. WH provided/ collected PET image data and reviewed the manuscript. AAL critically reviewed the manuscript and approved the final content of the manuscript. RB performed a study design, supervised the project and reviewed and approved the final content of the manuscript. DL implemented splineMIRIT, contributed to the intellectual content (supervision) and critically reviewed manuscript. All authors reviewed the collected data and interpretation, provided feedback for further research during the study and approved the final submitted version of this manuscript.

Competing interests
The department of Nuclear Medicine & PET Research of the VU University Medical Center was financially sponsored by Bristol-Myers Squibb Co. for participating in the ongoing clinical study and for providing consultancy. Dirk Loeckx was a postdoctoral fellow of the Research Foundation -Flanders (FWO) until April 2011. He is now co-founder and director of icoMetrix (Leuven, Belgium).