Evaluation of image quality with four positron emitters and three preclinical PET/CT systems

Teuho, Jarmo; Riehakainen, Leon; Honkaniemi, Aake; Moisio, Olli; Han, Chunlei; Tirri, Marko; Liu, Shihao; Grönroos, Tove J.; Liu, Jie; Wan, Lin; Liang, Xiao; Ling, Yiqing; Hua, Yuexuan; Roivainen, Anne; Knuuti, Juhani; Xie, Qingguo; Teräs, Mika; D’Ascenzo, Nicola; Klén, Riku

doi:10.1186/s13550-020-00724-z

Original research
Open access
Published: 10 December 2020

Evaluation of image quality with four positron emitters and three preclinical PET/CT systems

Jarmo Teuho ORCID: orcid.org/0000-0001-9401-0725^1,2,
Leon Riehakainen¹,
Aake Honkaniemi²,
Olli Moisio¹,
Chunlei Han²,
Marko Tirri^1,3,
Shihao Liu⁴,
Tove J. Grönroos^1,5,
Jie Liu⁶,
Lin Wan⁷,
Xiao Liang⁶,
Yiqing Ling⁶,
Yuexuan Hua⁶,
Anne Roivainen ORCID: orcid.org/0000-0002-4006-7977^1,2,8,
Juhani Knuuti^1,2,
Qingguo Xie^6,9,10,
Mika Teräs^3,11,
Nicola D’Ascenzo^6,9 &
…
Riku Klén¹

EJNMMI Research volume 10, Article number: 155 (2020) Cite this article

3617 Accesses
13 Citations
4 Altmetric
Metrics details

Abstract

Background

We investigated the image quality of ¹¹C, ⁶⁸Ga, ¹⁸F and ⁸⁹Zr, which have different positron fractions, physical half-lifes and positron ranges. Three small animal positron emission tomography/computed tomography (PET/CT) systems were used in the evaluation, including the Siemens Inveon, RAYCAN X5 and Molecubes β-cube. The evaluation was performed on a single scanner level using the national electrical manufacturers association (NEMA) image quality phantom and analysis protocol. Acquisitions were performed with the standard NEMA protocol for ¹⁸F and using a radionuclide-specific acquisition time for ¹¹C, ⁶⁸Ga and ⁸⁹Zr. Images were assessed using percent recovery coefficient (%RC), percentage standard deviation (%STD), image uniformity (%SD), spill-over ratio (SOR) and evaluation of image quantification.

Results

⁶⁸Ga had the lowest %RC (< 62%) across all systems. ¹⁸F had the highest maximum %RC (> 85%) and lowest %STD for the 5 mm rod across all systems. For ¹¹C and ⁸⁹Zr, the maximum %RC was close (> 76%) to the %RC with ¹⁸F. A larger SOR were measured in water with ¹¹C and ⁶⁸Ga compared to ¹⁸F on all systems. SOR in air reflected image reconstruction and data correction performance. Large variation in image quantification was observed, with maximal errors of 22.73% (⁸⁹Zr, Inveon), 17.54% (⁸⁹Zr, RAYCAN) and − 14.87% (⁶⁸Ga, Molecubes).

Conclusions

The systems performed most optimal in terms of NEMA image quality parameters when using ¹⁸F, where ¹¹C and ⁸⁹Zr performed slightly worse than ¹⁸F. The performance was least optimal when using ⁶⁸Ga, due to large positron range. The large quantification differences prompt optimization not only by terms of image quality but also quantification. Further investigation should be performed to find an appropriate calibration and harmonization protocol and the evaluation should be conducted on a multi-scanner and multi-center level.

Background

There is an increasing demand for standardization in preclinical positron emission tomography/computed tomography (PET/CT) studies, especially if performed in a multi-center or multi-system setting. Preclinical imaging studies have high requirements for image quality and quantification accuracy, which are dependent—beside other factors—on the accuracy of image reconstruction algorithms and data corrections [1, 2]. The available radiotracers with their unique physical properties, physics involved in the system hardware, data acquisition and reconstruction process and the physiology, among others, have a significant impact upon measurements performed in vivo [3]. Ensuring the reliability, reproducibility, validity and translatability of the preclinical data is of utmost importance [4]. Therefore, both the technical and non-technical factors that affect the quality and quantification of PET images should be carefully investigated.

While animal handling protocols have the greatest impact in preclinical studies, the characteristics of the used radiotracer also affect PET image quality [1]. Preferably, image quality should be studied with different systems in a standardized manner. For this purpose, phantom studies have been proposed to quantify the PET/CT system-specific differences in multi-system studies within an imaging center [2] and between several centers and systems [5]. A widely accepted PET protocol for system performance testing has been proposed by the National Electrical Manufacturers Association (NEMA), which includes a phantom and an image acquisition and analysis protocol for image quality evaluation.

The parameters such as image uniformity and recovery coefficients obtained using the NEMA NU4-2008 image quality protocol have been suggested to be used as metrics for performance standardization [2, 6, 7]. Recently, a multi-center study using a standard NEMA image quality phantom and 2-deoxy-2-[¹⁸F]fluoro-D-glucose ([¹⁸F]FDG) was performed, which showed that PET and PET/CT systems from a single vendor achieve comparable recovery coefficients, spill-over ratios and percentage standard deviations regarding performance values [4]. Therefore, it is possible to standardize the performance between single-vendor preclinical PET and PET/CT systems using the NEMA parameters and ¹⁸F.

As preclinical imaging is often performed by a wide variety of radiotracers, the image quality parameters should also be compared using multiple radionuclides with various physical properties. Radionuclides such as ¹¹C, ⁶⁸Ga, ¹⁸F and ⁸⁹Zr are commonly used in small-animal PET imaging, with half-lives ranging from several minutes (20 min for ¹¹C) to several days (3.27 days for ⁸⁹Zr) [1]. Moreover, these radionuclides have different positron energies, positron fraction and range [8]. For example, positron ranges larger than intrinsic resolution might reduce both image quality and resolution of the PET images [9]. The difference in half-lives and positron fractions also affects the counting statistics, resulting to noise differences if a standard acquisition time is used. The presence of photons due to single emission could increase spill-over to regions of low activity, resulting in increased image noise [9]. Therefore, it is essential that image quality parameters are also evaluated using different radionuclides in a standardized fashion, to study both radionuclide- and system-dependent effects.

Previously, evaluations with different radionuclides on the Inveon and Focus 120 preclinical PET systems have been performed [9, 10]. Liu et al. studied the loss of resolution due to the different positron ranges of several radionuclides [10], while Disselhorst et al. performed an image quality evaluation of a single small-animal PET/CT with ⁶⁸Ga, ¹⁸F, ⁸⁹Zr and ¹²⁴I [9]. Both reports emphasized that it is relevant to investigate preclinical PET/CT performances for different radionuclides, especially in regard to assessment of overall image quality [9, 10]. Further evaluations have been performed with various systems and radiotracers including ¹⁸F, ⁶⁸Ga, ⁶⁴Cu and ¹¹C [11,12,13,14]. While previous studies have focused on evaluations with preclinical imaging systems of a single vendor, it would be of high interest to extend the investigation of the image quality parameters to a multi-radionuclide setting using different PET/CT systems. In this manner, the reproducibility of image quality parameters across radionuclides and systems on a baseline level using a similar acquisition and analysis protocol could be better investigated.

Our motivation was to extend these evaluations by performing assessment of image quality parameters with several radionuclides on three small animal PET/CT systems from different vendors, on a single scanner level. The evaluation was performed using four radionuclides, including ¹¹C, ⁶⁸Ga, ¹⁸F and ⁸⁹Zr. The standard 20 min acquisition time specified by the NEMA protocol was used for ¹⁸F. Radionuclide-specific acquisition times to account for the physical half-life and positron fraction differences between the nuclei were used for ¹¹C, ⁶⁸Ga, and ⁸⁹Zr. A NEMA image quality phantom was used and the preclinical NEMA NU 4-2008 performance protocol was followed, to determine the radionuclide-specific effects on the resulting image quality of each system separately. It is the first time, to our knowledge, that the effect of physical properties of different radionuclides on image quality parameters is investigated using several preclinical PET/CT imaging systems in a single-center setting.

Methods

Preclinical PET/CT systems

Three small animal PET/CT systems were evaluated, namely the RAYCAN Trans-PET/CT X5 (RAYCAN Technology, Suzhou, China), Inveon Multimodality PET/CT (Siemens Medical Solutions, Knoxville, TN, USA) and Molecubes β-cube (PET) and X-cube (CT) (MOLECUBES NV, Ghent, Belgium). The acquisition software versions at the time of the study were 1.0.1371, 2.0 and 1.5.2 for the RAYCAN, Inveon and Molecubes, respectively. All systems are located within one institute. The RAYCAN and Inveon systems are physically one system where the bed is moved automatically between PET and CT. The Molecubes system is based on two separates scanners where the bed needs to be physically transferred between PET and CT. Co-registration between PET and CT is performed using a rigid registration matrix, which is calculated as part of the system calibration.

The performance aspects of these systems are described in detail elsewhere [15,16,17]. The system peak sensitivities are 1.7%, 9.3% and 12.4%, and the reported resolutions of the systems are 1.9 mm, 1.8 mm and 1.1 mm for RAYCAN, Inveon and Molecubes, respectively [15,16,17]. A summary of the technical characteristics of the systems is provided in Table 1. We will refer these systems as RAYCAN, Inveon and Molecubes throughout the paper.

Table 1 Technical characteristics of the preclinical PET/CT systems involved in the evaluation [15,16,17]

Full size table

The calibration of the systems is performed regularly according to the protocol provided by the vendor. The calibration protocols use ¹⁸F and no radionuclide-specific calibration is performed. Specifically, the calibration procedure for Inveon was performed using a 50 mL syringe and using 20 MBq of ¹⁸F. For Molecubes, 3 mL, 10 mL and 20 mL syringes and 5 MBq of ¹⁸F are used. For RAYCAN, a 4 cm diameter cylindrical phantom with 13 MBq of ¹⁸F is used. All of the calibrations of the systems are performed using a single Veenstra VDC-405 dose calibrator (Veenstra Instruments, Joure, The Netherlands). All recorded doses were also measured using this dose calibrator.

The default energy windows and coincidence timing windows were used on each system. The energy windows were 350–650 keV on the RAYCAN, 350–650 keV on the Inveon and 358–664 keV on the Molecubes system. The timing windows were 5 ns, 3.44 ns and 10 ns, respectively.

Radionuclides used for PET imaging

The radionuclides used were [¹¹C]acetate, [⁶⁸Ga]chloride, [¹⁸F]FDG, and [⁸⁹Zr]oxalate. The physical characteristics of the selected radionuclides are summarized in Table 2. The radionuclides have physical half-lives ranging from 20.4 min to 3.27 d, while the maximum positron range in water varies from 2.4 to 9.2 mm.

Table 2 Physical properties of different radionuclides used for PET image quality evaluation [8]

Full size table

NEMA image quality phantom

A standard NEMA image quality phantom was used [6]. The phantom was shipped originally with the RAYCAN system and was manufactured by RAYCAN (RAYCAN Technology, Suzhou, China), according to the NEMA specifications. The phantom has a length of 50 mm and diameter of 30 mm, where the main compartment is made in one part of solid plastic. The phantom consists of three regions, which can be used to analyze different aspects of image quality. A construction scheme with a photograph of the phantom can be found from Additional file 1: Fig. S1.

The first 20 mm of the phantom consists of 5 rods with diameters of 1, 2, 3, 4, and 5 mm, which are embedded in plastic to form a cold background. During phantom preparation, the rods are simply filled with water connected with the main cavity and contain the same concentration of radioactivity with each other and the main cavity. The rods are used to determine the recovery coefficient (Eq. 1). The central region of the phantom consists of a large uniform compartment, used to determine image uniformity and changes in the activity distribution due to noise and other effects (Eqs. 2 and 3).

The phantom also contains two cylinders with 8 mm of inner diameter and 14 mm in length. One of the cylinders is filled with air while the other is filled with water without radioactivity added. Neither of these cylinders is connected with the radioactivity in the main phantom volume, therefore they are representing two cold volumes on hot background. The cylinders are used to define the spill-over ratios in air and water (Eq. 4).

Image acquisition protocol

The standard NEMA protocol is designed specifically for ¹⁸F, where a 20 min emission scan duration with an initial activity of 3.70 MBq (± 5% accuracy) is recommended [6]. To record a similar amount of total coincidence events than for ¹⁸F, either the acquisition duration or the initial activity needs to be adjusted to different positron fractions and physical half-lives of the other radionuclides [9, 12, 13]. In this manner, the difference in counting statistics between each radionuclide can be minimized. We adjusted the total scan duration for all of the non-¹⁸F radionuclides. We chose to increase the acquisition time instead of activity, as increased activity might result in changes in scatter and randoms rates with increased system dead-time. These effects could introduce variations in the results, although they are expected to be low with this level of activity.

For determining radionuclide-specific acquisition times, we determined the acquisition time first on one system (Inveon) and fixed the duration on other systems (RAYCAN, Molecubes). Whereas this protocol does not account for the sensitivity differences between the systems, it allows to achieve a comparative evaluation of image quality parameters with different radionuclides within an individual PET/CT system.

Only on the Inveon system, it was possible to acquire the emission scan with a predefined number of counts. We set the system to acquire 190 M total events (prompts + delays), which corresponded to 20 min of emission duration for ¹⁸F. For other radionuclides, the Inveon system was set to acquire 190 M total events (prompts + delays) and the final acquisition time was noted. Thereafter, the same acquisition duration as determined on the Inveon system was then used for the RAYCAN and Molecubes systems, as these systems did not have an option to acquire by a set number counts. For each system, we extracted the number of total events corresponding to prompts + delays to confirm the amount of collected events between each radionuclide. The total number of counts collected with each system and the acquisition duration for each radionuclide can be found from Table 3. The experimental acquisition times were close to the theoretical acquisition times calculated in the paper of Disselhorst et al. [9].

Table 3 Number of events recorded on each PET/CT system and the final acquisition times used in the evaluation of different nuclei

Full size table

For image acquisition, the phantom was positioned on the scanner bed, oriented in the axial direction and centered in the field-of-view (FOV), using built-in lasers for guidance and CT scout images where available. The phantom was centered on the homogenous compartment with careful positioning to the phantom midline, which was marked to ensure repeatable positioning of the phantom between measurements. On all systems, a CT scan of the phantom was acquired for localization and attenuation correction for PET. The CT scan was acquired using the default parameters on each system which are as follows: 50 kVp and 1 mA for RAYCAN, 80 kVp and 0.5 mA for Inveon and 50 kVp and 0.1 mA for Molecubes, respectively. Thereafter, a PET scan over the entire phantom was performed with the radionuclide-specific acquisition time.

The radiotracer doses and activity concentrations at scan start times are shown in Table 4. For ¹¹C, ¹⁸F and ⁶⁸Ga acquisitions, the phantom was filled multiple times and was left to decay before proceeding to subsequent measurements. Before each measurement, the phantom was checked for and cleaned of any activity remaining from previous use. The total volume of the phantom regions filled with activity was measured as 22 mL according to phantom weight.

Table 4 Doses and activity concentrations at scan start times with acquisition times of the radiotracers on different PET/CT systems

Full size table

PET image reconstruction

All PET images were reconstructed using three-dimensional (3D) iterative reconstruction algorithms, using the default settings for histogramming and reconstruction. All available data corrections were applied, including dead-time, decay, normalization, geometric effects attenuation. Molecubes and Inveon apply the single scatter simulation method for scatter correction [18] and a delayed window method for randoms correction [19]. RAYCAN does not apply scatter or randoms correction. For attenuation correction, all the three scanners implement CT-based attenuation correction.

The reconstruction algorithms were 3D ordered-subset expectation maximization (3D-OSEM) with point spread function correction for RAYCAN [20], shifted Poisson model maximum a posteriori (SP-MAP) for Inveon [21] and a graphics processing unit (GPU)-based 3D-OSEM reconstruction for Molecubes [16]. No point spread function correction (3D-OSEM-PSF) was available on Inveon or Molecubes, whereas on the RAYCAN 3D-OSEM with point spread function correction was the only iterative algorithm available. The reconstruction parameters used and data corrections implemented for each system are summarized in Table 5.

Table 5 Image reconstruction parameters and data corrections implemented on the PET/CT systems

Full size table

Evaluation of image quality using the NEMA image quality phantom

The NEMA image quality phantom data was analyzed using the protocol specified in the NEMA NU 4-2008 standard [6]. The protocol involved evaluation of the recovery coefficient, image uniformity and spill-over effects by using in-house developed software in MATLAB2015b. These parameters have been proposed to be used as a metric for harmonization [1, 2]. A short description of the image quality metrics is given below.

The recovery coefficient determines the ability of an imaging system to recover contrast in small targets and reflects resolution. The recovery coefficient is theoretically limited to a value between 0 and 1, with values closer to 1 representing higher activity recovery, while values over 1 are considered as overestimation. The image uniformity and the percentage standard deviation in the uniform region are measures of image noise or other effects affecting the homogeneity of tracer distribution in that region. The smaller the value for percentage standard deviation, the smaller is the variation in the image, representing reduced noise. The spill-over ratio in both water and air represents the remaining contribution of scatter, positron range, randoms and other physical effects in the cold regions, as some activity will be spilled over in these regions The spill-over ratio is theoretically limited to values of 0 to 1, where values close to 0 indicate the smallest amount of spill-over.

To determine the recovery coefficient, the image slices over the central 10 mm length of the phantom rods were averaged to obtain one average slice of the rods. Circular regions of interest (ROIs) were drawn around each rod with diameters of twice the size of the physical diameter of the rods. From the ROIs, the maximum values were measured and the location of the maximum pixel coordinates was determined. The pixel coordinates were then used to create line profiles along the rods in the axial direction. To calculate the percent recovery coefficient (%RC), the pixel values in each line profile were divided by the mean activity measured from the uniform region to determine the mean %RC for each rod as:

$$\% {\text{RC}} = {\text{Mean}}_{{{\text{line}}\,{\text{profile}}}} /{\text{Mean}}_{{{\text{uniform}}{.}}} \times 100 \% ,$$

(1)

where ${\text{Mean}}_{{{\text{line}}\,{\text{profile}}}}$ corresponds to the mean activity of the line profile and ${\text{Mean}}_{{{\text{uniform}}{.}}}$ corresponds to the mean activity in the uniform region. Thereafter, the percent standard deviation of the recovery coefficients $(\% {\text{STD}}_{{{\text{RC}}}} )$ for each rod was determined from Eq. 2:

$$\% {\text{STD}}_{{{\text{RC}}}} = 100 \times \sqrt {({\text{STD}}_{{{\text{line}}\,{\text{profile}}}} /{\text{Mean}}_{{{\text{line}}\,{\text{profile}}}} )^{2} + ({\text{STD}}_{{{\text{uniform}}}} /{\text{Mean}}_{{{\text{uniform}}}} )^{2} } ,$$

(2)

where the mean and standard deviation were calculated from individual line profiles (${\text{Mean}}_{{{\text{line}}\,{\text{profile}}}}$ and ${\text{STD}}_{{{\text{line}}\,{\text{profile}}}}$) and the uninform region of the phantom (${\text{Mean}}_{{{\text{uniform}}}}$ and ${\text{STD}}_{{{\text{uniform}}}}$).

Uniformity was measured by drawing a 22.5 mm diameter and 10 mm long cylindrical volume of interest (VOI) over the center of the uniform region. The mean and percentage standard deviation (%SD) of the activity concentration were measured.

In addition, while the NEMA standard does not specify a measurement of the absolute quantification accuracy in the phantom, we calculated the percentage difference to the calculated activity concentration at scan start time in relation to activity concentration measured from the phantom uniform compartment. The percentage difference $\% \Delta$ was calculated as:

$$\% \Delta = \frac{{{\text{Mean}}_{{{\text{uniform}}{.}}} - A_{0} }}{{A_{0} }} \times 100,$$

(3)

where $A_{0}$ corresponds to the calculated activity concentration in kBq/mL at scan start time.

The spill-over of activity in the water and air-filled cylindrical inserts was defined by drawing VOIs of 4 mm in diameter and 7.5 mm length over the cylindrical inserts. The spill-over ratio (SOR) was calculated as the ratio of the mean of each cold region to the mean of the uniform region, defined as:

$${\text{SOR}} = {\text{Mean}}_{{{\text{cold}}}} /{\text{Mean}}_{{{\text{uniform}}{.}}}$$

(4)

The percent standard deviation $(\% {\text{STD}}_{{{\text{SOR}}}} )$ in the water- and air-filled rods was calculated in the same manner as the $\% {\text{STD}}$ of the recovery in Eq. 2, using the standard deviation and the mean calculated from the cold regions versus the uniform region.

Results

Figure 1 shows the transverse, sagittal and coronal views of the phantom with different radionuclides and systems. It can be seen that ¹¹C, ¹⁸F and ⁸⁹Zr have similar image quality across different imaging systems, while ⁶⁸Ga shows the poorer image quality, independent of the system. Especially the rod section of the phantom is blurred with ⁶⁸Ga.

The results from the %RC evaluation with all systems and radionuclides are shown in Fig. 2. Radionuclide-specific differences can be seen, which are also reflected across the systems to a degree. Maximum RCs with rod sizes from 1 to 5 mm were measured on the Inveon system using ¹⁸F (from 0.16 to 0.92), from rod sizes from 1 to 5 mm on the Molecubes system using ¹⁸F (from 0.18 to 0.93) and from rod sizes of 4 mm to 5 mm on the Raycan system using ¹⁸F (from 0.76 to 0.85). The lowest %RC values were measured with 1 mm rod sizes across all systems with ⁶⁸Ga (range 0.06–0.07). The maximum %RC using ⁶⁸Ga with 5 mm rod size was also the lowest of all radionuclides across all systems (range 0.56–0.62). In terms of %RC with rod sizes from 2 to 5 mm, ¹¹C and ⁸⁹Zr had RCs in between ¹⁸F and ⁶⁸Ga nearly on all systems (Fig. 2). The ${\text{RC}}s$ were very similar between ¹¹C and ⁸⁹Zr across all rod sizes.

The results from the $\% {\text{STD}}_{{{\text{RC}}}}$ evaluation with all systems and radionuclides are shown in Fig. 3. The largest variability was seen in the $\% {\text{STD}}_{{{\text{RC}}}}$ with ⁶⁸Ga across all systems (0.08 to 0.43). Lowest $\% {\text{STD}}_{{{\text{RC}}}}$ were seen with ¹⁸F and ⁸⁹Zr with the Inveon system (range 0.06–0.05 for ¹⁸F, 0.07 to 0.07 for ⁸⁹Zr) from rod sizes of 2 to 5 mm and Molecubes from rod sizes of 3 to 5 mm (range 0.15–0.11 for ¹⁸F, 0.11 to 0.11 for ⁸⁹Zr). Radionuclide-specific variation was highest with the RAYCAN system, with no clear trend between other radionuclides, with the exception of ⁶⁸Ga producing highest $\% {\text{STD}}_{{{\text{RC}}}}$ nearly across all rod sizes.

Table 6 contains the results from the uniformity evaluation. The differences in %SD between the nuclides were 1.1%, 2.2% and 1.3% for RAYCAN, Inveon and Molecubes. ¹⁸F showed a low %SD with the Inveon (4.85%), Molecubes (7.39%) and RAYCAN system (6.03%). ¹⁸F and ¹¹C had similar %SD for the RAYCAN system (approximately 6%). ⁶⁸Ga, ⁸⁹Zr and ¹⁸F had similar %SD for the Molecubes system (approximately 7%).

Table 6 The mean activity (kBq/mL) and the percentage standard deviation (%SD) measured from the uniform compartment of the phantom

Full size table

Table 7 contains the results from the evaluation of quantification accuracy, where the results varied significantly across nuclides and systems. Variation was large with ⁸⁹Zr (range − 7.82 to 22.73%) and ⁶⁸Ga (range − 14.87 to 0.36%), whereas ¹¹C and ¹⁸F showed a positive bias with all systems (range 11.01–13.36% for ¹¹C, range 3.73–8.56% for ¹⁸F). Surprisingly, ⁶⁸Ga showed the best accuracy on the Inveon and the RAYCAN system with maximum errors of 0.36% and -0.38%. The best quantification accuracy with ¹⁸F was seen on the Molecubes system (3.73% error). The largest quantification errors were seen with ⁸⁹Zr on the Inveon system (22.73%) and ⁸⁹Zr on the RAYCAN system (17.54%) and ⁶⁸Ga on the Molecubes system (-14.85%). Image quantification differences were smallest with ¹⁸F (range 3.73–8.56%) and ¹¹C between the systems (range 11.01–13.36%).

Table 7 Percentage difference (%Δ) of the mean activity in the uniform compartment of the phantom compared to the reference activity measured at scan start time

Full size table

The results from the SOR and $\% {\text{STD}}_{{{\text{SOR}}}}$ evaluation with all systems and radionuclides can be found from Table 8. SOR showed more variation between the systems and radionuclides, especially in the water compartment. Both ¹¹C and ⁶⁸Ga showed the highest spill-over ratios in water (range 0.16–0.27 for ¹¹C and 0.09 to 0.32 for ⁶⁸Ga), while ¹⁸F and ⁸⁹Zr had the lowest SOR in water (0.05 and < 0.01 on Inveon, 0.26 and 0.25 on RAYCAN, 0.07 and 0.07 on Molecubes). Highest SOR in air was measured with ⁶⁸Ga on RAYCAN (0.25) and Inveon (0.06), respectively, and with ¹¹C on Molecubes (0.13). The lowest SOR in air were measured on the Inveon system (range 0.01–0.06) and the highest were measured with the RAYCAN system (range 0.15–0.25). For the RAYCAN system, the data was not corrected.

Table 8 Spill-over ratios (SOR) with the percentage standard deviation ($\% {\text{STD}}_{{{\text{SOR}}}}$) in parenthesis measured from the phantom water and air compartment

Full size table

$\% {\text{STD}}_{{{\text{SOR}}}}$ did not show any clear trend with different radionuclides (Table 8). The largest values of $\% {\text{STD}}_{{{\text{SOR}}}}$ were recorded on the Inveon system using ⁸⁹Zr (106.38% in air, 240.08% in water). The RAYCAN system had the lowest values of $\% {\text{STD}}_{{{\text{SOR}}}}$ across all radionuclides (range 6.83–8.75% for water, range 7.64–10.42% for air).

Discussion

We performed a NEMA image quality evaluation using a well-established measurement and analysis protocol and investigated the variations in image quality and quantification with four different radionuclides on single-system setting, by using three small-animal PET/CT systems of different vendors. We used the standard NEMA protocol with 20 min acquisition time for ¹⁸F and a radionuclide-specific acquisition time for the non-¹⁸F radionuclides. Radionuclide-specific, acquisition-specific and system-specific effects were shown to affect the PET image. To our knowledge, this was the first time that the effect of physical properties of different radionuclides on the image quality parameters were investigated using three preclinical PET/CT imaging systems, in a single-center setting.

Analysis of the %RC

The behavior in %RC followed a similar trend with different radionuclides on different small animal PET/CT systems (Fig. 2). Long- (⁶⁸Ga) and short-range (¹⁸F, ¹¹C, ⁸⁹Zr) positron emitters within all systems could be separated in agreement with previous results [9]. ⁶⁸Ga showed the lowest %RC on all of the PET/CT systems (Fig. 2), indicating dependency on positron range. ¹⁸F had the highest recovery from rod sizes of 1 to 5 mm while ¹¹C and ⁸⁹Zr fall in between on Inveon and Molecubes system. This trend was similar for the RAYCAN system with rod sizes of 4 mm to 5 mm. The maximum %RC with different radionuclides varies across the systems for each rod (Fig. 2), which is expected as differences in %RC are also due to the chosen reconstruction algorithms and parameters, especially for the short-range positron emitters [9]. We also saw a dependency of %RC on acquisition time (Additional file 1: Data S1).

PSF reconstruction was applied only on the RAYCAN system, as this was the only iterative algorithm available. The other systems used non-PSF reconstruction. When using PSF reconstruction, higher %RC values are expected than with non-PSF reconstruction. However, for most rod sizes, RAYCAN has the lowest %RC (Fig. 2). This is explained by the following factors. First is the relatively low resolution of the RAYCAN system compared to other systems, positron range and to the size of the rods. Secondly, the %RC is calculated based on the mean of the line profile across the rod, reducing potential overshoot effects due to PSF. Thirdly, a low number of iterations (2) with a post-filter were used for PSF reconstruction, reducing potential overshoot effects, which are more prominent with a high number of iterations and without filtering.

Thus, the %RC seems to be dependent on system sensitivity and resolution in regard to the radionuclide positron fraction, physical half-life and positron range. Therefore, the individual differences in maximum %RC between the systems are explained not only by radionuclide-specific qualities but also by system-specific performance qualities, such as the sensitivity, intrinsic resolution and the implemented reconstruction algorithm. The %RC value is then affected by the following: (1) counting statistics, reflecting image noise and system sensitivity (2) resolution differences and the image reconstruction algorithm (3) positron range, physical half-life and positron fraction of the radionuclide which affect the two former factors. The dependency on %RC on the counting statistics is more prominent with systems with low sensitivity and radionuclides with low positron fractions and physical half-life compared to ¹⁸F.

Analysis of the $\% {\text{STD}}_{{{\text{RC}}}}$

The $\% {\text{STD}}_{{{\text{RC}}}}$ reflected the radionuclide- and system-specific qualities as the %RC evaluation (Fig. 3). ⁶⁸Ga showed the highest variability with the 1 mm rod size across all systems whereas ¹⁸F and ⁸⁹Zr had the lowest variability on two systems. The systems included have different resolution characteristics, with 1.9 mm in RAYCAN and 1.8 mm in Inveon up to 1.1 mm in Molecubes [15,16,17]. The intrinsic resolution of the imaging system versus the positron range of the radionuclide will then be reflected in this parameter. Noise from the image acquisition in regard to positron fraction, physical half-life, used acquisition time, sensitivity and system hardware design will affect $\% {\text{STD}}_{{{\text{RC}}}}$, where increased noise will result in increase of $\% {\text{STD}}_{{{\text{RC}}}}$. A related parameter is then the phantom %SD measured from the uniform compartment, which means that the higher the noise, the higher the %SD and the higher the $\% {\text{STD}}_{{{\text{RC}}}}$ (Table 6, Fig. 3). Additional contribution is given by the reconstruction algorithm and its noise handling properties, although a lower $\% {\text{STD}}_{{{\text{RC}}}}$ can be achieved only if the total number of events is large enough to exclude the effect of contribution of Poisson processes.

In summary, the following factors contribute to the radionuclide-specific variability in the $\% {\text{STD}}_{{{\text{RC}}}}$ parameter. The factors include positron range, resolution of the system, positron fraction and physical half-life, acquisition time and sensitivity and noise originating from the reconstruction. Some variability is introduced by the measurement and analysis procedure, as parameter includes measurement of activity across the whole rod.

Analysis of the %SD

We detected the smallest variation in %SD within each system with all values within range of 2% among different radionuclides (Table 6). Disselhorst et al. found that the largest differences in %SD originate from various reconstruction algorithms for the same radionuclide, when using radionuclide-specific acquisition times and a single PET/CT system [9]. Similarly to %RC, we also saw a dependency of %SD on the acquisition time, as expected (Additional file 1: Data S1). Thus, the %SD remains relatively stable between radionuclides as long as sufficient amount of counting statistics is collected and the same reconstruction algorithm is applied (Table 6). If the total activity or the counting statistics is too low, the standard deviation in the uniform region and other phantom regions will be affected by the relative sensitivity of the systems.

As the positron fraction, physical half-life differences and the system sensitivity with the reconstruction algorithm are reflected by this parameter, a large variation was seen between the systems (range 4.85–8.46%). We also noticed a bias in the Molecubes system, which shows the lowest mean activity for ⁶⁸Ga, ¹⁸F and ⁸⁹Zr. Given the %SD values are higher than for other systems, we suspect that this is not caused by difference in noise but a calibration offset. However, we were unable to verify this given we do not currently have access to the raw calibration factors on the system.

Analysis of image quantification

We detected large differences in quantification between the systems and different radionuclides (Table 7). One system showed a systematic overestimation of the activity (Inveon), whereas the bias fluctuated between the other two systems in both positive and negative direction (RAYCAN and Molecubes).

The large variation of the quantification accuracy is attributed by several factors. The first is the need of a proper calibration protocol specifically for each radionuclide across the systems, to minimize the over- and underestimations between the systems. Two systems (RAYCAN and Inveon) also showed absolute fluctuations over 5% with ¹⁸F. One would expect that since the systems are calibrated with ¹⁸F, the absolute errors would be within the expected fluctuations of ± 5%. This deviation might be caused by the differences in the calibration procedures between the systems, error in the measurement of exact target dose in the calibrator and limited accuracy of the activity measurement in images from the small compartment of the phantom. Other potential factors would be a drift in the system since the time point of the calibration or a difference in the reconstruction, activity, or the geometry between calibration phantom and the NEMA phantom.

Given that the system-specific calibration procedures use different sizes of phantoms or syringes between the systems, which have different geometry and volume than the NEMA phantom and the activities used for calibration (5 MBq to 20 MBq) are different from the NEMA specified activity (3.7 MBq), fluctuations are expected. It would be beneficial to use a single phantom with the same geometry to derive calibration factors between the systems to minimize the variation. Finally, changes e.g. in reconstruction parameters can result in differences larger than the expected fluctuation of ± 8.4 kBq/mL. This expected fluctuation can be calculated using the NEMA-recommended initial activity (3.7 MBq), the volume of the phantom (22 mL) and the allowed fluctuation (± 5%) in activity. Together, this produces an expected activity concentration of 168 kBq/mL from which the allowed fluctuation in units of kBq/mL can be derived by using the allowed fluctuation in percentage units.

Analysis of SOR

The behavior in SOR varied more between the systems (Table 8). Both ¹¹C and ⁶⁸Ga had SOR larger than ¹⁸F and ⁸⁹Zr in water on all systems. Thus, differentiation of short- and long-range positron emitters similarly to [9] could be established in the water compartment across all systems. For SOR in air, ¹¹C and ⁶⁸Ga had larger SOR compared to than ¹⁸F and ⁸⁹Zr on Molecubes and RAYCAN. On RAYCAN, only ⁶⁸Ga could be differentiated clearly from other radionuclides based on SOR (SOR in air 0.25, SOR in water 0.32). ¹⁸F and ⁸⁹Zr—the radionuclides with the lowest positron ranges—showed the lowest spill-over rations with all three systems.

Disselhorst et al. differentiated three factors which affect SOR: positron range, system-specific data corrections and the dimensions of the cold cylinder regions of the phantom. The authors suggested not to use SOR in water as data correction performance in assessment of radionuclides with long positron range (9.2 mm for ⁶⁸Ga) as with these radionuclides the SOR in water is caused by positrons emitted from the main body of the phantom, which annihilate in the water-filled compartment. The same effect is seen in our measurements with both ⁶⁸Ga and ¹¹C which show higher SOR in water compared to other radionuclides (Table 8). Thus, the SOR in water reflects a mixture of the effects of the data corrections and the positron range, although variations between radionuclides and systems are evident.

There is a large variation between the radionuclides system-wise in the SOR in air, explained by the differences in system-specific implementation of corrections for randoms and scatter in the image reconstruction. As the air compartment can be used as an indicator the system-specific data correction performance [9], Inveon seems to be most effective in correcting the activity in the air compartment, showing lowest SOR in air across all radionuclides. The large SOR measured on RAYCAN system are due to missing data corrections for randoms and scatter, making this performance value not comparable against the two systems with data corrections implemented. These factors indicate, that for standardization purposes, the SOR in air would be the most challenging to match between the systems, as data correction implementations and their effects are very system-dependent.

Analysis of $\% {\text{STD}}_{{{\text{SOR}}}}$

The last parameter which we quantified was $\% {\text{STD}}_{{{\text{SOR}}}}$. This parameter measures the variation in the cold compartments in the phantom versus the uniform compartment, and showed no radionuclide-specific trend (Table 8). Based on our results, this parameter is challenging to use as an indicator of the radionuclide-specific qualities and might be biased when the activity in the cold region is low. This is shown in our measurements using the Inveon system, which show the highest $\% {\text{STD}}_{{{\text{SOR}}}}$ of all of the measurements and very high values (106.38% and 240.08%) with ⁸⁹Zr. However, based on the SOR value in air, the Inveon should have most efficient data corrections in place of all the systems evaluated.

The high $\% {\text{STD}}_{{{\text{SOR}}}}$ on the Inveon system is caused by measuring a very low mean value versus high standard deviation inside the cylinder VOI. The low mean is caused by effective data correction, as shown by the low SOR with this radionuclide, whereas small regions with high activity on otherwise cold regions increase the standard deviation. This effect results in high $\% {\text{STD}}_{{{\text{SOR}}}}$, as can be seen from Eq. (2).

Moreover, Eq. (2) shows that $\% {\text{STD}}_{{{\text{SOR}}}}$ increases in magnitude either with large standard deviation or low mean value in the cold compartment. In this case, a region with high SOR (high mean) but good uniformity (low standard deviation) would result in lower $\% {\text{STD}}_{{{\text{SOR}}}}$ than in the opposite case. This is evident with the RAYCAN system, which showed the lowest $\% {\text{STD}}_{{{\text{SOR}}}}$ for all radionuclides (Table 8), across all systems, although the data was not corrected for scatter or randoms.

Theoretically, an efficient data correction would result both in low mean value and low standard deviation in the cold region, resulting to effective negation of spill-over of activity and any residual activity inside the compartment. Whereas it seems that in our measurements with the Inveon system, a low mean and large standard deviation are occurring simultaneously, due to high and low activity regions, which increases $\% {\text{STD}}_{{{\text{SOR}}}}$. This explains the difference in $\% {\text{STD}}_{{{\text{SOR}}}}$ between Inveon and other systems. However, it is not guaranteed that both the standard deviation and the mean value in the compartments are connected, that is, they are always increasing or decreasing by a similar amount or to the same direction. Therefore, due to these factors affecting to $\% {\text{STD}}_{{{\text{SOR}}}}$ calculation, we recommend that this parameter should be always investigated in connection with the SOR whenever the effectiveness of data correction performance is evaluated, with radionuclides with positron ranges different from ¹⁸F.

Comparison of results to previous studies

Previously, multi-radionuclide evaluations on preclinical PET/CT systems have been performed on the ALBIRA II, ARGUS, Mediso nanoScan, Inveon and Molecubes systems. We’ve collected the %RC and %SD results from these evaluations to Table 9, where available. Attarwala et al. compared ¹⁸F, ⁶⁸Ga and ⁶⁴Cu and reported lower %RC for ⁶⁸Ga (< 60%) with similar %SD (~ 6%) as compared to ¹⁸F [11]. Cañadas et al. performed an evaluation using ⁶⁸Ga and ¹⁸F using the ARGUS system, reporting %RC for ⁶⁸Ga in range of 0.17 and 0.72 in comparison to 0.28 to 0.92 with ¹⁸F [12]. The reported %SD were high for both nuclei (> 15%), possibly due to applying a large number of image updates. Using increased acquisition time for both ¹⁸F and ⁶⁸Ga, Gaitanis et al. reported %RC for ⁶⁸Ga in range of 0.09 to 0.60 compared to 0.18 to 0.87 with ¹⁸F [13]. For the Inveon system, Disselhorst et al. reported %RC in the range of 0.1 to 0.6 with ⁶⁸Ga, 0.2 to 1.0 with ¹⁸F and 0.2 to 0.9 with ⁸⁹Zr, with lower %SD than our study (2% to 3%) [9]. The %RCs measured in [9] agree very well with our results with ⁶⁸Ga, ¹⁸F and ⁸⁹Zr for the Inveon system.

Table 9 Comparison of %RC and %SD values reported from previous multi-radionuclide studies

Full size table

Although there are differences in the system performance, the reconstruction algorithms, reconstruction parameters, acquisition times and activities between this study and previous investigations (Table 9), some comparisons can be performed. Of note are the %RC results for ⁶⁸Ga with other systems (Table 9), where the %RCs are very comparable (maximum values 0.56 to 0.72) to our results. In terms of %SD, the ALBIRA II and Mediso NanoScan show comparable values (approximately 5% to 6.7%) to our results with both ⁶⁸Ga and ¹⁸F. However, the %SD values measured from the ARGUS system and from the Inveon system differ from ours, due to the amount of iterations used for reconstruction on both systems. ARGUS uses a relatively high number of iterations (48), whereas in Inveon the amount of image updates is lower (2/18) than in our study (18/16).

In comparison of our results with Molecubes, Presotto et al. compared 3 radionuclides with the Molecubes system, including ¹⁸F, ¹¹C and ⁶⁸Ga, where higher %RC were reported for ¹¹C and ¹⁸F similarly in our study, with lowest %RC for ⁶⁸Ga (< 0.6) [14]. The difference of %RC to our results is explained by the amount of iterations used as seen from our Supplemental data with the Molecubes system (Additional file 1: Fig. S4), where %RC of 0.3 for ¹⁸F and 1 mm rod can be reached when using 100 iterations, as in Presotto et al. For the RAYCAN system, only one previous evaluation exists [17], performed with ¹⁸F only, and with agreeable values to what are achieved in this paper. Thus, the results presented in this paper agree well with the studies published with the previous systems, concerning ⁶⁸Ga and ¹⁸F, although there are variations in reported %RC and %SD due to different acquisition times, reconstruction parameters and algorithms used in the measurements, prompting for a standardized approach for conducting the measurements, image reconstructions and evaluations.

Limitations

There are limitations imposed by the phantom, using the NEMA protocol for quantification of different parameters with different radionuclides and in applying the phantom measurements to in-vivo data. A recent paper has discussed in detail about the challenges in the NEMA measurements [22], from which we will focus only on the part which concern the image quality measurements. First is the construction of the phantom, where rods used to quantify the %RC are embedded in a cold background, whereas %SD and SOR are quantified from regions with hot background. This means that the phantom does not perfectly mimic the in-vivo situations where hot targets are usually located in a region with background activity. This also affects the evaluation and comparison of %RC among different reconstruction algorithms in the phantom and in-vivo as reconstruction convergence is also dependent on the level of background activity [23].

The second limitation is imposed by the NEMA measurement protocol used to quantify the %RC of the rods. As Hallen et al. discussed, the %RC actually measures a combination of recovery and variance over the rods [22]. The calculation of %RC includes measurement of the maximum activity in a ROI, which causes a positive correlation of noise and recovery coefficients. The NEMA protocol specifically states to search for the maximum pixel value in each of the rods from an averaged image, and then draw line profiles over the rods from which the %RC is calculated [22]. This may introduce inaccuracies in the presence of low counting statistics or noise, which will in turn create a positive bias for the %RC measurement with high-noise, low-statistics data.

To study this positive bias in %RC, we repeated the acquisitions with ¹¹C and ⁶⁸Ga using a 20 min acquisition time (Additional file 1: Data S1). The effect of lower counting statistics can be seen well with ¹¹C and ⁶⁸Ga which show increased %RC when using 20 min acquisition time (Additional file 1: Fig. S3), indicating that %RC is positively biased due to lower counting statistics. When using radionuclide-specific acquisition times, the %RCs are lower (Additional file 1: Fig. S3, Table S1). The effect of producing considerably smoother images with low variance will also positively bias the %RC with the larger rod sizes, as seen from the Supplemental Data (Additional file 1: Tables S5 to S7).

In addition, the measurement of absolute quantification should be performed using a different type of phantom (e.g. a large uniform phantom) as the NEMA protocol does not specifically state any measurement of absolute quantification from the uniform compartment of the phantom. Using the uniform compartment as a measure of absolute recovery might be limited in accuracy, giving only a rough estimate of the quantitative accuracy of the systems. There are also additional measurements which could be considered e.g. acquiring the phantom with multiple off-axis positions to study the effect of resolution non-uniformity with multiple radionuclides. In these cases, the measurements would need to be repeated with PSF correction turned on and off as PSF correction tends to reduce the resolution non-uniformity across the FOV. However, the NEMA standard does not specifically recommend in performing off-axis image quality measurements.

Finally, the comparison between systems in this study is hampered due to different sensitivities and that specific calibration factors need to be applied between the systems to enable an unbiased comparison. This would enable a more straightforward comparison of system-to-system performance. Thus, the protocol used in this study allows to study the effect of radionuclides reliably only within a specific system. To account for the different sensitivities for the systems in the study, adjustment of the acquisition time experimentally according to the sensitivity of the systems might be needed. As can be noted from the acquired total counts collected afterward from each of the system, decreasing the acquisition time in Molecubes and increasing the acquisition time for RAYCAN by a factor might allow to compensate for the sensitivity differences between the systems. Another option would be for the manufacturer of each of the system to implement a protocol to acquire by counts on all of the systems. A specific calibration protocol should be applied beforehand to take into account not only the different sensitivities but also calibration differences, reconstruction and data corrections implemented in the systems.

Furthermore, as all of the systems are calibrated routinely using ¹⁸F only, a radionuclide-specific calibration protocol might be desired and the systems should be cross calibrated before assessment of quantification results between the systems. This is reflected by our quantification results (Table 7), where variation is high between systems even with ¹⁸F. However, in routine preclinical or clinical imaging the calibrations are generally performed with either ¹⁸F only or a ⁶⁸Ge solid phantom and there are no specific calibration factors for each radionuclide, although they might be beneficial for quantitative accuracy.

Future directions

In summary, the factors that greatly contribute to the PET image quality and the parameters estimated were the radionuclide-specific positron fraction, physical half-life and positron range. Variation is introduced by system-specific qualities, including system sensitivity, spatial resolution, reconstruction algorithm and the implemented data corrections for randoms and scatter. In our supplemental data (Additional file 1: Data S1), we’ve highlighted that %RC measurements with non-¹⁸F radionuclides will be biased unless the differences in positron fraction and physical half-life are accounted for. Moreover, in line with the studies of Disselhorst et al. and Liu et al., we determined that the positron range is a limiting factor concerning recovery of small targets (%RC) and the resulting spill-over in cold regions surrounded by activity [9, 10]. These effects were most evident with ⁶⁸Ga.

The limitations imposed by positron range become more evident with the increasing resolution of modern preclinical PET/CT systems and with radionuclides with long positron range such as ⁶⁸Ga. Therefore, a performance benefit would be gained from a method for positron range correction. Currently, none of the systems have such method available. So far, positron range corrections have been used in research settings, where methods are based on deburring with an appropriate kernel in the reconstruction. For short review on the methodologies, we refer to two recent papers [24, 25]. In short, methods using spatially-invariant kernels are simple to implement, but are mainly effective for uniform media [24]. For heterogeneous media, spatially-variant anisotropic kernels need to be implemented in addition to taking into account the different densities of tissues [25]. Up to this date, the authors are currently aware only of one study [26] which applied a positron range correction method on a preclinical PET/CT and in-vivo data.

The results presented indicate that there is still considerable room for both optimization and standardization when using different radionuclides. In terms of optimization, selected reconstruction parameters e.g. image matrix size, filtering and the amount of iterations will affect the %RC, %SD and SOR parameters (Additional file 1: Data S2). As for our evaluation, we applied the default parameters, although they varied between the systems. We believe that our results are to be more reflective of imaging performance at baseline, thus we did not tune these parameters to suit specifically for different radionuclides or systems. Further optimization studies to achieve an optimal image quality with different radionuclides are encouraged. In practice, tuning the different reconstruction options suitable among systems might be limited depending on the available options on the system. In general, few modifiable options are available, such as matrix or pixel size, algorithm, the amount of iterations and filtering.

Concerning standardization, we believe that the variation seen with the maximal %RC and the %SD in the uniform compartment could be greatly reduced if a specific harmonization protocol was applied. By harmonizing calibrations, acquisition, image reconstruction parameters and possibly by post-processing (e.g. filtering) of the images, both the noise structure, contrast recovery and resolution properties could be made more uniform among different imaging systems. For example, ensuring that sufficient amount of counting statistics are collected, reconstructing images with lower matrix size and applying filters would result in reduced %SD. However, as stated by Hallen et al., the optimal %RC would need to be determined carefully with comparison of uniformity, as the overall image quality performance is a trade-off between uniformity and recovery [22]. This can be seen from the Molecubes data reconstructed with different reconstruction options, two different algorithms and variable amount of iterations (Additional file 1: Fig. S4, Data S2). As this will require a careful study of the effect of different combination of reconstruction parameters to the NEMA image quality metrics on each system and each radionuclide specifically, further studies on the effect of harmonization on both the acquisition and reconstruction parameters are encouraged to minimize the variation between different preclinical systems in multi-radionuclide studies.

Conclusions

Our study has highlighted several factors, which affect the image quality parameters when using the preclinical PET/CT systems for multi-radionuclide imaging and NEMA image quality measurements at baseline performance. These factors need to be addressed in further standardization attempts between different radionuclides, different PET/CT systems and when using the NEMA image quality protocol. System-, acquisition and radionuclide-dependent qualities were identified to affect the image quality parameters measured by the NEMA image quality protocol and should be accounted for by applying a specific calibration and harmonization protocol when conducting multi-center or multi-system studies in preclinical imaging. As this study was performed only as a single-center and single-system evaluation setting, further attempts for harmonizing system performance in multi-system and multi-center studies are needed.

In general, we have noted that most of the systems performed most optimal in terms of NEMA image quality parameters when using ¹⁸F, where ¹¹C and ⁸⁹Zr performed slightly worse than ¹⁸F and the performance was least optimal when using ⁶⁸Ga. As ⁶⁸Ga produced the lowest %RC, largest $\% {\text{STD}}_{{{\text{RC}}}}$ and increased SOR, further optimization of system performance would be beneficial for ⁶⁸Ga as well as radionuclides with long positron ranges. As modern preclinical PET/CT systems are close to sub-millimeter resolution, it is important to take into account the positron range to improve image quality with these radionuclides. A large variation of image quantification were also seen between the systems, which would also prompt further optimization not only in terms of improving the image quality, but also by improving image quantification. This would require assessment of the accuracy of system calibration, data corrections and of image quantification with different radionuclides.

Availability of data and materials

The datasets used and analysed during the current study are available from the corresponding author on reasonable request.

Abbreviations

%RC:: Percent recovery coefficient
%STD, %SD:: Percentage standard deviation
[¹⁸F]FDG:: [¹⁸F]Fluorodeoxyglucose
3D:: Three-dimensional
FOV:: Field-of-view
FBP:: Filtered backprojection
GPU:: Graphics processing unit
LSO:: Lutetium oxyorthosilicate
LYSO:: Lutetium–yttrium oxyorthosilicate
NEMA:: National electrical manufacturers association
OSEM:: Ordered subsets expectation maximization algorithm
PET/CT:: Positron emission tomography/computed tomography
PSF:: Point spread function
SOR:: Spill-over ratio
SP-MAP:: Shifted poisson maximum a posteriori reconstruction
VOI:: Volume of interest

References

Mannheim JG, Schmid AM, Pichler BJ. Influence of Co-57 and CT transmission measurements on the quantification accuracy and partial volume effect of a small animal PET scanner. Mol Imaging Biol. 2017;19:825–36.
Article Google Scholar
Mannheim JG, Kara F, Doorduin J, et al. Standardization of small animal imaging-current status and future prospects. Mol Imaging Biol. 2018;20:716–31.
Article Google Scholar
Kuntner C, Stout DB. Quantitative preclinical PET imaging: opportunities and challenges. Front Phys. 2014;2:12. https://doi.org/10.3389/fphy.2014.00012.
Article Google Scholar
Mannheim JG, Mamach M, Reder S, et al. Reproducibility and comparability of preclinical PET imaging data: a multi-center small animal PET study. J Nucl Med. 2019;60:1483–91.
Article CAS Google Scholar
Goertzen AL, Bao Q, Bergeron M, et al. NEMA NU 4–2008 comparison of preclinical PET imaging systems. J Nucl Med. 2012;53:1300–9.
Article Google Scholar
National Electrical Manufacturers Association 2008 NEMA standards publication NU 4–2008 Performance Measurements of Small Animal Positron Emission Tomographs. Rosslyn, VA: National Electrical Manufacturers Association
Osborne DR, Kuntner C, Berr S, Stout D. Guidance for efficient small animal imaging quality control. Mol Imaging Biol. 2017;19:485–98.
Article Google Scholar
Conti M, Eriksson L. Physics of pure and non-pure positron emitters for PET: a review and a discussion. EJNMMI Phys. 2016;3:8.
Article Google Scholar
Disselhorst JA, Brom M, Laverman P, et al. Image-quality assessment for several positron emitters using the NEMA NU 4–2008 Standards in the Siemens Inveon small-animal PET scanner. J Nucl Med. 2010;51:610–7.
Article Google Scholar
Liu X, Laforest R. Quantitative small animal PET imaging with nonconventional nuclides. Nucl Med Biol. 2009;36:551–9.
Article CAS Google Scholar
Attarwala AA, Karanja YW, Hardiansyah D, et al. Untersuchung der bildgebenden Eigenschaften des ALBIRA II Kleintier-PET-Systems für 18F, 68Ga und 64Cu. Z Med Phys. 2017;27:132–44.
Article Google Scholar
Cañadas M, Sanz ER, Vives MO, et al. Performance evaluation for 68Ga and 18F of the ARGUS small-animal PET scanner based on the NEMA NU-4 standard. In: Nuclear science symposium conference record 2010; 3454–7
Gaitanis A, Kastis GA, Vlastou E, Bouziotis P, Verginis P, Anagnostopoulos CD. Investigation of image reconstruction parameters of the Mediso nanoScan PC small-animal PET/CT scanner for two different positron emitters under NEMA NU 4–2008 Standards. Mol Imaging Biol. 2017;19:550–9.
Article CAS Google Scholar
Presotto L, Spangler-Bickell M, Belloli S, et al. 3D Spatial resolution proprieties of Molecubes β-Cube: characterization with different isotopes. In: 2019 IEEE nuclear science symposium and medical imaging conference NSS/MIC 2019. 2019;2019–20.
Bao Q, Newport D, Chen M, Stout DB, Chatziioannou AF. Performance Evaluation of the Inveon Dedicated PET Preclinical Tomograph Based on the NEMA NU-4 Standards. J Nucl Med. 2009;50:401–8.
Article Google Scholar
Krishnamoorthy S, Blankemeyer E, Mollet P, Surti S, Holen RV, Karp JS. Performance evaluation of the MOLECUBES β -CUBE—a high spatial resolution and high sensitivity small animal PET scanner utilizing monolithic LYSO scintillation detectors. Phys Med Biol. 2018;63:155013.
Article Google Scholar
Teuho J, Han C, Riehakainen L, et al. NEMA NU 4–2008 and in vivo imaging performance of RAYCAN trans-PET/CT X5 small animal imaging system. Phys Med Biol. 2019;64:115014.
Article CAS Google Scholar
Watson CC, Newport D, Casey ME. A single scatter simulation technique for scatter correction in 3D PET. Three-dimensional image reconstruction in radiology and nuclear medicine. Dordrecht: Springer; 1996. p. 255–68.
Google Scholar
Brasse D, Kinahan PE, Lartizien C, Comtat C, Casey M, Michel C. Correction methods for random coincidences in fully 3D whole-body PET: impact on data and image quality. J Nucl Med. 2005;46:859–67.
PubMed Google Scholar
Liu J, Kao C-M, Gu S, Xiao P, Xie Q. A PET system design by using mixed detectors: resolution properties. Phys Med Biol. 2014;59:3517.
Article Google Scholar
Qi J, Leahy RM, Cherry SR, Chatziioannou A, Farquhar TH. High-resolution 3D Bayesian image reconstruction using the microPET small-animal scanner. Phys Med Biol. 1998;43:1001–13.
Article CAS Google Scholar
Hallen P, Schug D, Schulz V. Comments on the NEMA NU 4–2008 Standard on performance measurement of small animal positron emission tomographs. In: EJNMMI Phys 2020;7.
Gong K, Cherry SR, Qi J. On the assessment of spatial resolution of PET systems with iterative image reconstruction. Phys Med Biol. 2016;61:193–202.
Article Google Scholar
Cal-González J, Pérez-Liva M, Herraiz JL, Vaquero JJ, Desco M, Udías JM. Tissue-dependent and spatially-variant positron range correction in 3D PET. IEEE Trans Med Imaging. 2015;34:2394–403.
Article Google Scholar
Emond EC, Groves AM, Hutton BF, Thielemans K. Effect of positron range on PET quantification in diseased and normal lungs. Phys Med Biol. 2019;64.
Cal-Gonzalez J, Vaquero JJ, Herraiz JL, et al. Improving PET quantification of small animal [68Ga]DOTA-Labeled PET/CT studies by using a CT-based positron range correction. Mol Imaging Biol. 2018;20:584–93.
Article CAS Google Scholar

Download references

Acknowledgements

The authors want to express their thanks to RAYCAN, Siemens Healthcare (Martha Moryson) and Molecubes NV (Pieter Mollet) for the technical assistance in NEMA measurements and analysis.

Funding

This work was supported in part by the Academy of Finland, Finnish Centre of Excellence in Cardiovascular and Metabolic Diseases, supported by the Academy of Finland, University of Turku, Turku University Hospital, and Åbo Akademi University. This study was also financially supported by grant from the Jane and Aatos Erkko Foundation, National Natural Science Foundation of China and Ministry of Science and Technology of the People’s Republic of China. These funding sources had no role in the design of this study and will not have any role during its execution, analyses, interpretation of the data, or decision to submit results.

Author information

Authors and Affiliations

Turku PET Centre, University of Turku, Turku, Finland
Jarmo Teuho, Leon Riehakainen, Olli Moisio, Marko Tirri, Tove J. Grönroos, Anne Roivainen, Juhani Knuuti & Riku Klén
Turku PET Centre, Turku University Hospital, Turku, Finland
Jarmo Teuho, Aake Honkaniemi, Chunlei Han, Anne Roivainen & Juhani Knuuti
Department of Biomedicine, University of Turku, Turku, Finland
Marko Tirri & Mika Teräs
RaySolution Digital Medical Imaging Co., Ltd, Ezhou, People’s Republic of China
Shihao Liu
MediCity Research Laboratory, University of Turku, Turku, Finland
Tove J. Grönroos
School of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, People’s Republic of China
Jie Liu, Xiao Liang, Yiqing Ling, Yuexuan Hua, Qingguo Xie & Nicola D’Ascenzo
School of Software Engineering, Huazhong University of Science and Technology, Wuhan, People’s Republic of China
Lin Wan
Turku Center for Disease Modeling, University of Turku, Turku, Finland
Anne Roivainen
Department of Medical Physics and Engineering, Istituto Neurologico Mediterraneo NEUROMED I.R.C.C.S., Pozzilli, Italy
Qingguo Xie & Nicola D’Ascenzo
Wuhan National Laboratory for Optoelectronics, Huazhong University of Science and Technology, Wuhan, China
Qingguo Xie
Department of Medical Physics, Turku University Hospital, Turku, Finland
Mika Teräs

Authors

Jarmo Teuho
View author publications
You can also search for this author in PubMed Google Scholar
Leon Riehakainen
View author publications
You can also search for this author in PubMed Google Scholar
Aake Honkaniemi
View author publications
You can also search for this author in PubMed Google Scholar
Olli Moisio
View author publications
You can also search for this author in PubMed Google Scholar
Chunlei Han
View author publications
You can also search for this author in PubMed Google Scholar
Marko Tirri
View author publications
You can also search for this author in PubMed Google Scholar
Shihao Liu
View author publications
You can also search for this author in PubMed Google Scholar
Tove J. Grönroos
View author publications
You can also search for this author in PubMed Google Scholar
Jie Liu
View author publications
You can also search for this author in PubMed Google Scholar
Lin Wan
View author publications
You can also search for this author in PubMed Google Scholar
Xiao Liang
View author publications
You can also search for this author in PubMed Google Scholar
Yiqing Ling
View author publications
You can also search for this author in PubMed Google Scholar
Yuexuan Hua
View author publications
You can also search for this author in PubMed Google Scholar
Anne Roivainen
View author publications
You can also search for this author in PubMed Google Scholar
Juhani Knuuti
View author publications
You can also search for this author in PubMed Google Scholar
Qingguo Xie
View author publications
You can also search for this author in PubMed Google Scholar
Mika Teräs
View author publications
You can also search for this author in PubMed Google Scholar
Nicola D’Ascenzo
View author publications
You can also search for this author in PubMed Google Scholar
Riku Klén
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

JT and LR participated to conception and study design, data analysis and interpretation, LR, AH and MTi conducted the data collection, OM was responsible for the production of ⁸⁹Zr, CH, TG and AR participated in study design and interpretation of the results, SL was responsible of technical assistance in conducting the measurements on the RAYCAN system with support in use of the system, TG participated in study design, interpretation of the results and communication with PET system vendors, JL, LW maintained the software installed on the RAYCAN system, XL initially verified the RAYCAN system against the NEMA performance standard, YL maintained the acquisition software of the RAYCAN system, YH was responsible of technical assistance in conducting the measurements on the RAYCAN system with support in use of the system, JK, QX, MTe, ND and RK were involved in the drafting and revision of the work and participated as principal investigators of the study. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Jarmo Teuho.

Ethics declarations

Human and animal rights

This article does not contain any studies with human participants or animals performed by any of the authors.

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

SL is an employee of RaySolution Digital Medical Imaging Co. The authors JT, LR, AH, OM, CH, MTi, TG, JL, LW, XL, YL,YH,AR, JK, QX, MTe, ND and RK declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Additional file 1: Fig. S1

. Construction scheme of the NEMA phantom with a photograph of the phantom. Data S1. Effect of using the standard 20 minute acquisition time for ¹¹C and ⁶⁸Ga. Data S1, Table S1. Recovery coefficients of the phantom rods with different radionuclides and acquisition times. Data S1, Table S2. Percentage standard deviation of recovery coefficients of the phantom rods with different radionuclides and acquisition times. Data S1, Table S3. The mean activity and the percentage standard deviation measured from the uniform compartment of the phantom with different acquisition times. Data S1, Fig. S2. Image quality comparison for ¹¹C and ⁶⁸Ga using the 20 minute acquisition time. Data S1, Fig. S3. Recovery coefficients with ¹¹C and ⁶⁸Ga only. Data S2. Effect of different reconstruction algorithms and parameters. Data S2, Table S4. Image reconstruction parameters and data corrections used for the reconstruction evaluation. Data S2, Fig. S4. Dependency of image quality parameters on the amount of iterations. Data S2, Table S5. Recovery coefficients of the phantom rods with different radionuclides and reconstruction options. Data S2, Table S6. Percentage standard deviation of recovery of the phantom rods with different radionuclides and reconstruction options. Data S2, Table S7. The mean radioactivity with percentage standard deviation and relative difference in the uniform compartment with the spill-over-ratios and their percentage standard deviations with different reconstruction schemes..

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Teuho, J., Riehakainen, L., Honkaniemi, A. et al. Evaluation of image quality with four positron emitters and three preclinical PET/CT systems. EJNMMI Res 10, 155 (2020). https://doi.org/10.1186/s13550-020-00724-z

Download citation

Received: 05 March 2020
Accepted: 21 October 2020
Published: 10 December 2020
DOI: https://doi.org/10.1186/s13550-020-00724-z

Evaluation of image quality with four positron emitters and three preclinical PET/CT systems

Abstract

Background

Results

Conclusions

Background

Methods

Preclinical PET/CT systems

Radionuclides used for PET imaging

NEMA image quality phantom

Image acquisition protocol

PET image reconstruction

Evaluation of image quality using the NEMA image quality phantom

Results

Discussion

Analysis of the %RC

Analysis of the \(\% {\text{STD}}_{{{\text{RC}}}}\)

Analysis of the %SD

Analysis of image quantification

Analysis of SOR

Analysis of \(\% {\text{STD}}_{{{\text{SOR}}}}\)

Comparison of results to previous studies

Limitations

Future directions

Conclusions

Availability of data and materials

Abbreviations

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Human and animal rights

Ethics approval and consent to participate

Consent for publication

Competing interests

Additional information

Publisher's Note

Supplementary information

Additional file 1: Fig. S1

Rights and permissions

About this article

Cite this article

Share this article

Keywords