Quantitative and clinical impact of MRI-based attenuation correction methods in [18F]FDG evaluation of dementia

Background Positron emission tomography/magnetic resonance imaging (PET/MRI) is a promising diagnostic imaging tool for the diagnosis of dementia, as PET can add complementary information to the routine imaging examination with MRI. The purpose of this study was to evaluate the influence of MRI-based attenuation correction (MRAC) on diagnostic assessment of dementia with [18F]FDG PET. Quantitative differences in both [18F]FDG uptake and z-scores were calculated for three clinically available (DixonNoBone, DixonBone, UTE) and two research MRAC methods (UCL, DeepUTE) compared to CT-based AC (CTAC). Furthermore, diagnoses based on visual evaluations were made by three nuclear medicine physicians and one neuroradiologist (PETCT, PETDeepUTE, PETDixonBone, PETUTE, PETCT + MRI, PETDixonBone + MRI). In addition, pons and cerebellum were compared as reference regions for normalization. Results The mean absolute difference in z-scores were smallest between MRAC and CTAC with cerebellum as reference region: 0.15 ± 0.11 σ (DeepUTE), 0.15 ± 0.12 σ (UCL), 0.23 ± 0.20 σ (DixonBone), 0.32 ± 0.28 σ (DixonNoBone), and 0.54 ± 0.40 σ (UTE). In the visual evaluation, the diagnoses agreed with PETCT in 74% (PETDeepUTE), 67% (PETDixonBone), and 70% (PETUTE) of the patients, while PETCT + MRI agreed with PETDixonBone + MRI in 89% of the patients. Conclusion The MRAC research methods performed close to that of CTAC in the quantitative evaluation of [18F]FDG uptake and z-scores. Among the clinically implemented MRAC methods, DixonBone should be preferred for diagnostic assessment of dementia with [18F]FDG PET/MRI. However, as artifacts occur in DixonBone attenuation maps, they must be visually inspected to assure proper quantification. Electronic supplementary material The online version of this article (10.1186/s13550-019-0553-2) contains supplementary material, which is available to authorized users.


Background
Magnetic resonance imaging (MRI) is today the preferred imaging modality in the clinical workup of suspected neurodegenerative disease due to the high spatial resolution and high soft tissue contrast. MRI can identify atrophy in dementia and exclude other diseases like vascular disease, cerebral amyloid angiopathy, brain tumors, and traumatic as well as inflammatory brain changes [1]. Positron emission tomography (PET) with fluorodeoxyglucose ([ 18 F]FDG) is however increasingly used to support the clinical diagnosis of patients with suspected dementia, as hypometabolism in certain brain regions can help identify specific types of dementia, including Alzheimer's disease (AD) and frontotemporal dementia (FTD) [2]. PET has a higher sensitivity for detecting early metabolic changes, which takes place prior to the morphological changes visible on MRI [1]. Hybrid PET/MRI systems have opened up the opportunity for simultaneous PET/MRI acquisitions, enabling fast and convenient examinations for patients with dementia. The information from PET and MRI is complementary, and detection of dementia with the combination of [ 18

F]FDG
PET and MRI is more accurate than with either of the imaging modalities alone [3].
As a complement to the visual assessment of hypometabolism in PET images performed by nuclear medicine physicians, PET data can be compared to databases of age-matched healthy controls. Z-score maps are then calculated, which represents the number of standard deviations (σ) separating the [ 18 F]FDG uptake of the patient and the average of the healthy controls, where moderate hypometabolism is defined as a z-score between − 2 σ and − 3 σ, and severe hypometabolism for a z-score below − 3 σ [4]. A prerequisite for using such quantitative comparisons clinically is quantitatively accurate PET images, which are heavily dependent on attenuation correction (AC). AC is one of the most important corrections that needs to be performed on PET images, but is still challenging when using a PET/MRI system.
For PET/computed tomography (CT) systems, AC is based on CT images (CTAC), which is scaled by a bilinear function to represent the linear attenuation coefficients (LACs) of the 511 keV photons. For PET/MRI systems, alternative methods had to be developed in order to calculate attenuation maps from MRI data since there is no direct relation between the MRI signal and the electron density of tissue. Several proposed brain MRI-based AC (MRAC) methods have demonstrated a small and acceptable bias from CTAC (regional difference within ± 5%) [5]. Most of these promising methods are however not implemented in clinical systems, except for Dixon with bone model that recently became available on the Siemens PET/MRI system (software VE11P). A few studies have compared clinically implemented and research MRAC methods with CTAC in the evaluation of cognitive impairment. Cabello et al. [6] compared Dixon-based (without bone) AC and ultrashort echo time (UTE) AC with four novel MRAC methods. They concluded that Dixon-and UTE-based AC were inferior to the research MRAC methods, both when measuring [ 18 F]FDG uptake and z-score accuracy to identify regions with reduced metabolism, compared to CTAC. These findings need to be re-evaluated after the recent software upgrade with modifications to the Dixon and UTE sequences.
The most relevant clinical issue is whether MRAC have an impact on clinical neurodegenerative diagnosis. Werner et al. [7] found that the pattern of hypometabolism remained largely unchanged with Dixon and that the clinical impact was negligible compared to CTAC. Franceschi et al. [8] found similar performance for Dixon and the prototype of Dixon with bone model in visually identifying hypometabolism without z-scores, and also concluded that even Dixon is acceptable for routine clinical evaluation of dementia. Still, the quantitative errors should be further reduced and MRAC methods better imitating CTAC is warranted.
Another factor that potentially can impact the presence of hypometabolism is the choice of reference region. In the comparison to the database of healthy controls, the [ 18 F]FDG uptake is normalized to a reference region, which should be unaffected by the disease. The most commonly used reference regions in dementia evaluations are cerebellum and pons, and incorrect AC in these regions can induce a bias in the [ 18 F]FDG uptake affecting z-scores throughout the brain. The accuracy of the MRAC methods in the reference region is thus important and should be investigated further.
The aim of this study was to assess the quantitative and clinical impact of the implemented MRAC methods in [ 18 F]FDG PET evaluation of dementia (Dixon, Dixon with bone model, UTE) on the Siemens Biograph PET/ MRI scanner. Two research MRAC methods (DeepUTE and UCL) presented in the literature were also included for comparison, in addition to CTAC as reference. Secondary aims were to investigate how the choice of reference region influenced the z-scores quantitatively.

Patients
Twenty-seven consecutive patients with suspected dementia were referred to brain PET/CT and PET/MRI examinations. Nine patients were excluded from this study due to incorrect anatomical position during the PET/CT examination (n = 5), misregistration of bone in Dixon Bone MRAC (n = 2) (the artifacts could not be removed manually and a new Dixon acquisition was not acquired), aliasing in MRI scans (n = 1), and problems with co-registration of PET images to the MNI PET template (n = 1). The 18 patients included had a mean age of 69 ± 9 years and a mean weight of 75 ± 16 kg. Patient characteristics and the proposed diagnosis made by a nuclear medicine physician and a neuroradiologist based on PET/CT and MR imaging and clinical referral text is given in Table 1. The study was approved by the Regional Committee for ethics in Medical Research (REC Central) (ref. number: 2013/1371) and all patients gave written informed consent.

Image acquisition and reconstruction
Image acquisition was performed on a Biograph mCT PET/CT system (software version VG51C), and subsequently on a Biograph mMR PET/MRI system (software version VE11P) (Siemens Healthcare GmbH, Erlangen, Germany). All patients fasted at least 6 h prior to intravenous injection of [ 18 F]FDG (210 ± 46 MBq). The patients were kept blindfolded in a quiet room during the uptake phase prior to the PET/CT examination, which was performed 35 ± 1 min post injection (p.i.), followed by the PET/MRI examination, performed 64 ± 9 min p.i.
Only the low-dose CT scan and the corresponding attenuation map were used (as reference) from the PET/ CT examination. The PET (20 min) and MRI (17 min) acquisitions were performed simultaneously, and the MRI protocol consisted of the same sequences as in the clinical MRI protocol for patients with suspicion of dementia (sagittal 3D T1 MPRAGE, coronal T2, transversal FLAIR, GRE T2* (microhemorrhage), and diffusion weighted imaging) in addition to the MRI sequences for MRAC; a high-resolution two-point Dixon VIBE and a UTE sequence. All PET reconstructions were performed on the mMR system using 3D OSEM reconstruction (three iterations and 21 subsets, 344 × 344 image matrix, 4 mm Gaussian filter) and corrections for scatter, randoms, detector normalization, decay, and attenuation.

Attenuation maps
PET data acquired at the PET/MRI system was reconstructed with the following five MR attenuation maps and a CT attenuation map (presented in Fig. 4) for each patient: 1. Dixon NoBone : Implemented at the mMR system.
Segmentation-based method that relies on the twopoint Dixon VIBE sequence (Brain HiRes), where air, fat, and soft tissue are segmented and assigned predefined discrete LACs (air: 0 cm −1 , fat: 0.0854 cm −1 , fat/soft tissue mix: 0.0927 cm −1 , and soft tissue: 0.1000 cm −1 ). 2. Dixon Bone : Implemented at the mMR system (product in the latest software, VE11P). Similar to Dixon NoBone , but includes continuous bone information from an integrated bone atlas by registration of MR images of the subject to MR images of the atlas [9,10]. The atlas contains sets of pre-aligned MR image and bone mask pairs with bone densities as LACs in cm −1 at the PET energy level of 511 keV. 3. UTE: Implemented at the mMR system.
All MRI data sets of the atlas are non-rigidly registered to the patient's MRI data and normalized correlation coefficients are calculated at each voxel. A pseudo CT is then calculated from averaged weights of the CT data sets based on the correlation coefficients. In this study, T1-weighted MPRAGE was used as input in a web-based tool, after bias correction with FMRIB Software Library (FSL, Oxford Centre for Functional MRI of the Brain, UK), as recommended by the distributor. The returned UCL attenuation map in Houndfield units (HU) was converted to LACs [14] and smoothed with a 4 mm Gaussian filter. 5. DeepUTE: Artificial intelligence approach to MRAC, using a deep learning algorithm [15]. Briefly, the method uses a modified 3D U-net architecture [16] for image-to-image learning of paired UTE and CT data. Compared to [15], the network was here trained using data from 832 adult examinations. 6. CT: Attenuation map generated by converting a lowdose CT scan on the mCT scanner to LACs [14]. The bed and head holder was excluded from the CT attenuation map by making a semi-automatic head mask (CT head mask) with the software MRIcron [17], and the attenuation map was multiplied by 10,000 to get the same order of magnitude as the MRAC maps at the mMR system. The CT attenuation maps did not cover the neck region sufficiently for attenuation correction of the PET data from the PET/MRI system due to differences in the Ambiguous clinical information as well as imaging data, but clearly neurodegenerative c Suspicion of normal pressure hydrocephalus (later confirmed clinically and operated with ventricular shunt) axial field of view of the PET-detectors. The area outside the CT head mask was therefore substituted with the Dixon Bone attenuation map for each patient. The CT image was rigidly registered to the Dixon in-phase image and the same transformation was performed on the CT attenuation map.
The same voxels that were substituted by Dixon Bone in the CT attenuation maps were also substituted by Dixon Bone in all evaluated MR attenuation maps. In order to perform this voxel substitution, the UTE TE 2 image and the T1w MPRAGE image was registered to the Dixon in-phase image, and the resulting transformations were used on the respective attenuation maps. All registrations were performed with Aliza Medical Imaging 1.35.3 (Bonn, Germany) (using elastix version 4.8 [18,19]) [20]. To enable import of the modified attenuation maps at the PET/MRI system, all attenuation maps used the header file of Dixon Bone with exchange of the pixel data.

Quantitative analysis Bone artifacts
After software upgrade (from VB20P to VE11P) of the PET/MRI system, bone artifacts have been observed in the Dixon Bone and UTE attenuation maps. Two of the most severe artifacts seen in the attenuation maps are misplacement of bone segments from other parts of the body found in the Dixon Bone and bone present inside the brain nearby the anterior ventricles in the UTE attenuation maps. The Dixon Bone and UTE attenuation maps were therefore visually inspected for these artifacts.
The [ 18 F]FDG uptake in all PET reconstructions were measured in 15 brain regions that were chosen to match the brain regions in the software used for zscore analysis and visual assessment. The regions were in MNI space and taken from the Harvard-Oxford Cortical Structural Atlas, MNI Structural Atlas, and Talairach Daemon Labels in FSL (Oxford Centre for Functional MRI of the Brain, UK). The PET images of the patients were converted to MNI space by co-registration to a dementia-specific [ 18 F]FDG-PET template [21,22]. The PET DixonBone was first registered with elastix to the PET template in a two-step process (rigid and non-rigid registration), and the resulting transform was used on the other five PET images of the same patient for transformation to MNI space. Relative difference (RD) was calculated in each brain region, and was defined as where PET MRAC and PET CTAC is the average activity measured in a brain region in PET MRAC and PET CTAC , respectively. The results are presented by using the boxplot function in MATLAB (R2017b). Absolute RDs were also calculated and averaged over patients and brain regions as RD abs .

Z-scores
The visual evaluations were performed with the software Cortex ID (GE Healthcare, Waukesha WI, USA), where z-scores were calculated in 26 brain regions. The database constitutes of 294 healthy controls divided in six age groups, imaged with [ 18 F]FDG PET and using a transmission scan of 68 Ge for attenuation correction. Both cerebellum and pons were used as reference regions in the quantitative analysis. Quantitative comparison of z-scores between PET MRAC (PET DixonBone , PET DixonNoBone , PET UTE , PET UCL , PET DeepUTE ) and PET CTAC were performed by calculating the difference, D, and absolute difference, D abs , in each brain region, where and D abs was averaged over patients and brain regions as D abs . The boxplot function in MATLAB was used to present the differences in z-scores.

Visual evaluation
To limit the number of images in the visual evaluation, MRAC methods were chosen based on the z-score analysis. The best and worst of the clinically implemented MRAC methods were included, in addition to the best research MRAC method. The PET CT was used as reference. Three nuclear medicine physicians (brain PET experience; reader 1: 3 years, reader 2: 10 years, reader 3: 1 year) performed the visual assessments individually. Based on PET images and z-scores, the patients were either categorized as normal or diagnosed with AD, FTD, or non-specific pathology (other subtypes of dementia, like DLB, and other patterns of hypometabolism that cannot be explained by image artifacts). The physicians were blinded for AC method and patient ID, and had no information regarding patient history or MRI.
A second reading was made based on both PET images, z-scores, and MR images by a nuclear medicine physician (reader 3) and a neuroradiologist (4-year experience in neuroradiology and European Diploma in NeuroRadiology (EDiNR)) in conjunction. The best clinically implemented MRAC method based on the z-score analysis was chosen for this second visual evaluation, and PET CT was used as reference. The first and second visual evaluation was done 2 months apart.
PET images and z-scores were evaluated in Cortex ID, while MR images were assessed with the hospitals Picture Archiving and Communication System (PACS; Sectra IDS 7). Cerebellum was chosen as reference region in all visual evaluations.

Quantitative analysis Bone artifacts
Bone artifacts were observed in 22 % (4/18) of the Dixon Bone attenuation maps, while no artifacts were seen in the corresponding Dixon images. New Dixon sequences were acquired for two patients with large bone segments from other parts of the body infiltrating the head (Fig. 1a), resulting in artifact-free attenuation maps (Fig. 1b). Artifacts positioned outside the head (Fig. 1c) were manually removed (Fig. 1d) for the last two patients. Hence, only artifact free Dixon Bone attenuation maps were included in the study. Furthermore, in 89% (16/18) of the UTE attenuation maps, minor bone artifacts were observed inside the brain close to the anterior ventricles (Fig. 1e). The UTE artifacts were not removed.

[ 18 F]FDG uptake
The mean absolute relative difference ( RD abs ) in [ 18 F]FDG uptake compared to PET CT was the smallest for PET DeepUTE and the largest when omitting bone information in PET DixonNoBone (Table 2). PET DixonBone performed similar to the research MRAC methods, but had slightly larger range of RD. The relative differences in [ 18 F]FDG uptake for the different brain regions are presented in Fig. 2. Patient 3, with abnormal anatomy (an arachnoid cyst in the posterior fossa), caused most of the outliers seen in Fig. 2. The attenuation maps with corresponding PET images for all reconstructions are demonstrated for this patient in Additional file 1: Figure S1.

Z-scores
The mean absolute difference (D abs ) in z-score between CTAC and MRAC was minimized with the research methods (PET DeepUTE and PET UCL ), which also had the smallest range. Among the clinically implemented methods, PET DixonBone performed best, closely followed by PET DixonNoBone . The largest D abs was found with PET-UTE (Table 3). For all MRAC methods, smaller differences were found for cerebellum than for pons as reference region. Figure 3 shows that the difference in z-scores between CTAC and the MRAC methods were more stable across brain regions for the research methods than for the clinical methods. PET DeepUTE slightly overestimated and PET UCL slightly underestimated the z-scores compared to PET CT for most brain regions for both reference regions (Fig. 3). The clinical MRAC methods (PET DixonBone , PET DixonNoBone , and PET UTE ) yielded lower z-scores than PET CT with pons as reference region, and both over-and underestimated z-scores with cerebellum as reference region (Fig. 3). Examples of z-score maps for one patient with dementia are presented in Fig. 4, where increased hypometabolism is especially pronounced for PET UTE with pons as reference region.  Table 4. The agreement in diagnosis between PET CT and PET DeepUTE , PET DixonBone and PET UTE was in average for the three readers 74%, 67%, and 70%, respectively (Table 5), and the κ-statistics indicated mostly moderate agreement between PET CT and PET MRAC . The inter-reader agreement was fair for PET CT (κ = 0.30) and slight for PET DeepUTE (κ = 0.17), PET DixonBone (κ = 0.19), and PET UTE (κ = 0.10).
In the second visual evaluation, which also included MRI, PET DixonBone was compared to PET CT . When MRI was included in the assessment, the agreement increased to 89% and the κ-statistics indicated almost perfect agreement (κ = 0.82) ( Table 6) according to Landis et al. [23].

Discussion
The impact of MRAC on dementia assessment was evaluated in this study by comparing [ 18 F]FDG uptake, z-scores, and clinical interpretation between PET MRAC and PET CT . The absolute mean quantitative differences in z-scores were small relative to the definition of hypometabolism for most MRAC methods with cerebellum as reference region, and especially for the research methods. Interpretation with PET alone yielded high uncertainties, while assessment with both PET and MRI resulted in almost perfect agreement between PET CT and PET DixonBone .
The bone artifacts found in the clinically available MRAC methods highlights the need for careful inspection of the attenuation maps in all brain examinations. In the Dixon Bone attenuation maps, the artifacts were caused by misregistration between the Dixon images and the bone-template, misplacing large bone segments from other parts of the body in the brain. Due to the severity of these artifacts, they were removed by either acquiring a new Dixon acquisition free of this artifact, or manually when found outside the brain. Although not evaluated quantitatively, this artifact would likely induce large errors in the attenuation corrected PET images. The minor bone artifacts observed in most UTE attenuation maps   were caused by changes in the UTE sequence and/or attenuation map algorithm after the software upgrade, and persisted even after acquiring new UTE images. Since the UTE attenuation maps are used clinically, they were not excluded from the current study. In clinical routine, a reliable and stable MRAC method is crucial and these problems need to be solved. The attenuation map errors have been reported to Siemens Healthcare, and will hopefully be solved in the near future. In the meantime, some of the artifacts can be avoided by implementing better procedures among radiographers to detect the artifacts and acquire new MR-based attenuation maps in such cases before the patient leaves the scanner table.
The research MRAC methods, as well as PET DixonBone , all demonstrated small absolute differences compared to PET CT regarding [ 18 F]FDG uptake, although the research methods had smaller RD range. Some outliers were observed in the analysis, and most of them were caused by a patient with an abnormal anatomy (arachnoid cyst in posterior fossa). DeepUTE gave least outliers for this patient with abnormal anatomy (2/15 brain regions), while Dixon Bone and Dixon NoBone yielded most outliers for this patient (10/15 brain regions). For absolute differences in [ 18 F]FDG uptake, the trend was the same as in previous studies [5][6][7]24], with descending performance for PET UCL , PET DixonBone , PET UTE , and PET DixonNobone (DeepUTE has not been included in previous studies). PET UTE yielded particularly large variations in the pons, probably due to misclassification of bone in that region, which makes pons not suited as a reference region with UTE AC. Furthermore, we found that the LACs for soft tissue were slightly higher with UCL AC and slightly lower with DeepUTE AC compared to the CTAC, which probably caused the general over-and underestimation of [ 18 F]FDG uptake for the two methods, respectively.
For the z-score evaluation, the research MRAC methods yielded the best performance and the differences in z-scores between PET MRAC and PET CT were generally small compared to the definition of hypometabolism, except for PET UTE , when using cerebellum as reference region. Of note, PET DixonBone and PET DixonNo-Bone yielded similar results for the z-scores, indicating that the missing bone information did not have a remarkable impact on z-scores. Despite small average differences in z-scores to PET CT for most MRAC Fig. 4 Examples of z-score maps for one patient (number 16) for the included AC methods, with pons and cerebellum as reference regions, and the corresponding attenuation maps methods with cerebellum as reference region, large outliers were present for the clinical MRAC methods with a deviation from PET CT > 1 σ, which can have a considerable impact on a z-score assessment. Hence, it is highly desirable to implement the research methods at the clinical PET/MRI systems as soon as possible to avoid large biases in the z-score assessment.
Since the calculation of z-scores use a reference region for normalization, the accuracy of AC in the reference region is of particular importance as bias in this region will affect the hypometabolism globally. Available reference regions in the software used for visual evaluation in the current study were pons, cerebellum, and global cerebral cortex. The extent of hypometabolism can however be underestimated with global normalization [25]. Therefore, only cerebellum and pons were used in this study, and differences in z-scores between MRAC and CTAC were found to be smaller with cerebellum than pons as reference region for all MRAC methods. Although glucose metabolism in the pons have been found to be least affected by dementia among several reference regions [26], the small size makes this region prone to bias and the surrounding inhomogeneous bone affects both attenuation and scatter [25]. Cerebellum is larger and less prone to bias, and the cerebellar glucose metabolism is not significantly reduced for AD patients, except for severe AD [25,27].
Based on our results, Dixon Bone with cerebellum as reference should be preferred among the clinically implemented MRAC methods when assessing z-scores.
However, for patients with abnormal anatomy and/or unusual tissue density, atlas-based methods should be used with caution [28]. In these cases, or when bone artifacts are present, Dixon NoBone could probably be used as an alternative for z-score assessment in the evaluation of dementia. The visual evaluations with PET only yielded moderate agreement between PET CT and PET MRAC in general. Highest agreement was found for PET DeepUTE , but the other MRAC methods performed quite similarly. Least false positive errors were found for PET DeepUTE compared to PET CT , while false negative errors were highest for PET DeepUTE and PET DixonBone . These results seem to be in agreement with the quantitative results, as PET UTE underestimated z-scores, inducing false positive errors, while PET DeepUTE slightly overestimated z-scores and tends to more often change pathology to normal than the opposite. However, the inter-reader agreement was low, which indicates that the visual assessment of [ 18 F]FDG PET in dementia is difficult and subjective, and that these evaluations were influenced by additional factors than the different AC methods. Another study by Werner et al. [7], evaluating the clinical impact of different AC methods, demonstrated a higher agreement between the readers; however, the categorization of diagnosis was not the same as in our study, which could have caused less discrepancy in their results.
Due to the large discrepancies in the PET only evaluations, another assessment including MRI was performed. Adding MRI information yielded almost perfect agreement between MRAC and CTAC readings according to the κ-statistics, and in the two cases of discrepancies between PET CT + MRI and PET DixonBone + MRI, the discrepancies were due to different subtypes of dementia. The improvement by including MRI was probably due to the ability to discard areas of hypometabolism due to other pathologies and normal variants (e.g., age-related atrophy, enlarged ventricles, and mega cisterna magna). Furthermore, information of neurodegenerative processes such as hippocampal atrophy (as seen in AD), focal cortical atrophy (as seen in FTD), and white matter hyperintensities (as seen in microvascular disease) was important complementary information to the PET findings. In a clinical setting with all clinical information and imaging available, the discrepancies between MRAC and CTAC would probably be further decreased, but this should be verified in studies with larger patient cohorts.
A limitation of this study is the small number of patients, and hence few patients having dementia and low diversity in diagnoses and severity. Furthermore, PET images suffer from partial volume effects due to the limited resolution that cause spill-out from one region to another. This was not corrected for and could cause a significant effect on hypometabolism from normal aging [29]. However, the aim of this study was to compare MRAC and CTAC, and not the exact diagnosis. Another factor that may affect the z-scores is that the PET  images in the database of Cortex ID were acquired and reconstructed differently than the PET images in this study. Still, the relative differences between CTAC and MRAC should be unaffected.