Towards performant and reliable undersampled MR reconstruction via diffusion model sampling
| Entity Passport | |
| Registry ID | arxiv-paper--unknown--2203.04292 |
| License | ArXiv |
| Provider | semantic_scholar |
Cite this paper
Academic & Research Attribution
@misc{arxiv_paper__unknown__2203.04292,
author = {Cheng Peng, Pengfei Guo, S. K. Zhou, Vishal M. Patel, Ramalingam Chellappa},
title = {Towards performant and reliable undersampled MR reconstruction via diffusion model sampling Paper},
year = {2026},
howpublished = {\url{https://free2aitools.com/paper/arxiv-paper--unknown--2203.04292}},
note = {Accessed via Free2AITools Knowledge Fortress}
} π¬Technical Deep Dive
Full Specifications [+]βΎ
βοΈ Nexus Index V2.0
π¬ Index Insight
FNI V2.0 for Towards performant and reliable undersampled MR reconstruction via diffusion model sampling: Semantic (S:50), Authority (A:70), Popularity (P:54), Recency (R:100), Quality (Q:45).
Verification Authority
π Executive Summary
β Cite Node
@article{Unknown2026Towards,
title={Towards performant and reliable undersampled MR reconstruction via diffusion model sampling},
author={},
journal={arXiv preprint arXiv:arxiv-paper--unknown--2203.04292},
year={2026}
} Abstract & Analysis
[2203.04292] Towards performant and reliable undersampled MR reconstruction via diffusion model sampling
1 1 institutetext:
Johns Hopkins University, MD, USA 2 2 institutetext: Medical Imaging, Robotics, and Analytic Computing Laboratory and Engineering (MIRACLE) Center, School of Biomedical Engineering & Suzhou Institute for Advance Research, University of Science and Technology of China, Suzhou, China 3 3 institutetext: Key Lab of Intelligent Information Processing of Chinese Academy of Sciences (CAS),
Institute of Computing Technology, CAS, Beijing, China
3 3 email: {cpeng26,pguo4,vpatel36,rchella4}@jhu.edu, s.kevin.zhou@gmail.com
Towards performant and reliable undersampled MR reconstruction via diffusion model sampling
Cheng Peng
11
ββ
Pengfei Guo
11
ββ
S. Kevin Zhou
2233
ββ
Vishal Patel
11
ββ
Rama Chellappa
11
Abstract
Magnetic Resonance (MR) image reconstruction from under-sampled acquisition promises faster scanning time. To this end, current State-of-The-Art (SoTA) approaches leverage deep neural networks and supervised training to learn a recovery model. While these approaches achieve impressive performances, the learned model can be fragile on unseen degradation, e.g. when given a different acceleration factor. These methods are also generally deterministic and provide a single solution to an ill-posed problem; as such, it can be difficult for practitioners to understand the reliability of the reconstruction. We introduce DiffuseRecon, a novel diffusion model-based MR reconstruction method. DiffuseRecon guides the generation process based on the observed signals and a pre-trained diffusion model, and does not require additional training on specific acceleration factors. DiffuseRecon is stochastic in nature and generates results from a distribution of fully-sampled MR images; as such, it allows us to explicitly visualize different potential reconstruction solutions. Lastly, DiffuseRecon proposes an accelerated, coarse-to-fine Monte-Carlo sampling scheme to approximate the most likely reconstruction candidate. The proposed DiffuseRecon achieves SoTA performances reconstructing from raw acquisition signals in fastMRI and SKM-TEA. Code will be open-sourced at www.github.com/cpeng93/DiffuseRecon .
1
Introduction
Magnetic Resonance Imaging (MRI) is a widely used medical imaging technique. It offers several advantages over other imaging modalities, such as providing high contrast on soft tissues and introducing no harmful radiation during acquisition. However, MRI is also limited by its long acquisition time due to the underlying imaging physics and machine quality. This leads to various issues ranging from patient discomfort to limited accessibility of the machines.
An approach to shorten MR scanning time is by under-sampling the signal in k-space during acquisition and recovering it by performing a post-process reconstruction algorithm. Recovering unseen signal is a challenging, ill-posed problem, and there has been a long history of research in addressing undersampled MR reconstruction. In general, this problem is formulated as:
y recon = arg β‘ min y β‘ β₯ β³ β β± β y β x obs β₯ + Ξ» β R β ( y ) , s . t . x obs = β³ β x full , formulae-sequence subscript π¦ recon subscript π¦ β³ β± π¦ subscript π₯ obs π π
π¦ π π‘ subscript π₯ obs β³ subscript π₯ full y_{\textrm{recon}}=\arg\min_{y}\lVert\mathcal{M}\mathcal{F}y-x_{\textrm{obs}}\rVert+\lambda R(y),~{}~{}s.t.~{}~{}x_{\textrm{obs}}=\mathcal{M}x_{\textrm{full}},
(1)
where x { full,obs } subscript π₯ full,obs x_{{\textrm{full,obs}}} denotes the fully-sampled and under-sampled k-space signal, β³ β³ \mathcal{M} denotes the undersampling mask, and β± β± \mathcal{F} denotes the Fourier operator. The goal is to find an MR image y π¦ y such that its k-space content β³ β β± β y β³ β± π¦ \mathcal{M}\mathcal{F}y agrees with x obs subscript π₯ obs x_{\textrm{obs}} ; this is often known as the data consistency term. Furthermore, y recon subscript π¦ recon y_{\textrm{recon}} should follow certain prior knowledge about MR images, as expressed by the regularization term R β ( β ) π R() . The design of R π R is subject to many innovations. Lustig et al.Β [ 12 ] first proposed to use Compressed Sensing motivated β 1 subscript β 1 \ell_{1} -minimization algorithm for MRI reconstruction, assuming that the undersampled MR images have a sparse representation in a transform domain. Ravishankar et al.Β [ 15 ] applied a more adaptive sparse modelling through Dictionary Learning, where the transformation is optimized through a set of data, leading to improved sparsity encoding. As interests grow in this field, more MR data has become publicly available. Consequently, advances in Deep Learning (DL), specifically with supervised learning, have been widely applied in MR reconstruction. Generally, DL-based methodsΒ [ 5 , 24 , 6 , 17 , 16 , 1 , 9 , 21 , 22 , 8 , 14 , 19 , 7 ] train Convolutional Neural Networks (CNNs) with paired data { y und , y full } = { β± β 1 β x obs , β± β 1 β x full } subscript π¦ und subscript π¦ full superscript β± 1 subscript π₯ obs superscript β± 1 subscript π₯ full {y_{\textrm{und}},y_{\textrm{full}}}={\mathcal{F}^{-1}x_{\textrm{obs}},\mathcal{F}^{-1}x_{\textrm{full}}} . Following the formulation of Eq.Β ( 1 ), data consistency can be explicitly enforced within the CNN by replacing β³ β β± β y β³ β± π¦ \mathcal{M}\mathcal{F}y with x obs subscript π₯ obs x_{\textrm{obs}} Β [ 17 ] . The resulting CNN serves as a parameterized R β ( β , ΞΈ ) π π R(,\theta) , and regularizes test images based on learned ΞΈ π \theta from the training distribution.
While supervised DL-based methods have led to impressive results, these methods generally train CNNs under specific degradation conditions; e.g., the under-sampling mask β³ β³ \mathcal{M} that generates y und subscript π¦ und y_{\textrm{und}} follows a particular acceleration factor. As a consequence, when the acceleration factor changes, the performances of various models often degrade significantly, making R β ( β , ΞΈ ) π π R(*,\theta) less reliable in general. Furthermore, while the trained CNNs infer the most likely estimation of y full subscript π¦ full y_{\textrm{full}} given y und subscript π¦ und y_{\textrm{und}} , they do not provide possible alternative solutions. Since Eq.Β 1 is an under-constrained problem, y und subscript π¦ und y_{\textrm{und}} can have many valid solutions. The ability to observe different solutions can help practitioners understand the potential variability in reconstructions and make more robust decisions. As such, finding a stochastic R π R that is generally applicable across all x obs subscript π₯ obs x_{\textrm{obs}} is of high interest.
We leverage the recent progress in a class of generative methods called diffusion models Β [ 10 , 13 , 4 ] , which use a CNN to perform progressive reverse-diffusion and maps a prior Gaussian distribution π© π© \mathcal{N} to a learned image distribution, e.g. p ΞΈ β ( y full ) subscript π π subscript π¦ full p_{\theta}(y_{\textrm{full}}) . Based on a pre-trained ΞΈ π \theta , we propose to guide the iterative reverse-diffusion by gradually introducing x obs subscript π₯ obs x_{\textrm{obs}} to the intermediate results, as shown in Fig. 1. This allows us to generate reconstructions in the marginal distribution p ΞΈ β ( y full | x obs ) subscript π π conditional subscript π¦ full subscript π₯ obs p_{\theta}(y_{\textrm{full}}|x_{\textrm{obs}}) , where any sample y recon βΌ p ΞΈ β ( y full | x obs ) similar-to subscript π¦ recon subscript π π conditional subscript π¦ full subscript π₯ obs y_{\textrm{recon}}\sim p_{\theta}(y_{\textrm{full}}|x_{\textrm{obs}}) agrees with the observed signal and lies on the MR image distribution p ΞΈ β ( y full ) subscript π π subscript π¦ full p_{\theta}(y_{\textrm{full}}) .
We make three contributions. (i) Our proposed DiffuseRecon performs MR reconstruction by gradually guiding the reverse-diffusion process with observed k-space signal and is robust to changing sampling condition using a single model. (ii) We propose a coarse-to-fine sampling algorithm, which allows us to estimate the most likely reconstruction and its variability within p ΞΈ β ( y full | x obs ) subscript π π conditional subscript π¦ full subscript π₯ obs p_{\theta}(y_{\textrm{full}}|x_{\textrm{obs}}) while leading to an approximately 40 Γ \times speed-up compared to naive sampling. (iii) We perform extensive experiments using raw acquisition signals from fastMRIΒ [ 23 ] and SKM-TEAΒ [ 3 ] , and demonstrate superior performance over SoTA methods.
Figure 1 : DiffuseRecon gradually incorporates x obs subscript π₯ obs x_{\textrm{obs}} into the denoising process through a k-Space Guidance (KSG) module; as such, we can directly generate samples from p ΞΈ β ( y full | x obs ) subscript π π conditional subscript π¦ full subscript π₯ obs p_{\theta}(y_{\textrm{full}}|x_{\textrm{obs}}) . Visualizations are based on 8X undersampling.
2
DiffuseRecon
Background. Diffusion modelΒ [ 10 , 13 , 4 ] is a class of unconditional generative methods that aims to transform a Gaussian distribution to the empirical data distribution. Specifically, the forward diffusion process is a Markov Chain that gradually adds Gaussian noise to a clean image, which can be expressed as:
q β ( y t | y t β 1 ) = π© β ( y t ; 1 β Ξ² t β y t β 1 , Ξ² t β π ) ; q β ( y t | y 0 ) = π© β ( y t ; Ξ± Β― t β y 0 , ( 1 β Ξ± Β― t ) β π ) , formulae-sequence π conditional subscript π¦ π‘ subscript π¦ π‘ 1 π© subscript π¦ π‘ 1 subscript π½ π‘ subscript π¦ π‘ 1 subscript π½ π‘ π π conditional subscript π¦ π‘ subscript π¦ 0 π© subscript π¦ π‘ subscript Β― πΌ π‘ subscript π¦ 0 1 subscript Β― πΌ π‘ π \displaystyle q(y_{t}|y_{t-1})=\mathcal{N}(y_{t};\sqrt{1-\beta_{t}}y_{t-1},\beta_{t}\mathbf{I});q(y_{t}|y_{0})=\mathcal{N}(y_{t};\sqrt{\bar{\alpha}_{t}}y_{0},(1-\bar{\alpha}_{t})\mathbf{I}),
(2)
where y t subscript π¦ π‘ y_{t} denotes the intermediate noisy images, Ξ² t subscript π½ π‘ \beta_{t} denotes the noise variance, Ξ± t = 1 β Ξ² t subscript πΌ π‘ 1 subscript π½ π‘ \alpha_{t}=1-\beta_{t} , and Ξ± Β― t = β s = 1 t Ξ± s subscript Β― πΌ π‘ superscript subscript product π 1 π‘ subscript πΌ π \bar{\alpha}{t}=\prod{s=1}^{t}\alpha_{s} . For simplicity, Ξ² t subscript π½ π‘ \beta_{t} follows a fixed schedule. When T π T is sufficiently large, y T subscript π¦ π y_{T} is dominated by noise. A CNN model is introduced to gradually reverse the forward diffusion process, i.e. denoise y t subscript π¦ π‘ y_{t} , by estimating
p ΞΈ β ( y t β 1 | y t ) = π© β ( y t β 1 ; Ο΅ ΞΈ β ( y t , t ) , Ο t 2 β π ) , subscript π π conditional subscript π¦ π‘ 1 subscript π¦ π‘ π© subscript π¦ π‘ 1 subscript italic-Ο΅ π subscript π¦ π‘ π‘ superscript subscript π π‘ 2 π \displaystyle p_{\theta}(y_{t-1}|y_{t})=\mathcal{N}(y_{t-1};\epsilon_{\theta}(y_{t},t),\sigma_{t}^{2}\mathbf{I}),
(3)
where Ο t 2 = Ξ² t Β― = 1 β Ξ± Β― t β 1 1 β Ξ± Β― t β Ξ² t superscript subscript π π‘ 2 Β― subscript π½ π‘ 1 subscript Β― πΌ π‘ 1 1 subscript Β― πΌ π‘ subscript π½ π‘ \sigma_{t}^{2}=\bar{\beta_{t}}=\frac{1-\bar{\alpha}{t-1}}{1-\bar{\alpha}{t}}\beta_{t} in this work. With fixed variance, Ο΅ ΞΈ subscript italic-Ο΅ π \epsilon_{\theta} in Eq.Β ( 3 ) is trained by mean-matching the diffusion noise through an β 2 subscript β 2 \mathcal{L}_{2} loss. We followΒ [ 13 ] with some modifications to train a diffusion model that generates MR images.
After training on MR images y full subscript π¦ full y_{\textrm{full}} , Ο΅ ΞΈ subscript italic-Ο΅ π \epsilon_{\theta} can be used to gradually transform noise into images that follow the distribution p ΞΈ β ( y full ) subscript π π subscript π¦ full p_{\theta}(y_{\textrm{full}}) , as shown in Fig.Β 1 . We note that diffusion models generate images unconditionally by their original design. To reconstruct high fidelity MR images conditioned on x o β b β s subscript π₯ π π π x_{obs} , DiffuseRecon is proposed to gradually modify y t subscript π¦ π‘ y_{t} such that y 0 subscript π¦ 0 y_{0} agrees with x o β b β s subscript π₯ π π π x_{obs} . DiffuseRecon consists of two parts, k-Space Guidance and Coarse-to-Fine Sampling.
K-Space Guidance. We note that R β ( β ) π R(*) is naturally enforced by following the denoising process with a pre-trained ΞΈ π \theta , as the generated images already follow the MR data distribution. As such, k-Space Guidance (KSG) is proposed to ensure the generated images follow the data consistency term. In a diffusion model, a denoised image is generated at every step t π‘ t by subtracting the estimated noise from the previous y t subscript π¦ π‘ y_{t} , specifically:
y t β 1 β² = 1 Ξ± t β ( y t β 1 β Ξ± t 1 β Ξ± Β― t β Ο΅ ΞΈ β ( y t , t ) ) + Ο t β π³ , π³ βΌ π© β ( 0 , π ) . formulae-sequence subscript superscript π¦ β² π‘ 1 1 subscript πΌ π‘ subscript π¦ π‘ 1 subscript πΌ π‘ 1 subscript Β― πΌ π‘ subscript italic-Ο΅ π subscript π¦ π‘ π‘ subscript π π‘ π³ similar-to π³ π© 0 π \displaystyle y^{\prime}_{t-1}=\frac{1}{\sqrt{\alpha_{t}}}(y_{t}-\frac{1-\alpha_{t}}{\sqrt{1-\bar{\alpha}_{t}}}\epsilon_{\theta}(y_{t},t))+\sigma_{t}\mathbf{z},\mathbf{z}\sim\mathcal{N}(0,\mathbf{I}).
(4)
For unconditional generation, y t β 1 = y t β 1 β² subscript π¦ π‘ 1 subscript superscript π¦ β² π‘ 1 y_{t-1}=y^{\prime}{t-1} , and the process repeats until t = 0 π‘ 0 t=0 . For DiffuseRecon, KSG gradually mixes observed k-space signal x obs subscript π₯ obs x{\textrm{obs}} with y t β 1 β² subscript superscript π¦ β² π‘ 1 y^{\prime}{t-1} . To do so, KSG first adds a zero-mean noise on x obs subscript π₯ obs x{\textrm{obs}} to simulate its diffused condition at step t π‘ t . The noisy observation is then mixed with y t β 1 β² subscript superscript π¦ β² π‘ 1 y^{\prime}_{t-1} in k-space based on the undersampling mask β³ β³ \mathcal{M} . This process is expressed as follows:
y t = β± β 1 β ( ( 1 β β³ ) β β± β y t β² + β³ β x obs , t ) , subscript π¦ π‘ superscript β± 1 1 β³ β± subscript superscript π¦ β² π‘ β³ subscript π₯ obs π‘ \displaystyle y_{t}=\mathcal{F}^{-1}((1-\mathcal{M})\mathcal{F}y^{\prime}_{t}+\mathcal{M}x_{\textrm{obs},t}),
(5)
whereΒ β x obs , t = x obs + β± β ( π© β ( 0 , ( 1 β Ξ± Β― t ) β π ) ) . whereΒ subscript π₯ obs π‘ subscript π₯ obs β± π© 0 1 subscript Β― πΌ π‘ π \displaystyle\text{where }x_{\textrm{obs},t}=x_{\textrm{obs}}+\mathcal{F}(\mathcal{N}(0,(1-\bar{\alpha}_{t})\mathbf{I})).
The resulting y t β 1 subscript π¦ π‘ 1 y_{t-1} is iteratively denoised with decreasing t π‘ t until y 0 subscript π¦ 0 y_{0} is obtained, and x obs , 0 = x obs subscript π₯ obs 0 subscript π₯ obs x_{\textrm{obs},0}=x_{\textrm{obs}} . As such, y 0 subscript π¦ 0 y_{0} shares the same observed k-space signal and achieves the data consistency term β³ β β± β y 0 = x obs β³ β± subscript π¦ 0 subscript π₯ obs \mathcal{M}\mathcal{F}y_{0}=x_{\textrm{obs}} . Since all generated samples fall on the data distribution p ΞΈ β ( y full ) subscript π π subscript π¦ full p_{\theta}(y_{\textrm{full}}) , KSG allows us to stochastically sample from the marginal distribution p ΞΈ β ( y full | x obs ) subscript π π conditional subscript π¦ full subscript π₯ obs p_{\theta}(y_{\textrm{full}}|x_{\textrm{obs}}) through a pre-trained diffusion model. As demonstrated in Fig. 2a, we can observe the variations in y 0 subscript π¦ 0 y_{0} with a given x obs subscript π₯ obs x_{\textrm{obs}} and determine the reliability of the reconstructed results. Furthermore, KSG is parameter-free and applicable to any undersampling mask β³ β³ \mathcal{M} without finetuning.
Mask Init. 1 Init. 2 Init. 3 Var.
(a) (a)
(b) (b)
Figure 3 : (a) Visualizations on various under-sampled k-space signal (8 Γ \times , 16 Γ \times , 32 Γ \times , 1D Gaussian Mask). (b) To accelerate the sampling process for MC simulation, multiple coarse samples are generated using T k π π \frac{T}{k} steps. These samples are averaged to y 0 avg subscript superscript π¦ avg 0 y^{\textrm{avg}}{0} and reintroduced to the denoising network for T refine subscript π refine T{\textrm{refine}} steps.
Coarse-to-Fine Sampling. While sampling in p ΞΈ β ( y full | x obs ) subscript π π conditional subscript π¦ full subscript π₯ obs p_{\theta}(y_{\textrm{full}}|x_{\textrm{obs}}) allows us to generate different reconstruction candidates, selecting the best reconstruction out of all candidates is a challenge. Since we do not have an analytic solution for p ΞΈ β ( y full | x obs ) subscript π π conditional subscript π¦ full subscript π₯ obs p_{\theta}(y_{\textrm{full}}|x_{\textrm{obs}}) , Monte-Carlo (MC) sampling can be used to estimate πΌ β ( p ΞΈ β ( y full | x obs ) ) πΌ subscript π π conditional subscript π¦ full subscript π₯ obs \mathbb{E}(p_{\theta}(y_{\textrm{full}}|x_{\textrm{obs}})) in an unbiased way with sufficient samples. However, as sampling from a diffusion model already requires multiple steps, generating multiple samples is computationally costly. In practice, the denoising process can be accelerated by evenly spacing the { T , T β 1 , β¦ , 1 } π π 1 β¦ 1 {T,T-1,...,1} schedule to a shorter schedule { T , T β k , β¦ , 1 } π π π β¦ 1 {T,T-k,...,1} Β [ 13 ] , where k > 1 π 1 k>1 , and modifying the respective weights Ξ² , Ξ± π½ πΌ \beta,\alpha based on k π k . However, the acceleration tends to produce less denoised results when k π k is large.
We propose a Coarse-to-Fine (C2F) sampling algorithm to greatly accelerate the MC sampling process without loss in performance. Specifically, we note that the diffusion noise that is added in Eq.Β ( 2 ) is zero-mean with respect to y 0 subscript π¦ 0 y_{0} , and can be reduced by averaging multiple samples. Since multiple samples are already required to estimate πΌ β ( p ΞΈ β ( y full | x obs ) ) πΌ subscript π π conditional subscript π¦ full subscript π₯ obs \mathbb{E}(p_{\theta}(y_{\textrm{full}}|x_{\textrm{obs}})) , we leverage the multi-sample scenario to more aggressively reduce the denoising steps. As shown in Fig. 2b, C2F sampling creates N π N instances of y T i , T k βΌ π© β ( 0 , π ) similar-to subscript superscript π¦ π π π π π© 0 π y^{i,\frac{T}{k}}{T}\sim\mathcal{N}(0,\mathbf{I}) and applies denoising individually for T k π π \frac{T}{k} steps based on a re-spaced schedule. The noisy results are averaged to produce y 0 avg = 1 N β β i = 0 N y 0 i , T k subscript superscript π¦ avg 0 1 π subscript superscript π π 0 subscript superscript π¦ π π π 0 y^{\textrm{avg}}{0}=\frac{1}{N}\sum^{N}{i=0}y^{i,\frac{T}{k}}{0} . Finally, y 0 avg subscript superscript π¦ avg 0 y^{\textrm{avg}}{0} is refined by going through additional T refine subscript π refine T{\textrm{refine}} steps with Ο΅ ΞΈ subscript italic-Ο΅ π \epsilon_{\theta} . To control the denoising strength in Ο΅ ΞΈ subscript italic-Ο΅ π \epsilon_{\theta} , { T refine , T refine β 1 , β¦ , 1 } β { T , T β 1 , β¦ , 1 } , T refine βͺ T formulae-sequence subscript π refine subscript π refine 1 β¦ 1 π π 1 β¦ 1 much-less-than subscript π refine π {T_{\textrm{refine}},T_{\textrm{refine}}-1,...,1}\in{T,T-1,...,1},T_{\textrm{refine}}\ll T . The last refinement steps help remove blurriness introduced by averaging multiple samples and lead to more realistic reconstruction results. During the refinement steps, x obs subscript π₯ obs x_{\textrm{obs}} directly replaces k-space signals in the reconstructions, as is consistent with β³ β β± β y 0 i , T k β³ β± subscript superscript π¦ π π π 0 \mathcal{M}\mathcal{F}y^{i,\frac{T}{k}}_{0} .
Compared to a naive approach which requires T β N π π TN total steps to estimate πΌ β ( p ΞΈ β ( y full | x obs ) ) πΌ subscript π π conditional subscript π¦ full subscript π₯ obs \mathbb{E}(p_{\theta}(y_{\textrm{full}}|x_{\textrm{obs}})) , C2F sampling introduces an approximately k π k -time speed-up. Furthermore, while y 0 i , T k subscript superscript π¦ π π π 0 y^{i,\frac{T}{k}}{0} are noisy compared to their fully-denoised version y 0 i , T subscript superscript π¦ π π 0 y^{i,T}{0} , their noise is approximately Gaussian and introduces only a constant between V β a β r β ( y 0 i , T k ) π π π subscript superscript π¦ π π π 0 Var(y^{i,\frac{T}{k}}{0}) and V β a β r β ( y 0 i , T ) π π π subscript superscript π¦ π π 0 Var(y^{i,T}{0}) . As such, given a reasonable N π N , variance in p ΞΈ β ( y full | x obs ) subscript π π conditional subscript π¦ full subscript π₯ obs p_{\theta}(y_{\textrm{full}}|x_{\textrm{obs}}) can still be estimated well from coarse samples.
3
Experiments
Dataset. In our experiments, we use two large knee MR datasets that contain raw, complex-value acquisition signals. FastMRIΒ [ 23 ] contains 1172 subjects, with approximately 35 slices per subject and is split such that 973 subjects are used for training and 199 subjects are used for evaluation. SKM-TEAΒ [ 3 ] contains 155 subjects, with approximately 160 slices per subject; 134 subjects are used for training, 21 subjects are used for evaluation. Single-coil data is used for both datasets. Undersampling masks β³ β³ \mathcal{M} are generated by using the masking function provided in the fastMRI challenge with 6 Γ \times and 8 Γ \times accelerations. To avoid slices with little information, the first five and ten slices are removed from evaluation for fastMRI and SKM-TEA respectively. To accommodate methods based on over-complete CNNs, which require upsampling and a larger memory footprint, images from SKM-TEA are center-cropped to β 448 Γ 448 superscript β 448 448 \mathbb{R}^{448\times 448} .
Implementation Details. All baselines are implemented in PyTorch and trained with β 1 subscript β 1 \mathcal{L}{1} loss and Adam optimizer. FollowingΒ [ 6 ] , a learning rate is set to 1.5 Γ 10 β 4 1.5 superscript 10 4 1.5\times 10^{-4} and reduced by 90 % percent 90 90% every five epochs; to ensure a fair comparison, U-NetΒ [ 16 ] is implemented with a data consistency layer at the end. For DiffuseRecon, we followΒ [ 13 ] by using a cosine noise schedule and a U-Net architecture with multi-head attention as the denoising model. The model is modified to generate two consecutive slices; as such, the input and output channel sizes are 4, accounting for complex values. For C2F Sampling, { T , k , N , T refine } = { 4000 , 40 , 10 , 20 } π π π subscript π refine 4000 40 10 20 {T,k,N,T{\textrm{refine}}}={4000,40,10,20} , which gives a T β N T β N k + T refine β 39 π π π π π subscript π refine 39 \frac{TN}{\frac{TN}{k}+T_{\textrm{refine}}}\approx 39 times speed-up compared to naive MC sampling.
(a) (a)
D5C5 OUCR Ours GT
fastMRI, 8 Γ \times β β \rightarrow 4 Γ \times
SKM-TEA, 6 Γ \times β β \rightarrow 10 Γ \times
(b) 33.1/.791
(c) 34.3/.827
(d) 35.6/.864
(e)
(f) 28.1/.510
(g) 28.2/.519
(h) 33.4/.736
(i)
(j) (b)
Figure 4 : (a) Ablation Study on DiffuseRecon with different parameters. (b) Visualizations on recovering from unseen acceleration factors.
Ablation Study. We examine the effectiveness of the following design choices:
β’
DiffuseRecon nonoise k , N subscript superscript DiffuseRecon π π nonoise {\textrm{DiffuseRecon}}^{k,N}{\textrm{nonoise}} : β³ β β± β y t β² β³ β± subscript superscript π¦ β² π‘ \mathcal{M}\mathcal{F}y^{\prime}{t} is directly replaced with x obs subscript π₯ obs x_{\textrm{obs}} in KSG instead of x obs , t subscript π₯ obs π‘ x_{\textrm{obs},t} ; k = { 40 } π 40 k={40} and N = { 1 , 2 , 5 , 10 } π 1 2 5 10 N={1,2,5,10} are tested.
β’
DiffuseRecon norefine k , N subscript superscript DiffuseRecon π π norefine {\textrm{DiffuseRecon}}^{k,N}_{\textrm{norefine}} : a combination of { k , N } π π {k,N} are tested without the refining steps; specifically, k = { 8 , 40 } π 8 40 k={8,40} and N = { 1 , 2 , 5 , 10 } π 1 2 5 10 N={1,2,5,10} .
β’
DiffuseRecon refine k , N subscript superscript DiffuseRecon π π refine {\textrm{DiffuseRecon}}^{k,N}{\textrm{refine}} : T refine = 20 subscript π refine 20 T{\textrm{refine}}=20 steps of further denoising is applied to the aggregate results from DiffuseRecon norefine k , N subscript superscript DiffuseRecon π π norefine {\textrm{DiffuseRecon}}^{k,N}_{\textrm{norefine}} , { k , N } = { { 8 , 40 } , 10 } π π 8 40 10 {k,N}={{8,40},10} .
The PSNR comparisons are visualized in Fig.Β 4 (a) and are made based on the middle fifteen slices for all subjects in the fastMRIβs evaluation set. There are several interesting takeaways. Firstly, PSNR increases significantly with larger N π N for all instances, demonstrating the importance of multi-sample aggregation. We note that k = 8 π 8 k=8 applies 500 steps and yields sufficiently denoised results; as such, the low PSNR value for DiffuseRecon norefine 8 , 1 subscript superscript DiffuseRecon 8 1 norefine {\textrm{DiffuseRecon}}^{8,1}{\textrm{norefine}} is due to the geometric variability between sampled and groundtruth image. Secondly, when no gradual noise is added to x obs subscript π₯ obs x{\textrm{obs}} , the reconstruction results are significantly better when N = 1 π 1 N=1 and worse when N = 10 π 10 N=10 compared to the proposed KSG; such a direct replacement approach is used in a concurrent work POCSΒ [ 2 ] . This indicates that, while the clean x obs subscript π₯ obs x_{\textrm{obs}} introduced at t = T π‘ π t=T helps accelerate the denoising process, the aggregate results do not estimate the ground-truth well, i.e. they are more likely to be biased. Finally, the refinement steps significantly remove the blurriness caused by averaging multiple samples. We note that performances are very similar for DiffuseRecon refine 8 , 10 subscript superscript DiffuseRecon 8 10 refine {\textrm{DiffuseRecon}}^{8,10}{\textrm{refine}} and DiffuseRecon refine 40 , 10 subscript superscript DiffuseRecon 40 10 refine {\textrm{DiffuseRecon}}^{40,10}{\textrm{refine}} ; as such, k = 8 π 8 k=8 leads to significant speed-up without sacrificing reconstruction quality.
Table 1 : Quantitative volume-wise evaluation of DiffuseRecon against SoTA methods. Dedicated models are trained for 6 Γ \times and 8 Γ \times acceleration; model robustness is tested by applying { 6 Γ , 8 Γ } {6\times,8\times} model on { 10 Γ , 4 Γ } {10\times,4\times} downsampled inputs. All baselines achieved similar performance compared to the original papers.
Method
fastMRI
SKM-TEA
6 Γ \times
8 Γ \times
8 Γ \times β β \rightarrow 4 Γ \times
6 Γ \times β β \rightarrow 10 Γ \times
6 Γ \times
8 Γ \times
8 Γ \times β β \rightarrow 4 Γ \times
6 Γ \times β β \rightarrow 10 Γ \times
UNetΒ [ 16 ]
29.79
28.89
30.67
21.98
31.62
30.47
32.06
24.34
0.621 0.577 0.666 0.296 0.713 0.669 0.728 0.382
KIKI-NetΒ [ 5 ]
29.51
28.09
30.18
22.31
31.67
30.14
32.20
24.67
0.607 0.546 0.650 0.313 0.711 0.655 0.732 0.422
D5C5Β [ 17 ]
29.88
28.63
30.90
23.07
32.22
30.86
32.99
25.99
0.622 0.565 0.675 0.349 0.732 0.683 0.763 0.512
OUCRΒ [ 6 ]
30.44
29.56
31.33
23.41
32.52
31.27
33.11
26.17
0.644 0.600 0.689 0.371 0.742 0.696 0.766 0.516
DiffuseRecon
30.56
29.94
31.70
27.23
32.58
31.56
33.76
28.40
0.648
0.614
0.708
0.515
0.743
0.706
0.795
0.584
Quantitative and Visual Evaluation. We compare reconstruction results from DiffuseRecon with current SoTA methods, and summarize the quantitative results in TableΒ 1 . KIKI-NetΒ [ 5 ] applies convolutions on both the image and k-space data for reconstruction. D5C5Β [ 17 ] is a seminal DL-based MR reconstruction work and proposes to incorporate data consistency layers into a cascade of CNNs. OUCRΒ [ 6 ] builds onΒ [ 17 ] and uses a recurrent over-complete CNNΒ [ 20 ] architecture to better recovery fine details. We note that these methods use supervised learning and train dedicated models based on a fixed acceleration factor. In TableΒ 1 , DiffuseRecon is compared with these specifically trained models for acceleration factor of 6 Γ 6\times and 8 Γ 8\times in k-space. Although the model for DiffuseRecon is trained for a denoising task and has not observed any down-sampled MR images, DiffuseRecon obtains top performances compared to the dedicated models from current SoTAs; the performance gap is larger as the acceleration factor becomes higher. To examine the robustness of dedicated models on lower and higher quality inputs, we apply models trained on 6 Γ 6\times and 8 Γ 8\times acceleration on 10 Γ 10\times and 4 Γ 4\times undersampled inputs, respectively. In these cases, the performances of DiffuseRecon are significantly higher, demonstrating that 1. models obtained from supervised training are less reliable on images with unseen levels of degradation, and 2. DiffuseRecon is a general and performant MR reconstruction method.
Visualization of the reconstructed images is provided in Fig.Β 4 (b) and Fig.Β 5 to illustrate the advantages of DiffuseRecon. Due to limited space, we focus on the top three methods: D5C5Β [ 17 ] , OUCRΒ [ 6 ] , and DiffuseRecon. We observe that many significant structural details are lost in D5C5 and OUCR but can be recovered by DiffuseRecon. The errors in D5C5 and OUCR tend to have a vertical pattern, likely because the under-sampled images suffer from more vertical aliasing artifacts due to phase encoding under-sampling. As such, it is more difficult for D5C5 and OUCR to correctly recover these vertical details, and leads to blurry reconstructions under a pixel-wise loss function. DiffuseRecon, on the other hand, outputs realistic MR images that obey the distribution p ΞΈ β ( y full | x obs ) subscript π π conditional subscript π¦ full subscript π₯ obs p_{\theta}(y_{\textrm{full}}|x_{\textrm{obs}}) ; it can better recover vertical details that fit the distribution of p ΞΈ β ( y full ) subscript π π subscript π¦ full p_{\theta}(y_{\textrm{full}}) . This is particularly pronounced in the 8 Γ 8\times fastMRI results in Fig.Β 5 , where the lost vertical knee pattern in D5C5 and OUCR renders the image highly unrealistic. Such an error is avoided in DiffuseRecon, as each of its sample has a complete knee structure based on the learned prior knowledge. The uncertainty introduced by phase-encoding under-sampling is also captured by the variance map and demonstrates that the exact placement of details may be varied. For more visualization and detailed comparisons, please refer to the supplemental material.
While DiffuseRecon holds many advantages to current SoTA methods, it does require more computation due to the multi-sampling process. The one-time inference speed is comparable to UNetΒ [ 16 ] at 20 β ms 20 ms 20\textrm{ms} ; however, DiffuseRecon performs 1000 inference steps. We note that 20s per slice still yields significant acceleration compared to raw acquisitions, and there exists much potential in accelerating diffusion modelsΒ [ 18 , 11 ] for future work. DiffuseRecon can also be used in conjunction with deterministic methods, e.g. , when variance analysis is needed only selectively, to balance speed to performance and reliability.
FastMRI SKM-TEA
D5C5Β [ 17 ]
OUCRΒ [ 6 ]
DiffuseRecon GT D5C5Β [ 17 ]
OUCRΒ [ 6 ]
DiffuseRecon GT
6 Γ \times
8 Γ \times
(a) 29.61/.6974
(b) 31.34/.7472
(c) 35.30/.8622
(d) Variance
(e) 30.03/.6160
(f) 30.50/.6765
(g) 31.28/.7011
(h) Variance
(i) 25.61/.5167
(j) 27.84/.6148
(k) 30.42/.7087
(l) Variance
(m) 31.59/.7399
(n) 31.97/.7513
(o) 33.26/.7896
(p) Variance
Figure 5 : Visual comparison of DiffuseRecon with other SoTA reconstruction methods. Error maps are provided for reference.
4
Conclusion
We propose DiffuseRecon, an MR image reconstruction method that is of high performance, robust to different acceleration factors, and allows a user to directly observe the reconstruction variations. Inspired by diffusion models, DiffuseRecon incorporates the observed k-space signals in reverse-diffusion and can stochastically generate realistic MR reconstructions. To obtain the most likely reconstruction candidate and its variance, DiffuseRecon performs a coarse-to-fine sampling scheme that significantly accelerates the generation process. We carefully evaluate DiffuseRecon on raw acquisition data, and find that DiffuseRecon achieves better performances than current SoTA methods that require dedicated data generation and training for different sampling conditions. The reconstruction variance from DiffuseRecon can help practitioners make more informed clinical decisions. Future works include further acceleration of the inference process and examination into different MR sampling patterns.
References
[1]
AkΓ§akaya, M., Moeller, S., WeingΓ€rtner, S., UΔurbil, K.: Scan-specific robust artificial-neural-networks for k-space interpolation (RAKI) reconstruction: Database-free deep learning for fast imaging. Magn. Reson. Med. 81 (1), 439β453 (Jan 2019)
[2]
Chung, H., chul Ye, J.: Score-based diffusion models for accelerated mri (2021)
[3]
Desai, A.D., Schmidt, A.M., Rubin, E.B., Sandino, C.M., Black, M.S., Mazzoli, V., Stevens, K.J., Boutin, R., Re, C., Gold, G.E., etΒ al.: Skm-tea: A dataset for accelerated mri reconstruction with dense image labels for quantitative clinical evaluation. In: Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 2) (2021)
[4]
Dhariwal, P., Nichol, A.: Diffusion models beat gans on image synthesis. CoRR abs/2105.05233 (2021), https://arxiv.org/abs/2105.05233
[5]
Eo, T., Jun, Y., Kim, T., Jang, J., Lee, H.J., Hwang, D.: KIKI-net: cross-domain convolutional neural networks for reconstructing undersampled magnetic resonance images. Magn. Reson. Med. 80 (5), 2188β2201 (Nov 2018)
[6]
Guo, P., Valanarasu, J.M.J., Wang, P., Zhou, J., Jiang, S., Patel, V.M.: Over-and-under complete convolutional RNN for MRI reconstruction. In: deΒ Bruijne, M., Cattin, P.C., Cotin, S., Padoy, N., Speidel, S., Zheng, Y., Essert, C. (eds.) Medical Image Computing and Computer Assisted Intervention
- MICCAI 2021 - 24th International Conference, Strasbourg, France, September 27 - October 1, 2021, Proceedings, Part VI. Lecture Notes in Computer Science, vol. 12906, pp. 13β23. Springer (2021)
[7]
Guo, P., Wang, P., Zhou, J., Jiang, S., Patel, V.M.: Multi-institutional collaborations for improving deep learning-based magnetic resonance image reconstruction using federated learning. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, virtual, June 19-25, 2021. pp. 2423β2432. Computer Vision Foundation / IEEE (2021)
[8]
Hammernik, K., Klatzer, T., Kobler, E., Recht, M.P., Sodickson, D.K., Pock, T., Knoll, F.: Learning a variational network for reconstruction of accelerated mri data. Magnetic Resonance in Medicine 79 (6), 3055β3071 (2018)
[9]
Han, Y., Sunwoo, L., Ye, J.C.: Β‘inline-formulaΒΏ Β‘tex-math notation=βlatexβΒΏ k π {k} Β‘/tex-mathΒΏΒ‘/inline-formulaΒΏ-space deep learning for accelerated mri. IEEE Transactions on Medical Imaging 39 (2), 377β386 (2020). https://doi.org/10.1109/TMI.2019.2927101
[10]
Ho, J., Jain, A., Abbeel, P.: Denoising diffusion probabilistic models. In: Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M., Lin, H. (eds.) Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual (2020)
[11]
Kong, Z., Ping, W.: On fast sampling of diffusion probabilistic models. CoRR abs/2106.00132 (2021), https://arxiv.org/abs/2106.00132
[12]
Lustig, M., Donoho, D., Pauly, J.M.: Sparse mri: The application of compressed sensing for rapid mr imaging. Magnetic Resonance in Med. 58 (6), 1182β95 (2007)
[13]
Nichol, A.Q., Dhariwal, P.: Improved denoising diffusion probabilistic models. In: Meila, M., Zhang, T. (eds.) Proceedings of the 38th International Conference on Machine Learning, ICML 2021, 18-24 July 2021, Virtual Event. Proceedings of Machine Learning Research, vol.Β 139, pp. 8162β8171. PMLR (2021), http://proceedings.mlr.press/v139/nichol21a.html
[14]
Qin, C., Schlemper, J., Caballero, J., Price, A.N., Hajnal, J.V., Rueckert, D.: Convolutional recurrent neural networks for dynamic mr image reconstruction. IEEE Transactions on Medical Imaging 38 (1), 280β290 (2019). https://doi.org/10.1109/TMI.2018.2863670
[15]
Ravishankar, S., Bresler, Y.: MR image reconstruction from highly undersampled k-space data by dictionary learning. IEEE TMI 30 (5), 1028β41 (2011)
[16]
Ronneberger, O., Fischer, P., Brox, T.: U-net: Convolutional networks for biomedical image segmentation. In: Medical Image Computing and Computer-Assisted Intervention - MICCAI 2015 - 18th International Conference Munich, Germany, October 5 - 9, 2015, Proceedings, Part III. pp. 234β241 (2015)
[17]
Schlemper, J., Caballero, J., Hajnal, J.V., Price, A.N., Rueckert, D.: A deep cascade of convolutional neural networks for dynamic MR image reconstruction. IEEE Trans. Medical Imaging 37 (2), 491β503 (2018)
[18]
Song, J., Meng, C., Ermon, S.: Denoising diffusion implicit models. In: 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3-7, 2021. OpenReview.net (2021)
[19]
Sriram, A., Zbontar, J., Murrell, T., Zitnick, C.L., Defazio, A., Sodickson, D.K.: Grappanet: Combining parallel imaging with deep learning for multi-coil MRI reconstruction. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA, June 13-19, 2020. pp. 14303β14310. Computer Vision Foundation / IEEE (2020)
[20]
Valanarasu, J.M.J., Sindagi, V.A., Hacihaliloglu, I., Patel, V.M.: Kiu-net: Towards accurate segmentation of biomedical images using over-complete representations. In: Martel, A.L., Abolmaesumi, P., Stoyanov, D., Mateus, D., Zuluaga, M.A., Zhou, S.K., Racoceanu, D., Joskowicz, L. (eds.) Medical Image Computing and Computer Assisted Intervention - MICCAI 2020 - 23rd International Conference, Lima, Peru, October 4-8, 2020, Proceedings, Part IV. Lecture Notes in Computer Science, vol. 12264, pp. 363β373. Springer (2020)
[21]
Wang, S., Su, Z., Ying, L., Peng, X., Zhu, S., Liang, F., Feng, D., Liang, D.: Accelerating magnetic resonance imaging via deep learning. In: 2016 IEEE 13th International Symposium on Biomedical Imaging (ISBI). pp. 514β517 (2016). https://doi.org/10.1109/ISBI.2016.7493320
[22]
Yang, G., Yu, S., Dong, H., Slabaugh, G., Dragotti, P.L., Ye, X., Liu, F., Arridge, S., Keegan, J., Guo, Y., Firmin, D.: Dagan: Deep de-aliasing generative adversarial networks for fast compressed sensing mri reconstruction. IEEE Transactions on Medical Imaging 37 (6), 1310β1321 (2018). https://doi.org/10.1109/TMI.2017.2785879
[23]
Zbontar, J., Knoll, F., Sriram, A., Muckley, M.J., Bruno, M., Defazio, A., Parente, M., Geras, K.J., Katsnelson, J., Chandarana, H., Zhang, Z., Drozdzal, M., Romero, A., Rabbat, M.G., Vincent, P., Pinkerton, J., Wang, D., Yakubova, N., Owens, E., Zitnick, C.L., Recht, M.P., Sodickson, D.K., Lui, Y.W.: fastmri: An open dataset and benchmarks for accelerated MRI. CoRR abs/1811.08839 (2018)
[24]
Zhou, B., Zhou, S.K.: Dudornet: Learning a dual-domain recurrent network for fast MRI reconstruction with deep T1 prior. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA, June 13-19, 2020. pp. 4272β4281. Computer Vision Foundation / IEEE (2020)
β
Feeling lucky?
Conversion report
Report an issue
View original on arXiv βΊ
AI Summary: Based on semantic_scholar metadata. Not a recommendation.
π‘οΈ Paper Transparency Report
Technical metadata sourced from upstream repositories.
π Identity & Source
- id
- arxiv-paper--unknown--2203.04292
- slug
- unknown--2203.04292
- source
- semantic_scholar
- author
- Cheng Peng, Pengfei Guo, S. K. Zhou, Vishal M. Patel, Ramalingam Chellappa
- license
- ArXiv
- tags
- paper, research, academic
βοΈ Technical Specs
- architecture
- null
- params billions
- null
- context length
- null
- pipeline tag
π Engagement & Metrics
- downloads
- 0
- stars
- 0
- forks
- 0
Data indexed from public sources. Updated daily.