MRgRT Real-Time Target Localization Using Foundation Models for Contour Point Tracking and Promptable Mask Refinement
PHYSICS IN MEDICINE AND BIOLOGY(2025)
Ludwig Maximilians Univ Munchen | Department of Radiation Oncology | Ludwig Maximilians Univ LMU
Abstract
Objective. This study aimed to evaluate two real-time target tracking approaches for magnetic resonance imaging (MRI) guided radiotherapy (MRgRT) based on foundation artificial intelligence models. Approach. The first approach used a point-tracking model that propagates points from a reference contour. The second approach used a video-object-segmentation model, based on segment anything model 2 (SAM2). Both approaches were evaluated and compared against each other, inter-observer variability, and a transformer-based image registration model, TransMorph, with and without patient-specific (PS) fine-tuning. The evaluation was carried out on 2D cine MRI datasets from two institutions, containing scans from 33 patients with 8060 labeled frames, with annotations from 2 to 5 observers per frame, totaling 29179 ground truth segmentations. The segmentations produced were assessed using the Dice similarity coefficient (DSC), 50% and 95% Hausdorff distances (HD50 / HD95), and the Euclidean center distance (ECD). Main results. The results showed that the contour tracking (median DSC 0.92 +/- 0.04 and ECD 1.9 +/- 1.0 mm) and SAM2-based (median DSC 0.93 +/- 0.03 and ECD 1.6 +/- 1.1 mm) approaches produced target segmentations comparable or superior to TransMorph w/o PS fine-tuning (median DSC 0.91 +/- 0.07 and ECD 2.6 +/- 1.4 mm) and slightly inferior to TransMorph w/ PS fine-tuning (median DSC 0.94 +/- 0.03 and ECD 1.4 +/- 0.8 mm). Between the two novel approaches, the one based on SAM2 performed marginally better at a higher computational cost (inference times 92 ms for contour tracking and 109 ms for SAM2). Both approaches and TransMorph w/ PS fine-tuning exceeded inter-observer variability (median DSC 0.90 +/- 0.06 and ECD 1.7 +/- 0.7 mm). Significance. This study demonstrates the potential of foundation models to achieve high-quality real-time target tracking in MRgRT, offering performance that matches state-of-the-art methods without requiring PS fine-tuning.
MoreTranslated text
Key words
deep learning,respiratory motion,MRI-linac,MRI-guidance,motion management
求助PDF
上传PDF
View via Publisher
AI Read Science
AI Summary
AI Summary is the key point extracted automatically understanding the full text of the paper, including the background, methods, results, conclusions, icons and other key content, so that you can get the outline of the paper at a glance.
Example
Background
Key content
Introduction
Methods
Results
Related work
Fund
Key content
- Pretraining has recently greatly promoted the development of natural language processing (NLP)
- We show that M6 outperforms the baselines in multimodal downstream tasks, and the large M6 with 10 parameters can reach a better performance
- We propose a method called M6 that is able to process information of multiple modalities and perform both single-modal and cross-modal understanding and generation
- The model is scaled to large model with 10 billion parameters with sophisticated deployment, and the 10 -parameter M6-large is the largest pretrained model in Chinese
- Experimental results show that our proposed M6 outperforms the baseline in a number of downstream tasks concerning both single modality and multiple modalities We will continue the pretraining of extremely large models by increasing data to explore the limit of its performance
Upload PDF to Generate Summary
Must-Reading Tree
Example

Generate MRT to find the research sequence of this paper
Data Disclaimer
The page data are from open Internet sources, cooperative publishers and automatic analysis results through AI technology. We do not make any commitments and guarantees for the validity, accuracy, correctness, reliability, completeness and timeliness of the page data. If you have any questions, please contact us by email: report@aminer.cn
Chat Paper