Adapting Foundation Model for Dental Caries Detection with Dual-View Co-Training

Tao Luo ^1*

Han Wu^1,2*

Tong Yang⁵

Dingggang Shen^1,3,4

Zhiming Cui¹

¹School of Biomedical Engineering & State Key Laboratory of Advanced Medical Materials and Devices, ShanghaiTech University, Shanghai, China

³Shanghai United Imaging Intelligence Co. Ltd., Shanghai, China

⁴Shanghai Clinical Research and Trial Center, Shanghai, China

⁵Shanghai Linkedcare Information Technology Co., Ltd., Shanghai, China

Paper Code

News

event [09/2025] Dataset available, please refer to GitHub!

event [09/2025] Code available on GitHub!

event [09/2025] Preprint available on arXiv!

event [06/2025] Our paper is accepted by MICCAI 2025!🎉

Abstract

Accurate dental caries detection from panoramic X-rays plays a pivotal role in preventing lesion progression. However, current detection methods often yield suboptimal accuracy due to subtle contrast variations and diverse lesion morphology of dental caries. In this work, inspired by the clinical workflow where dentists systematically combine whole-image screening with detailed tooth-level inspection, we present DVCTNet, a novel Dual-View Co-Training network for accurate dental caries detection. Our DVCTNet starts with employing automated tooth detection to establish two complementary views: a global view from panoramic X-ray images and a local view from cropped tooth images. We then pretrain two vision foundation models separately on the two views. The global-view foundation model serves as the detection backbone, generating region proposals and global features, while the local-view model extracts detailed features from corresponding cropped tooth patches matched by the region proposals. To effectively integrate information from both views, we introduce a Gated Cross-View Attention (GCV-Atten) module that dynamically fuses dual-view features, enhancing the detection pipeline by integrating the fused features back into the detection model for final caries detection. To rigorously evaluate our DVCTNet, we test it on a public dataset and further validate its performance on a newly curated, high-precision dental caries detection dataset, annotated using both intra-oral images and panoramic X-rays for double verification. Experimental results demonstrate DVCTNet's superior performance against existing state-of-the-art (SOTA) methods on both datasets, indicating the clinical applicability of our method. Our code and labeled dataset are available at https://github.com/ShanghaiTech-IMPACT/DVCTNet.

Methodology

Figure 1: An overview of the proposed DVCTNet for dental caries detection.

Dataset

We collected a new benchmark dataset, named the DVCT dataset, which has a total number of 500,000 panoramic X-ray images, acquired from eight clinical centers, of which 498,000 remain unlabeled and 2,000 are annotated following the cross-verification by four experienced dental radiologists with over ten years of expertise, covering 5311 instances of dental caries and across dentition at different stage, as shown in Figure 1 (1) . The labeled set is randomly divided into 1500/300/200 for training, validation and testing, respectively. Our new dataset has two advantages over existing ones: 1) a higher-quality goldenstandard annotated with dual verification from both panoramic X-ray images and intra-oral images and also a broader coverage of subjects across different age groups and 2) a large-scale unlabeled dataset, enabling self-supervised pretaining, especially for foundation model, which can be applied for any downstream task related to dental panoramic X-ray analysis.

Our dataset is available for reserach purpose only. To apply for DVCTNet dataset, please visit the GitHub repository.

Results

Quantitative results on the Public dataset and our DVCT dataset.

Methods	Public Dataset			DVCT Dataset
Methods	AP	AP50	AP75	AP	AP50	AP75

RetinaNet	13.0	30.5	10.2	11.1	32.9	3.3
YOLOX	40.5	81.3	36.1	15.4	42.4	8.4
DINO	37.8	75.3	29.4	22.2	50.4	14.4
Faster RCNN	39.9	78.0	37.8	14.3	30.8	9.8
FPCL	48.2	84.1	50.6	17.0	42.7	10.2
DVCTNet(ours)	48.9	84.7	52.2	31.3	57.4	31.9

Citation

@InProceedings{LuoTao_Adapting_MICCAI2025,
            author = { Luo, Tao and Wu, Han and Yang, Tong and Shen, Dinggang and Cui, Zhiming},
            title = { { Adapting Foundation Model for Dental Caries Detection with Dual-View Co-Training } },
            booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
            year = {2025},
            publisher = {Springer Nature Switzerland},
            volume = {LNCS 15975},
            month = {September},
            page = {44 -- 53}
    }