000309648 001__ 309648
000309648 005__ 20260205151719.0
000309648 0247_ $$2doi$$a10.1038/s41746-026-02403-0
000309648 0247_ $$2pmid$$apmid:41639385
000309648 037__ $$aDKFZ-2026-00280
000309648 041__ $$aEnglish
000309648 082__ $$a610
000309648 1001_ $$aYang, Shu$$b0
000309648 245__ $$aLarge-scale self-supervised video foundation model for intelligent surgery.
000309648 260__ $$a[Basingstoke]$$bMacmillan Publishers Limited$$c2026
000309648 3367_ $$2DRIVER$$aarticle
000309648 3367_ $$2DataCite$$aOutput Types/Journal article
000309648 3367_ $$0PUB:(DE-HGF)16$$2PUB:(DE-HGF)$$aJournal Article$$bjournal$$mjournal$$s1770300994_2392998
000309648 3367_ $$2BibTeX$$aARTICLE
000309648 3367_ $$2ORCID$$aJOURNAL_ARTICLE
000309648 3367_ $$00$$2EndNote$$aJournal Article
000309648 500__ $$a#NCTZFB26# / epub
000309648 520__ $$aComputer-Assisted Intervention has the potential to revolutionize modern surgery, with surgical scene understanding serving as a critical component in supporting decision-making and improving procedural efficacy. While existing AI-driven approaches alleviate annotation burdens via self-supervised spatial representation learning, their lack of explicit temporal modeling during pre-training fundamentally restricts the capture of dynamic surgical contexts, resulting in incomplete spatiotemporal understanding. In this work, we introduce the first video-level surgical pre-training framework that enables joint spatiotemporal representation learning from large-scale surgical video data. To achieve this, we constructed a large-scale surgical video dataset comprising 3650 videos and 3.55 million frames, spanning more than 20 surgical procedures and over 10 anatomical structures. Building upon this dataset, we propose SurgVISTA (Surgical Video-level Spatial-Temporal Architecture), a reconstruction-based pre-training method that jointly captures intricate spatial structures and temporal dynamics. Additionally, SurgVISTA incorporates image-level knowledge distillation guided by a surgery-specific expert model to enhance the learning of fine-grained anatomical and semantic features. To validate its effectiveness, we established a comprehensive benchmark comprising 13 video-level datasets spanning six surgical procedures across four tasks. Extensive experiments demonstrate that SurgVISTA consistently outperforms both natural- and surgical-domain pre-trained models, demonstrating strong potential to advance intelligent surgical systems in clinically meaningful scenarios.
000309648 536__ $$0G:(DE-HGF)POF4-315$$a315 - Bildgebung und Radioonkologie (POF4-315)$$cPOF4-315$$fPOF IV$$x0
000309648 588__ $$aDataset connected to CrossRef, PubMed, , Journals: inrepo02.dkfz.de
000309648 7001_ $$aZhou, Fengtao$$b1
000309648 7001_ $$0P:(DE-He78)bdba4dfeb11892e9f37641db00ef0534$$aMayer, Leon$$b2$$udkfz
000309648 7001_ $$aHuang, Fuxiang$$b3
000309648 7001_ $$aChen, Yiliang$$b4
000309648 7001_ $$aWang, Yihui$$b5
000309648 7001_ $$aHe, Sunan$$b6
000309648 7001_ $$aNie, Yuxiang$$b7
000309648 7001_ $$aWang, Xi$$b8
000309648 7001_ $$aJin, Yueming$$b9
000309648 7001_ $$aSun, Huihui$$b10
000309648 7001_ $$aXu, Shuchang$$b11
000309648 7001_ $$aLiu, Alex Qinyang$$b12
000309648 7001_ $$aLi, Zheng$$b13
000309648 7001_ $$aQin, Jing$$b14
000309648 7001_ $$aTeoh, Jeremy YuenChun$$b15
000309648 7001_ $$0P:(DE-He78)26a1176cd8450660333a012075050072$$aMaier-Hein, Lena$$b16$$udkfz
000309648 7001_ $$aChen, Hao$$b17
000309648 773__ $$0PERI:(DE-600)2925182-5$$a10.1038/s41746-026-02403-0$$pnn$$tnpj digital medicine$$vnn$$x2398-6352$$y2026
000309648 9101_ $$0I:(DE-588b)2036810-0$$6P:(DE-He78)bdba4dfeb11892e9f37641db00ef0534$$aDeutsches Krebsforschungszentrum$$b2$$kDKFZ
000309648 9101_ $$0I:(DE-588b)2036810-0$$6P:(DE-He78)26a1176cd8450660333a012075050072$$aDeutsches Krebsforschungszentrum$$b16$$kDKFZ
000309648 9131_ $$0G:(DE-HGF)POF4-315$$1G:(DE-HGF)POF4-310$$2G:(DE-HGF)POF4-300$$3G:(DE-HGF)POF4$$4G:(DE-HGF)POF$$aDE-HGF$$bGesundheit$$lKrebsforschung$$vBildgebung und Radioonkologie$$x0
000309648 9141_ $$y2026
000309648 915__ $$0StatID:(DE-HGF)0100$$2StatID$$aJCR$$bNPJ DIGIT MED : 2022$$d2025-11-05
000309648 915__ $$0StatID:(DE-HGF)0200$$2StatID$$aDBCoverage$$bSCOPUS$$d2025-11-05
000309648 915__ $$0StatID:(DE-HGF)0300$$2StatID$$aDBCoverage$$bMedline$$d2025-11-05
000309648 915__ $$0StatID:(DE-HGF)0320$$2StatID$$aDBCoverage$$bPubMed Central$$d2025-11-05
000309648 915__ $$0StatID:(DE-HGF)0501$$2StatID$$aDBCoverage$$bDOAJ Seal$$d2025-08-21T14:06:20Z
000309648 915__ $$0StatID:(DE-HGF)0500$$2StatID$$aDBCoverage$$bDOAJ$$d2025-08-21T14:06:20Z
000309648 915__ $$0StatID:(DE-HGF)0030$$2StatID$$aPeer Review$$bDOAJ : Anonymous peer review$$d2025-08-21T14:06:20Z
000309648 915__ $$0StatID:(DE-HGF)0199$$2StatID$$aDBCoverage$$bClarivate Analytics Master Journal List$$d2025-11-05
000309648 915__ $$0StatID:(DE-HGF)1110$$2StatID$$aDBCoverage$$bCurrent Contents - Clinical Medicine$$d2025-11-05
000309648 915__ $$0StatID:(DE-HGF)0160$$2StatID$$aDBCoverage$$bEssential Science Indicators$$d2025-11-05
000309648 915__ $$0StatID:(DE-HGF)0113$$2StatID$$aWoS$$bScience Citation Index Expanded$$d2025-11-05
000309648 915__ $$0StatID:(DE-HGF)0150$$2StatID$$aDBCoverage$$bWeb of Science Core Collection$$d2025-11-05
000309648 915__ $$0StatID:(DE-HGF)9915$$2StatID$$aIF >= 15$$bNPJ DIGIT MED : 2022$$d2025-11-05
000309648 915__ $$0StatID:(DE-HGF)0561$$2StatID$$aArticle Processing Charges$$d2025-11-05
000309648 915__ $$0StatID:(DE-HGF)0700$$2StatID$$aFees$$d2025-11-05
000309648 9201_ $$0I:(DE-He78)E130-20160331$$kE130$$lE130 Intelligente Medizinische Systeme$$x0
000309648 9201_ $$0I:(DE-He78)HD02-20160331$$kHD02$$lKoordinierungsstelle NCT Heidelberg$$x1
000309648 980__ $$ajournal
000309648 980__ $$aVDB
000309648 980__ $$aI:(DE-He78)E130-20160331
000309648 980__ $$aI:(DE-He78)HD02-20160331
000309648 980__ $$aUNRESTRICTED