Automated curation of large-scale cancer histopathology image datasets using deep learning.

Hilgers, Lars; Yuan, Tanwei; West, Nicholas P; Ghaffari Laleh, Narmin; Brobeil, Alexander; Loeffler, Chiara M L; Brenner, Hermann; Brinker, Titus J; Westwood, Alice; Hewitt, Katherine J; Quirke, Philip; Kather, Jakob Nikolas; Grabsch, Heike I; Matthaei, Emylou; Hoffmeister, Michael; Carrero, Zunamys I
doi:10.1111/his.15159
000288710 001__ 288710
000288710 005__ 20250528125727.0
000288710 0247_ $$2doi$$a10.1111/his.15159
000288710 0247_ $$2pmid$$apmid:38409878
000288710 0247_ $$2ISSN$$a0309-0167
000288710 0247_ $$2ISSN$$a1365-2559
000288710 0247_ $$2altmetric$$aaltmetric:160182880
000288710 037__ $$aDKFZ-2024-00429
000288710 041__ $$aEnglish
000288710 082__ $$a610
000288710 1001_ $$aHilgers, Lars$$b0
000288710 245__ $$aAutomated curation of large-scale cancer histopathology image datasets using deep learning.
000288710 260__ $$aOxford [u.a.]$$bWiley-Blackwell$$c2024
000288710 3367_ $$2DRIVER$$aarticle
000288710 3367_ $$2DataCite$$aOutput Types/Journal article
000288710 3367_ $$0PUB:(DE-HGF)16$$2PUB:(DE-HGF)$$aJournal Article$$bjournal$$mjournal$$s1714035834_1931
000288710 3367_ $$2BibTeX$$aARTICLE
000288710 3367_ $$2ORCID$$aJOURNAL_ARTICLE
000288710 3367_ $$00$$2EndNote$$aJournal Article
000288710 500__ $$a2024 Jun;84(7):1139-1153
000288710 520__ $$aArtificial intelligence (AI) has numerous applications in pathology, supporting diagnosis and prognostication in cancer. However, most AI models are trained on highly selected data, typically one tissue slide per patient. In reality, especially for large surgical resection specimens, dozens of slides can be available for each patient. Manually sorting and labelling whole-slide images (WSIs) is a very time-consuming process, hindering the direct application of AI on the collected tissue samples from large cohorts. In this study we addressed this issue by developing a deep-learning (DL)-based method for automatic curation of large pathology datasets with several slides per patient.We collected multiple large multicentric datasets of colorectal cancer histopathological slides from the United Kingdom (FOXTROT, N = 21,384 slides; CR07, N = 7985 slides) and Germany (DACHS, N = 3606 slides). These datasets contained multiple types of tissue slides, including bowel resection specimens, endoscopic biopsies, lymph node resections, immunohistochemistry-stained slides, and tissue microarrays. We developed, trained, and tested a deep convolutional neural network model to predict the type of slide from the slide overview (thumbnail) image. The primary statistical endpoint was the macro-averaged area under the receiver operating curve (AUROCs) for detection of the type of slide.In the primary dataset (FOXTROT), with an AUROC of 0.995 [95% confidence interval [CI]: 0.994-0.996] the algorithm achieved a high classification performance and was able to accurately predict the type of slide from the thumbnail image alone. In the two external test cohorts (CR07, DACHS) AUROCs of 0.982 [95% CI: 0.979-0.985] and 0.875 [95% CI: 0.864-0.887] were observed, which indicates the generalizability of the trained model on unseen datasets. With a confidence threshold of 0.95, the model reached an accuracy of 94.6% (7331 classified cases) in CR07 and 85.1% (2752 classified cases) for the DACHS cohort.Our findings show that using the low-resolution thumbnail image is sufficient to accurately classify the type of slide in digital pathology. This can support researchers to make the vast resource of existing pathology archives accessible to modern AI models with only minimal manual annotations.
000288710 536__ $$0G:(DE-HGF)POF4-313$$a313 - Krebsrisikofaktoren und Prävention (POF4-313)$$cPOF4-313$$fPOF IV$$x0
000288710 588__ $$aDataset connected to CrossRef, PubMed, , Journals: inrepo02.dkfz.de
000288710 650_7 $$2Other$$acolorectal cancer
000288710 650_7 $$2Other$$adeep learning
000288710 650_7 $$2Other$$adigital pathology
000288710 650_7 $$2Other$$aquality control
000288710 7001_ $$aGhaffari Laleh, Narmin$$b1
000288710 7001_ $$00000-0002-0346-6709$$aWest, Nicholas P$$b2
000288710 7001_ $$aWestwood, Alice$$b3
000288710 7001_ $$aHewitt, Katherine J$$b4
000288710 7001_ $$aQuirke, Philip$$b5
000288710 7001_ $$aGrabsch, Heike I$$b6
000288710 7001_ $$aCarrero, Zunamys I$$b7
000288710 7001_ $$aMatthaei, Emylou$$b8
000288710 7001_ $$aLoeffler, Chiara M L$$b9
000288710 7001_ $$0P:(DE-He78)1e33961c8780aca9b76d776d1fdc1ebb$$aBrinker, Titus J$$b10$$udkfz
000288710 7001_ $$0P:(DE-He78)b9e439a1aa1244925f92d547c0919349$$aYuan, Tanwei$$b11$$udkfz
000288710 7001_ $$0P:(DE-He78)90d5535ff896e70eed81f4a4f6f22ae2$$aBrenner, Hermann$$b12$$udkfz
000288710 7001_ $$aBrobeil, Alexander$$b13
000288710 7001_ $$0P:(DE-He78)6c5d058b7552d071a7fa4c5e943fff0f$$aHoffmeister, Michael$$b14$$udkfz
000288710 7001_ $$aKather, Jakob Nikolas$$b15
000288710 773__ $$0PERI:(DE-600)2006447-0$$a10.1111/his.15159$$gp. his.15159$$n7$$p1139-1153$$tHistopathology$$v84$$x0309-0167$$y2024
000288710 8564_ $$uhttps://inrepo02.dkfz.de/record/288710/files/Histopathology%20-%202024%20-%20Hilgers%20-%20Automated%20curation%20of%20large%E2%80%90scale%20cancer%20histopathology%20image%20datasets%20using%20deep.pdf
000288710 8564_ $$uhttps://inrepo02.dkfz.de/record/288710/files/Histopathology%20-%202024%20-%20Hilgers%20-%20Automated%20curation%20of%20large%E2%80%90scale%20cancer%20histopathology%20image%20datasets%20using%20deep.pdf?subformat=pdfa$$xpdfa
000288710 909CO $$ooai:inrepo02.dkfz.de:288710$$pVDB
000288710 9101_ $$0I:(DE-588b)2036810-0$$6P:(DE-He78)1e33961c8780aca9b76d776d1fdc1ebb$$aDeutsches Krebsforschungszentrum$$b10$$kDKFZ
000288710 9101_ $$0I:(DE-588b)2036810-0$$6P:(DE-He78)b9e439a1aa1244925f92d547c0919349$$aDeutsches Krebsforschungszentrum$$b11$$kDKFZ
000288710 9101_ $$0I:(DE-588b)2036810-0$$6P:(DE-He78)90d5535ff896e70eed81f4a4f6f22ae2$$aDeutsches Krebsforschungszentrum$$b12$$kDKFZ
000288710 9101_ $$0I:(DE-588b)2036810-0$$6P:(DE-He78)6c5d058b7552d071a7fa4c5e943fff0f$$aDeutsches Krebsforschungszentrum$$b14$$kDKFZ
000288710 9131_ $$0G:(DE-HGF)POF4-313$$1G:(DE-HGF)POF4-310$$2G:(DE-HGF)POF4-300$$3G:(DE-HGF)POF4$$4G:(DE-HGF)POF$$aDE-HGF$$bGesundheit$$lKrebsforschung$$vKrebsrisikofaktoren und Prävention$$x0
000288710 9141_ $$y2024
000288710 915__ $$0StatID:(DE-HGF)0420$$2StatID$$aNationallizenz$$d2023-08-23$$wger
000288710 915__ $$0StatID:(DE-HGF)3001$$2StatID$$aDEAL Wiley$$d2023-08-23$$wger
000288710 915__ $$0StatID:(DE-HGF)0100$$2StatID$$aJCR$$bHISTOPATHOLOGY : 2022$$d2023-08-23
000288710 915__ $$0StatID:(DE-HGF)0200$$2StatID$$aDBCoverage$$bSCOPUS$$d2023-08-23
000288710 915__ $$0StatID:(DE-HGF)0300$$2StatID$$aDBCoverage$$bMedline$$d2023-08-23
000288710 915__ $$0StatID:(DE-HGF)0600$$2StatID$$aDBCoverage$$bEbsco Academic Search$$d2023-08-23
000288710 915__ $$0StatID:(DE-HGF)0030$$2StatID$$aPeer Review$$bASC$$d2023-08-23
000288710 915__ $$0StatID:(DE-HGF)0199$$2StatID$$aDBCoverage$$bClarivate Analytics Master Journal List$$d2023-08-23
000288710 915__ $$0StatID:(DE-HGF)1050$$2StatID$$aDBCoverage$$bBIOSIS Previews$$d2023-08-23
000288710 915__ $$0StatID:(DE-HGF)0113$$2StatID$$aWoS$$bScience Citation Index Expanded$$d2023-08-23
000288710 915__ $$0StatID:(DE-HGF)0150$$2StatID$$aDBCoverage$$bWeb of Science Core Collection$$d2023-08-23
000288710 915__ $$0StatID:(DE-HGF)1030$$2StatID$$aDBCoverage$$bCurrent Contents - Life Sciences$$d2023-08-23
000288710 915__ $$0StatID:(DE-HGF)1190$$2StatID$$aDBCoverage$$bBiological Abstracts$$d2023-08-23
000288710 915__ $$0StatID:(DE-HGF)0160$$2StatID$$aDBCoverage$$bEssential Science Indicators$$d2023-08-23
000288710 915__ $$0StatID:(DE-HGF)1110$$2StatID$$aDBCoverage$$bCurrent Contents - Clinical Medicine$$d2023-08-23
000288710 915__ $$0StatID:(DE-HGF)9905$$2StatID$$aIF >= 5$$bHISTOPATHOLOGY : 2022$$d2023-08-23
000288710 9201_ $$0I:(DE-He78)C140-20160331$$kC140$$lNWG Digitale Biomarker in der Onkologie$$x0
000288710 9201_ $$0I:(DE-He78)C070-20160331$$kC070$$lC070 Klinische Epidemiologie und Alternf.$$x1
000288710 9201_ $$0I:(DE-He78)C120-20160331$$kC120$$lPräventive Onkologie$$x2
000288710 9201_ $$0I:(DE-He78)HD01-20160331$$kHD01$$lDKTK HD zentral$$x3
000288710 980__ $$ajournal
000288710 980__ $$aVDB
000288710 980__ $$aI:(DE-He78)C140-20160331
000288710 980__ $$aI:(DE-He78)C070-20160331
000288710 980__ $$aI:(DE-He78)C120-20160331
000288710 980__ $$aI:(DE-He78)HD01-20160331
000288710 980__ $$aUNRESTRICTED
guest :: login DKFZ
		Search		Submit		Personalize Your alerts Your baskets Your searches		Help