Journal Article DKFZ-2019-02225

http://join2-wiki.gsi.de/foswiki/pub/Main/Artwork/join2_logo100x88.png
Pan-European Data Harmonization for Biobanks in ADOPT BBMRI-ERIC.

 ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;

2019
Schattauer Stuttgart

Applied clinical informatics 10(4), 679 - 692 () [10.1055/s-0039-1695793]
 GO

This record in other databases:  

Please use a persistent id in citations: doi:

Abstract: High-quality clinical data and biological specimens are key for medical research and personalized medicine. The Biobanking and Biomolecular Resources Research Infrastructure-European Research Infrastructure Consortium (BBMRI-ERIC) aims to facilitate access to such biological resources. The accompanying ADOPT BBMRI-ERIC project kick-started BBMRI-ERIC by collecting colorectal cancer data from European biobanks. To transform these data into a common representation, a uniform approach for data integration and harmonization had to be developed. This article describes the design and the implementation of a toolset for this task. Based on the semantics of a metadata repository, we developed a lexical bag-of-words matcher, capable of semiautomatically mapping local biobank terms to the central ADOPT BBMRI-ERIC terminology. Its algorithm supports fuzzy matching, utilization of synonyms, and sentiment tagging. To process the anonymized instance data based on these mappings, we also developed a data transformation application. The implementation was used to process the data from 10 European biobanks. The lexical matcher automatically and correctly mapped 78.48% of the 1,492 local biobank terms, and human experts were able to complete the remaining mappings. We used the expert-curated mappings to successfully process 147,608 data records from 3,415 patients. A generic harmonization approach was created and successfully used for cross-institutional data harmonization across 10 European biobanks. The software tools were made available as open source.

Classification:

Contributing Institute(s):
  1. Medizinische Informatik in der Translationalen Onkologie (E240)
  2. Verbundinformationssysteme (E260)
Research Program(s):
  1. 315 - Imaging and radiooncology (POF3-315) (POF3-315)

Appears in the scientific report 2019
Database coverage:
Medline ; Clarivate Analytics Master Journal List ; IF < 5 ; JCR ; PubMed Central ; SCOPUS ; Science Citation Index Expanded ; Web of Science Core Collection
Click to display QR Code for this record

The record appears in these collections:
Document types > Articles > Journal Article
Public records
Publications database

 Record created 2019-09-17, last modified 2024-02-29


Rate this document:

Rate this document:
1
2
3
 
(Not yet reviewed)