TY - JOUR
AU - Kalinin, Alexandr A
AU - Arevalo, John
AU - Serrano, Erik
AU - Vulliard, Loan
AU - Tsang, Hillary
AU - Bornholdt, Michael
AU - Muñoz, Alán F
AU - Sivagurunathan, Suganya
AU - Rajwa, Bartek
AU - Carpenter, Anne E
AU - Way, Gregory P
AU - Singh, Shantanu
TI - A versatile information retrieval framework for evaluating profile strength and similarity.
JO - Nature Communications
VL - 16
IS - 1
SN - 2041-1723
CY - [London]
PB - Springer Nature
M1 - DKFZ-2025-01152
SP - 5181
PY - 2025
AB - Large-scale profiling assays capture a cell population's state by measuring thousands of biological properties per cell or sample. However, evaluating profile strength and similarity remains challenging due to the high dimensionality and non-linear, heterogeneous nature of measurements. Here, we develop a statistical framework using mean average precision (mAP) as a single, data-driven metric to address this challenge. We validate the mAP framework against established metrics through simulations and real-world data, revealing its ability to capture subtle and meaningful biological differences in cell state. Specifically, we use mAP to assess a sample's phenotypic activity relative to controls, as well as the phenotypic consistency of groups of perturbations (or samples). We evaluate the framework across diverse datasets and on different profile types (image, protein, mRNA), perturbations (CRISPR, gene overexpression, small molecules), and resolutions (single-cell, bulk). The mAP framework, together with our open-source software package copairs, is useful for evaluating high-dimensional profiling data in biological research and drug discovery.
KW - Software
KW - Humans
KW - Gene Expression Profiling: methods
KW - Information Storage and Retrieval: methods
KW - Computational Biology: methods
KW - Phenotype
LB - PUB:(DE-HGF)16
C6 - pmid:40467541
DO - DOI:10.1038/s41467-025-60306-2
UR - https://inrepo02.dkfz.de/record/301772
ER -