Journal Article DKFZ-2023-01700

http://join2-wiki.gsi.de/foswiki/pub/Main/Artwork/join2_logo100x88.png
Immune cell type signature discovery and random forest classification for analysis of single cell gene expression datasets.

 ;  ;  ;

2023
Frontiers Media Lausanne

Frontiers in immunology 14, 1194745 () [10.3389/fimmu.2023.1194745]
 GO

This record in other databases:  

Please use a persistent id in citations: doi:

Abstract: Robust immune cell gene expression signatures are central to the analysis of single cell studies. Nearly all known sets of immune cell signatures have been derived by making use of only single gene expression datasets. Utilizing the power of multiple integrated datasets could lead to high-quality immune cell signatures which could be used as superior inputs to machine learning-based cell type classification approaches.We established a novel workflow for the discovery of immune cell type signatures based primarily on gene-versus-gene expression similarity. It leverages multiple datasets, here seven single cell expression datasets from six different cancer types and resulted in eleven immune cell type-specific gene expression signatures. We used these to train random forest classifiers for immune cell type assignment for single-cell RNA-seq datasets. We obtained similar or better prediction results compared to commonly used methods for cell type assignment in independent benchmarking datasets. Our gene signature set yields higher prediction scores than other published immune cell type gene sets in random forest-based cell type classification. We further demonstrate how our approach helps to avoid bias in downstream statistical analyses by re-analysis of a published IFN stimulation experiment.We demonstrated the quality of our immune cell signatures and their strong performance in a random forest-based cell typing approach. We argue that classifying cells based on our comparably slim sets of genes accompanied by a random forest-based approach not only matches or outperforms widely used published approaches. It also facilitates unbiased downstream statistical analyses of differential gene expression between cell types for significantly more genes compared to previous cell classification algorithms.

Keyword(s): cell clustering ; cell type classification ; gene signature discovery ; machine learning ; single-cell RNA sequencing ; tumor microenvironment

Classification:

Contributing Institute(s):
  1. Angewandte Bioinformatik (B330)
  2. DKTK HD zentral (HD01)
Research Program(s):
  1. 312 - Funktionelle und strukturelle Genomforschung (POF4-312) (POF4-312)

Appears in the scientific report 2023
Database coverage:
Medline ; Creative Commons Attribution CC BY (No Version) ; DOAJ ; Article Processing Charges ; Clarivate Analytics Master Journal List ; DOAJ Seal ; Essential Science Indicators ; Fees ; IF >= 5 ; JCR ; PubMed Central ; SCOPUS ; Science Citation Index Expanded ; Web of Science Core Collection
Click to display QR Code for this record

The record appears in these collections:
Document types > Articles > Journal Article
Public records
Publications database

 Record created 2023-08-23, last modified 2024-02-29


Rate this document:

Rate this document:
1
2
3
 
(Not yet reviewed)