Learning interpretable representations of single-cell multi-omics data with multi-output Gaussian processes.

Moslehi, Zahra; Buettner, Florian; de Azevedo, Kevin; AmeriFar, Sareh
doi:10.1093/nar/gkaf630
% IMPORTANT: The following is UTF-8 encoded.  This means that in the presence
% of non-ASCII characters, it will not work with BibTeX 0.99 or older.
% Instead, you should use an up-to-date BibTeX implementation like “bibtex8” or
% “biber”.

@ARTICLE{Moslehi:303096,
      author       = {Z. Moslehi$^*$ and S. AmeriFar and K. de Azevedo$^*$ and F.
                      Buettner$^*$},
      title        = {{L}earning interpretable representations of single-cell
                      multi-omics data with multi-output {G}aussian processes.},
      journal      = {Nucleic acids research},
      volume       = {53},
      number       = {14},
      issn         = {0305-1048},
      address      = {Oxford},
      publisher    = {Oxford Univ. Press},
      reportid     = {DKFZ-2025-01521},
      pages        = {gkaf630},
      year         = {2025},
      note         = {ISSN 1362-4962},
      abstract     = {Learning representations of single-cell genomics data is
                      challenging due to the nonlinear and often multi-modal
                      nature of the data on one hand and the need for
                      interpretable representations on the other hand. Existing
                      approaches tend to focus either on interpretability aspects
                      via linear matrix factorization or on maximizing expressive
                      power via neural network-based embeddings using black-box
                      variational autoencoders or graph embedding approaches. We
                      address this trade-off between expressive power and
                      interpretability by introducing a novel approach that
                      combines highly expressive representation learning via an
                      embedding layer with interpretable multi-output Gaussian
                      processes within a unified framework. In our model, we learn
                      distinct representations for samples (cells) and features
                      (genes) from multi-modal single-cell data. We demonstrate
                      that even a few interpretable latent dimensions can
                      effectively capture the underlying structure of the data.
                      Our model yields interpretable relationships between groups
                      of cells and their associated marker genes: leveraging a
                      gene relevance map, we establish connections between cell
                      clusters (e.g. specific cell types) and feature clusters
                      (e.g. marker genes for those specific cell types) within the
                      learned latent spaces of cells and features.},
      keywords     = {Single-Cell Analysis: methods / Normal Distribution /
                      Genomics: methods / Humans / Neural Networks, Computer /
                      Algorithms / Machine Learning / Multiomics},
      cin          = {FM01},
      ddc          = {570},
      cid          = {I:(DE-He78)FM01-20160331},
      pnm          = {899 - ohne Topic (POF4-899)},
      pid          = {G:(DE-HGF)POF4-899},
      typ          = {PUB:(DE-HGF)16},
      pubmed       = {pmid:40694853},
      doi          = {10.1093/nar/gkaf630},
      url          = {https://inrepo02.dkfz.de/record/303096},
}
guest :: login DKFZ
		Search		Submit		Personalize Your alerts Your baskets Your searches		Help