001     296154
005     20250112014745.0
024 7 _ |a 10.1080/19420862.2024.2442750
|2 doi
024 7 _ |a pmid:39772905
|2 pmid
024 7 _ |a 1942-0862
|2 ISSN
024 7 _ |a 1942-0870
|2 ISSN
024 7 _ |a altmetric:172892558
|2 altmetric
037 _ _ |a DKFZ-2025-00082
041 _ _ |a English
082 _ _ |a 610
100 1 _ |a Ramon, Aubin
|0 0009-0002-5502-5961
|b 0
245 _ _ |a Prediction of protein biophysical traits from limited data: a case study on nanobody thermostability through NanoMelt.
260 _ _ |a London
|c 2025
|b Taylor & Francis
336 7 _ |a article
|2 DRIVER
336 7 _ |a Output Types/Journal article
|2 DataCite
336 7 _ |a Journal Article
|b journal
|m journal
|0 PUB:(DE-HGF)16
|s 1736433578_15823
|2 PUB:(DE-HGF)
336 7 _ |a ARTICLE
|2 BibTeX
336 7 _ |a JOURNAL_ARTICLE
|2 ORCID
336 7 _ |a Journal Article
|0 0
|2 EndNote
520 _ _ |a In-silico prediction of protein biophysical traits is often hindered by the limited availability of experimental data and their heterogeneity. Training on limited data can lead to overfitting and poor generalizability to sequences distant from those in the training set. Additionally, inadequate use of scarce and disparate data can introduce biases during evaluation, leading to unreliable model performances being reported. Here, we present a comprehensive study exploring various approaches for protein fitness prediction from limited data, leveraging pre-trained embeddings, repeated stratified nested cross-validation, and ensemble learning to ensure an unbiased assessment of the performances. We applied our framework to introduce NanoMelt, a predictor of nanobody thermostability trained with a dataset of 640 measurements of apparent melting temperature, obtained by integrating data from the literature with 129 new measurements from this study. We find that an ensemble model stacking multiple regression using diverse sequence embeddings achieves state-of-the-art accuracy in predicting nanobody thermostability. We further demonstrate NanoMelt's potential to streamline nanobody development by guiding the selection of highly stable nanobodies. We make the curated dataset of nanobody thermostability freely available and NanoMelt accessible as a downloadable software and webserver.
536 _ _ |a 312 - Funktionelle und strukturelle Genomforschung (POF4-312)
|0 G:(DE-HGF)POF4-312
|c POF4-312
|f POF IV
|x 0
588 _ _ |a Dataset connected to CrossRef, PubMed, , Journals: inrepo02.dkfz.de
650 _ 7 |a Biological sciences – biophysics and computational biology
|2 Other
650 _ 7 |a Protein fitness
|2 Other
650 _ 7 |a antibody design
|2 Other
650 _ 7 |a antibody engineering
|2 Other
650 _ 7 |a ensemble model
|2 Other
650 _ 7 |a machine learning
|2 Other
650 _ 7 |a nanobody
|2 Other
650 _ 7 |a semi-supervised learning
|2 Other
650 _ 7 |a thermostability
|2 Other
650 _ 7 |a Single-Domain Antibodies
|2 NLM Chemicals
650 _ 2 |a Single-Domain Antibodies: chemistry
|2 MeSH
650 _ 2 |a Single-Domain Antibodies: immunology
|2 MeSH
650 _ 2 |a Protein Stability
|2 MeSH
650 _ 2 |a Humans
|2 MeSH
650 _ 2 |a Software
|2 MeSH
650 _ 2 |a Computer Simulation
|2 MeSH
700 1 _ |a Ni, Mingyang
|b 1
700 1 _ |a Predeina, Olga
|b 2
700 1 _ |a Gaffey, Rebecca
|b 3
700 1 _ |a Kunz, Patrick
|0 P:(DE-He78)c4e25fa3671791de6626f8aab98a31e5
|b 4
700 1 _ |a Onuoha, Shimobi
|b 5
700 1 _ |a Sormanni, Pietro
|0 0000-0002-6228-2221
|b 6
773 _ _ |a 10.1080/19420862.2024.2442750
|g Vol. 17, no. 1, p. 2442750
|0 PERI:(DE-600)2537838-7
|n 1
|p 2442750
|t mAbs
|v 17
|y 2025
|x 1942-0862
909 C O |o oai:inrepo02.dkfz.de:296154
|p VDB
910 1 _ |a Deutsches Krebsforschungszentrum
|0 I:(DE-588b)2036810-0
|k DKFZ
|b 4
|6 P:(DE-He78)c4e25fa3671791de6626f8aab98a31e5
913 1 _ |a DE-HGF
|b Gesundheit
|l Krebsforschung
|1 G:(DE-HGF)POF4-310
|0 G:(DE-HGF)POF4-312
|3 G:(DE-HGF)POF4
|2 G:(DE-HGF)POF4-300
|4 G:(DE-HGF)POF
|v Funktionelle und strukturelle Genomforschung
|x 0
914 1 _ |y 2025
915 _ _ |a DBCoverage
|0 StatID:(DE-HGF)0200
|2 StatID
|b SCOPUS
|d 2023-10-26
915 _ _ |a DBCoverage
|0 StatID:(DE-HGF)0300
|2 StatID
|b Medline
|d 2023-10-26
915 _ _ |a DBCoverage
|0 StatID:(DE-HGF)0320
|2 StatID
|b PubMed Central
|d 2023-10-26
915 _ _ |a DBCoverage
|0 StatID:(DE-HGF)0501
|2 StatID
|b DOAJ Seal
|d 2023-07-18T15:26:08Z
915 _ _ |a DBCoverage
|0 StatID:(DE-HGF)0500
|2 StatID
|b DOAJ
|d 2023-07-18T15:26:08Z
915 _ _ |a Peer Review
|0 StatID:(DE-HGF)0030
|2 StatID
|b DOAJ : Anonymous peer review
|d 2023-07-18T15:26:08Z
915 _ _ |a DBCoverage
|0 StatID:(DE-HGF)0199
|2 StatID
|b Clarivate Analytics Master Journal List
|d 2023-10-26
915 _ _ |a DBCoverage
|0 StatID:(DE-HGF)1050
|2 StatID
|b BIOSIS Previews
|d 2023-10-26
915 _ _ |a WoS
|0 StatID:(DE-HGF)0113
|2 StatID
|b Science Citation Index Expanded
|d 2023-10-26
915 _ _ |a DBCoverage
|0 StatID:(DE-HGF)0150
|2 StatID
|b Web of Science Core Collection
|d 2023-10-26
915 _ _ |a DBCoverage
|0 StatID:(DE-HGF)1190
|2 StatID
|b Biological Abstracts
|d 2023-10-26
915 _ _ |a DBCoverage
|0 StatID:(DE-HGF)0160
|2 StatID
|b Essential Science Indicators
|d 2023-10-26
915 _ _ |a JCR
|0 StatID:(DE-HGF)0100
|2 StatID
|b MABS-AUSTIN : 2022
|d 2023-10-26
915 _ _ |a IF >= 5
|0 StatID:(DE-HGF)9905
|2 StatID
|b MABS-AUSTIN : 2022
|d 2023-10-26
915 _ _ |a Article Processing Charges
|0 StatID:(DE-HGF)0561
|2 StatID
|d 2023-10-26
915 _ _ |a Fees
|0 StatID:(DE-HGF)0700
|2 StatID
|d 2023-10-26
920 1 _ |0 I:(DE-He78)B070-20160331
|k B070
|l B070 Funktionelle Genomanalyse
|x 0
980 _ _ |a journal
980 _ _ |a VDB
980 _ _ |a I:(DE-He78)B070-20160331
980 _ _ |a UNRESTRICTED


LibraryCollectionCLSMajorCLSMinorLanguageAuthor
Marc 21