ProtT5 and random forests-based viscosity prediction method for therapeutic mAbs

Eur J Pharm Sci. 2024 Mar 1;194:106705. doi: 10.1016/j.ejps.2024.106705. Epub 2024 Jan 19.ABSTRACTViscosity is a key characteristic of therapeutic antibodies for subcutaneous administration which requires low volume and high concentration formulations. It would be highly beneficial to accurately predict the viscosity of newly developed therapeutic antibodies in the early stages of development. In this work, a ProtT5-XL-UniRef50 (ProtT5) and Random Forests (RF)-based prediction method was proposed for accurately predicting the viscosity of monoclonal antibodies, with only corresponding sequences needed. Starting from the given heavy and light chain V-region sequences, corresponding features were first extracted from the ProtT5 pretrained model. Kernel principal analysis (Kernel-PCA) was then used for reducing the extracted 2048-D (1024-D for each sequence) feature vector to a reasonable level for efficient training of the RF-regressor. Then, the RF model was constructed on 40 commercially available therapeutic antibodies and tested with 3-folds cross-validation. Test results show that the model could reproduce the viscosity value at a high level (Pearson correlation coefficient (PCC) = 0.928). Performance on classifying high (>30 cP) and low (<30 cP) viscosity is much more satisfactory, the Accuracy (ACC) and the area under precision-recall curve (AUC) of the classification model from validation tests are 0.975 and 1.000, respectively. Compared to 5 existing state-of-the...
Source: European Journal of Pharmaceutical Sciences - Category: Drugs & Pharmacology Authors: Source Type: research