Benchmarking Machine Learning Methods for Peptide Activity Predictions

Knutson, Boel; Meskini Moudi, Lida

Benchmarking Machine Learning Methods for Peptide Activity Predictions

dc.contributor.author	Knutson, Boel
dc.contributor.author	Meskini Moudi, Lida
dc.contributor.department	Göteborgs universitet/Institutionen för data- och informationsteknik	swe
dc.contributor.department	University of Gothenburg/Department of Computer Science and Engineering	eng
dc.date.accessioned	2022-10-14T07:33:59Z
dc.date.available	2022-10-14T07:33:59Z
dc.date.issued	2022-10-14
dc.description.abstract	One of the main challenges in the drug discovery process is to find a suitable compound for further analysis. The compound must affect the target relevant for the specific disease, while at the same time have desired properties to make it a safe and efficient drug candidate. The task of finding and optimizing these compounds is a long and expensive process. Therefore, using machine learning algorithms to predict the properties of compounds can speed up the process and reduce the cost. To use the algorithms, the information about the compounds must be translated into a numerical representation. The choice of representation and algorithm is of greatest importance since the predictions must be reliable to avoid late-stage failures in the drug discovery process. The objective of this thesis was to investigate if a molecular representation together with a machine learning model could be found to accurately predict the potency of peptides. This was done through a benchmarking study where different sequencebased descriptors and predictive models were combined to see if one combination worked well for various types of peptides. The descriptors were Z-scales, pseudo amino acid composition, and one-hot representation, and were combined with two different machine learning models, namely support vector classifier and random forests classifier. The results show that one-hot representation outperforms Z-scales and pseudo amino acid composition, however, the predictive model depends on the characteristics of peptides.	en
dc.identifier.uri	https://hdl.handle.net/2077/73888
dc.language.iso	eng	en
dc.setspec.uppsok	Technology
dc.subject	Drug discovery	en
dc.subject	peptide	en
dc.subject	classification	en
dc.subject	molecular representation	en
dc.subject	Z-scales	en
dc.subject	pseudo amino acid composition	en
dc.subject	one-hot representation	en
dc.subject	random forests	en
dc.subject	support vector machines	en
dc.title	Benchmarking Machine Learning Methods for Peptide Activity Predictions	en
dc.type	text
dc.type.degree	Student essay
dc.type.uppsok	H2

Files

Original bundle

Now showing 1 - 1 of 1

Name:: CSE 22-28 Knutson Meskini Moudi.pdf
Size:: 2.2 MB
Format:: Adobe Portable Document Format
Description:

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 876 B
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Masteruppsatser