Visa enkel post

dc.contributor.authorKnutson, Boel
dc.contributor.authorMeskini Moudi, Lida
dc.date.accessioned2022-10-14T07:33:59Z
dc.date.available2022-10-14T07:33:59Z
dc.date.issued2022-10-14
dc.identifier.urihttps://hdl.handle.net/2077/73888
dc.description.abstractOne of the main challenges in the drug discovery process is to find a suitable compound for further analysis. The compound must affect the target relevant for the specific disease, while at the same time have desired properties to make it a safe and efficient drug candidate. The task of finding and optimizing these compounds is a long and expensive process. Therefore, using machine learning algorithms to predict the properties of compounds can speed up the process and reduce the cost. To use the algorithms, the information about the compounds must be translated into a numerical representation. The choice of representation and algorithm is of greatest importance since the predictions must be reliable to avoid late-stage failures in the drug discovery process. The objective of this thesis was to investigate if a molecular representation together with a machine learning model could be found to accurately predict the potency of peptides. This was done through a benchmarking study where different sequencebased descriptors and predictive models were combined to see if one combination worked well for various types of peptides. The descriptors were Z-scales, pseudo amino acid composition, and one-hot representation, and were combined with two different machine learning models, namely support vector classifier and random forests classifier. The results show that one-hot representation outperforms Z-scales and pseudo amino acid composition, however, the predictive model depends on the characteristics of peptides.en_US
dc.language.isoengen_US
dc.subjectDrug discoveryen_US
dc.subjectpeptideen_US
dc.subjectclassificationen_US
dc.subjectmolecular representationen_US
dc.subjectZ-scalesen_US
dc.subjectpseudo amino acid compositionen_US
dc.subjectone-hot representationen_US
dc.subjectrandom forestsen_US
dc.subjectsupport vector machinesen_US
dc.titleBenchmarking Machine Learning Methods for Peptide Activity Predictionsen_US
dc.typetext
dc.setspec.uppsokTechnology
dc.type.uppsokH2
dc.contributor.departmentGöteborgs universitet/Institutionen för data- och informationsteknikswe
dc.contributor.departmentUniversity of Gothenburg/Department of Computer Science and Engineeringeng
dc.type.degreeStudent essay


Filer under denna titel

Thumbnail

Dokumentet tillhör följande samling(ar)

Visa enkel post