Abstract:
Protein structure and sequence analysis is an important and essential problem. Now
machine learning techniques have been widely used in bioinformatics. In this research
we analyze the protein of structure and sequence and predict the class of protein
sequence. Also find the accuracy of that class for different machine learning algorithm.
For the data set we use the exploratory data analysis (EDA) and extracted 278866
protein features from the data set. We classify the features and measure the accuracy
level of the three machine learning algorithm: Support Vector Machine (SVM), Naive
Bayes and Random Forest (RF) approach for that protein sequence.
Description:
This thesis submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science and Engineering of East West University, Dhaka, Bangladesh.