Abstract:
Online news portal sites are increasing progressively as like meshwork all over the world. The propagation of misrepresenting information such as social media feeds, news blogs, and online newspapers have made it challenging to distinguish reliable news sources, thus enhancing the need for computational tools able to provide insights into the trustworthiness of online content. It’s a much difficult task to determine the authenticity of the newspaper article content directly rather than knowing the actual source reliability of the article. To distinguish reliable and unreliable media sources, an approach is proposed in this paper to scale online news portal reliability using website metrics data that has been collected through the Alexa website traffic statistics tool. Initially, all Bengali online news portal websites were listed and a dataset has been created containing all relevant website metrics information of all those websites. Using domain knowledge and context analysis vital features have been extracted for scaling reliability of those websites using an unsupervised learning algorithm. In this context, the k-means algorithm is been used for making several clusters of the unlabeled dataset. Each cluster got a label depending on the metrics information correlation in terms of the real-world scenario. Comparing the experimental result with the theoretical knowledge, the proposed approach satisfied the research intention.
Description:
This thesis submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science and Engineering of East West University, Dhaka, Bangladesh.