A novel framework is presented to facilitate users in extracting summaries and keywords from long texts at real-time that uses a hybrid approach based on feature extraction and unsupervised learning to generate quality summaries.
With the explosion of data in the digital age, it is an important yet challenging task to extract meaningful information from long texts. In this paper, a novel framework is presented to facilitate users in extracting summaries and keywords from long texts at real-time. It uses a hybrid approach based on feature extraction and unsupervised learning to generate quality summaries. In addition, it integrates machine learning with semantic methods to extract keywords and phrases from the source text. The framework is deployed as a mobile app that allows users to manage, share and listen to the extracted information to improve user experience. To test the effectiveness of the work, experimental and research evaluations are carried out on DUC 2002 dataset using ROGUE parameters. Results demonstrate a higher F1-score than the state-of-the-art methods used for extractive summarization on the same dataset. Experiment also shows an accuracy of 70% for the keyword extraction method, which is in par with other work in the field.