Text Summarization Technique for Punjabi Language Using Neural Networks

  • Ghadeer Written by
  • Update: 02/11/2021

Text Summarization Technique for Punjabi Language Using Neural Networks

Arti Jain1, Anuja Arora1, Divakar Yadav2, Jorge Morato3, and Amanpreet Kaur1

1Department of Computer Science and Engineering, Jaypee Institute of Information Technology, India

2Department of Computer Science and Engineering, National Institute of Information Technology, India

3Department of Computer Science and Engineering, Universidad Carlos III de Madrid, Spain

Abstract: In the contemporary world, utilization of digital content has risen exponentially. For example, newspaper and web articles, status updates, advertisements etc. have become an integral part of our daily routine. Thus, there is a need to build an automated system to summarize such large documents of text in order to save time and effort. Although, there are summarizers for languages such as English since the work has started in the 1950s and at present has led it up to a matured stage but there are several languages that still need special attention such as Punjabi language. The Punjabi language is highly rich in morphological structure as compared to English and other foreign languages. In this work, we provide three phase extractive summarization methodology using neural networks. It induces compendious summary of Punjabi single text document. The methodology incorporates pre-processing phase that cleans the text; processing phase that extracts statistical and linguistic features; and classification phase. The classification based neural network applies an activation function- sigmoid and weighted error reduction-gradient descent optimization to generate the resultant output summary. The proposed summarization system is applied over monolingual Punjabi text corpus from Indian languages corpora initiative phase-II. The precision, recall and F-measure are achieved as 90.0%, 89.28% an 89.65% respectively which is reasonably good in comparison to the performance of other existing Indian languages’ summarizers.

Keywords: Extractive method, Indian languages corpora initiative, natural language processing, neural networks, Punjabi language, text summarization.

Received May 31, 2020; accept January 6, 2021

https://doi.org/10.34028/iajit/18/6/8

Full text

Read 317 times Last modified on Tuesday, 02 November 2021 07:51
Share
Top
We use cookies to improve our website. By continuing to use this website, you are giving consent to cookies being used. More details…