A Learning-Classification Based Approach for Word Prediction
Computer Science Department, University of Houston-Clear Lake, USA
Abstract: Word prediction is an important NLP problem in which we want to predict the correct word in a given context. Word completion utilities, predictive text entry systems, writing aids, and language translation are some of common word prediction applications. This paper presents a new word prediction approach based on context features and machine learning. The proposed method casts the problem as a learning-classification task by training word predictors with highly discriminating features selected by various feature selection techniques. The contribution of this work lies in the new way of presenting this problem, and the unique combination of a top performer in machine learning, svm, with various feature selection techniques MI, X2, and more. The method is implemented and evaluated using several datasets. The experimental results show clearly that the method is effective in predicting the correct words by utilizing small contexts. The system achieved impressive results, compared with similar work; the accuracy in some experiments approaches 91% correct predictions.
Keywords: Word prediction, word completion, machine learning, natural language processing.
Received January 7, 2006; accepted June 6, 2006