Supervised by: Ministry of Culture of PRC

Sponsored by:National Library of China
  Library Society of China

ISSN 1001-8867    CN 11-2746/G2

Automatic Subject Classification of Public Messages in E-government Affairs

Abstract: Public messages on the Internet political inquiry platform rely on manual classification, which has the problems of heavy workload, low efficiency, and high error rate. A Bi-directional long short-term memory (Bi-LSTM) network model based on attention mechanism was proposed in this paper to realize the automatic classification of public messages. Considering the network political inquiry data set provided by the BdRace platform as samples, the Bi-LSTM algorithm is used to strengthen the correlation between the messages before and after the training process, and the semantic attention to important text features is strengthened in combination with the characteristics of attention mechanism. Feature weights are integrated through the full connection layer to carry out classification calculations. The experimental results show that the F1 value of the message classification model proposed here reaches 0.886 and 0.862, respectively, in the data set of long text and short text. Compared with three algorithms of long short-term memory (LSTM), logistic regression, and naive Bayesian, the Bi-LSTM model can achieve better results in the automatic classification of public message subjects.

Keywords: Internet politics inquiry, public message, subject classification, Bi-LSTM model, attention mechanism