Text classification based on deep belief network and softmax regression (original) (raw)
Related papers
A comparative review on deep learning models for text classification
Indonesian Journal of Electrical Engineering and Computer Science
Text classification is a fundamental task in several areas of natural language processing (NLP), including words semantic classification, sentiment analysis, question answering, or dialog management. This paper investigates three basic architectures of deep learning models for the tasks of text classification: Deep Belief Neural (DBN), Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN), these three main types of deep learning architectures, are largely explored to handled various classification tasks. DBN have excellent learning capabilities to extracts highly distinguishable features and good for general purpose. CNN have supposed to be better at extracting the position of various related features while RNN is modeling in sequential of long-term dependencies. This paper work shows the systematic comparison of DBN, CNN, and RNN on text classification tasks. Finally, we show the results of deep models by research experiment. The aim of this paper to provides basic ...
A Step towards the Improvement in the Performance of Text Classification
KSII Transactions on Internet and Information Systems, 2019
The performance of text classification is highly related to the feature selection methods. Usually, two tasks are performed when a feature selection method is applied to construct a feature set; 1) assign score to each feature and 2) select the top-N features. The selection of top-N features in the existing filter-based feature selection methods is biased by their discriminative power and the empirical process which is followed to determine the value of N. In order to improve the text classification performance by presenting a more illustrative feature set, we present an approach via a potent representation learning technique, namely DBN (Deep Belief Network). This algorithm learns via the semantic illustration of documents and uses feature vectors for their formulation. The nodes, iteration, and a number of hidden layers are the main parameters of DBN, which can tune to improve the classifier's performance. The results of experiments indicate the effectiveness of the proposed method to increase the classification performance and aid developers to make effective decisions in certain domains.
On Optimality of Long Document Classification using Deep Learning
International Journal on Recent and Innovation Trends in Computing and Communication
Document classification is effective with elegant models of word numerical distributions. The word embeddings are one of the categories of numerical distributions of words from the WordNet. The modern machine learning algorithms yearn on classifying documents based on the categorical data. The context of interest on the categorical data is posed with weights and the sense and quality of the sentences is estimated for sensible classification of documents. The focus of the current work is on legal and criminal documents extracted from the popular news channels, particularly on classification of long length legal and criminal documents. Optimization is the essential instrument to bring the quality inputs to the document classification model. The existing models are studied and a feasible model for the efficient document classification is proposed. The experiments are carried out with meticulous filtering and extraction of legal and criminal records from the popular news web sites and p...
SURVEY ON DOCUMENT CLASSIFICATION USING DEEP LEARNING TECHNIQUES
IJCSIS Vol 17 No 4 April Issue, 2019
Document classification focuses to allocate at least one class or category to a document, making it easier to to find the relevant information at the right time and for filtering and routing documents directly to users. Documents can be classified according to their attributes or subjects. Document classification methods involve: Concept Mining, tf-idf, Support vector machines (SVM), , Naive Bayes classifier, Artificial neural network, , Instantaneously trained neural networks, K-nearest neighbor algorithms, Natural language processing and different methodologies. The main advantage of deep learning over other technique is because deep learning techniques can outperform other techniques when data size is large, reduces the need for feature engineering and has high performance on complex problems. The main objective of the survey is to analyse the various deep learning techniques for document classification. Keywords: Shallow neural network, Convolutional neural network, Recurrent neural network, Recurrent convolutional neural network, Bi-directional neural network.
Research on the Classification Ability of Deep Belief Networks on Small and Medium Datasets
Recent theoretical advances in the learning of deep artificial neural networks have made it possible to overcome a vanishing gradient problem. This limitation has been overcome using a pre-training step, where deep belief networks formed by the stacked Restricted Boltzmann Machines perform unsupervised learning. Once a pre-training step is done, network weights are fine-tuned using regular error back propagation while treating network as a feed-forward net. In the current paper we perform the comparison of described approach and commonly used classification approaches on some well-known classification data sets from the UCI repository as well as on one mid-sized proprietary data set.
A Study on a Joint Deep Learning Model for Myanmar Text Classification
2020 IEEE Conference on Computer Applications(ICCA)
Text classification is one of the most critical areas of research in the field of natural language processing (NLP). Recently, most of the NLP tasks achieve remarkable performance by using deep learning models. Generally, deep learning models require a huge amount of data to be utilized. This paper uses pre-trained word vectors to handle the resource-demanding problem and studies the effectiveness of a joint Convolutional Neural Network and Long Short Term Memory (CNN-LSTM) for Myanmar text classification. The comparative analysis is performed on the baseline Convolutional Neural Networks (CNN), Recurrent Neural Networks (RNN) and their combined model CNN-RNN.
Internal News Classification Using Deep Learning
For the last few years, text mining has been gaining significant importance. Since Knowledge is now available to users through variety of sources i.e. electronic media, digital media, print media, and many more. Due to huge availability of text in numerous forms, a lot of unstructured data has been recorded by research experts and have found numerous ways in literature to convert this scattered text into defined structured volume, commonly known as text classification. Focus on full text classification i.e. full news, huge documents, long length texts etc. is more prominent as compared to the short length text. In this paper, we have discussed text classification process, classifiers, and numerous feature extraction methodologies but all in context of short texts i.e. news classification based on their headlines. Existing classifiers and their working methodologies are being compared and results are presented effectively.
Improved Hybrid Model for Classification of Text Documents
European Journal of Artificial Intelligence and Machine Learning
All universities in and around the globe have senate members whose responsibility is to deliberate on matters that affect the smooth running of the university in senate meetings, such matters include, personnel, management, and student matters. Reports are generated at the end of each senate meeting on these matters and are printed on paper or stored in the system without proper grouping of the matters as a result of lack of efficient classification model. This paper proposes hybrid machine learning and deep learning models for the development of efficient classification model for textual documents and tested with reports from senate deliberations from university of Port Harcourt. The dataset for over ten years was collected and pre-processed, noise and other non-alphanumeric values removed by tokenization. Principal component analysis algorithm which is a machine learning approach was used extensively for feature selection and LSTM a deep learning architecture was used to build the...
IRJET-Online news classification using Deep Learning Technique
- A news classification task begins with a data set in which the class assignments are known. Classification are discrete and do not imply order. The goal of classification is to accurately predict the target class for each case in the data. Due to the Web expansion, the prediction of online news popularity is becoming a trendy research topic. Many researches have been done on this topic but the best result was provided by a Random Forest with a discrimination power of 73% accuracy. So, in this paper, main aim is to increase accuracy in predicting the popularity of online news. Thus, Neural Network will be implemented to acquire better results.
Nepali Text Document Classification Using Deep Neural Network
Tribhuvan University Journal
An automated text classification is a well-studied problem in text mining which generally demands the automatic assignment of a label or class to a particular text documents on the basis of its content. To design a computer program that learns the model form training data to assign the specific label to unseen text document, many researchers has applied deep learning technologies. For Nepali language, this is first attempt to use deep learning especially Recurrent Neural Network (RNN) and compare its performance to traditional Multilayer Neural Network (MNN). In this study, the Nepali texts were collected from online News portals and their pre-processing and vectorization was done. Finally deep learning classification framework was designed and experimented for ten experiments: five for Recurrent Neural Network and five for Multilayer Neural Network. On comparing the result of the MNN and RNN, it can be concluded that RNN outperformed the MNN as the highest accuracy achieved by MNN...