Jose Ranilla - Academia.edu (original) (raw)

Uploads

Papers by Jose Ranilla

Research paper thumbnail of Introducing a Family of Linear Measures for Feature Selection in Text Categorization

IEEE Transactions on Knowledge and Data Engineering, 2005

Text Categorization, which consists of automatically assigning documents to a set of categories, ... more Text Categorization, which consists of automatically assigning documents to a set of categories, usually involves the management of a huge number of features. Most of them are irrelevant and others introduce noise which could mislead the classifiers. Thus, feature reduction is often performed in order to increase the efficiency and effectiveness of the classification. In this paper, we propose to select relevant features by means of a family of linear filtering measures which are simpler than the usual measures applied for this purpose. We carry out experiments over two different corpora and find that the proposed measures perform better than the existing ones.

Research paper thumbnail of Improving Performance of Text Categorization by Combining Filtering and Support Vector Machines

Journal of The American Society for Information Science and Technology, 2004

Text Categorization is the process of assigning documents to a set of previously fixed categories... more Text Categorization is the process of assigning documents to a set of previously fixed categories. A lot of research is going on with the goal of automating this time-consuming task. Several different algorithms have been applied, and Support Vector Machines (SVM) have shown very good results. In this report, we try to prove that a previous filtering of the words used by SVM in the classification can improve the overall performance. This hypothesis is systematically tested with three different measures of word relevance, on two different corpus (one of them considered in three different splits), and with both local and global vocabularies. The results show that filtering significantly improves the recall of the method, and that also has the effect of significantly improving the overall performance.

Research paper thumbnail of Scoring and Selecting Terms for Text Categorization

... 43 (b) (a) 89 90 91 92 93 Precision (%) Filtering level (%) Lir ILir IG TF-IDF 72 75 78 81 84... more ... 43 (b) (a) 89 90 91 92 93 Precision (%) Filtering level (%) Lir ILir IG TF-IDF 72 75 78 81 84 87 90 93 Precision (%) ... (b) (a) 76 77 78 79 80 81 Recall (%) Filtering level (%) Lir ILir IG TF-IDF 69 72 75 78 81 Recall (%) Filtering level (%) L D Dir IL 0 10 20 30 40 50 60 70 80 90 100 ...

Research paper thumbnail of Jose Ranilla

Técnica de Administración Bancaria, iniciado en marzo del 2012 a la fecha. Poseo los conocimiento... more Técnica de Administración Bancaria, iniciado en marzo del 2012 a la fecha. Poseo los conocimientos, que mi condición académica así me exige.

Research paper thumbnail of Introducing a Family of Linear Measures for Feature Selection in Text Categorization

IEEE Transactions on Knowledge and Data Engineering, 2005

Text Categorization, which consists of automatically assigning documents to a set of categories, ... more Text Categorization, which consists of automatically assigning documents to a set of categories, usually involves the management of a huge number of features. Most of them are irrelevant and others introduce noise which could mislead the classifiers. Thus, feature reduction is often performed in order to increase the efficiency and effectiveness of the classification. In this paper, we propose to select relevant features by means of a family of linear filtering measures which are simpler than the usual measures applied for this purpose. We carry out experiments over two different corpora and find that the proposed measures perform better than the existing ones.

Research paper thumbnail of Improving Performance of Text Categorization by Combining Filtering and Support Vector Machines

Journal of The American Society for Information Science and Technology, 2004

Text Categorization is the process of assigning documents to a set of previously fixed categories... more Text Categorization is the process of assigning documents to a set of previously fixed categories. A lot of research is going on with the goal of automating this time-consuming task. Several different algorithms have been applied, and Support Vector Machines (SVM) have shown very good results. In this report, we try to prove that a previous filtering of the words used by SVM in the classification can improve the overall performance. This hypothesis is systematically tested with three different measures of word relevance, on two different corpus (one of them considered in three different splits), and with both local and global vocabularies. The results show that filtering significantly improves the recall of the method, and that also has the effect of significantly improving the overall performance.

Research paper thumbnail of Scoring and Selecting Terms for Text Categorization

... 43 (b) (a) 89 90 91 92 93 Precision (%) Filtering level (%) Lir ILir IG TF-IDF 72 75 78 81 84... more ... 43 (b) (a) 89 90 91 92 93 Precision (%) Filtering level (%) Lir ILir IG TF-IDF 72 75 78 81 84 87 90 93 Precision (%) ... (b) (a) 76 77 78 79 80 81 Recall (%) Filtering level (%) Lir ILir IG TF-IDF 69 72 75 78 81 Recall (%) Filtering level (%) L D Dir IL 0 10 20 30 40 50 60 70 80 90 100 ...

Research paper thumbnail of Jose Ranilla

Técnica de Administración Bancaria, iniciado en marzo del 2012 a la fecha. Poseo los conocimiento... more Técnica de Administración Bancaria, iniciado en marzo del 2012 a la fecha. Poseo los conocimientos, que mi condición académica así me exige.

Log In