Jose Ranilla - Academia.edu (original) (raw)
Uploads
Papers by Jose Ranilla
IEEE Transactions on Knowledge and Data Engineering, 2005
Text Categorization, which consists of automatically assigning documents to a set of categories, ... more Text Categorization, which consists of automatically assigning documents to a set of categories, usually involves the management of a huge number of features. Most of them are irrelevant and others introduce noise which could mislead the classifiers. Thus, feature reduction is often performed in order to increase the efficiency and effectiveness of the classification. In this paper, we propose to select relevant features by means of a family of linear filtering measures which are simpler than the usual measures applied for this purpose. We carry out experiments over two different corpora and find that the proposed measures perform better than the existing ones.
Journal of The American Society for Information Science and Technology, 2004
Text Categorization is the process of assigning documents to a set of previously fixed categories... more Text Categorization is the process of assigning documents to a set of previously fixed categories. A lot of research is going on with the goal of automating this time-consuming task. Several different algorithms have been applied, and Support Vector Machines (SVM) have shown very good results. In this report, we try to prove that a previous filtering of the words used by SVM in the classification can improve the overall performance. This hypothesis is systematically tested with three different measures of word relevance, on two different corpus (one of them considered in three different splits), and with both local and global vocabularies. The results show that filtering significantly improves the recall of the method, and that also has the effect of significantly improving the overall performance.
... 43 (b) (a) 89 90 91 92 93 Precision (%) Filtering level (%) Lir ILir IG TF-IDF 72 75 78 81 84... more ... 43 (b) (a) 89 90 91 92 93 Precision (%) Filtering level (%) Lir ILir IG TF-IDF 72 75 78 81 84 87 90 93 Precision (%) ... (b) (a) 76 77 78 79 80 81 Recall (%) Filtering level (%) Lir ILir IG TF-IDF 69 72 75 78 81 Recall (%) Filtering level (%) L D Dir IL 0 10 20 30 40 50 60 70 80 90 100 ...
Técnica de Administración Bancaria, iniciado en marzo del 2012 a la fecha. Poseo los conocimiento... more Técnica de Administración Bancaria, iniciado en marzo del 2012 a la fecha. Poseo los conocimientos, que mi condición académica así me exige.
IEEE Transactions on Knowledge and Data Engineering, 2005
Text Categorization, which consists of automatically assigning documents to a set of categories, ... more Text Categorization, which consists of automatically assigning documents to a set of categories, usually involves the management of a huge number of features. Most of them are irrelevant and others introduce noise which could mislead the classifiers. Thus, feature reduction is often performed in order to increase the efficiency and effectiveness of the classification. In this paper, we propose to select relevant features by means of a family of linear filtering measures which are simpler than the usual measures applied for this purpose. We carry out experiments over two different corpora and find that the proposed measures perform better than the existing ones.
Journal of The American Society for Information Science and Technology, 2004
Text Categorization is the process of assigning documents to a set of previously fixed categories... more Text Categorization is the process of assigning documents to a set of previously fixed categories. A lot of research is going on with the goal of automating this time-consuming task. Several different algorithms have been applied, and Support Vector Machines (SVM) have shown very good results. In this report, we try to prove that a previous filtering of the words used by SVM in the classification can improve the overall performance. This hypothesis is systematically tested with three different measures of word relevance, on two different corpus (one of them considered in three different splits), and with both local and global vocabularies. The results show that filtering significantly improves the recall of the method, and that also has the effect of significantly improving the overall performance.
... 43 (b) (a) 89 90 91 92 93 Precision (%) Filtering level (%) Lir ILir IG TF-IDF 72 75 78 81 84... more ... 43 (b) (a) 89 90 91 92 93 Precision (%) Filtering level (%) Lir ILir IG TF-IDF 72 75 78 81 84 87 90 93 Precision (%) ... (b) (a) 76 77 78 79 80 81 Recall (%) Filtering level (%) Lir ILir IG TF-IDF 69 72 75 78 81 Recall (%) Filtering level (%) L D Dir IL 0 10 20 30 40 50 60 70 80 90 100 ...
Técnica de Administración Bancaria, iniciado en marzo del 2012 a la fecha. Poseo los conocimiento... more Técnica de Administración Bancaria, iniciado en marzo del 2012 a la fecha. Poseo los conocimientos, que mi condición académica así me exige.