Multi-topic Text Categorization Based on Ranking Approach

doi:https://doi.org/10.15514/syrcose-2007-1-8

Open AccessArticle10.15514/syrcose-2007-1-8

Multi-topic Text Categorization Based on Ranking Approach

Valentina Glazkova,Mikhail Petrovskiy-2007-01-01-Proceedings of the Spring/Summer young researchers' colloquium on software engineering

0PDF

TL;DRAbstract

This paper is devoted to the multi-topic (multilabel) text classification problem. We propose two methods for reduction from ranking to the multi-label case. Unlike existing multi-label classification methods based on reduction from ranking problem, where the complex classification (threshold) function is being defined on the input feature space, in our approach we propose the construction of simple (linear) multilabel classification function using the output of the ranking methods (class relevance space) as the input. In our first method we propose to estimate the linear threshold function defined on the class relevance space. In the second method we directly find the linear operator mapping class ranks into the set of values of binary multi-label decision functions. Developed methods are less computationally expensive than existing methods and in the same time our methods demonstrate similar and in some cases significantly better accuracy. That has been demonstrated experimentally on

Chat with Paper

AI Agents for this Paper

This paper is devoted to the multi-topic (multilabel) text classification problem. We propose two methods for reduction from ranking to the multi-label case. Unlike existing multi-label classification methods based on reduction from ranking problem, where the complex classification (threshold) function is being defined on the input feature space, in our approach we propose the construction of simple (linear) multilabel classification function using the output of the ranking methods (class relevance space) as the input. In our first method we propose to estimate the linear threshold function defined on the class relevance space. In the second method we directly find the linear operator mapping class ranks into the set of values of binary multi-label decision functions. Developed methods are less computationally expensive than existing methods and in the same time our methods demonstrate similar and in some cases significantly better accuracy. That has been demonstrated experimentally on

Keywords

Computer scienceCategorizationRanking (information retrieval)Text categorizationNatural language processingArtificial intelligenceInformation retrieval

Chat

Click to start Chat