CitedEvidence
User Settings
Article

An Efficient Algorithm for Large-Scale Text Categorization

Changrui Yu,Yan Luo-2008-01-01
0

TL;DRAbstract

Abstract:- The text categorization is a core technique in knowledge mining field. Most of categorization methods are based on VSM in the current research, of which the widely-used method is kNN. But most of them are highly complicated on computation, and could hardly be used for classifying large-scale sample. Moreover, to them, the classifier must be rebuilt when adding or deleting the training samples, which make them poor in scalability. In this paper, based on Mutual Dependence and Equivalent Radius, a new categorization method (called MDER) is proposed. MDER can be used to classify large-scale sample and has good scalability. After a series of experiments of classifying Chinese texts, the conclusion are drawn that MDER outperforms kNN and CCC method, and can be used online to classify large-scale sample while keeping higher precision and recall.

Chat with Paper

AI Agents for this Paper

Abstract:- The text categorization is a core technique in knowledge mining field. Most of categorization methods are based on VSM in the current research, of which the widely-used method is kNN. But most of them are highly complicated on computation, and could hardly be used for classifying large-scale sample. Moreover, to them, the classifier must be rebuilt when adding or deleting the training samples, which make them poor in scalability. In this paper, based on Mutual Dependence and Equivalent Radius, a new categorization method (called MDER) is proposed. MDER can be used to classify large-scale sample and has good scalability. After a series of experiments of classifying Chinese texts, the conclusion are drawn that MDER outperforms kNN and CCC method, and can be used online to classify large-scale sample while keeping higher precision and recall.

Keywords

CategorizationComputer scienceScalabilityArtificial intelligenceText categorizationData miningClassifier (UML)Computation

Chat

Click to start Chat