A protein classification engine based on stochastic finite state automata
TL;DRAbstract
Accurate protein classification is one of the major challenges in modern bioinformatics. Motifs that exist in the protein chain can make such a classification possible. A plethora of algorithms to address this problem have been proposed by both the artificial intelligence and the pattern recognition communities. In this paper, a data mining methodology for classification rules induction in proposed. Initially, expert – based protein families are processed to create a new hybrid set of families. Then, a prefix tree acceptor is created from the motifs in the protein chains, and subsequently transformed into a stochastic finite state automaton using the ALERGIA algorithm. Finally, an algorithm is presented for the extraction of classification rules from the automaton.
Chat with Paper
AI Agents for this Paper
Accurate protein classification is one of the major challenges in modern bioinformatics. Motifs that exist in the protein chain can make such a classification possible. A plethora of algorithms to address this problem have been proposed by both the artificial intelligence and the pattern recognition communities. In this paper, a data mining methodology for classification rules induction in proposed. Initially, expert – based protein families are processed to create a new hybrid set of families. Then, a prefix tree acceptor is created from the motifs in the protein chains, and subsequently transformed into a stochastic finite state automaton using the ALERGIA algorithm. Finally, an algorithm is presented for the extraction of classification rules from the automaton.
Keywords
Chat
Click to start Chat