|
|||
Mining Cell Cycle Literature Using Support Vector MachinesTheodoros G. Soldatos1 and Georgios A. Pavlopoulos2 1Life Biosystems GmbH, Belfortstr. 2, 69115, Heidelberg, Germany
2ESAT-SCD / IBBT-K.U.Leuven Future Health Department, Katholieke Universiteit Leuven, Kasteelpark Arenberg 10, Box 2446, 3001, Leuven, Belgium
Abstract. While biomedical literature is rapidly increasing, text classification remains a challenge for researchers, curators and librarians. In the context of this work, we use the Caipirini (http://caipirini.org) service to report on the exploration of a literature corpus related to the G1, S, G2 and M phases of the human cell cycle respectively. We use Support Vector Machines (SVMs) and a well-studied dataset to compare each of the cell cycle phases against all others in order to find abstracts that are related to one specific phase at a time. Finally we measure the performance of the results using the standard accuracy, precision and recall metrics. We find differences between the results of each of the four phases and we compare with previous findings of relevant work. We conclude that the results concur and help interpreting the observed classification performance. Keywords: supervised machine learning, biomedical literature, cell cycle, support vector machines LNAI 7297, p. 278 ff. [email protected]
|