|
|||
Towards Better Prioritization of Epigenetically Modified DNA RegionsErnesto Iacucci1,2, Dusan Popovic1,2, Georgios A. Pavlopoulos1,2, Léon-Charles Tranchevent1,2, Marijke Bauters3,2, Bart De Moor1,2, and Yves Moreau1,2 1ESAT-SCD, Katholieke Universiteit Leuven, Kasteelpark Arenberg 10, Box 2446, 3001, Leuven, Belgium 2IBBT-K.U.Leuven Future Health Department, Katholieke Universiteit Leuven, Kasteelpark Arenberg 10, Box 2446, 3001, Leuven, Belgium
3Department of Human Genetics, Katholieke Universiteit Leuven, Kasteelpark Arenberg 10, Box 2446, 3001, Leuven, Belgium Abstract. Epigenetic modifications of the genome can cause profound changes in phenotype of an organism. Experimental methods allow us to detect regions of the DNA that have been epigenetically modified; these regions are said to be enriched in a queried state versus a control. Detecting the enriched regions is not a simple matter as making sense of the data involves multiple analytical steps and often results in false calls. In this study, we analyze the utility of using additional features of the data (such as the transcription start site (TSS) and the histone coverage) to detect enrichment. We train a decision tree ensemble using these three features and review how well they identify regions that are truly enriched (as validated by q-PCR). We find that the enrichment score derived directly from ChIP-chip experiment data is less informative than the histone coverage. Keywords: ChIP-chip, data integration, protein-DNA, machine learning, decision trees LNAI 7297, p. 270 ff. [email protected]
|