A New Strategy of Exploring Metabolomics Data Using Monte Carlo Tree
【摘要】:正Large amounts of data from high-throughput metabolomics experiments become commonly more and more complex,which brings a number of challenges to existing statistical modeling.Thus there is a need to develop statistically efficient approach for mining the underlying metabolite information contained by metabolomics data under investigation.In the work,we provide a new strategy based on Monte-Carlo cross validation coupled with the classification tree algorithm, which was termed as the MCTree approach.The MCTree approach inherently provides a feasible way to uncover the predictive structure of metabolomics data by establishment of many cross-predictive models.With the help of the sample proximity matrix such obtained,it seems to be able to give some interesting insights into metabolomics data.Simultaneously,informative metabolites or potential biomarkers can be successfully discovered by means of variable importance ranking in the MCTree approach.Two real metabolomics datasets are finally used to demonstrate the performance of the proposed approach.