Stochastic Search by Gibbs Sampling for Big Data Analytics
时间: 2018-12-05  作者:   浏览次数: 1012

Title: Stochastic Search by Gibbs Sampling for Big Data Analytics

Speaker: Dr. Guoqi Qian, Associate Professor of Statistics

School of Mathematics and StatisticsThe University of Melbourne

Date: 2018/12/24 AM 10:30-11:30

Venue:览秀楼105学术报告厅

Abstract:

Most algorithms for statistical machine learning and big data analytics

are deterministic and enumerative. They can be computationally intractable even for mining a dataset containing just a few hundred features or predictors in the context of variable selection or model selection.

Over the last 20 years we have been developing a Gibbs-sampling-induced stochastic search methodology to randomly sample candidate models or subsets of predictor variables, and then perform model/variable selection based on the generated sample.

Since the generated sample is of a tiny fraction of the candidate model

space that is often of an exponential order, the stochastic search is

computationally scalable to big data analytics. Also the generated sample typically constitutes an ergodic Markov chain; hence the consistency of model/variable selection is consequently guaranteed with probability 1.In this talk, we will briefly review the Gibbs-sampling-induced stochastic search methodology. Time permitted, we will then present a few of its applications in market basket analysis, genome-wide association studies, proteomics information retrieval, tropical cyclone genesis studies andwild animal population abundance studies.