Support points – a new way to reduce big and high-dimensional data

Event Date Tuesday, December 11, 2018 - 11:00 to 12:00

Speaker Simon Mak, Postdoctoral Fellow, H. Milton Stewart School of Industrial and Systems Engineering, Georgia Institute of Technology

Speaker's Page Simon Mak

Event Type Statistics Seminar

Location Room 4192, Earth Sciences Building (2207 Main Mall)

This talk presents a new method for reducing big and high-dimensional data into a smaller dataset, called support points (SPs). In an era where data is plentiful but downstream analysis is oftentimes expensive, SPs can be used to tackle many big data challenges in statistics, engineering and machine learning. SPs have two key advantages over existing methods. First, SPs provide optimal and model-free reduction of big data for a broad range of downstream analyses. Second, SPs can be efficiently computed via parallelized difference-of-convex optimization; this allows us to reduce millions of data points to a representative dataset in mere seconds. SPs also enjoy appealing theoretical guarantees, including distributional convergence and improved reduction over random sampling and clustering-based methods. The effectiveness of SPs is then demonstrated in two real-world applications, the first for reducing long Markov Chain Monte Carlo (MCMC) chains for rocket engine design, and the second for data reduction in computationally intensive predictive modeling.

News & Events

Events List

Subscribe to email list

User menu

Support points – a new way to reduce big and high-dimensional data

News & Events

Events List

Subscribe to email list

User menu

You are here

Support points – a new way to reduce big and high-dimensional data