News & Events

Subscribe to email list

Please select the email list(s) to which you wish to subscribe.

You are here

Clustering and modelling of phase variation for functional data

Tuesday, February 26, 2019 - 11:00 to 12:00
Eric Fu, UBC Statistics PhD Student
Statistics Seminar
Room 4192, Earth Sciences Building (2207 Main Mall)

Our work is motivated by an analysis of elephant seal dive profiles which we view as functional data, specifically, as depth as a function of time, with data recorded almost continuously by sensors attached to the animal. The objective is to group profiles by shape to better understand the corresponding behavioural states of the seals. Most existing approaches rely on multivariate clustering methods applied to ad hoc summaries of the dive profile. Instead, we view each profile as arising from a function that is a deformation of a base shape. The deformation is regarded as phase variation and is represented by a latent warping function with a finite mixture distribution.

We first propose a curve registration model to explicitly model amplitude and phase variations of functional data, with phase variation represented by smooth time transformations called warping functions. Inference is conducted via the stochastic approximation expectation-maximization (SAEM) algorithm. Our simulation study shows that the SAEM algorithm is computationally more stable and efficient than existing approaches in the literature for inference of this class of curve registration model with flexible warping.

We then propose two clustering approaches based on our curve registration model for functional data: 1) a simultaneous approach that smooths the noisy raw profiles and estimates the base shape, the warping functions and the cluster membership and via SAEM algorithms; and 2) a two-step approach that applies clustering algorithms on the estimated warping functions. In contrast to generic clustering algorithms in the literature, our methods treat the clustering structure as heterogeneity in phase variation. The proposed method is applied to the analysis of elephant seal dive profiles and an analysis of human growth curves. We are able to obtain more intuitive clusters by focusing the clustering effort on phase variation.