News & Events

Subscribe to email list

Please select the email list(s) to which you wish to subscribe.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Image CAPTCHA

Enter the characters shown in the image.

User menu

You are here

Two MSc student presentations (Zefan Liu & Tom Tang)

Thursday, April 17, 2025 - 11:00 to 12:00
Zefan (Steve) Liu, UBC Statistics M.Sc. student & Tom Tang, UBC Statistics M.Sc. student
ESB 4192 / Zoom

To join this seminar virtually: Please request Zoom connection details from ea [at] stat.ubc.ca

Presentation 1

Time: 11:00 am - 11:30 am

Speaker: Zefan (Steve) Liu, UBC Statistics MSc student

Title: Modelling peaks over thresholds in panel data: a grouped panel generalized Pareto regression model

Abstract: Extreme Value Theory (EVT) provides probabilistic tools to understand the behaviour of extreme events, making it widely applicable across various fields. When modelling the marginal distributions of the extremes in panel data, one may wish to balance the flexibility to capture the heterogeneity among margins and the efficiency of estimation through a combination of regression technique and assuming a latent group structure among subjects. This group structure facilitating information pooling may not be known a priori and needs to be estimated from data, which may then lead to potential physical interpretations. One existing approach addressing this modelling idea builds on the Block Maxima (BM) method in EVT, which can result in a loss of valuable information. Moreover, similar to the classic k-means clustering method, the current algorithm for estimating group structure is prone to converging to locally optimal solutions. We extend the current approach to a new framework called the grouped panel generalized Pareto regression model, which utilizes the Peaks Over Threshold (POT) method to model excesses over high thresholds, thereby leveraging extreme event information more exhaustively. To account for the conditional dependence structure within clusters of excesses, we introduce a dependence-window-based sandwich estimator for standard error estimation. Taking advantage of the POT method, we develop a new grouping algorithm inspired by hierarchical clustering, which relies on a pre-determined linkage and stopping rule. This algorithm estimates the latent number of groups, the group structure and associated parameters simultaneously, and it demonstrates improved performance in identifying the globally optimal structure and balancing the goodness of fit across subjects under reasonable conditions. The finite-sample performance of our methodology is carefully evaluated through simulation studies, and an application to the river flow data from 31 hydrological stations in Upper Danube river basin is used to illustrate the real-world applicability of our modelling strategy, where the estimation efficiency is notably improved and physically interpretable group structures are identified.

Presentation 2

Time: 11:30 am – 12:00 pm

Speaker: Tom Tang, UBC Statistics MSc student

Title: The challenges of non-identifiability and a penalized maximum likelihood estimator for the beta mixture model

Abstract: This thesis explores statistical inference for the finite mixture models, with a particular focus on beta mixture models, which are widely used in biostatistics, bioinformatics, and computer science. It addresses significant issues such as unbounded likelihood and non-identifiability, which can complicate parameter estimation. To overcome the obstacle caused by the unbounded likelihood, we propose a penalized maximum likelihood estimation approach by adding a penalty term to the log-likelihood function, leading to stable parameter estimation. Additionally, we derive a closed-form expression for testing non-identifiability in beta mixture models. The effectiveness of our penalized approach is evaluated through simulation studies and compared with alternative approaches, such as the method of moments. Practical applicability is demonstrated through applications to DNA methylation analysis and local false discovery rate estimation. Finally, we suggest several directions for future research.