Jiahua Chen Suggested Stat548 Papers for 2020.


I remain commited to allow students to choose any of my recent publications. Do consult me first. Most of my papers contain some demanding technical developments. If technicality is of your taste, I am pleased to hear and you are not liable to get all details. Demonstrating your skill in one particular technical issue in the paper sufficies plus showing your grasp of the big picture. You are also welcome to paper of low technicality nature. If you choose this approach, get a taste of what problems the paper solves. Demonstrate your understanding and critically examine various mini-claims of the paper by repeating some simulations and subjecting the proposed methods to different, reasonable, but likely less favorable situations. Handling the technical parts at the level of building blocks rather than specific dtails.

Plan to get the paper done within 1.5 months. If there seem to be a lot more materials that can be handled in 1.5 months, let us discuss which parts of the paper can be brushed off to save time. Take charge on what you wish to cover on top of our general agreements.

Here are google scholar website where you can find most of my publications.
https://scholar.google.com/citations?hl=en&user=XVr8B-AAAAAJ&view_op=list_works



Specific recommandations



Tuning the EM-test for finite mixture models
Canadian Journal of Statistics, 2011, 389-404.

There has been rapid progress in developing effective and easy-to-use tests of the order of a finite mixture model. The EM-test is the latest to join the rankrequirements. It has a relatively simple limiting distribution and enjoys broad applicability. Based on asymptotic theory, the $p$-value of the EM-test is approximated via its limiting distribution. The built-in tuning parameter has an important influence on the approximation precision. Thus, choosing an appropriate value for this parameter is important for fully realizing the advantages of the EM-test. In this paper, we develop a novel computer-experiment approach to address this issue. Through designed experiments, we derive a number of empirical formulas for the tuning parameter. Extensive validation simulation shows that these formulas work well in terms of providing accurate type I errors.

A student may provide a theoretical derivation for the EM-test in a simplified situation and in an intuitive fashion. After this, illustrate a concrete understanding of the computer experiment idea in this paper. The final step is to repeat some of the computer experiments. I give the highest mark to these who can think of meaningfully different situations where the same idea may be useful.

Homogeneity testing under finite location-scale mixtures

https://onlinelibrary.wiley.com/doi/abs/10.1002/cjs.11557

The testing problem for the order of finite mixture models has a long history and remains an active research topic. Since \cite{Ghosh1985} revealed the hard-to-manage asymptotic properties of the likelihood ratio test, many successful alternative approaches have been developed. The most successful attempts include the modified likelihood ratio test and the EM-test, which lead to neat solutions for finite mixtures of univariate normal distributions, finite mixtures of single-parameter distributions, and several mixture-like models. The problem remains challenging, and there is still no generic solution for location-scale mixtures. In this paper, we provide an EM-test solution for homogeneity for finite mixtures of location-scale family distributions. This EM-test has nonstandard limiting distributions, but we are able to find the critical values numerically. We use computer experiments to obtain appropriate values for the tuning parameters. A simulation study shows that the fine-tuned EM-test has close to nominal type I errors and very good power properties. Two application examples are included to demonstrate the performance of the EM-test.

Even we have a general expression of the limiting distribution of the test statistic, it is not so useful. In the end, we rely to simulation to find the quantiles.
Anyone choose this paper may wish to go over the theory. If successful, nothing additional will be required. Just get one specific technicality carefully addressed in the report, plus provide a general comments on the rest of the technicality.
Those who choose to examine the applied aspect of the paper, find some location-scale family that is not included in the paper. Apply the EM-test to this family, use simulation to assess its type I error and power properties.

Permutation tests under a rotating sampling plan with clustered data
https://arxiv.org/pdf/2004.13892.pdf

Consider a population consisting of clusters of sampling units, evolving temporally, spatially, or according to other dynamics. We wish to monitor the evolution of its means, medians, or other parameters. For administrative convenience and informativeness, clustered data are often collected via a rotating plan. Under rotating plans, the observations in the same clusters are correlated, and observations on the same unit collected on different occasions are also correlated. Ignoring this correlation structure may lead to invalid inference procedures. Accommodating cluster structure in parametric models is difficult or will have a high level of misspecification risk. In this paper, we explore exchangeability in clustered data collected via a rotating sampling plan to develop a permutation scheme for testing various hypotheses of interest. We also introduce a semiparametric density ratio model to facilitate the multiple population structure in rotating sampling plans. The combination ensures the validity of the inference methods while extracting maximum information from the sampling plan. A simulation study indicates that the proposed tests firmly control the type I error whether or not the data are clustered. The use of the density ratio model improves the power of the tests.

This is a latest development for the long-term monitoring test. The paper is of applied nature. The method works well enough for the standard situation presented in the paper. A lot can be improved or extended. To work on this paper, understand the idea, double check the accuracy of the test and look at potential variations.