News & Events

Subscribe to email list

Please select the email list(s) to which you wish to subscribe.

User menu

You are here

Assessing performance of classifiers by cross-validation based on binary data

Thursday, August 25, 2016 - 16:00
Yichen Zhao, UBC Statistics Master's student
Statistics Seminar
Room 4192, Earth Sciences Building (2207 Main Mall)

In statistical applications, we are often asked to construct a classifier based on a random sample from a specific population. Once a classifier is built, we may use it to categorize new individuals from the population. The accuracy of categorizing new individuals is related to the precision of the classifier we built. Yet, the sample from the population is generally noisy. Unless the sample size is very large, the performance of the classifier in terms of correctly classifying new individuals is far from certain. In the data analysis stage, we usually look for the classifier that provides the highest success rate in classifying individuals in the given sample. This classifier's apparent rate of success generally over-estimates its precision when it is applied on new individuals from the population. To overcome this issue, the cross-validation technique is often suggested to be used to assess the performance of a classifier. In this project, we use simulation studies to investigate if the cross-validation technique indeed accurately estimates the performance of classifiers in various situations.