Informed Survival in Stat 678: Survival Analysis, CTL Mini-grant Spring 2016

Instructor: Juanjuan Fan, Mathematics and Statistics

Summary: Working under the premise that student learning is improved with examples and rubrics to guide their work, Professor Juanjuan Fan posted recordings on Blackboard of her in-class software demonstrations and lectures for students to review before undertaking their final project of data analysis and report writing in her graduate level Stat 678 Survival Analysis course. In addition to the recordings, she posted exemplary published papers before taking students to work on specific aspects of their projects in the computer lab. Comparing her students’ performance in this course to a previous course using the same rubric showed improved scores in their data analysis, suggesting that students directly benefitted from this new access to exemplars and recordings.

Final report

What I Did

To train students in the use of statistical software R, data analysis, and report writing, I recorded software demonstrations and posted them on the Blackboard course site. Even though I have always done software demonstration in class, students often have difficulties mastering the coding skills involved by watching me do it just once in class. With the recordings available throughout the semester, students could review them after class and watch them again if they needed a refresher when working on the final project. I also made available the recordings I have on lectures from a previous offering of the class

I assigned students a final project of data analysis and report writing. The project was both group and individual: the students were asked to form groups of two or three and find a data set meeting certain requirements. After the data set was approved, each group of students worked on their unique data set and gave a group presentation in class. However, each student had to write up a report on their own. This way, students got to learn from one another, and also had the chance to improve on their oral and written communication skills.

I brought students to the computer lab to work on specific aspects of their project: exploratory data analysis using tables and figures; presenting model building results in tables; model building and diagnostics using residual analysis. Each time before the lab, I would post published papers with specific elements that I thought were nicely done so that students could follow the examples for their project.

How It Went

I compared the results of students’ data analysis from this class to those of Stat 680B, Biostatistical Methods, in Spring 2015. In Stat 680B, the method used was logistic regression. In Stat 678, the method used was Cox proportional hazards regression. Both deal with multiple regression problems with different types of outcome variables (binary in logistic regression vs. censored survival time in Cox regression). The student reports from both classes were graded using the same rubrics. In general, Stat 678 students did better than Stat 680B students, resulting from increased emphasis on data analysis and report writing due to the mini-grant. I chose not to perform any statistical inference due to small sample sizes (23 students from Stat 678 and 36 students from Stat 680B) as well as the obvious confounding variables of time and cohort (the reports were graded at different times).

A comparison in written exam performance between this offering of Stat 678 versus a previous offering (Spring 2012) was also made. The spring 2012 Stat 678 offering was the most similar to the Spring 2016 offering in that both classes had final projects in lieu of final exams and the two midterms in the two classes were given at about the same time, with the first midterm given mid-semester and the second midterm given towards the end of the semester. The results are mixed. In the first midterm, the average score was 78.4% for the Spring 2012 offering and 83.3% for the Spring 2016 offering. In the second midterm, the average score was 88.2% for the Spring 2012 offering and 85.5% for the Spring 2016 offering. Neither comparison reached statistical significance at the 0.05 significance level. In a sense, a no-difference comparison between the two Stat 678 offerings is a positive for this project given that a few classes were taken away from the standard lectures and spent in the lab doing data analysis. The exams in the two offerings were at similar difficulty levels. Therefore, with better student performance in their final data analysis report and no difference in written exams focused on student understanding of underlying statistical theory and principals, I consider the innovations introduced in this project a success.

What I Learned

Students learn when they have clear good examples to model from and know how they are graded (rubrics). In the future, I may have students hand in a portion of their report, for example, the portion on exploratory data analysis, and have them peer grade and give each other feedback. Good data analysis practices can be taught!