Develop learning material for intro to data science course
In this project, you will develop (1) lab materials for teaching statistical concepts using R language and by analyzing real datasets, (2) "statistics nuggets" which are 15 min short activities illustrating how statistics and data science is used in real world applications or (3) explore a new data set for class project, for the course STAT310 Fundamental Concepts of Statistics.
About the course
STAT310 Fundamental Concepts of Statistics is typically one of the first statistics courses students take to learn about fundamental principles of statistical science and how to use data to investigate a question of interest. Students learn how to formulate an inquiry into a model or a hypothesis, explore a data set through visualization and descriptive statistics, make inference using relevant data and appropriate statistical methods, and communicate your observations and findings. The course uses the textbook Statistical Inference via Data Science
A ModernDive into R and the Tidyverse, and uses R language to demonstrate statistical concepts including point estimate, sampling variation, confidence interval, hypothesis testing, and linear regression.
Lab materials
The lab materials are R markdown documents that includes both examples and exercises to demonstrate the statistical concepts and methods introduced in the class. For example, in the linear regression module, we use data from the study It doesn't hurt to ask: question-asking increases liking to illustrate linear models when there are two explanatory variables, one categorical and one continuous. Students are asked to explore the data, fit models and interpret results in the lab.
Statistics nuggets
These are 15 min activities to expose students to the variety of ways data science and statistics are used in the world, prompt awareness and reflection of aspects of data science that are not typically discussed in an intro class. Activities include reading a piece of news (e.g., gender bias in classification systems), exploring a data visualization (e.g., A night under the stars: a look at overnight stays at US national parks) and discussing questions about what they see and learn from the materials. The materials are typically timely, short, and engaging.
Course project
Students analyze a real world data to answer questions of their interest in the final project. Currently students can analyze the US COVID-19 data or the US open policing data. Exploring other relevant and interesting datasets will be another project topic of interest.
What I expect from you
This project will be 4 hours/week * 8 weeks in total. Throughout the project, I expect you to turn in an R markdown file each week summarizing your work (e.g., news article or data set you have found, exploratory analysis you have completed). At the end of eight weeks, I expect you to turn in a final writeup of the course material you designed, e.g., a number of statistics nuggets, two sets of lab materials, and/or summary of a new data set you have found.
What you can expect from me
I will guide you through your project, and provide suggestions and feedback on the materials you have developed. I respect, value, and encourage your perspective, and will acknowledge your work when using your material. I will aim to meet with you once a week.
You will strengthen your skills and receive mentorship in (1) working with real data (2) statistical analysis (3) data science communication and (4) teaching. Your work will be used in future iterations of the class and benefit UMass students in the coming years!