Reproducibility and open scientific practices are increasingly demanded of scientists and researchers. Training on how to apply these practices in data analysis is still limited and has not kept up with demand. This course is aimed at early career researchers conducting quantitative analyses (ranging from lab-based research to epidemiology). By the end of the course, students will have:
- An understanding of why an open and reproducible data workflow is important.
- Practical experience in setting up and carrying out an open and reproducible data analysis workflow.
- Know how to continue learning methods and applications in this field.
Students will develop proficiency in using the R statistical computing language, as well as improving their data and code literacy. Throughout this course we will focus on a general quantitative analytical workflow, using the R statistical software and other modern tools. The course will place particular emphasis on research in diabetes and metabolism; it will be taught by instructors working in this field and it will use relevant examples where possible. This course will not teach statistical techniques, as these topics are already covered in university curriculums.
Prerequisites and installation instructions
No experience in data analysis or programming assumed or required. However, before attending the workshop, there are a few prerequisites to complete.
- Install the latest version of R
- Install the latest version of RStudio
- Install the packages listed in the Course Materials
- Install Git
- Read or scan through Chapter 1 of the online book “R for Data Science”
- Read and abide by the Code of Conduct
Instructors and helpers
- Lead instructor and organizer: