R for Data Science


INF2167H

Data science is a fast-growing field and new tools and techniques are designed everyday to perform data analysis in quick and robust ways. This course (INF2167 — R for Data Science) covers the fundamentals of data science using the R language and environment for statistical computing and graphics. R is currently widely used by information students and data scientists from various disciplines. The course will teach students how to do data science in an easy way. It is designed for students from the social sciences and from non-programming backgrounds. The course focus is not on learning a new programming language but rather on providing students with skills to approach various research questions that involve analysis of social sciences data. We will learn skills of data collection, storage, cleaning, transformation, visualization, and various techniques of data analysis. Most important, we will learn how those skills are applied in research involving the social world. We will apply those techniques to analyze structured tabular data, networked data, and unstructured text data through experimenting on real datasets, including online data. This course will provide students with a new skill highly in demand in the information and data
sciences job markets.

Note: Formerly a special topics course.  Effective fall 2020, the course is a regular course.

Syllabi: