Teaching – Intro to Data Analysis

In both Fall and Spring, I teach a full course titled “Into to Data Analysis” at UC Santa Cruz, Silicon Valley.

Students use R  (or Python) to complete a large data analysis project, including a write-up with findings, insights and visuals. All tools used are open sourced. This is an introductory course, and the topics covered include:

  • Approaches to data analysis: Templates, write-ups and illustrative examples
  • Overview of tools for data analysis: R, R-Studio (IDE) and Python
  • Obtaining data: Finding data sets and Web scraping
  • Data manipulation techniques: Data quality, reshaping data, joining data sets and data aggregation
  • Plotting and visualization: Exploration and presentation
  • Exploratory data analysis: Visual inspection, descriptive analytics, insights
  • Estimation techniques: Multiple approaches based on assumptions, sampling basics
  • Regression models: Simple, multiple linear regression, ANOVA, comparing samples.
  • Analysis report write-up and presentation, including graphs
  • Simulation techniques: Fitting distributions, simulating stochastic processes
  • Forecasting methods and applications: Smoothing, moving averages, time series, ARIMA

R is used for examples in the class.


