Listed in: Mathematics and Statistics, as STAT-240
Moodle sites: Course (Login required) | Section 01 (Login required) | Section 02 (Login required)
Amy S. Wagaman (Sections 01 and 02)
Making sense of a complex, high-dimensional data set is not an easy task. The analysis chosen is ultimately based on the research question(s) being asked. This course will explore how to visualize and extract meaning from large data sets through a variety of analytical methods. Methods covered include principal components analysis and selected statistical and machine learning techniques, both supervised (e.g. classification trees and random forests) and unsupervised (e.g. clustering). Additional methods covered may include factor analysis, dimension reduction methods, or network analysis at instructor discretion. This course will feature hands-on data analysis with statistical software, emphasizing application over theory.
The course is expected to include small group work, interactive labs, peer interactions such as peer review and short presentations, and a personal project, to foster student engagement in the course and with each other.
Requisite: STAT 111 or 135. Limited to 24 students. Fall semester. Professor Wagaman.
If Overenrolled: For the Fall, priority for rising sophomores and Statistics majors. For the Spring, priority for sophomores and Statistics majors.