Skip to content

10 Dimensionality Reduction

Preparation

Ch 8

Material

Session material

Online Resources

  • Steve Brunton has made a whole lecture series about the SVD. This is overkill but maybe check out the the Overview and the videoes about PCA.

  • For a very appealing and visual explanation of SVD, you should take a look at Visual Kernel's video on the topic.

Useful Resources on t-SNE (not curriculum related, but its useful to know about):

I've also added the original research papers leading up to t-SNE (not part of syllabus, just there for reference)

If you want a really in depth introduction to t-SNE, look here

If you missed the session in linear algebra, I recommend checking out some of the resources mentioned above.

Session Description

This lecture covers unsupervised machine learning algorithms. We discuss how these can be used for dimensionality reduction.

Key Concepts

  • Principal component analysis (PCA)
  • t-distributed stochastic neighbor embedding (t-SNE)

Learning Objectives

  • Use principle component analysis (PCA) to reduce the dimensions of your dataset
  • Describe how PCA can be used for clustering analyses
  • Create 2-dimensional clustering-plots in python using PCA and t-SNE

Note that t_SNE is not curriculum related, but it is a very useful tool for visualizing high-dimensional data. We will not ask you about it in the exam.