Skip to content

01 Introduction

Material

Chapter 1 + the pdf "kNN"

kNN (PDF)

Lesson 1 (PDF)

Titanic (CSV)

Titanic (Jupyter Notebook)

Session Description

We will talk about what machine learning is and the types of problems we work with. We will also introduce the first algorithm: k-Nearest Neighbor.

Key Concepts

  • Distance metrics in kNN
  • The parameter “k” and its effect on model performance
  • Differences between low bias and low variance models
  • Data normalization for distance-based methods
  • Evaluating model outputs (accuracy, confusion matrix, etc.)

Learning Objectives

  • Explain what is meant by the term Machine Learning (ML)
  • Explain what is meant by supervised vs. unsupervised learning
  • Explain the overall difference between classification and regression
  • Describe the "train-test" methodology
  • Train a k-Nearest Neighbors (kNN) algorithm on a dataset in sklearn
  • Explain the principles behind the kNN algorithm
  • Explain what is meant by the "hyperparameters" of an algorithm