Key Concepts

Review core concepts you need to learn to master this subject

K-Nearest Neighbors Underfitting and Overfitting

The value of k in the KNN algorithm is related to the error rate of the model. A small value of k could lead to overfitting as well as a big value of k can lead to underfitting. Overfitting imply that the model is well on the training data but has poor performance when new data is coming. Underfitting refers to a model that is not good on the training data and also cannot be generalized to predict new data.

Distance Formula
Lesson 1 of 2
  1. 1
    In this lesson, you will learn three different ways to define the distance between two points: 1. Euclidean Distance 2. Manhattan Distance 3. Hamming Distance Before diving into the distance form…
  2. 2
    Euclidean Distance is the most commonly used distance formula. To find the Euclidean distance between two points, we first calculate the squared distance between each dimension. If we add up al…
  3. 3
    Manhattan Distance is extremely similar to Euclidean distance. Rather than summing the squared difference between each dimension, we instead sum the absolute value of the difference between eac…
  4. 4
    Hamming Distance is another slightly different variation on the distance formula. Instead of finding the difference of each dimension, Hamming distance only cares about whether the dimensions a…
  5. 5
    Now that you’ve written these three distance formulas yourself, let’s look at how to use them using Python’s SciPy library: - Euclidean Distance .euclidean() - Manhattan Distance .cityblock() - …

What you'll create

Portfolio projects that showcase your new skills

Pro Logo

How you'll master it

Stress-test your knowledge with quizzes that help commit syntax to memory

Pro Logo