Key Concepts

Review core concepts you need to learn to master this subject

K-Nearest Neighbors

The K-Nearest Neighbors algorithm is a supervised machine learning algorithm for labeling an unknown data point given existing labeled data.

The nearness of points is typically determined by using distance algorithms such as the Euclidean distance formula based on parameters of the data. The algorithm will classify a point based on the labels of the K nearest neighbor points, where the value of K can be specified.

Distance Formula
Lesson 1 of 2
  1. 1
    In this lesson, you will learn three different ways to define the distance between two points: 1. Euclidean Distance 2. Manhattan Distance 3. Hamming Distance Before diving into the distance form…
  2. 2
    Euclidean Distance is the most commonly used distance formula. To find the Euclidean distance between two points, we first calculate the squared distance between each dimension. If we add up al…
  3. 3
    Manhattan Distance is extremely similar to Euclidean distance. Rather than summing the squared difference between each dimension, we instead sum the absolute value of the difference between eac…
  4. 4
    Hamming Distance is another slightly different variation on the distance formula. Instead of finding the difference of each dimension, Hamming distance only cares about whether the dimensions a…
  5. 5
    Now that you’ve written these three distance formulas yourself, let’s look at how to use them using Python’s SciPy library: - Euclidean Distance .euclidean() - Manhattan Distance .cityblock() - …

What you'll create

Portfolio projects that showcase your new skills

Pro Logo

How you'll master it

Stress-test your knowledge with quizzes that help commit syntax to memory

Pro Logo