## Key Concepts

Review core concepts you need to learn to master this subject

### Mean of a Dataset

The mean, or average, of a dataset is calculated by adding all the values in the dataset and then dividing by the number of values in the set.

For example, for the dataset [1,2,3], the mean is 1+2+3 / 3 = 2.

### The mean() Function

In R, the mean of a vector is calculated using the mean() function. The function accepts a vector as input, and returns the average as a numeric.

The code below is used to create a vector and calculate its mean:

a <- c(3,4,5,6)

mean(a)

This code outputs the average value of the array c(3,4,5,6):

4.5

### Median of a Dataset

The median of a dataset is the value that, assuming the dataset is ordered from smallest to largest, falls in the middle. If there are an even number of values in a dataset, the middle two values are the median.

Say we have a dataset with the following ten numbers:

24, 16, 30, 10, 12, 28, 38, 2, 4, 36

We can order this dataset from smallest to largest:

2, 4, 10, 12, 16, 24, 28, 30, 36, 38

The medians of this dataset are 16 and 24, because they are the fifth- and sixth-positioned observations in this dataset. In other words, there are four observations to the left of 16, and four observations to the right of 24.

If we added another value (say, 28) near the middle of this dataset:

2, 4, 10, 12, 16, 24, 28, 28, 30, 36, 38

The new median is equal to 24, because there are 5 values smaller than it, and 5 values larger than it.

### The median() Function in R

In R, the median of a vector is calculated using the median() function. The function accepts a vector as an input. If there are an odd number of values in the vector, the function returns the middle value. If there are an even number of values in the vector, the function returns the average of the two medians.

##### Even:
b <- c(3,4,5,6,12)
median(b)

The code above outputs 5 as the median, because it is the middle number in the array.

##### Odd:
a <- c(3,4,5,12)
median(a)

The code above outputs 4.5, because it takes the average of the two medians, 4 and 5.

1. 1
Finding the center of a dataset is one of the most common ways to summarize statistical findings. Often, people communicate the center of data using words like, on average, usually, or often….
2. 2
The mean, often referred to as the average, is a way to measure the center of a dataset. The average of a set is calculated using a two-step process: 1. Add all of the observations in your da…
3. 3
While you’ve shown that you can calculate the average yourself, it becomes time-consuming as the size of your dataset increases — imagine adding all of the numbers in a dataset with 10,000 ob…
4. 4
In this lesson, you learned how to calculate the average of a dataset using the formula: \bar{x} = \frac{x_1 + x_2 … + x_{n}}{n} and the R function: mean(my_data) — Circling back to the ori…
1. 1
In this lesson, you will learn how to find the median of a dataset — a common measure of a dataset’s center. Each of the next three exercises will cover the following topics: - Manually fin…
2. 2
The formal definition for the median of a dataset is: *The value that, assuming the dataset is ordered from smallest to largest, falls in the middle. If there are an even number of values in a dat…
3. 3
Finding the median of a dataset becomes increasingly time-consuming as the size of your dataset increases — imagine finding the median of an unsorted dataset with 10,000 observations. The R …
4. 4
In this lesson, you learned how to find the median of a dataset in two steps: 1. Sort the dataset 2. Identify the one or two numbers that fall in the middle of the sorted dataset You also learned …
1. 1
In this lesson, you will learn how to find the mode of a dataset. Each of the next three exercises will cover the following: - Manually finding the mode of a dataset - Using R’s functions to find…
2. 2
The formal definition for the mode of a dataset is: *The most frequently occurring observation in the dataset. A dataset can have multiple modes if there is more than one value with the same maxim…
3. 3
Finding the mode of a dataset becomes increasingly time-consuming as the size of your dataset increases — imagine finding the mode of a dataset with 10,000 observations. The R package DescTo…
4. 4
In this lesson, you learned how to find the mode of a dataset in two steps: 1. Find the frequency of every unique number in the dataset 2. Determine which number has the highest frequency You also…

## What you'll create

Portfolio projects that showcase your new skills

## How you'll master it

Stress-test your knowledge with quizzes that help commit syntax to memory