Suppose you want to know the average height of an oak tree in your local park. On Monday, you measure `10`

trees and get an average height of `32`

ft. On Tuesday, you measure `12`

different trees and reach an average height of `35`

ft. On Wednesday, you measure the remaining `11`

trees in the park, whose average height is `31`

ft. The average height for all `33`

trees in your local park is `32.8`

ft.

The collection of individual height measurements on Monday, Tuesday, and Wednesday are each called samples. A *sample* is a subset of the entire population (all the oak trees in the park). The mean of each sample is a *sample mean* and it is an estimate of the *population mean*.

Note: the sample means (`32`

ft., `35`

ft., and `31`

ft.) were all close to the population mean (`32.8`

ft.), but were all slightly different from the population mean and from each other.

For a population, the mean is a constant value no matter how many times it’s recalculated. But with a set of samples, the mean will depend on exactly which samples are selected. From a sample mean, we can then extrapolate the mean of the population as a whole. There are three main reasons we might use sampling:

- data on the entire population is not available
- data on the entire population is available, but it is so large that it is unfeasible to analyze
- meaningful answers to questions can be found faster with sampling

### Instructions

**1.**

In the workspace, we’ve generated a random population of size `300`

that follows a normal distribution with a mean of `65`

. Update the value of `population_mean`

to store the `mean()`

of `population`

. Does it closely match your expectation?

**2.**

Let’s look at how the means of different samples can vary within the same population.

The code in the notebook generates 5 random samples from `population`

. `sample_1`

is displayed and `sample_1_mean`

has been calculated.

Replace the `"Not calculated"`

strings with calculations of the means for `sample_2`

, `sample_3`

, `sample_4`

, and `sample_5`

.

Look at the population mean and the sample means. Are they all the same? All different? Why?

# Take this course for free

By signing up for Codecademy, you agree to Codecademy's Terms of Service & Privacy Policy.