## Key Concepts

Review core concepts you need to learn to master this subject

### NumPy’s Mean and Axis

``````We will use the following 2-dimensional array for this example:

```
py
ring_toss = np.array([[1, 0, 0],
[0, 0, 1],
[1, 0, 1]])
```

The code below will calculate the average of each row.

```py
np.mean(ring_toss, axis=1)
# Output: array([ 0.33333333,  0.33333333,  0.66666667])
`````````

In a two-dimensional array, you may want the mean of just the rows or just the columns. In Python, the NumPy `.mean()` function can be used to find these values. To find the average of all rows, set the axis parameter to 1. To find the average of all columns, set the axis parameter to 0.

Introduction to Statistics with NumPy
Lesson 1 of 2
1. 1
You’re a citizen scientist who has started collecting data about rising water in the river next to where you live. For months, you painstakingly measure the water levels and enter your findings int…
2. 2
The first statistical concept we’ll explore is mean, also commonly referred to as an average. The mean is a useful measurement to get the center of a dataset. NumPy has a built-in function to cal…
3. 3
We can also use np.mean to calculate the percent of array elements that have a certain property. As we know, a logical operator will evaluate each item in an array to see if it matches the specif…
4. 4
If we have a two-dimensional array, np.mean can calculate the means of the larger array as well as the interior values. Let’s imagine a game of ring toss at a carnival. In this game, you have thr…
5. 5
As we can see, the mean is a helpful way to quickly understand different parts of our data. However, the mean is highly influenced by the specific values in our data set. What happens when one of t…
6. 6
One way to quickly identify outliers is by sorting our data, Once our data is sorted, we can quickly glance at the beginning or end of an array to see if some values lie far beyond the expected ran…
7. 7
Another key metric that we can use in data analysis is the median. The median is the middle value of a dataset that’s been ordered in terms of magnitude (from lowest to highest). Let’s look at …
8. 8
In a dataset, the median value can provide an important comparison to the mean. Unlike a mean, the median is not affected by outliers. This becomes important in skewed datasets, datasets whose va…
9. 9
As we know, the median is the middle of a dataset: it is the number for which 50% of the samples are below, and 50% of the samples are above. But what if we wanted to find a point at which 40% of t…
10. 10
Some percentiles have specific names: - The 25th percentile is called the first quartile - The 50th percentile is called the median - The 75th percentile is called the *third quarti…
11. 11
While the mean and median can tell us about the center of our data, they do not reflect the range of the data. That’s where standard deviation comes in. Similar to the interquartile range, the …
12. 12
As we saw in the last exercise, knowing the standard deviation of a dataset can help us understand how spread out our dataset is. We can find the standard deviation of a dataset using the Numpy f…
13. 13
Let’s review! In this lesson, you learned how to use NumPy to analyze single-variable datasets. Here’s what we covered: - Using the np.sort method to locate outliers. - Calculating central positio…

## What you'll create

Portfolio projects that showcase your new skills

## How you'll master it

Stress-test your knowledge with quizzes that help commit syntax to memory