Key Concepts

Review core concepts you need to learn to master this subject

Pandas DataFrame Aggregate Function

df.columnName.mean() # Average of all values in column df.columnName.std() # Standard deviation of column df.columnName.median() # Median value of column df.columnName.max() # Maximum value in column df.columnName.min() # Minimum value in column df.columnName.count() # Number of values in column df.columnName.nunique() # Number of unique values in column df.columnName.unique() # List of unique values in column

Pandas’ aggregate statistics functions can be used to calculate statistics on a column of a DataFrame. For example, df.columnName.mean() computes the mean of the column columnName of dataframe df. The code block shows how to calculate statistics on the column columnName of df using Pandas’ aggregate statistics functions.

Pandas’ Groupby

df.columnName.mean() # Average of all values in column df.columnName.std() # Standard deviation of column df.columnName.median() # Median value of column df.columnName.max() # Maximum value in column df.columnName.min() # Minimum value in column df.columnName.count() # Number of values in column df.columnName.nunique() # Number of unique values in column df.columnName.unique() # List of unique values in column

In a pandas DataFrame, aggregate statistic functions can be applied across multiple rows by using a groupby function. In the example, the code takes all of the elements that are the same in Name and groups them, replacing the values in Grade with their mean. Instead of mean() any aggregate statistics function, like median() or max(), can be used. Note that to use the groupby() function, at least two columns must be supplied.

Aggregates in Pandas
Lesson 1 of 1
  1. 1

    This lesson you will learn about aggregates in Pandas. An aggregate statistic is a way of creating a single number that describes a group of numbers. Common aggregate statistics incluse mean,…

  2. 2

    In the previous lesson, you learned how to perform operations on each value in a column using […] . In this exercise, you will learn how to combine all of the values from a column for a sing…

  3. 3

    When we have a bunch of data, we often want to calculate aggregate statistics (mean, standard deviation, median, percentiles, etc.) over certain subsets of the data. Suppose we have a grade book w…

  4. 4

    After using […] , we often need to clean our resulting data. As we saw in the previous exercise, the […] function creates a new Series, not a DataFrame. For our ShoeFly.com example, the in…

  5. 5

    Sometimes, the operation that you want to perform is more complicated than […] or […] . In those cases, you can use the […] method and lambda functions, just like we did for individual …

  6. 6

    Sometimes, we want to group by more than one column. We can easily do this by passing a list of column names into the […] method. Imagine that we run a chain of stores and have data about the…

  7. 7

    When we perform a […] across multiple columns, we often want to change how our data is stored. For instance, recall the example where we are running a chain of stores and have data about the n…

  8. 8

    This lesson introduced you to aggregates in Pandas. You learned: How to perform aggregate statistics over individual rows with the same value using […] . How to rearrange a DataFrame into…

What you'll create

Portfolio projects that showcase your new skills

Pro Logo

How you'll master it

Stress-test your knowledge with quizzes that help commit syntax to memory

Pro Logo