Skip to Content
Learn
Aggregates in R
Calculating Aggregate Functions II

Sometimes, we want to group by more than one column. We can do this by passing multiple column names as arguments to the `group_by` function.

Imagine that we run a chain of stores and have data about the number of sales at different locations on different days:

location date day_of_week total_sales
West Village February 1 W 400
West Village February 2 Th 450
Chelsea February 1 W 375
Chelsea February 2 Th 390

We suspect that sales are different at different locations on different days of the week. In order to test this hypothesis, we could calculate the average sales for each store on each day of the week across multiple months. The code would look like this:

``````df %>%
group_by(location,day_of_week) %>%
summarize(mean_total_sales = mean(total_sales))``````

And the results might look something like this:

location day_of_week mean_total_sales
Chelsea M 402.50
Chelsea Tu 422.75
Chelsea W 452.00
West Village M 390
West Village Tu 400

Instructions

1.

At ShoeFly.com, our Purchasing team thinks that certain `shoe_type`/`shoe_color` combinations are particularly popular this year (for example, blue ballet flats are all the rage in Paris).

Find the total number of shoes of each `shoe_type`/`shoe_color` combination purchased using `group_by`, `summarize()` and `n()`. Save your result to the variable `shoe_counts`, and view it.

2.

The Marketing team wants to better understand the different price levels of the kinds of shoes that have been sold on the website, in particular looking at `shoe_type`/`shoe_material` combinations.

Find the mean price of each `shoe_type`/`shoe_material` combination purchased using `group_by`, `summarize()` and `mean()`. Save your result to the variable `shoe_prices`, and view it.

Don’t forget to include `na.rm = TRUE` as an argument in the summary function that you call!

Take this course for free

Or sign up using:

Already have an account?