Skip to Content
Intermediate Data Visualization in R
Statistical Summaries

By default, bar plots using geom_bar() show the count of observations for each value. We can also show other types of data, such as calculating and showing the mean instead.

Let’s say we want to see how much an animal sleeps on average by the kind of food it eats, based on the msleep dataset. The code below does just that! Recall that passing stat = "identity" to a geom_bar() layer tells ggplot2 to display values as is, rather than count the number of occurrences. We can similarly use stat = "summary", which tells ggplot2 to summarize values according to a provided function. We can specify fun = "mean" to summarize our y axis variable by calculating mean values for each value in our x axis variable.

# Filter our data to include only hours spent asleep, omitting NA values msleep_means_df <- msleep %>% filter(status == "asleep") %>% na.omit() # Construct a bar plot calculating and displaying means msleep_meanbar <- ggplot(msleep_means_df, aes(x = diet, y = hours)) + labs(title="Mean Hours Asleep by Diet") + geom_bar(stat = "summary", fun = "mean")

Here’s how this plot looks. In the msleep dataset, insectivores (animals that eat insects) sleep for fifteen hours a day on average, which is far more than animals with other diets!

Bar Plot: Mean Hours Asleep by Diet



We now want to calculate mean graduation rates for all students across all schools by year in our graduation_df dataset. We’ve filtered graduation_df to only retain rows where the Demographic column equals Total Cohort and the Status column equals Graduated.

Run the head() function on our new graduation_means_df data frame to examine it.


Create a bar plot using graduation_means_df named graduation_meanbar. Map Year to the x axis and Pct to the y axis. Use stat = "summary" and fun = "mean" to calculate and display mean values on the y axis.

Print the plot to see the change in mean graduation rates over time!

Folder Icon

Sign up to start coding

Already have an account?