Key Concepts

Review core concepts you need to learn to master this subject

Histograms In R

# Creates a histogram of the Ozone feature from the dataset airquality. In this case, each bin will have a width of 10. airquality_histogram_binwidth <- ggplot(airquality, aes(x = Ozone)) + geom_histogram(binwidth = 10)

In R, the geom_histogram() function from the ggplot2 library will create a histogram. The binwidth argument sets the width of the bins in the histogram.

If the binwidth argument is not used, the histogram will create 30 bins by default of equal size. It is recommended to use the binwidth argument to make the histogram smoother.

Histograms are used to visualize the distribution of a continuous variable.

Intermediate Data Visualization in R
Lesson 1 of 1
  1. 1
    In this lesson, we’ll explore a variety of different visualizations in R’s ggplot2 package. We’ll also go over different ways we can customize our plots to better …
  2. 2
    Histograms let us visualize the distribution of a continuous variable, in contrast to bar plots which show counts and other values for discrete and categorical variables. Histograms divide values o…
  3. 3
    Heatmaps let us visualize frequencies along two variables. A heatmap looks like a scatterplot, but uses color-coded squares rather than individual points to indicate how many cases occurred at the …
  4. 4
    Box plots, also known as box-and-whisker plots, show the distribution of data by quartiles. Box plots are useful in showing how much a variable varies across values of another variable – are most …
  5. 5
    Many times we are interested in seeing percentages within our data or how different values add up. We can do this using a stacked bar plot. Let’s turn to the msleep dataset included in ggplot2 de…
  6. 6
    Instead of stacking our fill variable in our bar graphs, we can also represent values of the variable side by side. This is known as a clustered bar plot. The plot below visualizes our msleep dat…
  7. 7
    By default, bar plots using geom_bar() show the count of observations for each value. We can also show other types of data, such as calculating and showing the mean instead. Let’s say we want to s…
  8. 8
    Often, we’ll want to show not only the mean of a value but also its standard error. This tells us how much variation there is around the mean – are most values close to the averages shown, or is t…
  9. 9
    Frequently, we’ll want to customize our axes to represent our data more clearly. For discrete variables, such as categories on the x axis of a bar plot, we may want to specify a particular order th…
  10. 10
    Similarly to discrete variables, we can add a scale_x_continuous() layer to customize continuous variables on our x axis, or a scale_y_continuous() layer to customize continuous variables on our y …
  11. 11
    Facets allow us to visualize multiple discrete variables in one plot, showing each value of the facet variable in a different section. The plot below shows our familiar hours slept by diet plot,…
  12. 12
    You’ve completed the Data Visualization in R lesson! You now know how to choose and implement different kinds of geoms in ggplot2, how to customize your plot axes, and how to visualize additional v…

What you'll create

Portfolio projects that showcase your new skills

Pro Logo

How you'll master it

Stress-test your knowledge with quizzes that help commit syntax to memory

Pro Logo