Boxplot() in R: How to Make BoxPlots Learn with Example

A boxplot, also known as a box-and-whisker plot, is a graphical representation of the distribution of a dataset. It displays key statistical measures, such as the median, quartiles, and potential outliers. In R, you can create boxplots using the boxplot() function. Here’s how to make boxplots with an example:

Example: Creating boxplots using the boxplot() function

Sample data: Heights of individuals (in inches) by gender

heights ← data.frame(
Gender = rep(c(“Male”, “Female”), each = 50),
Height = c(rnorm(50, mean = 70, sd = 3), rnorm(50, mean = 65, sd = 2))
)

Create a basic boxplot

boxplot(heights$Height, main = “Height Distribution”, ylab = “Height (inches)”)

Create a boxplot by gender

boxplot(Height ~ Gender, data = heights, main = “Height Distribution by Gender”, ylab = “Height (inches)”)

In the example above:

We create a sample dataset heights containing the heights of individuals by gender.
We create a basic boxplot of the entire height dataset using boxplot(heights$Height, …), where main specifies the plot title and ylab specifies the y-axis label.
We create a more informative boxplot by gender using boxplot(Height ~ Gender, data = heights, …). This groups the data by the “Gender” column and creates separate boxplots for each gender. Height ~ Gender is a formula notation indicating that we want to create boxplots of “Height” based on the levels of “Gender”.
You can customize your boxplot further by adding additional options, such as adjusting colors, adding titles, changing axis labels, and more. Boxplots provide a concise way to visualize the distribution of data and identify potential outliers, making them valuable tools in exploratory data analysis.