# R Scatterplots

The scatter plots are used to compare variables. A comparison between variables is required when we need to define how much one variable is affected by another variable. In a scatterplot, the data is represented as a collection of points. Each point on the scatterplot defines the values of the two variables. One variable is selected for the vertical axis and other for the horizontal axis. In R, there are two ways of creating scatterplot, i.e., using plot() function and using the ggplot2 package’s functions.

There is the following syntax for creating scatterplot in R:

plot(x, y, main, xlab, ylab, xlim, ylim, axes)

Here,

S.No | Parameters | Description |
---|---|---|

1. | x | It is the dataset whose values are the horizontal coordinates. |

2. | y | It is the dataset whose values are the vertical coordinates. |

3. | main | It is the title of the graph. |

4. | xlab | It is the label on the horizontal axis. |

5. | ylab | It is the label on the vertical axis. |

6. | xlim | It is the limits of the x values which is used for plotting. |

7. | ylim | It is the limits of the values of y, which is used for plotting. |

8. | axes | It indicates whether both axes should be drawn on the plot. |

Let’s see an example to understand how we can construct a scatterplot using the plot function. In our example, we will use the dataset “mtcars”, which is the predefined dataset available in the R environment.

### Example

`#Fetching two columns from mtcars data <-mtcars[,c('wt','mpg')] # Giving a name to the chart file. png(file = "scatterplot.png") # Plotting the chart for cars with weight between 2.5 to 5 and mileage between 15 and 30. plot(x = data$wt,y = data$mpg, xlab = "Weight", ylab = "Milage", xlim = c(2.5,5), ylim = c(15,30), main = "Weight v/sMilage") # Saving the file. dev.off()`

**Output:**

## Scatterplot using ggplot2

In R, there is another way for creating scatterplot i.e. with the help of ggplot2 package.

The ggplot2 package provides ggplot() and geom_point() function for creating a scatterplot. The ggplot() function takes a series of the input item. The first parameter is an input vector, and the second is the aes() function in which we add the x-axis and y-axis.

Let’s start understanding how the ggplot2 package is used with the help of an example where we have used the familiar dataset “mtcars”.

### Example

`#Loading ggplot2 package library(ggplot2) # Giving a name to the chart file. png(file = "scatterplot_ggplot.png") # Plotting the chart using ggplot() and geom_point() functions. ggplot(mtcars, aes(x = drat, y = mpg)) +geom_point() # Saving the file. dev.off()`

**Output:**

We can add more features and make a more attractive scatter plots also. Below are some examples in which different parameters are added.

### Example 1: Scatterplot with groups

`#Loading ggplot2 package library(ggplot2) # Giving a name to the chart file. png(file = "scatterplot1.png") # Plotting the chart using ggplot() and geom_point() functions. #The aes() function inside the geom_point() function controls the color of the group. ggplot(mtcars, aes(x = drat, y = mpg)) + geom_point(aes(color=factor(gear))) # Saving the file. dev.off()`

**Output:**

### Example 2: Changes in axis

`#Loading ggplot2 package library(ggplot2) # Giving a name to the chart file. png(file = "scatterplot2.png") # Plotting the chart using ggplot() and geom_point() functions. #The aes() function inside the geom_point() function controls the color of the group. ggplot(mtcars, aes(x = log(mpg), y = log(drat))) +geom_point(aes(color=factor(gear))) # Saving the file. dev.off()`

**Output:**

### Example 3: Scatterplot with fitted values

`#Loading ggplot2 package library(ggplot2) # Giving a name to the chart file. png(file = "scatterplot3.png") #Creating scatterplot with fitted values. # An additional function stst_smooth is used for linear regression. ggplot(mtcars, aes(x = log(mpg), y = log(drat))) +geom_point(aes(color = factor(gear))) + stat_smooth(method = "lm",col = "#C42126",se = FALSE,size = 1) #in above example lm is used for linear regression and se stands for standard error. # Saving the file. dev.off()`

**Output:**

## Adding information to the graph

### Example 4: Adding title

`#Loading ggplot2 package library(ggplot2) # Giving a name to the chart file. png(file = "scatterplot4.png") #Creating scatterplot with fitted values. # An additional function stst_smooth is used for linear regression. new_graph<-ggplot(mtcars, aes(x = log(mpg), y = log(drat))) +geom_point(aes(color = factor(gear))) + stat_smooth(method = "lm",col = "#C42126",se = FALSE,size = 1) #in above example lm is used for linear regression and se stands for standard error. new_graph+ labs( title = "Scatterplot with more information" ) # Saving the file. dev.off()`

**Output:**

### Example 5: Adding title with dynamic name

`#Loading ggplot2 package library(ggplot2) # Giving a name to the chart file. png(file = "scatterplot5.png") #Creating scatterplot with fitted values. # An additional function stst_smooth is used for linear regression. new_graph<-ggplot(mtcars, aes(x = log(mpg), y = log(drat))) +geom_point(aes(color = factor(gear))) + stat_smooth(method = "lm",col = "#C42126",se = FALSE,size = 1) #in above example lm is used for linear regression and se stands for standard error. #Finding mean of mpg mean_mpg<- mean(mtcars$mpg) #Adding title with dynamic name new_graph + labs( title = paste("Adding additiona information. Average mpg is", mean_mpg) ) # Saving the file. dev.off()`

**Output:**

### Example 6: Adding a sub-title

`#Loading ggplot2 package library(ggplot2) # Giving a name to the chart file. png(file = "scatterplot6.png") #Creating scatterplot with fitted values. # An additional function stst_smooth is used for linear regression. new_graph<-ggplot(mtcars, aes(x = log(mpg), y = log(drat))) +geom_point(aes(color = factor(gear))) + stat_smooth(method = "lm",col = "#C42126",se = FALSE,size = 1) #in above example lm is used for linear regression and se stands for standard error. #Adding title with dynamic name new_graph + labs( title = "Relation between Mile per hours and drat", subtitle = "Relationship break down by gear class", caption = "Authors own computation" ) # Saving the file. dev.off()`

**Output:**

### Example 7: Changing name of x-axis and y-axis

`#Loading ggplot2 package library(ggplot2 # Giving a name to the chart file. png(file = "scatterplot7.png") #Creating scatterplot with fitted values. # An additional function stst_smooth is used for linear regression. new_graph<-ggplot(mtcars, aes(x = log(mpg), y = log(drat))) +geom_point(aes(color = factor(gear))) + stat_smooth(method = "lm",col = "#C42126",se = FALSE,size = 1) #in above example lm is used for linear regression and se stands for standard error. #Adding title with dynamic name new_graph + labs( x = "Drat definition", y = "Mile per hours", color = "Gear", title = "Relation between Mile per hours and drat", subtitle = "Relationship break down by gear class", caption = "Authors own computation" ) # Saving the file. dev.off()`

**Output:**

### Example 8: Adding theme

`#Loading ggplot2 package library(ggplot2 # Giving a name to the chart file. png(file = "scatterplot8.png") #Creating scatterplot with fitted values. # An additional function stst_smooth is used for linear regression. new_graph<-ggplot(mtcars, aes(x = log(mpg), y = log(drat))) +geom_point(aes(color = factor(gear))) + stat_smooth(method = "lm",col = "#C42126",se = FALSE,size = 1) #in above example lm is used for linear regression and se stands for standard error. #Adding title with dynamic name new_graph+ theme_dark() + labs( x = "Drat definition, in log", y = "Mile per hours, in log", color = "Gear", title = "Relation between Mile per hours and drat", subtitle = "Relationship break down by gear class", caption = "Authors own computation" ) # Saving the file. dev.off()`

**Output:**