R Scatterplots

R Scatterplots

In this guide, we will discuss R Scatterplots.

The scatter plots are used to compare variables. A comparison between variables is required when we need to define how much one variable is affected by another variable. In a scatterplot, the data is represented as a collection of points. Each point on the scatterplot defines the values of the two variables. One variable is selected for the vertical axis and the other for the horizontal axis. In R, there are two ways of creating a scatterplot, i.e., using the plot() function and using the ggplot2 package’s functions.

There is the following syntax for creating scatterplot in R:

plot(x, y, main, xlab, ylab, xlim, ylim, axes)  

Here,

S.NoParametersDescription
1.xIt is the dataset whose values are the horizontal coordinates.
2.yIt is the dataset whose values are the vertical coordinates.
3.mainIt is the title of the graph.
4.labIt is the label on the horizontal axis.
5.labIt is the label on the vertical axis.
6.slimIt is the limits of the x values which is used for plotting.
7.ylimIt is the limits of the values of y, which is used for plotting.
8.axesIt indicates whether both axes should be drawn on the plot.

Let’s see an example to understand how we can construct a scatterplot using the plot function. In our example, we will use the dataset “mtcars”, which is the predefined dataset available in the R environment.

Example

#Fetching two columns from mtcars  
data <-mtcars[,c('wt','mpg')]  
# Giving a name to the chart file.  
png(file = "scatterplot.png")  
# Plotting the chart for cars with weight between 2.5 to 5 and mileage between 15 and 30.  
plot(x = data$wt,y = data$mpg, xlab = "Weight", ylab = "Milage", xlim = c(2.5,5), ylim = c(15,30), main = "Weight v/sMilage")  
# Saving the file.  
dev.off()  

Output

R Scatterplots

Scatterplot using ggplot2

In R, there is another way for creating a scatterplot i.e. with the help of the ggplot2 package.

The ggplot2 package provides ggplot() and geom_point() function for creating a scatterplot. The ggplot() function takes a series of the input item. The first parameter is an input vector, and the second is the aes() function in which we add the x-axis and y-axis.

Let’s start understanding how the ggplot2 package is used with the help of an example where we have used the familiar dataset “mtcars”.

Example

#Loading ggplot2 package  
library(ggplot2)  
# Giving a name to the chart file.  
png(file = "scatterplot_ggplot.png")  
# Plotting the chart using ggplot() and geom_point() functions.  
ggplot(mtcars, aes(x = drat, y = mpg)) +geom_point()  
# Saving the file.  
dev.off()  

Output

R Scatterplots

We can add more features and make more attractive scatter plots also. Below are some examples in which different parameters are added.

Example 1: Scatterplot with groups

#Loading ggplot2 package  
library(ggplot2)  
# Giving a name to the chart file.  
png(file = "scatterplot1.png")  
# Plotting the chart using ggplot() and geom_point() functions.  
#The aes() function inside the geom_point() function controls the color of the group.  
ggplot(mtcars, aes(x = drat, y = mpg)) +  
geom_point(aes(color=factor(gear)))  
# Saving the file.  
dev.off()  

Output:

R Scatterplots

Example 2: Changes in axis

#Loading ggplot2 package  
library(ggplot2)  
# Giving a name to the chart file.  
png(file = "scatterplot2.png")  
# Plotting the chart using ggplot() and geom_point() functions.  
#The aes() function inside the geom_point() function controls the color of the group.  
ggplot(mtcars, aes(x = log(mpg), y = log(drat))) +geom_point(aes(color=factor(gear)))  
# Saving the file.  
dev.off()  

Output

R Scatterplots

Example 3: Scatterplot with fitted values

#Loading ggplot2 package  
library(ggplot2)  
# Giving a name to the chart file.  
png(file = "scatterplot3.png")  
#Creating scatterplot with fitted values.  
# An additional function stst_smooth is used for linear regression.  
ggplot(mtcars, aes(x = log(mpg), y = log(drat))) +geom_point(aes(color = factor(gear))) + stat_smooth(method = "lm",col = "#C42126",se = FALSE,size = 1)  
#in above example lm is used for linear regression and se stands for standard error.  
# Saving the file.  
dev.off()  

Output:

R Scatterplots

Adding information to the graph

Example 4: Adding title

#Loading ggplot2 package  
library(ggplot2)  
# Giving a name to the chart file.  
png(file = "scatterplot4.png")  
#Creating scatterplot with fitted values.  
# An additional function stst_smooth is used for linear regression.  
new_graph<-ggplot(mtcars, aes(x = log(mpg), y = log(drat))) +geom_point(aes(color = factor(gear))) +  
stat_smooth(method = "lm",col = "#C42126",se = FALSE,size = 1)  
#in above example lm is used for linear regression and se stands for standard error.  
new_graph+  
labs(  
        title = "Scatterplot with more information"  
)  
# Saving the file.  
dev.off()  

Output:

R Scatterplots

Example 5: Adding title with dynamic name

#Loading ggplot2 package  
library(ggplot2)  
# Giving a name to the chart file.  
png(file = "scatterplot5.png")  
#Creating scatterplot with fitted values.  
# An additional function stst_smooth is used for linear regression.  
new_graph<-ggplot(mtcars, aes(x = log(mpg), y = log(drat))) +geom_point(aes(color = factor(gear))) +  
stat_smooth(method = "lm",col = "#C42126",se = FALSE,size = 1)  
#in above example lm is used for linear regression and se stands for standard error.  
#Finding mean of mpg  
mean_mpg<- mean(mtcars$mpg)  
#Adding title with dynamic name  
new_graph + labs(  
        title = paste("Adding additiona information. Average mpg is", mean_mpg)  
)  
# Saving the file.  
dev.off()  

Output:

R Scatterplots

Example 6: Adding a sub-title

#Loading ggplot2 package  
library(ggplot2)  
# Giving a name to the chart file.  
png(file = "scatterplot6.png")  
#Creating scatterplot with fitted values.  
# An additional function stst_smooth is used for linear regression.  
new_graph<-ggplot(mtcars, aes(x = log(mpg), y = log(drat))) +geom_point(aes(color = factor(gear))) +  
stat_smooth(method = "lm",col = "#C42126",se = FALSE,size = 1)  
#in above example lm is used for linear regression and se stands for standard error.  
#Adding title with dynamic name  
new_graph + labs(  
        title =  
                "Relation between Mile per hours and drat",  
        subtitle =  
                "Relationship break down by gear class",  
        caption = "Authors own computation"  
)  
# Saving the file.  
dev.off()  

Output:

R Scatterplots

Example 7: Changing name of x-axis and y-axis

#Loading ggplot2 package  
library(ggplot2  
# Giving a name to the chart file.  
png(file = "scatterplot7.png")  
#Creating scatterplot with fitted values.  
# An additional function stst_smooth is used for linear regression.  
new_graph<-ggplot(mtcars, aes(x = log(mpg), y = log(drat))) +geom_point(aes(color = factor(gear))) +  
stat_smooth(method = "lm",col = "#C42126",se = FALSE,size = 1)  
#in above example lm is used for linear regression and se stands for standard error.  
#Adding title with dynamic name  
new_graph + labs(  
        x = "Drat definition",  
        y = "Mile per hours",  
        color = "Gear",  
        title = "Relation between Mile per hours and drat",  
        subtitle = "Relationship break down by gear class",  
        caption = "Authors own computation"  
)  
# Saving the file.  
dev.off()  

Output:

R Scatterplots

Example 8: Adding theme

#Loading ggplot2 package  
library(ggplot2  
# Giving a name to the chart file.  
png(file = "scatterplot8.png")  
#Creating scatterplot with fitted values.  
# An additional function stst_smooth is used for linear regression.  
new_graph<-ggplot(mtcars, aes(x = log(mpg), y = log(drat))) +geom_point(aes(color = factor(gear))) +  
stat_smooth(method = "lm",col = "#C42126",se = FALSE,size = 1)  
#in above example lm is used for linear regression and se stands for standard error.  
#Adding title with dynamic name  
new_graph+  
theme_dark() +  
                labs(  
                        x = "Drat definition, in log",  
                        y = "Mile per hours, in log",  
                        color = "Gear",  
                        title = "Relation between Mile per hours and drat",  
                        subtitle = "Relationship break down by gear class",  
                        caption = "Authors own computation"  
                )  
# Saving the file.  
dev.off()  

Output:

R Scatterplots

Next Topic: Click Here

This Post Has 2 Comments

Leave a Reply