Scatter charts fixing a response variable

Asked

Viewed 206 times

4

Suppose I have an interest in the dataset iris, already present in the memory of R:

head(iris)
  Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1          5.1         3.5          1.4         0.2  setosa
2          4.9         3.0          1.4         0.2  setosa
3          4.7         3.2          1.3         0.2  setosa
4          4.6         3.1          1.5         0.2  setosa
5          5.0         3.6          1.4         0.2  setosa
6          5.4         3.9          1.7         0.4  setosa

I would like to fix one of the columns of this data set as my response variable and plot the scatter charts between this column and the others present in iris. For example, if I fix Petal.Length, i would like to see the following scatter charts made through the package ggplot2:

  • Petal.Length and Sepal.Length

  • Petal.Length and Sepal.Width

  • Petal.Length and Petal.Width

There is no need to distinguish between the different Species. I know how to do this manually, as follows:

library(ggplot2)
library(gridExtra)

g1 <- ggplot(iris, aes(x = Sepal.Length , y = Petal.Length)) +
geom_point()
g2 <- ggplot(iris, aes(x = Sepal.Width , y = Petal.Length)) +
geom_point()
g3 <- ggplot(iris, aes(x = Petal.Width , y = Petal.Length)) +
geom_point()

grid.arrange(g1, g2, g3, ncol=3)

inserir a descrição da imagem aqui

However, I would like an automated way to do this, especially for cases where there will be more than 3 predictive variables in my dataset.

How to proceed?

2 answers

3

My approach was to take the name of the variables and pass them on ggplot as text within double brackets [[.

colunas <- names(iris)
resposta <- colunas[1] # escolhe variável resposta
colunas <- colunas[-c(1,5)] # remove resposta e as espécies

graficos <- lapply(colunas, function(explicativa, df, resposta) {
  ggplot(df, aes(x = df[[explicativa]] , y = df[[resposta]])) +
    geom_point()
}, df = iris, resposta = resposta)

grid.arrange(grobs = graficos, ncol = length(graficos))

Edited

Another possible solution is to build the code as a text and pass it on parse() and then eval(). It is important that the argument passed to eval() be appointed text. Thus:

graficos <- lapply(colunas, function(explicativa, df, resposta) {
  codigo <- sprintf("ggplot(df, aes(x = %s, y = %s)) + geom_point()",
                    explicativa, resposta)    
  eval(parse(text = codigo))
}, df = iris, resposta = resposta)
  • Simple and elegant answer. Excellent.

3


A solution is to first put your data.frame in format long, and then use in the ggplot2 directly:

# carrega pacotes
library(reshape2)
library(ggplot2)

# coloca dados no formato long
iris_long <- melt(iris, id = c("Petal.Length", "Species"))

# plot com ggplot2
ggplot(iris_long, aes(y = Petal.Length, x = value)) + 
  geom_point() + facet_wrap(~variable)

inserir a descrição da imagem aqui

By default, the facet_wrap uses the same scale for all facets, but you can change this as you like. For example, facets with free scales:

ggplot(iris_long, aes(y = Petal.Length, x = value)) + 
  geom_point() + facet_wrap(~variable, scales = "free")

inserir a descrição da imagem aqui

Browser other questions tagged

You are not signed in. Login or sign up in order to post.