Plot multiple columns at the same time

Asked

Viewed 33 times

0

I need to plot all columns of a table relative to a specific column.

Dataset:

df <- read.table(
text = "c1 c2 c3  x
        2  4   5  0
        3  5   2  0
        6  7   8  0
        1  2   5  1
        2  5   6  1
        3  3   3  1", header = TRUE)

I need to plot the values of columns C1, C2 and C2 with the values of column X so that I realize what their correlation with the values 0 and 1:

At the moment I’m trying the following, but it’s not coming out what I need:

library(tidyverse)
df %>% gather("id", "value", 1:3) %>%
  ggplot(., aes(x, id, color = x))+
  geom_point()

1 answer

4


The secret here is to understand the output of the function gather. Let’s see him in detail below.

df <- read.table(
    text = "c1 c2 c3  x
        2  4   5  0
        3  5   2  0
        6  7   8  0
        1  2   5  1
        2  5   6  1
        3  3   3  1", header = TRUE)

library(tidyverse)
df %>% gather("id", "value", 1:3)
#>    x id value
#> 1  0 c1     2
#> 2  0 c1     3
#> 3  0 c1     6
#> 4  1 c1     1
#> 5  1 c1     2
#> 6  1 c1     3
#> 7  0 c2     4
#> 8  0 c2     5
#> 9  0 c2     7
#> 10 1 c2     2
#> 11 1 c2     5
#> 12 1 c2     3
#> 13 0 c3     5
#> 14 0 c3     2
#> 15 0 c3     8
#> 16 1 c3     5
#> 17 1 c3     6
#> 18 1 c3     3

Note that there are 3 columns. We will use x and value to take the values of x and c1, c2 and c3 of the original set, making it id be the identifier of each column in the long format. Thus, we have the following chart:

df %>% gather("id", "value", 1:3) %>%
    ggplot(., aes(x = x, y = value, colour = id)) +
    geom_point()

Note that there is not much to understand regarding the correlation of c1, c2 and c3 with x, for x only has two values. One way to solve this is by adjusting a curve to each value of id. In this case, I opted for a simple linear regression and the visualization improved a little.

df %>% gather("id", "value", 1:3) %>%
    ggplot(., aes(x = x, y = value, colour = id)) +
    geom_point() +
    geom_smooth(method = "lm", se = FALSE)
#> `geom_smooth()` using formula 'y ~ x'

Another way would be to divide the id in panels, to at least better separate the groups of points and facilitate the visualization of trends.


df %>% gather("id", "value", 1:3) %>%
    ggplot(., aes(x = x, y = value)) +
    geom_point() +
    facet_wrap(~ id)

Created on 2021-07-28 by the reprex package (v2.0.0)

Browser other questions tagged

You are not signed in. Login or sign up in order to post.