To transform a spreadsheet of presence and absence of species into columns

Asked

Viewed 649 times

0

I have a basic question about spreadsheets but that is catching my work.

I have a table of presence and absence of species with 714 sites and 4406 species. I need to transform this spreadsheet into a spreadsheet with 3 columns, being 1 with the name of the sites, 1 with the name of all species per site (repeating the 4406 species in each site) and the other with 0 (absence of the species in the site) and 1 (presence of the species in the site).

How can I do that?

  • 2

    Welcome to Stackoverflow! Unfortunately, this question cannot be reproduced by anyone trying to answer it. Please, take a look at this link and see how to ask a reproducible question in R. So, people who wish to help you will be able to do this in the best possible way.

  • The melt (reshape2 package) and Gather (tidyr package) functions do what you need. If you update your question with a sample of your data, OS users can respond more easily and in more detail. You can run this line: dput(seus.dados[1:20,1:20]) and copy and paste the console output. With this any user will have a sample of the first 20 rows and 20 columns of their date frame..

1 answer

2

See if that’s what you need. Since you didn’t post a sample of your data, here’s a fictitious presence and absence matrix to use as an example:

matriz.pa <- data.frame(
  Sitio = LETTERS[1:4], 
  A_arturica = sample(c(0,1), 4, replace=TRUE),
  A_beliniae = sample(c(0,1), 4, replace=TRUE),
  B_carmensis = sample(c(0,1), 4, replace=TRUE) )

> matriz.pa
  Sitio A_arturica A_beliniae B_carmensis
1     A          1          1           1
2     B          0          1           1
3     C          1          1           0
4     D          0          1           0

Using the melt function (reshape or reshape2 package):

> reshape::melt(matriz.pa, variable_name = 'Especie')
Using Sitio as id variables
   Sitio     Especie value
1      A  A_arturica     1
2      B  A_arturica     0
3      C  A_arturica     1
4      D  A_arturica     0
5      A  A_beliniae     1
6      B  A_beliniae     1
7      C  A_beliniae     1
8      D  A_beliniae     1
9      A B_carmensis     1
10     B B_carmensis     1
11     C B_carmensis     0
12     D B_carmensis     0

The reshape2 package contains enhanced versions of the functions of the reshape package, but what you need is simple. You can also use the Gather function, from the tidyr package, which does the same thing (but with fewer options, although again, what you need is simple).

  • To avoid the warning Using Sitio as id variables one can use the argument id.vars = "Sitio".

  • Yes! The option id.vars also important if you have more than one column that is not of values, although generally the melt function does a good job detecting the id and data columns.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.