Extract attributes from a dataframe using the points of a second dataframe

Asked

Viewed 49 times

1

Greetings.

I’m trying to extract information from biomes from a dataframe (biomes_df), which has columns of temperature and precipitation. I want to use these values to extract which biomes the temp/Prec points of another dataframe (sites) are. Note that in df biomes, there are extensions of where the biomes are, there is a continuum.

I thought about using the match and sp:over functions, but I wasn’t very successful. If anyone has any suggestions, thank you, because I don’t have much reference to solve this.

biomes_df <- data.frame(
  mat = c(
    29.339, 13.971, 15.371, 17.510, 24.131, 27.074, 28.915, 29.201, 29.339,
    13.971, -9.706, -7.572, 4.491, 17.510, 15.371, 13.971, 17.510, 4.491,
    -7.572, -9.706, -6.687, -0.949, 3.098, 7.147, 10.165, 13.918, 18.626,
    18.176, 17.510, 18.626, 13.918, 10.165,  7.147, 3.098, -0.949, 1.039,
    1.998, 2.444, 3.118, 4.446, 7.758, 12.614, 18.720, 18.637, 18.626, -0.949,
    -6.687, -4.395, -4.098, -1.592, 0.914, 4.155, 3.118, 2.444, 1.998, 1.039,
    -0.949, 18.720,  12.614, 7.758, 4.446, 3.118, 4.155, 15.716, 20.136,
    19.392, 18.720, 18.720, 19.392, 20.136, 22.278, 23.756, 24.199, 24.714,
    25.667, 26.105, 27.414, 27.772, 25.709, 21.736, 18.720, 17.510, 18.176,
    18.626, 18.637, 18.720, 21.736, 25.709, 27.772, 28.418, 28.915, 27.074,
    24.131, 17.510, -6.687, -8.896, -9.706, -13.382, -15.366, -15.217, -8.373,
    -4.098, -1.592, -4.098, -4.395, -6.687
  ),
  map = c(
    21.3, 23.0, 174.6, 535.1, 702.9, 847.9, 992.4, 532.1, 21.3, 23.0, 7.3,
    87.2, 314.6, 535.1, 174.6, 23.0, 535.1, 314.6, 87.2, 7.3, 202.6, 391.7,
    529.9, 783.1, 956.9, 1116.5, 1269.3, 794.3, 535.1, 1269.3, 1116.5, 956.9,
    783.1, 529.9, 391.7, 514.8, 673.4, 968.5, 1630.6, 1839.7, 2028.0, 2224.0,
    2355.7, 1837.6, 1269.3, 391.7, 202.6, 922.9, 1074.1, 1405.9, 1744.9,
    2012.3, 1630.6, 968.5, 673.4, 514.8, 391.7, 2355.7, 2224.0, 2028.0,
    1839.7, 1630.6, 2012.3, 2930.1, 3377.7, 2917.0, 2355.7, 2355.7, 2917.0,
    3377.7, 3896.5, 4343.1, 4415.2, 4429.8, 4279.0, 4113.7, 3344.4, 2790.6,
    2574.0, 2414.3, 2355.7, 535.1, 794.3, 1269.3, 1837.6, 2355.7, 2414.3,
    2574.0, 2790.6, 1920.3, 992.4, 847.9, 702.9, 535.1, 202.6, 50.8, 7.3,
    34.8, 98.8, 170.8, 533.0, 1074.1, 1405.9, 1074.1, 922.9, 202.6
  ),
  biome = c(
    rep('Subtropical desert', 9), rep('Temperate grassland/desert', 7),
    rep('Woodland/shrubland', 13), rep('Temperate forest', 16),
    rep('Boreal forest', 12), rep('Temperate rain forest', 10),
    rep('Tropical rain forest', 14), rep('Tropical seasonal forest/savanna', 13),
    rep('Tundra', 12)
  )
)

sites <- data.frame(site = c("a", "b"), temp = c(-1.2, 27),
                    prec = c(144.6, 207))
  • 2

    Hello, is there any way you can post what you tried and the starting tables? That by the way you posted only the final data frame and without the initials or the steps it is difficult to give a specific answer

  • Hello @Jorgemendes. All my attempts did not give results with any output, so I did not play. But I updated with a playable df as well

  • map and mat in biomes_df in case would precipitation and temperature?

  • That’s right! In this case, they have more than one value because they form a polygon that this biome can occur (I tried to turn into Spatialpolygonsdataframe to use the over function but I was unsuccessful)

1 answer

0

If I understand correctly I think that’s what you want. Using the sp to create a polygon and seeing in which polygon the point falls.

I used the Vignette "Map overlay and Spatial Aggregation in sp" to play. It’s not optimized because I don’t know much about the package but it already seems to work.

library(sp)

biomes_df <- data.frame(
  mat = c(
    29.339, 13.971, 15.371, 17.510, 24.131, 27.074, 28.915, 29.201, 29.339,
    13.971, -9.706, -7.572, 4.491, 17.510, 15.371, 13.971, 17.510, 4.491,
    -7.572, -9.706, -6.687, -0.949, 3.098, 7.147, 10.165, 13.918, 18.626,
    18.176, 17.510, 18.626, 13.918, 10.165,  7.147, 3.098, -0.949, 1.039,
    1.998, 2.444, 3.118, 4.446, 7.758, 12.614, 18.720, 18.637, 18.626, -0.949,
    -6.687, -4.395, -4.098, -1.592, 0.914, 4.155, 3.118, 2.444, 1.998, 1.039,
    -0.949, 18.720,  12.614, 7.758, 4.446, 3.118, 4.155, 15.716, 20.136,
    19.392, 18.720, 18.720, 19.392, 20.136, 22.278, 23.756, 24.199, 24.714,
    25.667, 26.105, 27.414, 27.772, 25.709, 21.736, 18.720, 17.510, 18.176,
    18.626, 18.637, 18.720, 21.736, 25.709, 27.772, 28.418, 28.915, 27.074,
    24.131, 17.510, -6.687, -8.896, -9.706, -13.382, -15.366, -15.217, -8.373,
    -4.098, -1.592, -4.098, -4.395, -6.687
  ),
  map = c(
    21.3, 23.0, 174.6, 535.1, 702.9, 847.9, 992.4, 532.1, 21.3, 23.0, 7.3,
    87.2, 314.6, 535.1, 174.6, 23.0, 535.1, 314.6, 87.2, 7.3, 202.6, 391.7,
    529.9, 783.1, 956.9, 1116.5, 1269.3, 794.3, 535.1, 1269.3, 1116.5, 956.9,
    783.1, 529.9, 391.7, 514.8, 673.4, 968.5, 1630.6, 1839.7, 2028.0, 2224.0,
    2355.7, 1837.6, 1269.3, 391.7, 202.6, 922.9, 1074.1, 1405.9, 1744.9,
    2012.3, 1630.6, 968.5, 673.4, 514.8, 391.7, 2355.7, 2224.0, 2028.0,
    1839.7, 1630.6, 2012.3, 2930.1, 3377.7, 2917.0, 2355.7, 2355.7, 2917.0,
    3377.7, 3896.5, 4343.1, 4415.2, 4429.8, 4279.0, 4113.7, 3344.4, 2790.6,
    2574.0, 2414.3, 2355.7, 535.1, 794.3, 1269.3, 1837.6, 2355.7, 2414.3,
    2574.0, 2790.6, 1920.3, 992.4, 847.9, 702.9, 535.1, 202.6, 50.8, 7.3,
    34.8, 98.8, 170.8, 533.0, 1074.1, 1405.9, 1074.1, 922.9, 202.6
  ),
  biome = c(
    rep('Subtropical desert', 9), rep('Temperate grassland/desert', 7),
    rep('Woodland/shrubland', 13), rep('Temperate forest', 16),
    rep('Boreal forest', 12), rep('Temperate rain forest', 10),
    rep('Tropical rain forest', 14), rep('Tropical seasonal forest/savanna', 13),
    rep('Tundra', 12)
  )
)

sites <- data.frame(site = c("a", "b"), temp = c(-1.2, 27),
                    prec = c(144.6, 207))

biomes_list <- split(biomes_df[,1:2], biomes_df$biome)

biomes_poly <- SpatialPolygons(list(
  Polygons(list(Polygon(biomes_list[[1]])), ID=names(biomes_list[1])), 
  Polygons(list(Polygon(biomes_list[[2]])), ID=names(biomes_list[2])), 
  Polygons(list(Polygon(biomes_list[[3]])), ID=names(biomes_list[3])), 
  Polygons(list(Polygon(biomes_list[[4]])), ID=names(biomes_list[4])),
  Polygons(list(Polygon(biomes_list[[5]])), ID=names(biomes_list[5]))
  ))

sites_points <- sites[,2:3]
dimnames(sites_points)[[1]] <- sites[,1]
sites_points <- SpatialPoints(sites_points)

sites_categories <- over(sites_points, biomes_poly, returnList = TRUE)

sites["categories"] <- names(biomes_list)[as.numeric(sites_categories)]

sites
#>   site temp  prec                 categories
#> 1    a -1.2 144.6 Temperate grassland/desert
#> 2    b 27.0 207.0         Subtropical desert

Created on 2020-07-02 by the reprex package (v0.3.0)

  • 1

    It was exactly what I needed. I came to see this Vignette, but I don’t have enough mastery to be able to adapt it to mine. I thank you.

  • Good that helped ! If you can accept the answer here too :)

Browser other questions tagged

You are not signed in. Login or sign up in order to post.