0
I’m having a problem iterating a dataset.
The main idea was to separate the main dataset into others, one for each team, from a for loop. Then apply another for loop to traverse the rows of each set and create a column with the goals that the team made. I made the next attempt, which went wrong:
library(readr)
library(magrittr)
library(dplyr)
library(lubridate)
library(zoo)
fut_uk <- read.csv("https://raw.githubusercontent.com/jokecamp/FootballData/master/football-data.co.uk/england/2020-2021%20(Until%20Jan%2022)/Premier.csv") %>%
select(Data = Date, HomeTeam, AwayTeam, FTR, FTHG, FTAG, FTR, HTHG, HTAG, HTR, HS, AS, HST, AST, HC, AC) %>%
mutate(Data = as.Date(Data, format = "%d/%m/%y"))
for (time in unique(fut_uk$HomeTeam)) {
assign(time, fut_uk %>%
filter(HomeTeam == time | AwayTeam == time) %>%
mutate(Gols = FTHG * NA))
for (i in seq_along(time[,2])) {
if (time[i,2] == time) {
time[i,16] <- time[i, 5]
} else {
time[i,16] <- time[i, 6]
}
}
}
Which returns the following error:
Error in time[, 2] : número incorreto de dimensões
Does anyone know how to resolve this? Thank you very much!
The logical test of
foris to pass through the variabletimein the established vector. In each loop the variable is reestablished, being an item of 1 dimension. So, there is no way to use seq_along or locate a second dimension as time[,2]. This is the error.– Daniel Ikenaga
To complement @Danielikenaga’s comment, the
forevaluates the iterator variable only once at the beginning of each iteration, you cannot change the value within the cycle. And it’s even quite confusing to choose the same name for the result of the pipe onassign. Choose another name and the cycle should run.– Rui Barradas
Moreover,
FTHG * NAalways givesNA. Why multiply?Gols = NA_real_is easier and gives the same result.– Rui Barradas
I don’t understand what you mean by "create a column with the goals the team made". That column already exists at the base, is the column
FTHG. And to separate by team just dotime_list <- split(fut_uk, fut_uk$HomeTeam).– Rui Barradas