-1
I have a database, where my variable resposta
are dates and the explanatory is the flow of the source of a city.
I generated a time series model as a way to try to understand how reliable my data was, however, the AIC measure is not as effective as.
The idea was to divide my base into training and testing and try to make a prediction of the data, which would help to be more sure of reliability. My data is:
Data Fonte Férrea
jan/18 160,11
fev/18 NA
mar/18 150,88
abr/18 NA
mai/18 127,52
jun/18 171,25
jul/18 111,24
ago/18 111,26
set/18 109,79
out/18 295,12
nov/18 361
dez/18 365
jan/19 118,29
fev/19 112,18
mar/19 204,4
abr/19 109,95
mai/19 122,93
jun/19 130,43
jul/19 80,33
ago/19 96,52
set/19 83,46
out/19 101,71
nov/19 58,63
dez/19 119,67
jan/20 136,61
The question is: how to divide this data into training and testing?
The idea was to leave the last 4 observations in the base test and the rest in the training, however, I do not know how to put in the function of the R the last 4 observations, being them from Oct/19 to Jan/20.
The R function that generates the training and test data is:
treino=window(basededados,end=)
teste=window(basededados,start=,end=)
Is the response variable the same date? Is your goal to predict the future date as a function of the flow? Also, take a look at this link (mainly in the use of function
dput
) and see how to ask a reproducible question in R. So, people who wish to help you will be able to do this in the best possible way.– Marcus Nunes
The response variable is the date. My goal is to forecast the data. For this, I’m dividing my base into training and testing. Only I could not understand how to put in the window function a start and end date when data is given in month/year.
– Letícia Marrara
Are you sure about this? Please explain then what it means "to forecast the data". Is it to predict future dates? Or predict the flow value on future dates? Because if it is the second case, the response variable is the flow rate. Also, share the data according to the link I passed above, to make it easier for us to help you.
– Marcus Nunes
The forecast of the data here in the case, is only to provide a reliable indication of how good the model is to predict new data, it is more a matter of reliability. My variable answer is the date, because I want to know how the flow is explained over time. The difficulty here, in this case, is how to divide the base I am using in training and testing, because my date is placed in the form month/year. I didn’t understand how to put in window function so you can read correctly.
– Letícia Marrara
i <- 1:(ncol(dados) - 4);train <- dados[i, ];test <- dados[-i, ]
.– Rui Barradas