1
I am studying solutions to perform a prediction of a product that depends on other variables.
In this my study I am using as my database Seatbelts
, a Time Series that is already native to R. Which is a historical series of car accidents with death from 1969 to 1983.
On this basis it has eight variables, and my goal is to use the variable DriversKilled
(Dead Drivers) to create a predictive model that predicts the number of Dead Drivers in the next 5 years.
colnames(Seatbelts)
[1] "DriversKilled" "drivers" "front" "rear" "kms"
[6] "PetrolPrice" "VanKilled" "law"
Using the Linear Regression model tslm
and including the variables trend
(trend) and season
(seasonality), I was able to make the prediction using the function Forecast
successfully.
mortos = window(Seatbelts[,c("DriversKilled")], start = c(1975,1), end= c(1984,12))
treino = window(mortos, start=c(1975,1), end=c(1979,12))
teste = window(mortos, start=c(1980,1), end=c(1984,12))
modelo_1 = tslm(treino ~ trend + season, data = treino)
Prev1 = forecast(modelo_1, h = 60)
plot(mortos)
lines(Prev1$mean, col="red")
As seen in the image above, the red color indicates the prediction using Linear Regression, but I want to improve it, considering other variables such as the variable drivers
(drivers), for sure this variable influences the number of dead and if it is included in my model I can improve the accuracy of my forecast.
That’s where my problem begins, I’ve been trying to add other variables in my Linear Regression model, but I can’t do that function Forecast
recognize these new variables. How can I do this? It is possible?
How can I perform a prediction using a Multivariate or Multiple Linear Regression model in R?
Thank you for the answer I will test.
– Izak Mandrak