Connecting the points to the regression line

Asked

Viewed 302 times

7

Suppose the following database:

Income <- structure(list(X = 1:30, Education = c(10, 10.4013377926421, 
10.8428093645485, 11.2441471571906, 11.6454849498328, 12.0869565217391, 
12.4882943143813, 12.8896321070234, 13.2909698996656, 13.7324414715719, 
14.133779264214, 14.5351170568562, 14.9765886287625, 15.3779264214047, 
15.7792642140468, 16.2207357859532, 16.6220735785953, 17.0234113712375, 
17.4648829431438, 17.866220735786, 18.2675585284281, 18.7090301003344, 
19.1103678929766, 19.5117056856187, 19.9130434782609, 20.3545150501672, 
20.7558528428094, 21.1571906354515, 21.5986622073579, 22), Income = c(26.6588387834389, 
27.3064353457772, 22.1324101716143, 21.1698405046065, 15.1926335164307, 
26.3989510407284, 17.435306578572, 25.5078852305278, 36.884594694235, 
39.666108747637, 34.3962805641312, 41.4979935356871, 44.9815748660704, 
47.039595257834, 48.2525782901863, 57.0342513373801, 51.4909192102538, 
61.3366205527288, 57.581988179306, 68.5537140185881, 64.310925303692, 
68.9590086393083, 74.6146392793647, 71.8671953042483, 76.098135379724, 
75.77521802986, 72.4860553152424, 77.3550205741877, 72.1187904524136, 
80.2605705009016)), .Names = c("X", "Education", "Income"), class = "data.frame", row.names = c(NA, 
-30L))

To make a graph with the adjustment line (LOESS) on gpplot2, the following command is sufficient:

ggplot(Income, aes(Education, Income)) + geom_point(color="red") + geom_smooth(se=FALSE)

However, how to connect the points to the regression line, to illustrate the error term (as shown in the graphic below)?

inserir a descrição da imagem aqui

Based on my question to the SOEN.

2 answers

5


You can also use the Plot function

mod <- loess(Income ~ Education, data = Income)
Income <- transform(Income, Fitted = fitted(mod))

plot(Income ~ Education, data = Income, type = "p", col = "red",
    cex = 1.25)
lines(Fitted ~ Education, data = Income, col = "blue")
with(Income, segments(Education, Income, Education, Fitted))

inserir a descrição da imagem aqui

3

In the ggplot2 you can use the geom_segment to draw lines between the points and the values predicted by the model. But first you need to run the model "outside" of the gpplot2 to obtain the expected values.

Running the template and adding a column to the database:

require("ggplot2")

mod <- loess(Income ~ Education, data = Income)
Income <- transform(Income, Fitted = fitted(mod))

Adding lines to the chart:

ggplot(Income, aes(Education, Income)) + 
  geom_point(color="red") + 
  geom_smooth(se=FALSE, method = "loess") +
  geom_segment(aes(x = Education, y = Income,
                   xend = Education, yend = Fitted))

inserir a descrição da imagem aqui

Browser other questions tagged

You are not signed in. Login or sign up in order to post.