Issue discriminant function constant - linear discriminant analysis [R]

Asked

Viewed 711 times

4

How to issue the discriminant function constant (or constants, if multiple discriminant analysis)? Follows the dput for assistance in response.

structure(list(REAÇÃO = structure(c(0, 1, 0, 0, 1, 0, 1, 1, 
0, 1, 0, 1, 0, 0, 0, 1, 1, 1, 1, 0, 1, 0, 1, 1, 1, 0, 1, 1, 1, 
1, 0, 1, 1, 0, 1, 1), format.spss = "F11.0"), IDADE = structure(c(22, 
38, 36, 58, 37, 31, 32, 54, 60, 34, 45, 27, 30, 20, 30, 30, 22, 
26, 19, 18, 22, 23, 24, 50, 20, 47, 34, 31, 43, 35, 23, 34, 51, 
63, 22, 29), format.spss = "F11.0"), ESCOLARIDADE = structure(c(6, 
12, 12, 8, 12, 12, 10, 12, 8, 12, 12, 12, 8, 4, 8, 8, 12, 8, 
9, 4, 12, 6, 12, 12, 12, 12, 12, 12, 12, 8, 8, 12, 16, 12, 12, 
12), format.spss = "F11.0"), SEXO = structure(c(1, 1, 0, 0, 1, 
0, 1, 1, 1, 0, 0, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 0, 
0, 1, 0, 1, 0, 0, 0, 1, 1, 1), format.spss = "F11.0")), .Names = c("REAÇÃO", 
"IDADE", "ESCOLARIDADE", "SEXO"), row.names = c(NA, -36L), class = "data.frame")

Where: REAÇÃO is the dependent variable and the other independent variables.

As it is a simple discriminant analysis (two groups), it will have only one statement for the only function of the model. Of an advance, this constant has the value of -4.438. Preferably, I ask that it be obtained with a simple function to be executed.

  • 1

    Hello Giovani, post only the analysis output does not help much to understand what you need, the question becomes abstract, try to be a little more specific with a sample of your data and the analysis you performed. Use the command dput(dados) to capture a reproducible sample of your data and edit the question.

  • Hello, Fernandes. This post helps to understand my problem (get the constants in the discriminant analysis). However, the respondent did not provide the answer in script. See: https://stats.stackexchange.com/questions/166942/why-are-discriminant-analysis-results-in-r-lda-and-spss-different-constant-t

  • So, as the boy mentioned, in the MASS package, there is no such constant, but it suggests a way to get it through a mathematical equation, just understand the concept and implement in R.

  • @Fernandes, I edited the question. If you can help me, thank you.

  • Unfortunately I won’t be able to help you, at least not now, but try to look at the function PROC DISCRIM of SAS, the concept is the same to obtain the constant, one can use the PDF as an aid.

2 answers

4


Doing a search of the OS, I ended up finding this topical that is calculated constant based on the mathematical formula. The code below will return you the value of 4.437946, which differs from the value you yourself said by the sign.

library(MASS)
fit <- lda(REAÇÃO ~ ., data = dados)
fit # show results
plot(fit)
groupmean <- (fit$prior%*%fit$means)
constant <- (groupmean%*%fit$scaling)

I may be mistaken, but by the structure you’ve provided and the answers to the topic I mentioned, you found that constant through the SPSS and, the variable reference value REAÇÃO that he uses is the reverse of what the R uses. If you want the value to match,

dados$REAÇÃO <- as.factor(dados$REAÇÃO)
dados2 <- within(dados, REAÇÃO <- relevel(REAÇÃO, ref = 2))
fit2 <- lda(REAÇÃO ~ ., data = dados2)
fit2 # show results
plot(fit2)
groupmean2 <- (fit2$prior%*%fit2$means)
constant2 <- (groupmean2%*%fit2$scaling)
  • 2

    I didn’t even notice that it was you who gave the answer that I based on Cross Validated. By the difference of signals (the value of the reference response), I found that in SPSS would all be reversed, as the second part of my example. As for the part of always will have the reversed signal, I can’t tell you, I am not user of the SPSS. I suggest you use other test databases and check, but still sure won’t be possible.

  • By the discriminant function formula, the centroid for group 0 is: z=(-4.438)+(-1.9832)+(4.6226)+(0.6609)=-1.1377. That’s based on the book I have. If I invert the reference category (based on the second part of your example), the value of the centroid will be totally different (will give -7.7383).

2

After performing the discriminant analysis with the package MASS, you can get the constant. The expression is:

groupmean<-(model$prior%*%model$means)
constant<-(groupmean%*%model$scaling)
constant

Where model is the discriminating model. For example:

model<-lda(y~x1+x2+xn,data=mydata)
model

Paying attention only to the sign of the constant.

Any other (useful) answer that complements this, will win the reward.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.