2
I am trying to compare two distributions, but when I apply Ks.test for both, I only get the value of’D' and p-value coincidentally gives the same value for both, '< 2.2e-16'. I had the idea of removing the zero values to see the result, and Ks.test presented all the values properly. But unfortunately, for this analysis, I have to leave also the values equal to zero.
Has anyone ever had this problem? Or any idea how to proceed? I need to have some value for p-value, in order to accept or reject the null hypothesis.
My data is extensive, so I had not put here. Below:
d<-c(4.1,3.7,11.1,15.0,5.1,12.3,0.1,0.2,0.0,0.4,0.0,23.2,0.0,0.0,13.2,0.0,0.0,0.0,0.0,18.6,3.3,0.2,4.2,0.1,0.0,0.7,11.6,1.0,28.9,0.0,0.0,0.0,2.3,10.5,9.7,1.7,0.0,0.5,0.0,1.9,16.7,26.4,9.2,1.2,1.4,9.0,35.3,8.6,0.6,0.0,0.0,0.1,0.5,2.9,27.2,0.0,0.0,0.0,0.0,15.4,0.0,0.0,5.3,1.3,2.1,0.3,22.1,0.0,0.0,5.7,4.2,68.5,1.7,8.7,0.0,9.6,0.0,15.6,0.0,1.9,14.8,0.1,2.4,0.0,0.0,1.1,22.0,1.8,39.4,0.0,0.1,29.5,14.0,0.0,4.5,0.0,37.2,0.0,0.0,21.6,0.0,21.6,1.3,24.5,1.9,1.8,14.1,12.1,0.0,0.1,0.0,0.0,0.2,15.4,1.2,0.4,0.0,0.0,0.0,0.0,0.1,18.9,0.2,0.7,0.8,0.6,17.2,0.0,0.0,0.1,0.1,0.0,0.0,0.1,0.0,0.7,21.2,35.7,0.0,0.0,.8,1.7,10.4,0.0,4.9,0.0,0.9,0.6,6.2,2.2,0.0,0.7,7.6,0.1,1.8,29.4,5.4,0.0,0.0,0.0,0.1,34.4,0.6,11.2,0.0,0.6,1.7,0.3,0.0,8.4,2.6,0.2,27.6,2.6,0.4,0.0,18.5,0.0,25.5,0.9,0.0,0.0,0.2,0.1,0.1,0.0,1.1,0.0,0.0,0.0,0.0,0.1,0.3,0.0,0.0,1.1,0.0,0.9,0.8,1.2,2.6,0.0,6.6,0.0,0.8,15.1,2.6,2.1,4.0,2.2,0.0,15.5,15.0,0.1,1.9,12.8,31.6,0.0,0.0,0.0,25.9,0.0,0.0,1.3,0.0,0.3,0.0,0.0,0.1,0.0,0.1,10.9,1.3,0.0,0.0,1.8,4.4,0.0,2.1,20.2,0.0,12.5,0.1,0.0,0.7,0.0,4.0,46.8,27.1,0.0,0.0,0.0,16.9,0.0,23.7,29.8,0.0,0.0,5.5,0.0,23.8,0.0,0.1,4.4,0.1,43.2,15.4,9.5,0.9,0.0,1.2,7.0,15.9,0.0,9.9,3.5,12.0,0.0,0.5,0.0,0.1,1.1,2.6,0.1,0.0,0.0,0.0,0.0,1.4,18.4,4.5,5.2,4.1,4.3,0.0,3.5,0.0,0.0,0.2,0.0,0.0,2.2,0.0,0.7,0.0,0.0,0.0,14.5,3.1,0.0,0.0,0.1,5.7,0.5,0.1,0.2,0.0,0.0,6.8,0.0,0.2,18.3,0.0,0.2,0.0,0.0,2.5,40.9,4.4,0.0,0.0,0.8,1.0,4.5,0.1,0.0,0.0,0.0,0.0,0.0,0.3,0.4,11.9,0.0,0.0,0.6,12.2,0.0,0.0,0.3,9.3,9.3,1.6,6.1,0.0,19.0,0.0,0.0,0.0,1.4,0.0,0.1,0.0,8.2,5.3,0.0,0.0,3.4,0.0,0.0,0.0,24.1,0.2,15.7,0.0,0.0,12.1,4.1,5.8,13.2,1.0,64.2,0.0,0.5,10.6,0.0,7.0,4.3,0.0,0.0,16.7,29.8,49.3,57.8,4.3,1.2,0.0,0.0,0.0,0.0,6.8,10.6,3.7,2.2,0.0,0.1,5.1,0.0,0.0,1.0,4.3,0.0,43.5,5.6,0.0,7.7,0.0,0.0,18.7,0.3,0.2,0.4,0.0,0.0,23.0,0.0,0.0,0.2,9.5,0.0,5.1,6.4,0.0,28.0,0.0,0.0,3.2,0.0,0.5,1.2,2.3,42.3,0.0,0.0,1.8,0.0,0.2,5.8,30.8,3.1,2.7)
The line of reasoning was as follows:
n<-length(d[!is.na(d)])
media<-mean(d)
desvio<-sd(d)
vetor<- as.vector(d[!is.na(d)])
variancia<-var(vetor)*(n-1)/n
alfa<-(media)^2/(variancia)
beta<-(variancia)/(media)
ks.test(vetor,"pgamma",shape=alfa, scale=beta)
D = 0.3792, p-value < 2.2e-16
alternative hypothesis: two-sided
Compared to a normal:
ks.test(vetor,"pnorm",mean=media, sd=desvio)
D = 0.3002, p-value < 2.2e-16
alternative hypothesis: two-sided
I tested it because I wanted to compare it with the two distributions, Gamma and Normal. So that in the end I could compare the two p-value values and see which one fits best with my data. But the two continue to appear p-value as: < 2.2e-16
welcome to Sopt. Enjoy doing the Tour to better understand how the site works.
– João Martins
Iara, the Kolmogorov-Smirnov test (
ks.test
) compares whether a sample follows a given continuous probability distribution or whether two samples follow the same continuous distribution. It is a non-parametric test that, in order to reject or reject the null hypothesis, we have to compare the value of the statistic D with the critical values of a table that depend on the sample size and the level of significance (both not informed). Although your doubt doesn’t seem to be about languageR
, if you provide us with a piece of your data, we may be able to help you in a better way.– Rafael Cunha
The results you obtained are not equal, which are both smaller than
2.2e-16
. To see why R looks like this, see the help pagehelp(".Machine")
and print the value of.Machine$double.eps
.– Rui Barradas