Probabilities with pnorm R function

Asked

Viewed 219 times

3

Statistics and Probability is not my area. A doubt has arisen, I think I’m doing it right but the result is strange. This is my series:

y=structure(c(-0.276926746036887, 5.1002303288006, -4.45902094037891, 
-3.65240790618631, -0.554369416754141, 0.554369416754141, -2.25772100857938, 
0.375468281204183, -2.05228159135609, 3.29532926592406, -1.32662877100271, 
0.988930229485896, 7.07229348341512, -1.28656226502041, 4.2549381884453, 
3.30389447729236, 7.00232043259232, 4.7748331221784, -0.27085058060754, 
2.30352233190845, -3.22100247146746, -4.8743614652375, 2.06839209389353, 
1.89690621078412, 3.19172094544754, 2.16583767731922, -0.756218240014694, 
0.19541235523014, 3.14867268321485, -0.144151053557329, 1.01160305514936, 
1.5230454220633, 1.03840992420788, 2.71968062765501, 0.861247787660713, 
6.23477534727392, 0.5745094967259, 1.93701233271648, -4.92253766463621, 
-2.73591768775421, 4.3088310528527, 0.851783504402864, 1.72544081552388, 
-0.246815828758246, 0.166106780480357, 2.17708551212288, 0.789826760142481, 
3.52664861321393, -6.1785670816505, 3.6211660479106, 2.78553748264525, 
0.89906385457475, 2.04066324333774, 3.76850202784195, -0.0156889406364658, 
5.78634965275749, 1.26423314049121, 1.78013275057423, 3.31172073861811, 
2.4038546444218, 4.66258664624009, -13.5843132751602, -0.403750718769838, 
-3.56233196110176, -1.81026841297745, -7.46631529569132, 1.42758499302842, 
-1.24995503682966, -2.24626121938432, -3.31120681505209, 2.99957221381565, 
2.30969024677898, -2.37907446465362, -1.88443844926011, -5.12583959903528, 
5.20685002435163, -3.97367473360913, 2.67851563087302, 0.235381859142514, 
2.80824193248285, 2.89610807313333, -1.9729824987724, -3.3615385580259, 
-2.10190280870005, -2.09136151458093, -3.87094266631358, -3.2751255099671, 
1.8850792597277, 1.02942443142527, 1.36158956227371, -2.59327811511363, 
0.491897726102963, -5.78257606259328, -5.87069602591836, -3.25452200548395, 
6.39760964001113, -0.247985247978266, -6.1760320126711, 0.423565923494829, 
2.17040312533964, 7.31569072572449, -0.0871727871011863, 1.621556300047, 
-0.504340241336876, -4.50860952107559, -4.4400301974362, 2.23109247118944, 
3.28749943941721, 0.344162253040281, 3.43025943921054, -0.226072675706557, 
7.21093155995787, 1.44163507753278, -7.28040722370891, 5.74286395911455, 
-2.60392371230649, 2.31880108190266, 0.508242311922885, -2.82704339382555, 
-3.94015554915489, -0.582767936456074, 2.46930245946095, 0.553845041192957, 
-2.26755057988116, -4.00552887824704, -6.38198320512942, -1.70521028736478, 
0.982781138056799, -3.69451268069177, 0.246585308476566, 0.55308591968849, 
-1.8746351610273, 2.84001033287316, 8.83299994192822, 1.49942937302144, 
1.43144209241926, 4.82628292850097, -4.02801052546, 0.267481044687123, 
-4.19615590666126, 0.543398882231305, -1.31021176393017, -5.53706210900828, 
4.21482279511858, 1.86102992601902, 1.12800218208342, -2.20317694783819, 
-2.99809865598509, -3.9927697470259, -0.862197910255613, -3.54699081750229, 
11.1743775531587, 12.9770616075187, 3.18619731487439, 0.0317784590121939, 
1.83499639512377, 4.40415488243423, -5.97321376720503, -3.93178975523039, 
0.57591946095546, 4.42156241654894, 0.537527630812085, -0.121844321824688, 
-0.61744241735453, 0.523729365789133, 0.200706255816196, 0.519958804533455, 
-1.54404705357017, 0.86950573958392, -0.0327307895505558, -1.58683494663899, 
-0.013607190665299, -2.37306791586077, 0.440301351691902, 0.344705879693602, 
-2.62103933785511, -3.61112663327796, 4.24778609707985, 0.0313351928111716, 
-0.952260731392018, -0.158275463946334, -2.18588813267492, 0.613340421034958, 
1.41402133928985, -0.250985009120286, -0.470304013118084, 2.95224973756179, 
-1.78771862889961, -0.0758449638961822, 3.24567999848993, -1.51097540057064, 
2.86670167933881, -1.4720908190912, 0.379216944245919, 1.39885736740112, 
-2.91513301487977, -0.218721729003968, -0.174077974918679, -0.358434494209331, 
-0.262627361006695, -3.91537200150442, -3.27447875910531, -1.78701946030079, 
6.70983840143415, -1.70370125158495, -0.983681568317685, 1.51971910738222, 
-0.98330489803079, -1.75605251878265, 1.58576167333393, 1.1341537292256, 
0.533282885011355, -4.06735941384394, -0.58755861516564, 2.74957162619566, 
0.622289734338011, -0.780114417335526, -1.87837155096527, 0.226072054152004, 
2.5368406372972, -2.34466303952749, 2.09320149165367, -2.48295466772607, 
-1.39236880883822, 1.54309735856075, 1.50481181337672, -0.864243789878039, 
1.0338958791585, 2.64716085607346, -0.702852086409766, 0.341588718140706, 
0.404527399709431, 1.60509437116675, -2.76525626967899, 0.749200216946883, 
-2.27946320594848, 0.19400561795847, 3.02847564751852, -1.3395241719926, 
-0.197225968421838, 2.51515239525749, -0.895573323697879, 2.29956699052384, 
3.87915217893021, -1.07749609076405, 0.977159881194761, 3.17346515886144, 
-1.85765670111033, 1.92585418625068, 2.05443904262965, -5.10898848846342, 
2.31349795923873, 1.23878810661255, 1.69870352568162, -0.814981856773006, 
0.919933007395513, 0.693588874876505, -1.23512568593857, -1.79467292197339, 
-1.23956764741749, 1.07546772749542, 1.93739237643256, -1.99990828004241, 
2.95188431410198, -0.0635606707981018, -0.852783793313855, -2.18690287354734, 
-0.410120764875582, -4.15224123307232, -2.46163901733218, 0.957897478032321, 
-1.43716915209405, 0.446453789583057, 0.494251639553211, -3.42523616802606, 
-2.17046135806602, 4.41921336813345, -0.335763862240462, -1.32083839660705, 
-2.42762992607925, -0.82634215069915, 2.69239819821484, 1.98972704914046, 
-5.23074389240228, -2.00100680366893, -1.53953028309399, -3.55446542103525, 
-2.06954910327771, -1.95713201456629, 0.946560098399563, 3.63170421214811, 
-3.24970892660951, 1.11918342199998, -0.314074601207448, 0.840158487264464, 
-0.326895061687471, -1.40089184945676, -3.89052244917483, -0.454972014583788, 
1.63056386213887, -2.09531475147559, 1.89341455741064, -1.23150989131111, 
4.77196538893804, 1.66515332371626, 1.89683518961253, -2.35330107339518, 
1.76926250072161, 0.040693200374009, 2.10220842655976, 0.771271929555351, 
-3.61206762105296, 1.58934142527959, 0.836620342328287, -4.64587274736759, 
-2.77107914576235, 1.29533561334032, -1.04463858207546, -1.79675468861598, 
1.63218869418859, -1.94848787942505, -3.14046238852407, 0.544018624569886, 
-0.127392297256357, -0.00980613276329034, -0.365473065125499, 
-1.58733491562901, 1.01879263874898, -1.33529297306464, -1.48767751576085, 
0.980797357366159, -1.10305672543544, -1.88529499316701, 1.01498530290741, 
3.58364274627491, -0.262274385874206, 0.157049880696913, 0.0456770639576165, 
0.039702233772132, 0.271537324819782, -0.567698382468329, 0.502358961664562, 
8.51230456682989, 2.14508593839724, 9.65767019796469, 5.18629487924251, 
4.8662255853938, 1.41817523625373, 0.955339050012599, -0.167170359769503, 
-3.43010456552938, -8.79805101759148, -1.79194933762561, -1.57363442942458, 
2.41735538894081, 1.91796218652891, -2.92469310269456), .Tsp = c(1980.08333333333, 
2009.75, 12), class = "ts")

I want to calculate probabilities using the Accumulated Normal distribution function:

Prob(X< quartile.25)

pnorm(quantile(y,.25), mean = mean(y), sd = sd(y), lower.tail=TRUE) 

Prob(X> quartile 75)

1-pnorm(quantile(y,.75), mean = mean(y), sd = sd(y), lower.tail=TRUE) 

This probability should not be the same because it is a normal distribution?

I’m doing it properly?

1 answer

5


Yes, you’re doing everything right.

If we stick to the theory, yes, these two results should be the same. If your data comes from a normal distribution, these two values should be equal.

(more generally, this result is valid for any probability distribution that is symmetric to the mean)

I do not know the origin of your data. I imagine they are simulated or come from some real time series. In this case, we have no guarantee that these two values will be the same. The most we can get are approximations to the values of these quartiles. In your case, the values were

> pnorm(quantile(y,.25), mean = mean(y), sd = sd(y), lower.tail=TRUE)
      25% 
0.2668025 
> 1-pnorm(quantile(y,.75), mean = mean(y), sd = sd(y), lower.tail=TRUE)
      75% 
0.2836341 

which are quite reasonable approaches to these quartiles.

Take another example. Suppose I want to generate a random sample of size 100, with mean 0 and standard deviation 1, of a random variable with normal distribution. I’m gonna call this x and estimate their average:

x <- rnorm(100, mean=0, sd=1)
mean(x)
[1] 0.0005606774

No matter how much you repeat this experiment, the average estimate will never be zero. This is very unlikely to happen. However, the approach we get is fairly good. In fact, it’s so good that if I run a hypothesis test to see if this average is, in fact, zero, I get

t.test(x)

    One Sample t-test

data:  x
t = 0.00624, df = 99, p-value = 0.995
alternative hypothesis: true mean is not equal to 0
95 percent confidence interval:
 -0.1777242  0.1788455
sample estimates:
   mean of x 
0.0005606774 

Note that the p-value was super high, indicating that we cannot reject H_0. That is, the mean of this sample is, in fact, zero.

In short, do not worry. Your logic is correct and this small difference is expected. Because of the fluctuations in the generation of these numbers, either via simulation or their actual nature, they are not perfectly symmetric to the average. But this feature is expected.

  • Thank you very much! I happen to be finding difference in the first decimal place tbm. But I am allied to know that this right and to have understood its explanation. I have a probability that this giving 0.2 and another 0.3. Anyway, but it was understood. Thank you, again.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.