If members of the sample are thought to have a "significantly greater" proportion with the property than the control population we need to compare the "one sided" significance of p WRT P. To do this we check whether p > P + 1.64*sqrt(P*(1-P)/n) to be 95% certain p is significantly greater than P, or p > P + 2.32*sqrt(P*(1-P)/n) to be 99% certain it is significantly greater.
The basic procedure sometimes means p can appear to be "quite a bit larger" than P but still not be evidence it is significantly larger.
In many epidemiological studies all that can be said, if such a test fails, is that "there seems to be an increased risk of X, but it does not appear to be unambiguously related to [factors that characterise sample]". I.e. pure chance can't be ruled out as being completely responsible for what appears a larger proption of those in the sample having the property than members of the control population.
In published studies (e.g. for radiation effects) we are typically dealing with P of the order of 1/1000 to 10/1000 and n in the range 1000 to 100000. Let's make a table of showing the significance levels for samples for different P and n.
The table below shows a number of values for the proportion P of a population with a property, and a number of different sample sizes n to be compared with the control population. In each case the table shows the number of expected members of the sample with the property x, and the upper limits with 95% and 99% confidence for the expected number with the property x95 and x99. If the sample has been "exposed" to some effect, then the number in the sample with the property must exceed x95 and/or x99 to show statistical evidence the sample has a "higher" proportion with the property than the control population.
sample size (n) | pop'n prob (P) | expected number in sample (x) | upb expected number at 95% confidence (x95) | x95/x | upb expected number at 99% confidence (x99) | x99/x |
---|---|---|---|---|---|---|
1000 | 0.0001 | 0.1 | 0.618588 | 6.18588 | 0.833612 | 8.33612 |
1000 | 0.0011 | 1.1 | 2.8191 | 2.56282 | 3.5319 | 3.21082 |
1000 | 0.0021 | 2.1 | 4.47409 | 2.13052 | 5.45847 | 2.59927 |
1000 | 0.0031 | 3.1 | 5.98304 | 1.93001 | 7.17845 | 2.31563 |
1000 | 0.0041 | 4.1 | 7.41393 | 1.80828 | 8.788 | 2.14342 |
1000 | 0.0051 | 5.1 | 8.79419 | 1.72435 | 10.3259 | 2.02469 |
1000 | 0.0061 | 6.1 | 10.1381 | 1.66199 | 11.8125 | 1.93647 |
1000 | 0.0071 | 7.1 | 11.4544 | 1.61329 | 13.2598 | 1.86758 |
1000 | 0.0081 | 8.1 | 12.7486 | 1.5739 | 14.676 | 1.81186 |
1000 | 0.0091 | 9.1 | 14.0247 | 1.54118 | 16.0666 | 1.76557 |
10000 | 0.0001 | 1 | 2.63992 | 2.63992 | 3.31988 | 3.31988 |
10000 | 0.0011 | 11 | 16.4363 | 1.49421 | 18.6903 | 1.69912 |
10000 | 0.0021 | 21 | 28.5075 | 1.3575 | 31.6204 | 1.50573 |
10000 | 0.0031 | 31 | 40.117 | 1.2941 | 43.8972 | 1.41604 |
10000 | 0.0041 | 41 | 51.4796 | 1.2556 | 55.8248 | 1.36158 |
10000 | 0.0051 | 51 | 62.682 | 1.22906 | 67.5258 | 1.32404 |
10000 | 0.0061 | 61 | 73.7697 | 1.20934 | 79.0644 | 1.29614 |
10000 | 0.0071 | 71 | 84.7697 | 1.19394 | 90.4791 | 1.27435 |
10000 | 0.0081 | 81 | 95.7001 | 1.18148 | 101.795 | 1.25673 |
10000 | 0.0091 | 91 | 106.573 | 1.17113 | 113.03 | 1.24209 |
100000 | 0.0001 | 10 | 15.1859 | 1.51859 | 17.3361 | 1.73361 |
100000 | 0.0011 | 110 | 127.191 | 1.15628 | 134.319 | 1.22108 |
100000 | 0.0021 | 210 | 233.741 | 1.11305 | 243.585 | 1.15993 |
100000 | 0.0031 | 310 | 338.83 | 1.093 | 350.784 | 1.13156 |
100000 | 0.0041 | 410 | 443.139 | 1.08083 | 456.88 | 1.11434 |
100000 | 0.0051 | 510 | 546.942 | 1.07244 | 562.259 | 1.10247 |
100000 | 0.0061 | 610 | 650.381 | 1.0662 | 667.125 | 1.09365 |
100000 | 0.0071 | 710 | 753.544 | 1.06133 | 771.598 | 1.08676 |
100000 | 0.0081 | 810 | 856.486 | 1.05739 | 875.76 | 1.08119 |
100000 | 0.0091 | 910 | 959.247 | 1.05412 | 979.666 | 1.07656 |
The table shows us that in a not-atypical case, where a property occurs rarely and the size of a sample of "exposed" members of the population is only moderately large (in the 1000s say), it is possible that even an apparently-observed "effect" involving a factor of 2 difference is not to be regarded as "statistically significant". In such cases we say "there is a statistically weak relationship between whatever characterises the sample and exhibiting the property". Such is the case at present [c1996] for ELF exposure and genetic effects of ionising radiation exposure.
To bring this down to earth, consider a recent US finding about the effectiveness of using AZT in preventing HIV+ mothers passing the disease to their offspring. It has been reported some places that the number of HIV+ babies born in the US is around 2000 pa. It is also known that the majority of babies born to HIV+ mothers will not be HIV+ regardless of whether their mothers take AZT or not. In a number of studies it has been shown that babies born to HIV+ mothers given the drug during their final months have a "reduced risk" of being born HIV+ -- by as much as 2/3 in one study (i.e. babies born to HIV+ mothers also given AZT were found to be HIV+ in 1/3 the numbers of those babies born to HIV+ mothers not given AZT).