A note on statistical significance

Summary: even a factor of 2 may not be "statistically significant"


If we're trying to establish the "statistical significance" of the proportion of a sample having a property compared with the proportion having the property in an overall or control population, an approximate "standard deviation of the sample proportion" is given by sqrt(P*(1-P)/n) where P is the proportion in the control population that have the property (the control population is assumed to be very large WRT to the size of the sample size n).

If members of the sample are thought to have a "significantly greater" proportion with the property than the control population we need to compare the "one sided" significance of p WRT P. To do this we check whether p > P + 1.64*sqrt(P*(1-P)/n) to be 95% certain p is significantly greater than P, or p > P + 2.32*sqrt(P*(1-P)/n) to be 99% certain it is significantly greater.

The basic procedure sometimes means p can appear to be "quite a bit larger" than P but still not be evidence it is significantly larger.

In many epidemiological studies all that can be said, if such a test fails, is that "there seems to be an increased risk of X, but it does not appear to be unambiguously related to [factors that characterise sample]". I.e. pure chance can't be ruled out as being completely responsible for what appears a larger proption of those in the sample having the property than members of the control population.

In published studies (e.g. for radiation effects) we are typically dealing with P of the order of 1/1000 to 10/1000 and n in the range 1000 to 100000. Let's make a table of showing the significance levels for samples for different P and n.

The table below shows a number of values for the proportion P of a population with a property, and a number of different sample sizes n to be compared with the control population. In each case the table shows the number of expected members of the sample with the property x, and the upper limits with 95% and 99% confidence for the expected number with the property x95 and x99. If the sample has been "exposed" to some effect, then the number in the sample with the property must exceed x95 and/or x99 to show statistical evidence the sample has a "higher" proportion with the property than the control population.

sample size (n) pop'n prob (P) expected number in sample (x) upb expected number at 95% confidence (x95) x95/x upb expected number at 99% confidence (x99) x99/x
10000.00010.10.6185886.185880.8336128.33612
10000.00111.12.81912.562823.53193.21082
10000.00212.14.474092.130525.458472.59927
10000.00313.15.983041.930017.178452.31563
10000.00414.17.413931.808288.7882.14342
10000.00515.18.794191.7243510.32592.02469
10000.00616.110.13811.6619911.81251.93647
10000.00717.111.45441.6132913.25981.86758
10000.00818.112.74861.573914.6761.81186
10000.00919.114.02471.5411816.06661.76557
100000.000112.639922.639923.319883.31988
100000.00111116.43631.4942118.69031.69912
100000.00212128.50751.357531.62041.50573
100000.00313140.1171.294143.89721.41604
100000.00414151.47961.255655.82481.36158
100000.00515162.6821.2290667.52581.32404
100000.00616173.76971.2093479.06441.29614
100000.00717184.76971.1939490.47911.27435
100000.00818195.70011.18148101.7951.25673
100000.009191106.5731.17113113.031.24209
1000000.00011015.18591.5185917.33611.73361
1000000.0011110127.1911.15628134.3191.22108
1000000.0021210233.7411.11305243.5851.15993
1000000.0031310338.831.093350.7841.13156
1000000.0041410443.1391.08083456.881.11434
1000000.0051510546.9421.07244562.2591.10247
1000000.0061610650.3811.0662667.1251.09365
1000000.0071710753.5441.06133771.5981.08676
1000000.0081810856.4861.05739875.761.08119
1000000.0091910959.2471.05412979.6661.07656

The table shows us that in a not-atypical case, where a property occurs rarely and the size of a sample of "exposed" members of the population is only moderately large (in the 1000s say), it is possible that even an apparently-observed "effect" involving a factor of 2 difference is not to be regarded as "statistically significant". In such cases we say "there is a statistically weak relationship between whatever characterises the sample and exhibiting the property". Such is the case at present [c1996] for ELF exposure and genetic effects of ionising radiation exposure.

To bring this down to earth, consider a recent US finding about the effectiveness of using AZT in preventing HIV+ mothers passing the disease to their offspring. It has been reported some places that the number of HIV+ babies born in the US is around 2000 pa. It is also known that the majority of babies born to HIV+ mothers will not be HIV+ regardless of whether their mothers take AZT or not. In a number of studies it has been shown that babies born to HIV+ mothers given the drug during their final months have a "reduced risk" of being born HIV+ -- by as much as 2/3 in one study (i.e. babies born to HIV+ mothers also given AZT were found to be HIV+ in 1/3 the numbers of those babies born to HIV+ mothers not given AZT).

Questions

  1. Considering the size of the samples and the proportions concerned (i.e. only 2000 HIV+ babies are considered to be born in the entire US pa whether their mothers received AZT or not and/or were part of the study or not) is the above finding likely to be "statistically significant"?

  2. Regardless of any "statistical" significance, is the finding important?


Kym Horsell /
Kym@KymHorsell.COM

ADVISORY: Email to these sites is filtered. Unsolicited email may be automajically re-directed to the relevant postmaster.