Statistical significance

(Difference between revisions)
Jump to: navigation, search
(#REDIRECT Statistics)
 
(might as well link this, too)
(5 intermediate revisions by 2 users not shown)
Line 1: Line 1:
#REDIRECT [[Statistics]]
+
'''Statistical significance''' is a concept from [[statistics]] referring essentially to the [[likelihood]] of something occurring merely "[[by chance]]".
 +
 
 +
==Surprise!==
 +
 
 +
Statistical significance is related to the everyday notion of "surprise". As an illustration, consider repeatedly tossing a coin. A coin is said to be "fair" if it is just as likely to come up "heads" as "tails". If one tosses a given coin, say, 20 times and sees it come up heads 10 times, then it is reasonable to assume the coin is fair, since the proportion of "heads" is one-half. If, however, the coin comes up heads all 20 times, it is reasonable to conclude that it is ''not'' a fair coin. This is intuitively obvious, but what is the conclusion based on? What if you got, say, 7 heads, or 15? What do you conclude? After all, it's actually likely that the number of heads obtained in 20 tosses would be more or less than 10, rather than ''exactly'' 10, because of the [[randomness]] involved!
 +
 
 +
In fact, the impression of whether the coin is fair or not is based on how ''surprising'' the results are. The farther the observed number of heads gets from 10 (half of the 20 tosses), the more surprised you would be. Somewhere between 10 and 20 heads, for example, there must be a certain point at which the assumption of fairness becomes untenable.
 +
 
 +
More formally, one can calculate the [[probability]] of seeing different numbers of heads in 20 tosses of a ''fair'' coin and use these probabilities to make the conclusion about ''our'' coin. The less likely our results would be when tossing a fair coin, the less likely our coin is fair. That is, the less likely a fair coin would do what our coin did, the more surprising our results are (assuming our coin was fair), and so the more reasonable it is to conclude that our coin is not fair.
 +
 
 +
==A more real-world example==
 +
 
 +
To see how these ideas are used in real-world studies, and to tie this idea of "surprise" to statistical significance, consider an [[experiment]] designed to test whether a new drug to treat a disease is more effective than the old, standard treatment. The degree of improvement the new drug provides in our study is said to be '''statistically significant''' if it is so large that it is unlikely to have occurred by chance alone.
 +
 
 +
To be more specific, if one sees any improvement using the new drug over what is expected (or observed) using the standard treatment, there are two possible explanations:
 +
# There is actually no overall benefit to the new drug (i.e., if given to the entire population of people having this disease), and the (sample) results simply occurred "by chance" because of individual differences in how people respond to medical treatments (i.e., we just "happened" to get a sample of individuals that responded better than the average member of the population).
 +
# There ''is'' a benefit to the new drug over the old (in the population), and the (sample) results are simply reflecting this fact.
 +
 
 +
The larger the amount of improvement actually observed in the sample (or the larger the [[sample size]] used in the experiment), the less convincing explanation #1 becomes, and the more convincing explanation #2 becomes. Statistics gives a way of calculating the ''probability'' that explanation #1 could be true (it is technically a [[conditional probability]], ''assuming'' there is no overall benefit, and is called a ''[[wikipedia:p-value|p-value]]''). The smaller this probability is, the more likely explanation #2 is the correct one.
 +
 
 +
Note, by the way, that just because an "effect" (the benefit of using the new drug over the old, in this case) is large enough to be statistically significant, that doesn't mean it is actually large in an absolute, real-world sense. For example, the new drug might only give a small benefit (that is nonetheless significant because of the sample size used in the experiment) that is outweighed by considerations of cost or side-effects.
 +
 
 +
In any case, accepting explanation #2 as correct even though one has not actually tested the drug on every member of the population, makes the conclusion an '''[[inference]]''' and not a logical [[deduction]]. The decision could be wrong, even if the results are highly (statistically) significant. Explanation #1 could, in fact, be the truth. This is why any conclusion based on the statistical analysis of data is fundamentally subject to error (the kind of error being discussed here is usually called a "[[wikipedia:type I error|type I error]]" or "alpha error"). By carefully controlling for other sources of error (experimenter bias, data collection errors, etc.), and performing a valid statistical analysis, one can quantify the ''probability'' of making such an error — something that [[ad hoc]] (and most other [[unscientific]]) explanations cannot hope to achieve.
 +
 
 +
{{Science}}
 +
 
 +
[[Category:Science]]
 +
[[Category:Philosophy]]

Revision as of 04:59, 22 April 2011

Statistical significance is a concept from statistics referring essentially to the likelihood of something occurring merely "by chance".

Surprise!

Statistical significance is related to the everyday notion of "surprise". As an illustration, consider repeatedly tossing a coin. A coin is said to be "fair" if it is just as likely to come up "heads" as "tails". If one tosses a given coin, say, 20 times and sees it come up heads 10 times, then it is reasonable to assume the coin is fair, since the proportion of "heads" is one-half. If, however, the coin comes up heads all 20 times, it is reasonable to conclude that it is not a fair coin. This is intuitively obvious, but what is the conclusion based on? What if you got, say, 7 heads, or 15? What do you conclude? After all, it's actually likely that the number of heads obtained in 20 tosses would be more or less than 10, rather than exactly 10, because of the randomness involved!

In fact, the impression of whether the coin is fair or not is based on how surprising the results are. The farther the observed number of heads gets from 10 (half of the 20 tosses), the more surprised you would be. Somewhere between 10 and 20 heads, for example, there must be a certain point at which the assumption of fairness becomes untenable.

More formally, one can calculate the probability of seeing different numbers of heads in 20 tosses of a fair coin and use these probabilities to make the conclusion about our coin. The less likely our results would be when tossing a fair coin, the less likely our coin is fair. That is, the less likely a fair coin would do what our coin did, the more surprising our results are (assuming our coin was fair), and so the more reasonable it is to conclude that our coin is not fair.

A more real-world example

To see how these ideas are used in real-world studies, and to tie this idea of "surprise" to statistical significance, consider an experiment designed to test whether a new drug to treat a disease is more effective than the old, standard treatment. The degree of improvement the new drug provides in our study is said to be statistically significant if it is so large that it is unlikely to have occurred by chance alone.

To be more specific, if one sees any improvement using the new drug over what is expected (or observed) using the standard treatment, there are two possible explanations:

  1. There is actually no overall benefit to the new drug (i.e., if given to the entire population of people having this disease), and the (sample) results simply occurred "by chance" because of individual differences in how people respond to medical treatments (i.e., we just "happened" to get a sample of individuals that responded better than the average member of the population).
  2. There is a benefit to the new drug over the old (in the population), and the (sample) results are simply reflecting this fact.

The larger the amount of improvement actually observed in the sample (or the larger the sample size used in the experiment), the less convincing explanation #1 becomes, and the more convincing explanation #2 becomes. Statistics gives a way of calculating the probability that explanation #1 could be true (it is technically a conditional probability, assuming there is no overall benefit, and is called a p-value). The smaller this probability is, the more likely explanation #2 is the correct one.

Note, by the way, that just because an "effect" (the benefit of using the new drug over the old, in this case) is large enough to be statistically significant, that doesn't mean it is actually large in an absolute, real-world sense. For example, the new drug might only give a small benefit (that is nonetheless significant because of the sample size used in the experiment) that is outweighed by considerations of cost or side-effects.

In any case, accepting explanation #2 as correct even though one has not actually tested the drug on every member of the population, makes the conclusion an inference and not a logical deduction. The decision could be wrong, even if the results are highly (statistically) significant. Explanation #1 could, in fact, be the truth. This is why any conclusion based on the statistical analysis of data is fundamentally subject to error (the kind of error being discussed here is usually called a "type I error" or "alpha error"). By carefully controlling for other sources of error (experimenter bias, data collection errors, etc.), and performing a valid statistical analysis, one can quantify the probability of making such an error — something that ad hoc (and most other unscientific) explanations cannot hope to achieve.


v · d Science
v · d General science
Scientific method   Scientific theory · Hypothesis · Evidence · Examining claims · Skepticism
Scientific Disciplines   Physics · Biology · Chemistry · Psychology · Medical Science · Mathematics
History of science   Library of Alexandria · Aristotle · Dark ages · Renaissance · The enlightenment · Heliocentrism · Newtonian physics · Darwinian evolution · Mendelian genetics · Relativity Theory · Quantum mechanics · Space exploration · Computer sciences · String theory
Champions of reason   Carl Sagan · Karl Kruszelnicki · Julius Sumner Miller · John Allan Paulos · James Randi
v · d Biology
Evolution   Overview of genetics · Genetic mutation · Hereditary change · Natural selection · Adaptation
Abiogenesis   Possible theories of abiogenesis · Building blocks of life · The Urey-Miller experiment
Evolutionary straw men   Life just exploded from nothing · So you think we came from monkeys · How did the first dog find a mate · Crocoducks · Banana argument · 747 Junkyard argument · Irreducible complexity · Chuck Missler's jar of peanut butter · What good is half a wing?
Notable Biologists   Charles Darwin · Alfred Russel Wallace · Thomas Huxley · Gregor Mendel · Stanley Miller · Norman Borlaug · Richard Lenski · Jerry Coyne · Richard Dawkins · PZ Myers
Notable quacks   William Dembski · Michael Behe · Geoffrey Simmons · Ken Ham · Michael Cremo
v · d Physics
Cosmology   Big bang · Relativity theory · The cosmos · Black holes
Quantum mechanics   Heisenberg Principle · Schrödinger's cat · Atomic decay
Physics straw men   Fine-tuning argument · Anthropic principle · Quantum mechanics and free will · Quantum mechanics and the after life · Quantum mechanics and Naturopathy · Something can't come from nothing
Notable Physicists   Plato · Isaac Newton · Albert Einstein · Maxwell Plank · Niels Bohr · Werner Heisenberg · Richard Feynman · Erwin Schrödinger · Freeman Dyson · Roger Penrose · Neil deGrasse Tyson · Stephen Hawking · Micho Kaku
Notable Quacks   Dinesh D'Souza · Ray Comfort
v · d Mathematics
Mathematics   Overview of mathematics · Numbers in reality · History of numbers
Statistics   Sample size · Selection bias · Data mining · Standard divination · Statistical significance · Statistical probability · Meta probability · Gambler's fallacy
Mathematics in nature   Golden ratio · Golden spiral · Fibonacci sequence
Mathematics and religion   Biblical value of pi · Noah's flood
Personal tools
Namespaces
Variants
Actions
wiki navigation
IronChariots.Org
Toolbox