Statistics

From Iron Chariots Wiki
(Difference between revisions)
Jump to: navigation, search
(expand yet again; try to tie to apologetics a bit)
(Statistical significance: wording)
 
(8 intermediate revisions by 3 users not shown)
Line 1: Line 1:
Along with [[probability]], upon which it is mostly based, '''statistics''' is a mathematical discipline which provides techniques for drawing conclusions from observed data. It is heavily relied upon in most scientific fields (experiments and observational studies), as well as in business and industry (marketing, operations management, economic analysis, quality control) and government (public polling, policy analysis).
+
{{wikipedia|color=#FDF7DE;}}
 +
Along with [[probability]], upon which it is mostly based, '''statistics''' is a mathematical discipline which provides techniques for drawing conclusions from observed data. It is heavily relied upon in most [[scientific]] fields (experiments and observational studies), as well as in business and industry (marketing, operations management, economic analysis, quality control) and government (public polling, policy analysis).
  
 
Statistics can be divided into two large subfields, ''descriptive statistics'' and ''inferential statistics'':
 
Statistics can be divided into two large subfields, ''descriptive statistics'' and ''inferential statistics'':
; Descriptive statistics : Numerical and graphical summaries of data — i.e., "charts and graphs". These are more in the public eye (think ''USA Today''), but they can be used by unscrupulous sorts to mislead ("lies, damned lies, and statistics").
+
; Descriptive statistics : Numerical and graphical summaries of data — i.e., "charts and graphs". These are more in the public eye (think ''[[Wikipedia:USA Today|USA Today]]''), but they can be used by unscrupulous sorts to mislead ("lies, damned lies, and statistics").
 
; Inferential statistics : Probability-based analysis used to [[infer]] something about a larger, mostly unobserved, ''population'' based on what is seen in a ''sample'' from that population. While the results of statistical analyses are often reported in the mainstream media (medical studies and the like), the details of the statistics behind them are usually left out and can often only be found in academic journals.
 
; Inferential statistics : Probability-based analysis used to [[infer]] something about a larger, mostly unobserved, ''population'' based on what is seen in a ''sample'' from that population. While the results of statistical analyses are often reported in the mainstream media (medical studies and the like), the details of the statistics behind them are usually left out and can often only be found in academic journals.
  
==Significance and inference==
+
==Statistical significance==
  
A central idea in inferential statistics that is widely misunderstood is '''[[statistical significance]]'''. In an [[experiment]] to test, say, whether a new drug to treat a disease is more effective than the old, standard treatment, the degree of improvement the new drug provides is said to be ''statistically significant'' if it is so large as to be unlikely to have occurred by chance alone.
+
A central idea in inferential statistics that is widely misunderstood is '''[[statistical significance]]'''. In an [[experiment]] to test, say, whether a new drug to treat a disease is more effective than the old, standard treatment, the degree of improvement the new drug provides is said to be ''statistically significant'' if it is so large that it would be unlikely to occur in a similarly sized sample if, in fact, there is actually no overall benefit of the drug when given to the entire population of such patients. See the [[statistical significance]] article for more information.
  
To be more specific, if one sees any improvement using the new drug over what is expected (or observed) using the standard treatment, there are two possible explanations:
+
{{Science}}
# There is actually no overall benefit to the new drug (i.e., if given to the entire population of people having this disease), and the (sample) results simply occurred "by chance" because of individual differences in how people respond to medical treatments (i.e., we just "happened" to get a sample of individuals that responded better than the average member of the population).
+
# There ''is'' a benefit to the new drug over the old (in the population), and the (sample) results are simply reflecting this fact.
+
 
+
The larger the amount of improvement actually observed in the sample (or the larger the [[sample size]] used in the experiment), the less convincing explanation #1 becomes, and the more convincing explanation #2 becomes. Statistics gives a way of calculating the ''probability'' that explanation #1 could be true (it is technically a [[conditional probability]], ''assuming'' there is no overall benefit, and is called a ''p-value''). The smaller this probability is, the more likely explanation #2 is the correct one.
+
 
+
Note, by the way, that just because an effect (the benefit of using the new drug over the old, in this case) is large enough to be statistically significant, that doesn't mean it is actually large in an absolute, real-world sense. The actual benefit of the new drug might not be worth the ''cost'' of the drug — in terms of money or even side-effects.
+
 
+
In any case, accepting explanation #2 as correct even though one has not actually tested the drug on every member of the population, makes the conclusion an '''[[inference]]''' and not a logical [[deduction]]. The decision could be wrong, even if the results are highly (statistically) significant. Explanation #1 could, in fact, be the truth. This is why any conclusion based on the statistical analysis of data is fundamentally subject to error (the kind of error being discussed here is usually called a "type I error" or "alpha error"). By carefully controlling for other sources of error (experimenter bias, data collection errors, etc.), and performing a valid statistical analysis, one can quantify the ''probability'' of making such an error — something that [[ad hoc]] and most other [[unscientific]] explanations cannot hope to achieve.
+
  
 
[[Category:Science]]
 
[[Category:Science]]

Latest revision as of 04:01, 22 April 2011

Wikipedia-logo-en.png
For more information, see the Wikipedia article:

Along with probability, upon which it is mostly based, statistics is a mathematical discipline which provides techniques for drawing conclusions from observed data. It is heavily relied upon in most scientific fields (experiments and observational studies), as well as in business and industry (marketing, operations management, economic analysis, quality control) and government (public polling, policy analysis).

Statistics can be divided into two large subfields, descriptive statistics and inferential statistics:

Descriptive statistics 
Numerical and graphical summaries of data — i.e., "charts and graphs". These are more in the public eye (think USA Today), but they can be used by unscrupulous sorts to mislead ("lies, damned lies, and statistics").
Inferential statistics 
Probability-based analysis used to infer something about a larger, mostly unobserved, population based on what is seen in a sample from that population. While the results of statistical analyses are often reported in the mainstream media (medical studies and the like), the details of the statistics behind them are usually left out and can often only be found in academic journals.

Statistical significance

A central idea in inferential statistics that is widely misunderstood is statistical significance. In an experiment to test, say, whether a new drug to treat a disease is more effective than the old, standard treatment, the degree of improvement the new drug provides is said to be statistically significant if it is so large that it would be unlikely to occur in a similarly sized sample if, in fact, there is actually no overall benefit of the drug when given to the entire population of such patients. See the statistical significance article for more information.


v · d Science
v · d General science
Scientific method   Scientific theory · Hypothesis · Evidence · Examining claims · Skepticism
Scientific Disciplines   Physics · Biology · Chemistry · Psychology · Medical Science · Mathematics
History of science   Heliocentrism · Quantum mechanics
Champions of reason   Carl Sagan · James Randi
v · d Biology
Evolution   Natural selection
Abiogenesis   The Urey-Miller experiment
Evolutionary straw men   Life just exploded from nothing · So you think we came from monkeys · How did the first dog find a mate · Crocoducks · Banana argument · 747 Junkyard argument · Irreducible complexity · Chuck Missler's jar of peanut butter · What good is half a wing?
Notable Biologists   Charles Darwin · Richard Dawkins · PZ Myers
Notable quacks   William Dembski · Michael Behe · Geoffrey Simmons · Ken Ham · Michael Cremo
v · d Physics
Concepts   Cosmology · Big bang · Relativity theory · Black holes · Quantum mechanics
Physics straw men   Fine-tuning argument · Anthropic principle
Notable Physicists   Isaac Newton · Albert Einstein · Richard Feynman · Stephen Hawking
Notable Quacks   Dinesh D'Souza · Ray Comfort
v · d Mathematics
Statistics   Sample size · Selection bias · Data mining · Standard divination · Statistical significance · Statistical probability · Meta probability · Gambler's fallacy
Mathematics and religion   Biblical value of pi
Personal tools
Namespaces
Variants
Actions
wiki navigation
IronChariots.Org
Toolbox