## Testing hypotheses suggested by the data - Wikipedia, the free encyclopedia

--------------------

** Testing hypotheses suggested by the data **

Not to be confused with Post hoc analysis.

material may be challenged and removed. /(January 2008)/

In statistics, patterns suggested by a given dataset, when tested with same
the dataset that suggested them, are likely to be accepted even when they
are not true. This is because circular reasoning (double dipping) would be
involved: something seems true in the limited data set, therefore we
hypothesize that it is true in general, therefore we (wrongly) test it on
the same limited data set, which seems to confirm that it is true.
Generating hypotheses based on data already observed, in the absence of
testing them on new data, is referred to as *post hoc theorizing* (from
Latin /post hoc/, "after this").

The correct procedure is to test any hypothesis on a data set that was not
used to generate the hypothesis.

*Contents*

· 1 Example of fallacious acceptance of a hypothesis
· 2 The general problem
· 3 Correct procedures
· 5 Notes and references

*Example of fallacious acceptance of a hypothesis*

Suppose fifty different researchers, unaware of each other's work, run
clinical trials to test whether Vitamin X is efficacious in treating
cancer. Forty-nine of them find no significant differences between
measurements done on patients who have taken Vitamin X and those who have
taken a placebo. The fiftieth study finds a big difference, but the
difference is of a size that one would expect to see in about one of every
fifty studies even if vitamin X has

Source: en.wikipedia.org/wiki/Testing_hypotheses_suggested_by_the_data