Strength of evidence

Thanks for visiting NZMaths.
We are preparing to close this site and currently expect this to be in June 2024
but we are reviewing this timing due to the large volume of content to move and
improvements needed to make it easier to find different types of content on
Tāhūrangi. We will update this message again shortly.

For more information visit https://tahurangi.education.govt.nz/updates-to-nzmaths

An assessment of how well data, collected to investigate an assertion or question, support the assertion or support a conclusion to the question. The assertion or question usually involves a comparison of a numerical variable between two categories of a category variable (i.e., that there is a link between the numerical variable and the category variable).

Data are collected to investigate the assertion or question. An estimate of a population parameter is calculated from the data. This observed estimate is often a difference between means but could be a difference between two proportions or a slope of a fitted regression line.

Could an estimate as big as the observed estimate be produced just by chance?

To answer this question, the effect of sampling variation alone on the estimate needs to be considered when it is assumed that there is no link between the two variables. If random allocation alone could easily produce an estimate as big as the observed estimate then the data cannot be interpreted as support for the existence of a link between the two variables. Values of the numerical variable obtained from the data collection are randomly allocated to the two categories of the category variable. An estimate is calculated from this ‘resampling using randomisation’ process. This process is repeated many times to form a distribution of estimates under sampling variation alone.

By comparing the observed estimate with the distribution of estimates, an assessment can be made of the strength of evidence the data provide for the assertion or provide for a conclusion to the question. This assessment is made by looking at the percentage of estimates under sampling variation alone that are at least as far from zero as the observed estimate.

If less than about 0.1% of the estimates produced by random allocation alone are at least as far from zero as the observed estimate then the data provide very strong evidence of a link between the two variables.

If about 1% of the estimates produced by random allocation alone are at least as far from zero as the observed estimate then the data provide strong evidence of a link between the two variables.

If about 5% of the estimates produced by random allocation alone are at least as far from zero as the observed estimate then the data provide some evidence of a link between the two variables.

If about 10% of the estimates produced by random allocation alone are at least as far from zero as the observed estimate then the data provide weak evidence of a link between the two variables.

If more than about 12% of the estimates produced by random allocation alone are at least as far from zero as the observed estimate then the data provide no evidence of a link between the two variables.

Two examples of using resampling using randomisation are provided in the description of randomisation. These examples include a conclusion about the strength of evidence the data provide for a link between the two variables.

See: randomisation, resampling

Curriculum achievement objectives reference

Statistical investigation: Level 8