Data condition for HT about proportions (NecessaryForHTProportions). Is z-Test or \(\\chi^2\)-testvalid?

class cerebunit.statistics.data_conditions.forHTproportions.NecessaryForHTProportions

Checks for situations (sample size requirements) for which Hypothesis Testing About Proportions is valid, i.e, is t-Test (or standard z-score) valid? and also for Hypothesis Testing About proportions by \(\chi^2\) -test.

1. For z-test

Situation-1

With respect to distributions condition for hypothesis testing about proportions is valid if

Below are some rule-of-thumbs guide to check if an experiment if binomial:

  • it is repeated a fixed number of times
  • trials are independent
  • trial outcomes are either success or failure
  • probability of success is the same for all trials.

Situation-2

Hypothesis testing about proportions is also valid when both the quantities \(np\) and \(n(1 - p_0)\) are at least \(5^{\dagger}\). Note that, \(n\) is the sample size and \(p_0\) is the null value. Some consider \(10^{\ddagger}\) (instead of 5) as the lower bound.

  • Ott, R.L. (1998). An Introduction to Statistical Methods and Data Analysis. (p.370) \(^{\dagger}\)
  • Utts, J.M, Heckard, R.F. (2010). Mind on Statistics. (p.465) \(^{\ddagger}\)

2. For \(\chi^2\) -test

Situation-1

Dito as above.

Situation-2

The guidelines for large sample are

  • expected values in all the cell of the two-way contingency table should be greater than 1
  • number of cells with expected values greater than 5 should be at least \(80\%\) of the total number of cells

Note that, \(\chi^2\)-test may be performed even if the above large sample guidelines are violated but the results may not be valid. \(^{\dagger\dagger}\)

  • Utts, J.M, Heckard, R.F. (2010). Mind on Statistics. (p.588) \(^{\dagger\dagger}\)

Implementation

Method name Arguments
ask_for_ztest() n, p0, lb (optional)
ask_for_chi2test() expected valued two-way table
static ask_for_chi2test(expected_values)

This function checks if the sample size requirement for running hypothesis testing for proportions using \(\chi^2\)-test.

The argument is the expected values table of the 2 x K two-way contingency table.

Definition Meaning
\(n\) total number of cells in a two-way contingency table
\(n_1\) number of expectation values < 1
\(n_5\) number of expectation values > 5
\(lb_{80\%}\) \(80\%\) of the cells

Algorithm that asks if the distribution of an experimental data is normal.

Given: ex; expected values table
Get \(n \leftarrow ex.size\)
Get \(n_1 \leftarrow ex[ex<1].size\)
Get \(n_5 \leftarrow ex[ex>1].size\)
Compute \(lb_{80\%} \leftarrow ex[ex>1].size\)
Compute result2 \(\leftarrow\) 0.8 times n`
if \(n_1 < 0 \cap n_5 \geq lb_{80\%}\)
“sample size requirement is satisfied”
else
“sample size requirement is not satisfied”

Note:

  • boolean return (True or False)
  • True if sample size requirement is satisfied
  • False if sample size requirement is not satisfied
static ask_for_ztest(n, p0, lb=5)

This function checks if the sample size requirement for running hypothesis testing for proportions using z-test.

Arguments Meaning
first, n sample size
second, p0 null value
third, lb lower bound (5 (default))

Algorithm that asks if the distribution of an experimental data is normal.

Given: \(n, p_0, lb\)
Compute result1 \(\leftarrow\) \(np_0\)
Compute result2 \(\leftarrow\) \(n(1-p_0)\)
if result1 \(\cap\) result2 \(\geq lb\)
“sample size requirement is satisfied”
else
“sample size requirement is not satisfied”

Note:

  • boolean return (True or False)
  • True if sample size requirement is satisfied
  • False if sample size requirement is not satisfied
  • for two sample tests pass \(n_i\) and \(p_i\) in place of \(n\) and \(p_0\) respectively for \(i^{th}\) sample.