Data condition for HT about proportions (NecessaryForHTProportions
). Is z-Test or \(\\chi^2\)-testvalid?¶
-
class
cerebunit.statistics.data_conditions.forHTproportions.
NecessaryForHTProportions
¶ Checks for situations (sample size requirements) for which Hypothesis Testing About Proportions is valid, i.e, is t-Test (or standard z-score) valid? and also for Hypothesis Testing About proportions by \(\chi^2\) -test.
1. For z-test
Situation-1
With respect to distributions condition for hypothesis testing about proportions is valid if
- random sample (from population)
- data from binomial experiment with independent trials.
Below are some rule-of-thumbs guide to check if an experiment if binomial:
- it is repeated a fixed number of times
- trials are independent
- trial outcomes are either success or failure
- probability of success is the same for all trials.
Situation-2
Hypothesis testing about proportions is also valid when both the quantities \(np\) and \(n(1 - p_0)\) are at least \(5^{\dagger}\). Note that, \(n\) is the sample size and \(p_0\) is the null value. Some consider \(10^{\ddagger}\) (instead of 5) as the lower bound.
- Ott, R.L. (1998). An Introduction to Statistical Methods and Data Analysis. (p.370) \(^{\dagger}\)
- Utts, J.M, Heckard, R.F. (2010). Mind on Statistics. (p.465) \(^{\ddagger}\)
2. For \(\chi^2\) -test
Situation-1
Dito as above.
Situation-2
The guidelines for large sample are
- expected values in all the cell of the two-way contingency table should be greater than 1
- number of cells with expected values greater than 5 should be at least \(80\%\) of the total number of cells
Note that, \(\chi^2\)-test may be performed even if the above large sample guidelines are violated but the results may not be valid. \(^{\dagger\dagger}\)
- Utts, J.M, Heckard, R.F. (2010). Mind on Statistics. (p.588) \(^{\dagger\dagger}\)
Implementation
Method name Arguments ask_for_ztest()
n, p0, lb (optional) ask_for_chi2test()
expected valued two-way table -
static
ask_for_chi2test
(expected_values)¶ This function checks if the sample size requirement for running hypothesis testing for proportions using \(\chi^2\)-test.
The argument is the expected values table of the 2 x K two-way contingency table.
Definition Meaning \(n\) total number of cells in a two-way contingency table \(n_1\) number of expectation values < 1 \(n_5\) number of expectation values > 5 \(lb_{80\%}\) \(80\%\) of the cells Algorithm that asks if the distribution of an experimental data is normal.
Given: ex; expected values tableGet \(n \leftarrow ex.size\)Get \(n_1 \leftarrow ex[ex<1].size\)Get \(n_5 \leftarrow ex[ex>1].size\)Compute \(lb_{80\%} \leftarrow ex[ex>1].size\)Compute result2 \(\leftarrow\) 0.8 times n`if \(n_1 < 0 \cap n_5 \geq lb_{80\%}\)“sample size requirement is satisfied”else“sample size requirement is not satisfied”Note:
- boolean return (True or False)
- True if sample size requirement is satisfied
- False if sample size requirement is not satisfied
-
static
ask_for_ztest
(n, p0, lb=5)¶ This function checks if the sample size requirement for running hypothesis testing for proportions using z-test.
Arguments Meaning first, n sample size second, p0 null value third, lb lower bound (5 (default)) Algorithm that asks if the distribution of an experimental data is normal.
Given: \(n, p_0, lb\)Compute result1 \(\leftarrow\) \(np_0\)Compute result2 \(\leftarrow\) \(n(1-p_0)\)if result1 \(\cap\) result2 \(\geq lb\)“sample size requirement is satisfied”else“sample size requirement is not satisfied”Note:
- boolean return (True or False)
- True if sample size requirement is satisfied
- False if sample size requirement is not satisfied
- for two sample tests pass \(n_i\) and \(p_i\) in place of \(n\) and \(p_0\) respectively for \(i^{th}\) sample.