Data condition for HT about means (NecessaryForHTMeans). Is t-Test valid?

class cerebunit.statistics.data_conditions.forHTmeans.NecessaryForHTMeans

Checks for situations for which Hypothesis Testing About Means is valid, i.e, is t-Test (or standard z-score) valid?

Situation-1

For large sample sizes and randomly collected individuals one may assume that the population of the measurements (of interest) is normal. Thus condition for hypothesis testing about means is valid.

Situation-2

Hypothesis testing about means is also valid when there is not evidence of extreme outliers or skewed population shape. This is usually the case for population of the measurements that are approximately normal.

Implementation

Method name Arguments
ask() experimental_data
check_normal_population() experimental_data
check_skew_population() experimental_data
classmethod ask(question, experimental_data)

Depending on the question asked (normal? or skew?) this function checks if the distribution of the raw data is normal or skewed.

Algorithm that asks if the distribution of an experimental data is normal.

question = normal?

Given: experimental_data
get sample_size \(\leftarrow\) number of experimental data elements

Algorithm that asks if the distribution is skewed (a data may not be bell-shaped but symmetric).

question = skew?

Given: experimental_data

Note:

  • boolean return (True or False)
  • True if it is normal
  • True if it is skewed
static check_normal_population(data)

Tests if sample is from a normal distribution.

Algorithm to check if population is normal

Given: data
Parameter: \(\alpha = 0.001\)
Compute: p \(\leftarrow\) normaltest(data)
if p < \(\alpha\)
“data is normal”
else
“data is not normal”

Note:

  • \(\alpha\) is an arbitrarily small value, here taken as equal to 0.001
  • scipy.stats.normaltest is based on D’Agostino & Pearson’s omnibus test of normality.
  • “data is normal” => True and “data is not normal” => False
static check_skew_population(data)

Tests if the data is symmetric (not skewed).

Algorithm to check if population is not skewed

Given: data
Parameter: \(\beta = 0.001\)
Compute: s \(\leftarrow\) skew(data)
if s > \(\beta\)
“data is skewed”
else
“data is not skewed”

Note:

  • \(\beta\) is an arbitrarily small value, here taken as equal to 0.001
  • scipy.stats.skew is based on Zwillinger, D. and Kokoska, S. (2000). CRC Standard Probability and Statistics Tables and Formulae. Chapman & Hall: New York. 2000. Section 2.2.24.1. ISBN: ISBN 9780849300264 that uses Fisher-Pearson coefficient of skewness.
  • by default scipy.stats.skew is computed for bias = True. Fisher-Pearson standardized moment coefficient is the computed value for the corrected bias, i.e, bias = False.
  • “data is skewed” => True and “data is not skewed” => False

Below shows a rough road-map

Conditions for HT about means