Hypothesis testing about medians (HtestAboutMedians)

class cerebunit.statistics.hypothesis_testings.aboutmedians.HtestAboutMedians(observation, prediction, test={'name': 'sign_test', 'side': 'not_equal', 'z_statistic': 0.0})

Hypothesis Testing (significance testing) about medians.

This is a nonparameteric test that does not assume specific type of distribution and hence robust (valid over broad range of circumstances) and resistant (to influence of outliers) test.

1. Verify necessary data conditions.

Statistic Interpretation
data experiment/observed data array \(^{\dagger}\)
  • \(^{\dagger}\)
  • \(\overrightarrow{x} =\) experimental data for one sample testing
  • \(\overrightarrow{x} =\) (experimental - prediction) data for paired data testing
  • thus \(\eta =\) median of \(\overrightarrow{x}\)

2. Defining null and alternate hypotheses.

Statistic Interpretation
sample statistic, \(\eta\) experiment/observed median \(^{\dagger}\)
null value/population parameter, \(\eta_0\) prediction (specified value) \(^{\dagger}\)
null hypothesis, \(H_0\) \(\eta = \eta_0\)
alternate hypothesis, \(H_a\) \(\eta \neq or < or > \eta_0\)

Depending on whether testing is for a single sample or for paired data \(^{\dagger}\),

Statistic single sample paired data
\(\eta\) experiment/observed median median of (experiment - observed)
\(\eta_0\) model prediction 0
Two-sided hypothesis (default)
\(H_0\): \(\eta = \eta_0\) and \(H_a\): \(\eta \neq \eta_0\)
One-side hypothesis (left-sided)
\(H_0\): \(\eta = \eta_0\) and \(H_a\): \(\eta < \eta_0\)
One-side hypothesis (right-sided)
\(H_0\): \(\eta = \eta_0\) and \(H_a\): \(\eta > \eta_0\)

3. Assuming H0 is true, find p-value.

If the data is skewed, the non-parametric z-score is computed for Sign test.

Statistic Interpretation
\(s_{+}\) number of values in sample \(> \eta_0\)
\(s_{-}\) number of values in sample \(< \eta_0\)
\(n_U = s_{+} + s_{-}\) number of values in sample \(\neq \eta_0\)
z_statistic, z z = \(\frac{s_{+} - \frac{n_U}{2}}{\sqrt{\frac{n_U}{4}}}\)

If the data is not skewed, the non-parametric z-score is computed for Signed-rank test (Wilcoxon signed-rank test not Wilcoxon rank-sum test).

Statistic Interpretation
\(\overrightarrow{x}\) data \(^{\dagger}\)
\(|x_i-\eta_0|\) absolute difference between data values and null value
\(T\) ranks of the computed difference (excluding difference = 0 )
\(T^+\) sum of ranks \(\eta_0\); Wilcoxon signed-rank statistic
\(n_U\) number of values in data not equal to \(\eta_0\)
z_statistic, z z = \(\frac{T^+ - [n_U(n_U+1)/4]}{\sqrt{n_U(n_U+1)(2n_U+1)/24}}\)

Using z look up table for standard normal curve which will return its corresponding p.

4. Report and Answer the question, based on the p-value is the result (true H0) statistically significant?

Answer is not provided by the class but it is up to the person viewing the reported result. The reports are obtained calling the attributes .statistics and .description. This is illustrated below.

ht = HtestAboutMedians( observation, prediction, score,
                        side="less_than" ) # side is optional
score.description = ht.outcome
score.statistics = ht.statistics

Arguments

Argument Representation Value type
first experiment/observation
dictionary that must have keys;

“median”,”sample_size”,”raw_data”

second model prediction float or Quantity array
third

(keyword)

about test
dictionary with keywords: “name”: string (“sign_test”,
“signed_rank_test”);
“z_statistic”: float; “side”: string (“not_equal”,
“less_than”, “greater_than”);
and any additional names that is specific to the test

This constructor method generated statistics and outcome (which is then assigned to descirption within the validation test class where this hypothesis test class is implemented).

static alternate_hypothesis(side, symbol_null_value, symbol_sample_statistic)

Returns the statement for the alternate hypothesis, Ha.

get_below_equal_above(data)

Set values for the attributes .below, .equal, and .above the null value, \(\eta_0\) = .specified_value.

static null_hypothesis(symbol_null_value, symbol_sample_statistic)

Returns the statement for the null hypothesis, H0.

test_outcome()

Puts together the returned values of null_hypothesis(), alternate_hypothesis(), and _compute_pvalue(). Then returns the string value for .outcome.