Compute test-statistic for Fisher Exact Test on proportions as the categorical variable (FisherExactScore)

class cerebunit.statistics.stat_scores.fisherExactScore.FisherExactScore(*args, **kwargs)

Compute fisher-statistic for Fisher Exact Test of proportions for 2 x 2 tables. This test should be called when sample size conditions for z-test (for proportions) and chi2-test (for proportions) are violated.

Consider a generic 2 x 2 table

Possibilities for categorical

variable, A
Possibilities for categorical variable, B
Yes No
a1 b1 b2
a2 b3 b4

Then depending on the question of interest, any one of the values (b1 to b4) in the cells will be its corresponding test statistic (i.e, score). The p-value is computed using the probability distribution, hypergeometric distribution.

This class uses scipy.stats.fisher_exact. The scipy.stats.fisher_exact returns two values; a ratio of the odds and the p-value.

Let us take the example

Groups people getting cold after
Yes No
echinacea herb 1 9
placebo 4 6

Then the ratio of the odds is \(\frac{1 \times 6}{4 \times 9}\). Also, for this scenario if the concern is whether the use of herb prevented acquiring cold then its test statistic is the number of colds acquired in the echinacea group.

Use Case:

x = FishExScoreForFisherExactTest( observation, prediction, "greater_than" )
score = FishExScoreForFisherExactTest(x)

what is the probability (pvalue) that only score value or fewer would be in the model group just by chance? If our null hypothesis would be that it is not by chance but highly probably then the test is right sided (greater).

Note: As part of the SciUnit framework this custom TScore should have the following methods,

  • compute() (class method)
  • sort_key() (property)
  • __str__()
classmethod compute(observation, prediction, sidedness)
Argument Value type
first argument |dictionary; observation/experimental data must
|must have keys “sample_size” and “success_numbers”
second argument |dictionary; model prediction must also have keys
|”sample_size” and “success_numbers”
third argument string; “not_equal”, “greater_than”, “less_than”

Note:

  • unlike most scores in CerebUnit this one takes in a third argument for side of hypothesis testing; this is because of scipy.stats.fisher_exact.