Compute chi^2-statistic for chi^2 goodness-of-fit test on proportions of categories of a categorical variable (`Chi2GOFScore`)¶

class cerebunit.statistics.stat_scores.chi2GOFScore.Chi2GOFScore(*args, **kwargs)¶

Compute chi2-statistic for chi-squared goodness-of-fit Test of proportions.

One may think of this as a one-way contingency table.

sample size \(n\)	\(k\) categories of a categorial variable of interest
sample size \(n\)	\(x_1\)	\(x_2\)	\(\ldots\)	\(x_k\)
observations	\(O_1\)	\(O_2\)	\(\ldots\)	\(O_k\)
probabilities	\(p_1\)	\(p_2\)	\(\ldots\)	\(p_k\)
expected	\(np_1\)	\(np_2\)	\(\ldots\)	\(np_k\)

Notice that for probabilities of k categories \(\sum_{\forall i} p_i = 1\). The expected counts for each category can be derived from it (or already given) such that \(\sum_{\forall i} np_i = n\).

Definitions	Interpretation
\(n\)	sample size; total number of experiments done
\(k\)	number of categorical variables
\(O_i\)	observed count (frequency) for \(i^{th}\) variable
\(p_i\)	probability for \(i^{th}\) category such that \(\sum_{\forall i} p_i = 1\)
\(E_i\)	expected count for \(i^{th}\) category such that \(E_i = n p_i\)
test-statistic	\(\chi^2 = \sum_{\forall i} \frac{(O_i - E_i)^2}{E_i}\)
\(df\)	degrees of freedom, \(df = k-1\)

Note the modification made when compared with a two-way \(\chi^2\) test is

the calculation of expected counts \(E_i = n p_i\)
the degree of freedom \(df = k-1\)

This class uses scipy.stats.chisquare.

Use Case:

x = Chi2GOFScoreForProportionChi2GOFTest.compute( observation, prediction )
score = Chi2GOFScoreForProportionChi2GOFTest(x)

Note: As part of the SciUnit framework this custom TScore should have the following methods,

compute() (class method)
sort_key() (property)
__str__()

classmethod compute(observation, prediction)¶

Argument	Value type
first argument	dictionary; observation/experimental data must have keys “sample_size” with a number as its value and “observed_freq” whose value is an array
second argument	dictionary; model prediction must have either “probabilities” or “expected” whose value is an array (same length as “observed_freq”)

Note:

chi squared tests (for goodness-of-fit or contingency table) by nature are two-sided so there is not option for one-sidedness.

Previous topic

Next topic

This Page

Compute chi^2-statistic for chi^2 goodness-of-fit test on proportions of categories of a categorical variable (`Chi2GOFScore`)¶

Compute chi^2-statistic for chi^2 goodness-of-fit test on proportions of categories of a categorical variable (Chi2GOFScore)¶

Compute chi^2-statistic for chi^2 goodness-of-fit test on proportions of categories of a categorical variable (`Chi2GOFScore`)¶