statistics - t test in python on a frequency table -


if have 2 lists of numbers x , y, can run t-test on them using scipy.stats.ttest_ind(x,y). far good. if instead of x , y, have frequency counts; there pythonic way run efficient t test or have "manually" calculate original vectors?

edit (frequency count): if x = [1,0,3,0,1,3,2] corresponding frequency count is:

+---+---+ | 0 | 2 | | 1 | 2 | | 2 | 1 | | 3 | 2 | +---+---+ 

where first column value , second corresponding count/frequency.

you can use rv_discrete scipy.stats generate data according distribution marked frequencies.

using example of frequency counts provide in edit, generate random variable this,

import scipy.stats stats  x = [0, 1, 2, 3] freq = [2, 2, 1, 2] total = sum(freq) p = [i/total in freq] custm = stats.rv_discrete(name='custm', values=(x, p)) 

where take account vector of probabilities p has sum 1.

and can generate data distribution easily,

in [7]: custm.rvs(size=7)  out[7]: array([2, 0, 3, 1, 3, 2, 0]) 

hope helps.


Comments

Popular posts from this blog

How to show in django cms breadcrumbs full path? -

php - Invalid Cofiguration - yii\base\InvalidConfigException - Yii2 -

ruby on rails - npm error: tunneling socket could not be established, cause=connect ETIMEDOUT -