根据规则“安全”,无法将数组数据从dtype('O')转换为dtype('float64')

时间:2018-09-21 18:12:20

标签: python python-3.x statistics statsmodels

在运行我的Tukey测试时,它会给我这个错误:

  

无法根据以下说明将数组数据从dtype('O')转换为dtype('float64')   遵守“安全”规则

我的数据框头输出:

    Group    Score
3   A        1.91
4   B        1.7
5   C        1.69
6   D        1.68
7   E        1.49

我的Tukey测试代码:

from statsmodels.stats.multicomp import pairwise_tukeyhsd
from statsmodels.stats.multicomp import MultiComparison

mc = MultiComparison(df['Score'], df['Group'])
result = mc.tukeyhsd()

print(result)
print(mc.groupsunique)


> TypeError Traceback (most recent call
> last) <ipython-input-10-705a07612b72> in <module>()
>       1 mc = MultiComparison(df['Score'], df['Group'])
> ----> 2 result = mc.tukeyhsd()
>       3 
>       4 print(result)
>       5 print(mc.groupsunique)
> 
> /usr/local/lib/python3.6/dist-packages/statsmodels/sandbox/stats/multicomp.py
> in tukeyhsd(self, alpha)
>     964         self.groupstats = GroupsStats(
>     965                             np.column_stack([self.data, self.groupintlab]),
> --> 966                             useranks=False)
>     967 
>     968         gmeans = self.groupstats.groupmean
> 
> /usr/local/lib/python3.6/dist-packages/statsmodels/sandbox/stats/multicomp.py
> in __init__(self, x, useranks, uni, intlab)
>     535 
>     536         #temporary until separated and made all lazy
> --> 537         self.runbasic(useranks=useranks)
>     538 
>     539 
> 
> /usr/local/lib/python3.6/dist-packages/statsmodels/sandbox/stats/multicomp.py
> in runbasic(self, useranks)
>     569         else:
>     570             self.xx = x[:,0]
> --> 571         self.groupsum = groupranksum = np.bincount(self.intlab, weights=self.xx)
>     572         #print('groupranksum', groupranksum, groupranksum.shape, self.groupnobs.shape
>     573         # start at 1 for stats.rankdata :
> 
> TypeError: Cannot cast array data from dtype('O') to dtype('float64')
> according to the rule 'safe'

有人知道这意味着什么吗?

1 个答案:

答案 0 :(得分:1)

尝试替换行

mc = MultiComparison(df['Score'], df['Group'])

使用

mc = MultiComparison(df['Score'].astype('float'), df['Group'])

如果在此失败,则可能存在问题行。您可以改用以下方法解决此问题:

mc = MultiComparison(pd.to_numeric(df['Score'], errors='coerce'), df['Group'])