运行Tukey测试时出现一个奇怪的错误。我希望在我尝试了很多之后,有人能够为我提供帮助。这是我的数据框:
Name Score
1 A 2.29
2 B 2.19
这是我的Tukey测试代码:
#TUKEY HSD TEST
tukey = pairwise_tukeyhsd(endog=df['Score'].astype('float'),
groups=df['Name'],
alpha=0.05)
tukey.plot_simultaneous()
plt.vlines(x=49.57,ymin=-0.5,ymax=4.5, color="red")
tukey.summary()
这是错误:
<ipython-input-12-3e12e78a002f> in <module>()
2 tukey = pairwise_tukeyhsd(endog=df['Score'].astype('float'),
3 groups=df['Name'],
----> 4 alpha=0.05)
5
6 tukey.plot_simultaneous()
/usr/local/lib/python3.6/dist-packages/statsmodels/stats/multicomp.py in pairwise_tukeyhsd(endog, groups, alpha)
36 '''
37
---> 38 return MultiComparison(endog, groups).tukeyhsd(alpha=alpha)
/usr/local/lib/python3.6/dist-packages/statsmodels/sandbox/stats/multicomp.py in __init__(self, data, groups, group_order)
794 if group_order is None:
795 self.groupsunique, self.groupintlab = np.unique(groups,
--> 796 return_inverse=True)
797 else:
798 #check if group_order has any names not in groups
/usr/local/lib/python3.6/dist-packages/numpy/lib/arraysetops.py in unique(ar, return_index, return_inverse, return_counts, axis)
221 ar = np.asanyarray(ar)
222 if axis is None:
--> 223 return _unique1d(ar, return_index, return_inverse, return_counts)
224 if not (-ar.ndim <= axis < ar.ndim):
225 raise ValueError('Invalid axis kwarg specified for unique')
/usr/local/lib/python3.6/dist-packages/numpy/lib/arraysetops.py in _unique1d(ar, return_index, return_inverse, return_counts)
278
279 if optional_indices:
--> 280 perm = ar.argsort(kind='mergesort' if return_index else 'quicksort')
281 aux = ar[perm]
282 else:
**TypeError: '<' not supported between instances of 'float' and 'str'**
如何解决此错误?预先感谢!
答案 0 :(得分:1)
您有问题,因为df['Name']
包含浮点数和字符串,并且df['Name']
的类型为pandas.core.series.Series
。从追溯可以看出,这种组合会导致numpy.unique()
出现错误。您可以通过两种方法解决该问题。
tukey = pairwise_tukeyhsd(endog=df['Score'].astype('float'),
groups=list(df['Name']), # list instead of a Series
alpha=0.05)
OR
确保df['Name']
仅包含数字或字符串。