应用错误收集

我有一个数据集，其中包含两种玩具的批次中有缺陷的玩具的数量。像这样：

import pandas as pd

df = pd.DataFrame({'toy_type': ['A', 'B', 'A', 'A', 'A', 'B', 'B'],
                   'num_of_defective': [3, 5, 6, 4, 1, 2, 1, 0] )

我需要找出是否少于2个有缺陷的玩具在玩具A中比在玩具B中更频繁地出现在批次中。

我做到了，但是我不知道这是否成立，因为这种分布可能不正常：

from scipy.stats import ttest_ind

alpha = 0.05
ans = ttest_ind(df[(df['toy_type']=='A') & (df['num_of_defective']<2)]['num_of_defective'],
          df[(df['toy_type']=='B') & (df['num_of_defective']<2)]['num_of_defective'], 
                   equal_var = False)
if ans[1]<alpha:
    print('Its true')
else:
    print('Its false')

在这种情况下可以使用scipy.ttest_ind吗？

0 个答案: