组合groupby()和qcut() - “无法将字符串转换为float”

时间:2016-03-29 15:44:02

标签: python pandas

这是我的数据框,总共约5000个条目:

data.head(3)

    filename    date        var1    var2    age     sex
0   file1.jpg   2012-01-17  132.32  199.17  31.0    2.0
1   file2.jpg   2012-01-17  134.88  196.50  31.0    2.0
2   file3.jpg   2012-01-17  151.19  209.07  31.0    2.0
3   ...

我想根据var1将此数据集划分为10个分位数组。没问题:

data['var1_groups'] = pd.qcut(data['var1'], 10)
data.head(3)

    filename    date        var1    var2    age     sex   var1_groups
0   file1.jpg   2012-01-17  132.32  199.17  31.0    2.0   (129.488, 133.659]
1   file2.jpg   2012-01-17  134.88  196.50  31.0    2.0   (133.659, 138.176]
2   file3.jpg   2012-01-17  151.19  209.07  31.0    2.0   (148.196, 153.09]
3   ...

现在,在var1组中,我希望在age分位数组中进一步细分。 所以我试试这个:

data['age_groups'] = data.groupby(['var1_groups'])['age'].transform(lambda x: pd.qcut(x, 3))

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-164-998f598868ed> in <module>()
----> 1 data['age_groups'] = data.groupby(['var1_groups'])['age'].transform(lambda x: pd.qcut(x, 3))

/usr/lib/miniconda3/envs/python3.5/lib/python3.5/site-packages/pandas/core/groupby.py in transform(self, func, *args, **kwargs)
   2762 
   2763             indexer = self._get_index(name)
-> 2764             result[indexer] = res
   2765 
   2766         result = _possibly_downcast_to_dtype(result, dtype)

ValueError: could not convert string to float: '(53, 69]'   

这里发生了什么? Pandas是否尝试将结果类别转换回原始列的dtype或其他内容?

0 个答案:

没有答案