在Pandas系列中按数据类型对列进行分组会引发TypeError:数据类型无法理解

时间:2018-12-08 00:50:55

标签: python pandas dataframe pandas-groupby

我正在按类型对值进行分组,如下所示:

groups = frame.columns.to_series().groupby(frame.dtypes).groups

我收到错误消息:

TypeError: data type not understood

按数据类型对列进行分组以防止此类错误的正确方法是什么?

编辑: 样本输入

0             0       0  1985     ATL        NL  barkele01  870000         428.0   428.0      1955.0    ...           Leonard Harold   225.0   77.0    R      R  1976-09-14  1987-09-26  barkl001  barkele01       both
1             1       1  1985     ATL        NL  bedrost01  550000         559.0   559.0      1957.0    ...            Stephen Wayne   200.0   75.0    R      R  1981-08-14  1995-08-09  bedrs001  bedrost01       both
2             2       2  1985     ATL        NL  benedbr01  545000         614.0   614.0      1955.0    ...              Bruce Edwin   175.0   73.0    R      R  1978-08-18  1989-09-11  beneb001  benedbr01       both
3             3       3  1985     ATL        NL   campri01  633333           NaN     NaN         NaN    ...                      NaN     NaN    NaN  NaN    NaN         NaN         NaN       NaN        NaN  left_only
4             4       4  1985     ATL        NL  ceronri01  625000        1466.0  1466.0      1954.0    ...             Richard Aldo   192.0   71.0    R      R  1975-08-17  1992-07-10  ceror001  ceronri01       both
5             5       5  1985     ATL        NL  chambch01  800000        1481.0  1481.0      1948.0    ...      Carroll Christopher   195.0   73.0    L      R  1971-05-28  1988-05-08  chamc001  chambch01       both

示例输出将像

{float: [columns], int:[columns], string:[columns]}

1 个答案:

答案 0 :(得分:1)

您可以在axis=1中使用groupby

type_dct = {str(k): list(v) for k, v in df.groupby(df.dtypes, axis=1)}

对于您的示例数据框,它给出:

{'int64': [0, 1, 2, 3, 7],
 'float64': [8, 9, 10, 14, 15],
 'object': [4, 5, 6, 11, 12, 13, 16, 17, 18, 19, 20, 21, 22]}

请注意,没有string系列与熊猫dtype这样的东西。 object dtype表示指向任意Python对象(包括字符串)的指针。有关更多详细信息,请参见Strings in a DataFrame, but dtype is object