如何将一组行数据转换为pandas中的数据帧

时间:2018-01-26 14:12:55

标签: python pandas dataframe

我有以下格式的输入数据集 input data format

输入数据由以下代码段生成。

   for i in range (0,10):
        my_list = df1.iloc[i].split(",")
        for x in my_list:
                if x in Waterbodies:
                    print(i,"Waterbodies")    
                if x in Beaches:
                    print(i,"Beaches")

我希望将它们添加到数据框中,如下面的格式。

output data format

我尝试了几个例子但都没有用。我该怎么办?

1 个答案:

答案 0 :(得分:1)

您可以使用:

df.groupby('ID')['Cat'].apply(list)

输出:

ID
0    [Waterbodies, Beaches]
1    [Waterbodies, Beaches]
3                 [Beaches]
7    [Waterbodies, Beaches]
8             [Waterbodies]
Name: Cat, dtype: object

这是一个MCVE:

d = pd.Series(['Waterbodies','Beaches','Waterbodies','Beaches','Beaches','Waterbodies','Beaches','Waterbodies'],index=[0,0,1,1,3,7,7,8])

d = d.rename('Cat')

d.index.name = 'id'

d.groupby('id').apply(list).reset_index()

输出:

   id                     Cat
0   0  [Waterbodies, Beaches]
1   1  [Waterbodies, Beaches]
2   3               [Beaches]
3   7  [Waterbodies, Beaches]
4   8           [Waterbodies]