我有以下格式的输入数据集 input data format
输入数据由以下代码段生成。
for i in range (0,10):
my_list = df1.iloc[i].split(",")
for x in my_list:
if x in Waterbodies:
print(i,"Waterbodies")
if x in Beaches:
print(i,"Beaches")
我希望将它们添加到数据框中,如下面的格式。
我尝试了几个例子但都没有用。我该怎么办?
答案 0 :(得分:1)
您可以使用:
df.groupby('ID')['Cat'].apply(list)
输出:
ID
0 [Waterbodies, Beaches]
1 [Waterbodies, Beaches]
3 [Beaches]
7 [Waterbodies, Beaches]
8 [Waterbodies]
Name: Cat, dtype: object
这是一个MCVE:
d = pd.Series(['Waterbodies','Beaches','Waterbodies','Beaches','Beaches','Waterbodies','Beaches','Waterbodies'],index=[0,0,1,1,3,7,7,8])
d = d.rename('Cat')
d.index.name = 'id'
d.groupby('id').apply(list).reset_index()
输出:
id Cat
0 0 [Waterbodies, Beaches]
1 1 [Waterbodies, Beaches]
2 3 [Beaches]
3 7 [Waterbodies, Beaches]
4 8 [Waterbodies]