我在数据框nbr
下有
||Postal_Code|Borough|Neighborhood|
|0|M3A|North York|Parkwoods|
|1|M4A|North York|Victoria Village|
|2|M5A|Downtown Toronto|Harbourfront|
|3|M5A|Downtown Toronto|Regent Park|
|4|M6A|North York|Lawrence Heights|
|5|M6A|North York|Lawrence Manor|
|6|M7A|Queen’s Park|Queen’s Park|
我想运行Python代码,使第4行和第5行合并为1行,并返回如下结果:(我尝试了groupby
和agg
方法,但它们在这里不起作用)
||Postal_Code|Borough|Neighborhood|
|0|M3A|North York|Parkwoods|
|1|M4A|North York|Victoria Village|
|2|M5A|Downtown Toronto|Harbourfront|
|3|M5A|Downtown Toronto|Regent Park|
|4|M6A|North York|Lawrence Heights , Lawrence Manor|
|5|M7A|Queen’s Park|Queen’s Park|
下面的代码:
nbr1.index = pd.RangeIndex(len(nbr1.index))
More than one neighborhood can exist in one postal code area.
for row_index,row in nbr1.iterrows():
if(nbr1.loc[row_index,[‘Postal_Code’]].values.astype(‘str’) == nbr1.loc[row_index + 1,[‘Postal_Code’]].values.astype(‘str’)):
print(‘inside same Postal code’)
print(nbr1.loc[row_index,[‘Postal_Code’]].values.astype(‘str’))
print(nbr1.loc[row_index + 1,[‘Postal_Code’]].values.astype(‘str’))
if(nbr1.loc[row_index,['Borough']].values.astype('str') == nbr1.loc[row_index + 1,['Borough']].values.astype('str')):
print('inside same Borough')
print(nbr1.loc[row_index,['Borough']].values.astype('str'))
print(nbr1.loc[row_index + 1,['Borough']].values.astype('str'))
print(nbr1.loc[row_index,['Neighborhood']].values.astype('str'))
print(nbr1.loc[row_index + 1,['Neighborhood']].values.astype('str'))
print('Adding')
nbr1[row_index,['Neighborhood']] = nbr1.loc[row_index,['Neighbourhood']].values.astype('str').apply(lambda x: '-'.join(x +1), axis=1)
答案 0 :(得分:1)
您可以使用groupby
和agg
df.groupby('Postal_Code').agg({'Borough':'first',
'Neighborhood': ', '.join}).reset_index()
Postal_Code Borough Neighborhood
0 M3A North York Parkwoods
1 M4A North York Victoria Village
2 M5A Downtown Toronto Harbourfront, Regent Park
3 M6A North York Lawrence Heights, Lawrence Manor
4 M7A Queen’s Park Queen’s Park