合并数据框中的两行

时间:2017-11-22 12:53:57

标签: python-3.x pandas

我有一个名为Side的列的数据框,下面的示例中的值为E或W.我想将这两行合并为一行。会发生什么:Parking_Spaces Total_Vehicle_Count列必须是两行的总和,必须删除侧列,行数必须是之前的一半。

有一个简单的方法吗?

Elmntkey    Study_Area  Sub_Area    Side    Unitdesc Parking_Category   Parking_Spaces  Total_Vehicle_Count Dp_Count    Construction    Event Closure   Subarea Label   Peak Hour? (Yes or No)  Day Time stamp                                                      
2014-04-08 08:00:00 24558   12th Ave - Weekday  unknown E   12TH AVE BETWEEN E MARION ST AND E SPRING ST    Paid Parking    8.0 1.0 0   No  No  12th Ave - Weekday  No  Weekday
2014-04-08 08:00:00 24557   12th Ave - Weekday  unknown W   12TH AVE BETWEEN E MARION ST AND E SPRING ST    Paid Parking    11.0    6.0 1   No  No  12th Ave - Weekday  No  Weekday
2014-04-08 09:00:00 24557   12th Ave - Weekday  unknown W   12TH AVE BETWEEN E MARION ST AND E SPRING ST    Paid Parking    11.0    6.0 1   No  No  12th Ave - Weekday  No  Weekday
2014-04-08 09:00:00 24558   12th Ave - Weekday  unknown E   12TH AVE BETWEEN E MARION ST AND E SPRING ST    Paid Parking    8.0 1.0 0   No  No  12th Ave - Weekday  No  Weekday
2014-04-08 10:00:00 24557   12th Ave - Weekday  unknown W   12TH AVE BETWEEN E MARION ST AND E SPRING ST    Paid Parking    11.0    10.0    1   No  No  12th Ave - Weekday  No  Weekday

2 个答案:

答案 0 :(得分:1)

可以使用df.groupby

完成此操作
df.groupby(['Elmntkey','Study_Area','Sub_Area',' Unitdesc','Dp_Count',' Construction',' Event Closure','Subarea Label','Peak Hour? (Yes or No)','Day Time stamp'])[['Parking_Spaces','Total_Vehicle_Count']].sum().reset_index()

输出

   Elmntkey          Study_Area Sub_Area                                      Unitdesc  Dp_Count  Construction  Event Closure       Subarea Label Peak Hour? (Yes or No) Day Time stamp Parking_Spaces  Total_Vehicle_Count
0     24557  12th Ave - Weekday  unknown  12TH AVE BETWEEN E MARION ST AND E SPRING ST         1            No             No  12th Ave - Weekday                     No        Weekday           33.0                 22.0
1     24558  12th Ave - Weekday  unknown  12TH AVE BETWEEN E MARION ST AND E SPRING ST         0            No             No  12th Ave - Weekday                     No        Weekday           16.0                  2.0

答案 1 :(得分:0)

根据Shijos的回答,我用以下代码解决了这个问题:

#Getting the information
temp = df['raw'].groupby(['Time_Stamp','Unitdesc',], as_index=False)['Parking_Spaces','Total_Vehicle_Count'].sum()

#setting Time_Stamp as index and sort by the index, to match the target dataframe
temp = temp.set_index('Time_Stamp')
temp.sort_index(inplace=True)

# save the result to the target dataframe
df['droped']['Free_Spots'] = temp['Parking_Spaces']
df['droped']['Used_Spots'] = temp['Total_Vehicle_Count']

Shijo因提供正确答案而受到赞誉。