我具有以下数据帧,其中包含来自气象站的信息:
import pandas as pd
import numpy as np
df = pd.DataFrame({'Code Weather Station': ['1024', '1024', '1024', '2089',
'2089', '2089', '8974'],
'Instrumentation': ['Pluviometer-Analog', 'speedometer', 'incidence-sun',
'speedometer', 'Pluviometer', 'speedometer',
'Pluviometer']})
我想对每个气象站的仪器进行分组。
我尝试如下使用groupby以及sum()函数:
df_New = df.groupby('Code Weather Station', as_index=False)['Instrumentation'].sum()
结果符合预期。但是,我希望这些乐器之间有空隙。
print(df_New)
Code Weather Station Instrumentation
1024 Pluviometer-Analogspeedometerincidence-sun
2089 speedometerPluviometerspeedometer
8974 Pluviometer
我希望输出为:
Code Weather Station Instrumentation
1024 Pluviometer-Analog speedometer incidence-sun
2089 speedometer Pluviometer speedometer
8974 Pluviometer
谢谢。
答案 0 :(得分:1)
哦!像这样reset_index()
:
df.groupby('Code Weather Station')['Instrumentation'].apply(lambda x: ' '.join(x)).reset_index()
答案 1 :(得分:0)
您应避免使用apply
,因为它效率低下。您可以尝试以下操作:-
import pandas as pd
import numpy as np
df = pd.DataFrame({'Code Weather Station': ['1024', '1024', '1024', '2089',
'2089', '2089', '8974'],
'Instrumentation': ['Pluviometer-Analog', 'speedometer', 'incidence-sun',
'speedometer', 'Pluviometer', 'speedometer',
'Pluviometer']})
def process(x):
return " ".join(x)
df_new = df.groupby('Code Weather Station').agg({
'Instrumentation': [('Instrumentation', process)]
})
df_new.columns = df_new.columns.droplevel()
df_new