如何根据其他数据框的行将新列添加到数据框?

时间:2019-03-22 14:41:22

标签: python-3.x pandas dataframe row multiple-columns

我有两个数据框:

DF1(我刚刚重新采样了):

 Mi_pollution.head():



   Sensor_ID     Time_Instant    Measurement
0    10273   2013-11-01 00:00:00    46
1    10273   2013-11-01 01:00:00    51
2    10273   2013-11-01 02:00:00    39
3    10273   2013-11-01 03:00:00    30
4    10273   2013-11-01 04:00:00    37

我有DF2:

Pollutants.head():

    Sensor_ID     Sensor_Street_Name    Sensor_Lat  Sensor_Long  Sensor_Type   UOM   Time_Instant
 0  20020   Milano -via Carlo Pascal    45.478452   9.235016     Ammonia       µg/m   YYYY/MM/DD
 1  17127   Milano - viale Marche       45.496067   9.193023     Benzene       µg/m   YYYY/MM/DD HH24:MI
 2  17126   Milano -via Carlo Pascal    45.478452   9.235016     Benzene       µg/m   YYYY/MM/DD HH24:MI
 3  6057    Milano - via Senato         45.470780   9.197180     Benzene       µg/m   YYYY/MM/DD HH24:MI
 4  6062    Milano - P.zza Zavattari    45.476089   9.143509     Benzene       µg/m   YYYY/MM/DD HH24:MI

我要做的是基于污染物创建新列,并将其添加到DF1中,并根据Sensor分配每个测量值,例如:

 Sensor_ID    Time_Instant      Ammonia    Benzene   Nitrogene  …...
0   20020   2013-12-01 00:00:00   4.8       Nan       Nan
1   20020   2013-12-01 01:00:00   5.3       Nan       Nan
2   20020   2013-12-01 02:00:00   3.0       Nan       Nan
.
.
56  14330   2013-11-01 00:00:00   Nan      6.3        Nan
57  14330   2013-11-01 01:00:00   Nan      5.3        Nan
.
. 

任何建议将不胜感激,谢谢大家。

1 个答案:

答案 0 :(得分:0)

假设您要加入Sensor_ID(在您给出的小例子中,两个数据框之间没有共同的Sensor_IDs),则可以合并Sensor_ID上的df(可能还有Time_Instant?)。

然后,您可以使用pivot_table将行值(Sensor_Type)转置为列标题,然后用Measurement填充行值。

例如: df3 = df1.merge(df2, on='Sensor_ID', how='left')\ .pivot_table(index=['Sensor_ID','Sensor_Street_Name','Other columns'], values='Measurement', columns='Sensor_Type')\ .reset_index()