我有2个数据帧,我想按如下方式相互连接:
df1:
index 394 min FIC-2000 398 min FFC
0 Recycle Gas min 20K20 Compressor min 20k
1 TT date kg/h AT date ..
2 nan 2011-03-02 -20.7 2011-03-02
08:00:00 08:00:00
3 nan 2011-03-02 -27.5 ...
08:00:10
df2:
index Unnamed:0 0 1 .. 394 395 .....
0 Service Prop Prop1 Recycle Gas RecG
输出df3应该像这样:
df3
index Unnamed:0 0 .. 394 395..
0 Service Prop Recycle Gas RecG
1 Recycle Gas min FIC-2000
2 min 20K20
3 TT date kg/h
4 nan 2011-03-02 -20.7
08:00:00
5 nan 2011-03-02 -27.5
08:00:10
我尝试使用此代码:
df3=pd.concat([df1,df2), axis=1)
,但这只是连续索引394,而df1的其余部分附加到df2数据帧的末尾。 知道怎么做吗?
答案 0 :(得分:0)
只需更改为axis=0
。
考虑一下:
输入:
>>> df
col1 col2 col3
0 1 4 2
1 2 1 5
2 3 6 319
>>> df_1
col4 col5 col6
0 1 4 12
1 32 12 3
2 3 2 319
>>> df_2
col1 col3 col6
0 12 14 2
1 4 132 3
2 23 22 9
Concat不匹配(按列名称)
>>> pd.concat([df, df_1], axis=0)
col1 col2 col3 col4 col5 col6
0 1.0 4.0 2.0 NaN NaN NaN
1 2.0 1.0 5.0 NaN NaN NaN
2 3.0 6.0 319.0 NaN NaN NaN
0 NaN NaN NaN 1.0 4.0 12.0
1 NaN NaN NaN 32.0 12.0 3.0
2 NaN NaN NaN 3.0 2.0 319.0
匹配匹配:
>>> pd.concat([df, df_1, df_2], axis=0)
col1 col2 col3 col4 col5 col6
0 1.0 4.0 2.0 NaN NaN NaN
1 2.0 1.0 5.0 NaN NaN NaN
2 3.0 6.0 319.0 NaN NaN NaN
0 NaN NaN NaN 1.0 4.0 12.0
1 NaN NaN NaN 32.0 12.0 3.0
2 NaN NaN NaN 3.0 2.0 319.0
0 12.0 NaN 14.0 NaN NaN 2.0
1 4.0 NaN 132.0 NaN NaN 3.0
2 23.0 NaN 22.0 NaN NaN 9.0
匹配的匹配字,填入NaN-s(从逻辑上讲,您可以填入None-s)
>>> pd.concat([df, df_1, df_2], axis=0).fillna(0) #in case you wish to prettify it, maybe in case of strings do .fillna('')
col1 col2 col3 col4 col5 col6
0 1.0 4.0 2.0 0.0 0.0 0.0
1 2.0 1.0 5.0 0.0 0.0 0.0
2 3.0 6.0 319.0 0.0 0.0 0.0
0 0.0 0.0 0.0 1.0 4.0 12.0
1 0.0 0.0 0.0 32.0 12.0 3.0
2 0.0 0.0 0.0 3.0 2.0 319.0
0 12.0 0.0 14.0 0.0 0.0 2.0
1 4.0 0.0 132.0 0.0 0.0 3.0
2 23.0 0.0 22.0 0.0 0.0 9.0
编辑 由与以下评论部分中与OP的对话触发。
所以您这样做:
(1)合并数据帧
df3=pd.concat([df1,df2], axis=0)
(2)在其上加入另一个数据框:
df5=pd.merge(df3, df4[["FIC", "min"]], on="FIC", how="outer")
(如果您认为后缀相关,则可能要考虑后缀字段) REF :https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.merge.html