我有两个数据帧:第一个数据帧" fgblquotef"样本是:
DateTimesy VWPfgbmy
59 2014-09-05 06:00:24.033000 127.687514
60 2014-09-05 06:00:24.436000 127.687933
61 2014-09-05 06:00:24.597000 127.687746
62 2014-09-05 06:00:24.891000 127.687752
63 2014-09-05 06:00:25.178000 127.687730
64 2014-09-05 06:00:25.227000 127.687741
65 2014-09-05 06:00:26.035000 127.687651
66 2014-09-05 06:00:26.667000 127.689970
71 2014-09-05 06:00:26.677000 127.692642
72 2014-09-05 06:00:26.681000 127.692571
73 2014-09-05 06:00:26.688000 127.696051
75 2014-09-05 06:00:26.700000 127.696051
76 2014-09-05 06:00:26.702000 127.695850
79 2014-09-05 06:00:27.216000 127.687548
80 2014-09-05 06:00:27.910000 127.687512
81 2014-09-05 06:00:28.208000 127.687524
82 2014-09-05 06:00:28.289000 127.687436
83 2014-09-05 06:00:28.717000 127.687436
85 2014-09-05 06:00:28.998000 127.686910
87 2014-09-05 06:00:29.035000 127.687043
88 2014-09-05 06:00:29.062000 127.687534
89 2014-09-05 06:00:29.099000 127.687059
90 2014-09-05 06:00:29.327000 127.686843
91 2014-09-05 06:00:29.386000 127.686811
92 2014-09-05 06:00:29.505000 127.686984
93 2014-09-05 06:00:29.571000 127.686931
94 2014-09-05 06:00:29.602000 127.686989
96 2014-09-05 06:00:29.958000 127.686771
97 2014-09-05 06:00:29.960000 127.686759
98 2014-09-05 06:00:29.962000 127.686673
和第二个" df":
DateTimesx DateTimesy
2 2014-09-05 06:00:23.596000 2014-09-05 06:00:24.596000
3 2014-09-05 06:00:23.644000 2014-09-05 06:00:24.644000
4 2014-09-05 06:00:23.694000 2014-09-05 06:00:24.694000
5 2014-09-05 06:00:23.744000 2014-09-05 06:00:24.744000
6 2014-09-05 06:00:23.794000 2014-09-05 06:00:24.794000
7 2014-09-05 06:00:23.844000 2014-09-05 06:00:24.844000
8 2014-09-05 06:00:23.894000 2014-09-05 06:00:24.894000
9 2014-09-05 06:00:24.044000 2014-09-05 06:00:25.044000
10 2014-09-05 06:00:24.294000 2014-09-05 06:00:25.294000
11 2014-09-05 06:00:24.394000 2014-09-05 06:00:25.394000
12 2014-09-05 06:00:24.444000 2014-09-05 06:00:25.444000
13 2014-09-05 06:00:24.544000 2014-09-05 06:00:25.544000
14 2014-09-05 06:00:24.694000 2014-09-05 06:00:25.694000
15 2014-09-05 06:00:24.794000 2014-09-05 06:00:25.794000
16 2014-09-05 06:00:24.844000 2014-09-05 06:00:25.844000
17 2014-09-05 06:00:25.294000 2014-09-05 06:00:26.294000
18 2014-09-05 06:00:25.394000 2014-09-05 06:00:26.394000
19 2014-09-05 06:00:25.694000 2014-09-05 06:00:26.694000
20 2014-09-05 06:00:25.794000 2014-09-05 06:00:26.794000
21 2014-09-05 06:00:26.044000 2014-09-05 06:00:27.044000
22 2014-09-05 06:00:26.294000 2014-09-05 06:00:27.294000
23 2014-09-05 06:00:26.544000 2014-09-05 06:00:27.544000
24 2014-09-05 06:00:26.694000 2014-09-05 06:00:27.694000
25 2014-09-05 06:00:28.344000 2014-09-05 06:00:29.344000
26 2014-09-05 06:00:29.044000 2014-09-05 06:00:30.044000
27 2014-09-05 06:00:29.094000 2014-09-05 06:00:30.094000
28 2014-09-05 06:00:29.144000 2014-09-05 06:00:30.144000
29 2014-09-05 06:00:29.394000 2014-09-05 06:00:30.394000
30 2014-09-05 06:00:29.744000 2014-09-05 06:00:30.744000
31 2014-09-05 06:00:29.894000 2014-09-05 06:00:30.894000
第二个数据帧" df"列df [" DateTimesy"]使用:
创建td = pd.to_timedelta(1, unit= "s")
df["DateTimesy"] = df["DateTimesx"] + td
然后我合并使用:
df2 = pd.merge(df, fgbmquotef, on = "DateTimesy", how = "outer")
然而我得到了结果:
DateTimesx DateTimesy VWPfgbmy
0 2014-09-05 06:00:23.596000 2014-09-05 06:00:24.596000 NaN
1 2014-09-05 06:00:23.644000 2014-09-05 06:00:24.644000 NaN
2 2014-09-05 06:00:23.694000 2014-09-05 06:00:24.694000 NaN
3 2014-09-05 06:00:23.744000 2014-09-05 06:00:24.744000 NaN
4 2014-09-05 06:00:23.794000 2014-09-05 06:00:24.794000 NaN
5 2014-09-05 06:00:23.844000 2014-09-05 06:00:24.844000 NaN
6 2014-09-05 06:00:23.894000 2014-09-05 06:00:24.894000 NaN
7 2014-09-05 06:00:24.044000 2014-09-05 06:00:25.044000 NaN
8 2014-09-05 06:00:24.294000 2014-09-05 06:00:25.294000 NaN
9 2014-09-05 06:00:24.394000 2014-09-05 06:00:25.394000 NaN
10 2014-09-05 06:00:24.444000 2014-09-05 06:00:25.444000 NaN
11 2014-09-05 06:00:24.544000 2014-09-05 06:00:25.544000 NaN
12 2014-09-05 06:00:24.694000 2014-09-05 06:00:25.694000 NaN
13 2014-09-05 06:00:24.794000 2014-09-05 06:00:25.794000 NaN
14 2014-09-05 06:00:24.844000 2014-09-05 06:00:25.844000 NaN
15 2014-09-05 06:00:25.294000 2014-09-05 06:00:26.294000 NaN
16 2014-09-05 06:00:25.394000 2014-09-05 06:00:26.394000 NaN
17 2014-09-05 06:00:25.694000 2014-09-05 06:00:26.694000 NaN
18 2014-09-05 06:00:25.794000 2014-09-05 06:00:26.794000 NaN
19 2014-09-05 06:00:26.044000 2014-09-05 06:00:27.044000 NaN
20 2014-09-05 06:00:26.294000 2014-09-05 06:00:27.294000 NaN
21 2014-09-05 06:00:26.544000 2014-09-05 06:00:27.544000 NaN
22 2014-09-05 06:00:26.694000 2014-09-05 06:00:27.694000 NaN
23 2014-09-05 06:00:28.344000 2014-09-05 06:00:29.344000 NaN
24 2014-09-05 06:00:29.044000 2014-09-05 06:00:30.044000 NaN
25 2014-09-05 06:00:29.094000 2014-09-05 06:00:30.094000 NaN
26 2014-09-05 06:00:29.144000 2014-09-05 06:00:30.144000 NaN
27 2014-09-05 06:00:29.394000 2014-09-05 06:00:30.394000 NaN
28 2014-09-05 06:00:29.744000 2014-09-05 06:00:30.744000 NaN
29 2014-09-05 06:00:29.894000 2014-09-05 06:00:30.894000 NaN
哪个错了,因为应该有" fgblquotef"条目也在那里混合而不仅仅是" df"条目。谁能解释一下这里发生了什么以及我犯了什么错误?
答案 0 :(得分:1)
也许:
df2 = pd.merge(df,fgbmquotef,left_on =“DateTimesy”,right_on =“DateTimesy”,how =“outer”)#尽管你不应该这样做。
尝试:
df2 = pd.merge(df.set_index("DateTimesy"), fgbmquotef.set_index("DateTimesy"), left_index=True, right_index=True, how = "outer")
df2 = pd.merge(df.set_index("DateTimesy", drop=False), fgbmquotef.set_index("DateTimesy", drop=False), left_index=True, right_index=True, how = "outer", suffixes = ('_df', '_fgbmquotef'))
或没有后缀:
df2 = pd.merge(df.set_index("DateTimesy", drop=False), fgbmquotef.set_index("DateTimesy", drop=False), left_index=True, right_index=True, how = "outer")
最后尝试连接函数:http://pandas.pydata.org/pandas-docs/stable/merging.html#concatenating-objects