不幸的是,我经历了许多类似查询的例子,但都没有成功。 我有两个数据框需要合并。
df1
. DATE HIGH LOW OPEN CLOSE
0 2013-01-04 10734.23 10602.24 10604.50 10688.11
1 2013-01-07 10743.69 10589.70 10743.69 10599.01
2 2013-01-08 10602.12 10463.43 10544.21 10508.06
3 2013-01-09 10620.70 10398.61 10405.67 10578.57
4 2013-01-10 10686.12 10619.65 10635.11 10652.64
5 2013-01-11 10830.43 10748.06 10786.14 10801.57
6 2013-01-15 10952.31 10851.66 10914.65 10879.08
7 2013-01-16 10806.41 10591.30 10806.41 10600.44
df2
. Date sentiment
0 2013-01-01 -0.027282
1 2013-01-02 0.063613
2 2013-01-03 0.091363
3 2013-01-04 0.092818
4 2013-01-05 -0.019002
5 2013-01-06 -0.033752
6 2013-01-07 0.060038
7 2013-01-08 0.081649
8 2013-01-09 -0.031924
9 2013-01-10 0.109111
10 2013-01-11 -0.057070
11 2013-01-12 -0.052431
12 2013-01-13 0.014726
13 2013-01-14 0.047232
14 2013-01-15 0.060790
15 2013-01-16 -0.067828
16 2013-01-17 -0.035174
使用的代码: merged_left = pd.merge(left = df1,right = df2,how ='left',left_on ='Date',right_on ='Date')
因此,我失去了情感数据中的所有内容,如下所示:
. Date HIGH LOW OPEN CLOSE sentiment
0 2013-01-04 10734.23 10602.24 10604.50 10688.11 NaN
1 2013-01-07 10743.69 10589.70 10743.69 10599.01 NaN
2 2013-01-08 10602.12 10463.43 10544.21 10508.06 NaN
3 2013-01-09 10620.70 10398.61 10405.67 10578.57 NaN
4 2013-01-10 10686.12 10619.65 10635.11 10652.64 NaN
5 2013-01-11 10830.43 10748.06 10786.14 10801.57 NaN
6 2013-01-15 10952.31 10851.66 10914.65 10879.08 NaN
7 2013-01-16 10806.41 10591.30 10806.41 10600.44 NaN
如下所示,df2是具有2157行的较大数据框,许多日期不在df(1447行)中...这些日期是 不需要,基本上我只想要df1中存在的相应日期的情感数据:
. Date HIGH LOW OPEN CLOSE sentiment
0 2013-01-04 10734.23 10602.24 10604.50 10688.11 0.092818
1 2013-01-07 10743.69 10589.70 10743.69 10599.01 0.060038
2 2013-01-08 10602.12 10463.43 10544.21 10508.06 0.081649
3 2013-01-09 10620.70 10398.61 10405.67 10578.57 -0.031924
4 2013-01-10 10686.12 10619.65 10635.11 10652.64 0.109111
5 2013-01-11 10830.43 10748.06 10786.14 10801.57 -0.057070
6 2013-01-15 10952.31 10851.66 10914.65 10879.08 0.060790
7 2013-01-16 10806.41 10591.30 10806.41 10600.44 -0.067828
任何帮助都会非常感激……整个周末都在解决这个问题。
答案 0 :(得分:0)
问题在于两列都需要日期时间,并且默认的内部联接也需要,因此how='inner'
应该省略:
df1['Date'] = pd.to_datetime(df1['Date'])
df2['Date'] = pd.to_datetime(df2['Date'])
merged_left = pd.merge(df1, df2, on='Date')