我有两个分别为 df 和 cleaned_data 的熊猫数据框。
我正在努力计算与每种商品及其相应日期相关的价格变化百分比。 (最终,我打算使用情绪分析来预测股票新闻文章对价格走势的影响。)
计算规则:
如果该文章是在下午4点之后发布的(列after4
== 1),我希望价格的变化百分比为:
adj close(t+1) - adj close(t)/adj close(t) *100
如果在下午4点之前:
adj close(t) - adj close(t-1)/adj close(t-1) *100.
示例数据
cleaned_data数据帧如下:
id text Date after4
symbols
NFLX 1.01972E+18 Senate wants emergency alerts to go out throug... 2018-07-18 1
NFLX 1.01969E+18 Netflix NFLX just released quarterly 10-Q. Qu... 2018-07-18 1
NFLX 1.01969E+18 RT Excluding FAANG Stocks The S&P Would B... 2018-07-18 1
NFLX 1.01969E+18 RT Here is where FANG was trading the last ti... 2018-07-18 1
NFLX 1.01969E+18 RT Trader Takeaways 7.18.18 by Focus on SPY ... 2018-07-18 1
df数据框如下所示:
Date NFLX MTB GPS ES MOMO \
2 2018-07-16 400.480011 160.774506 26.513777 55.796215 42.161884
3 2018-07-17 379.480011 161.137909 26.891514 55.635551 43.164043
4 2018-07-18 375.130005 166.856628 27.125355 55.304787 43.335030
5 2018-07-19 364.230000 165.221344 27.620012 55.909626 42.347126
MAT HON ESS GRPN
2 44.696285 134.337509 224.523422 4.70
3 45.651787 136.255325 221.290680 4.64
4 46.597641 136.860947 220.187729 4.72
5 45.043739 135.383591 221.100479 4.75
我正在尝试在cleaned_data中创建一个新列,其中包含价格的%change,但是我是python新手,似乎找不到解决方法。