Question

df

date                     Score        sent_index
2017-03-02 01:01:04.000    0.038889      na
2017-03-02 01:12:10.726    0.112000      na
2017-03-02 01:33:58.001    -0.134991     na
2017-03-02 01:39:51.000     0            na
2017-03-02 01:39:52.000     -0.9338      0.87(example score from 01:01:04.000 to 01:39:52.000)    
 .
 .                           next hour scores
 .                           and so on up to 018
 2018-05-24 01:00:00.000

此表是熊猫数据框的标题，其中包含2017年至2018年的值。我已经在下一列中使用“得分”列计算了情绪得分：正计数，负计数和神经计数的数量，其代码如下：< / p>

sent_range=df[d_start:d_end]
pos = (sent_range.scores>0).sum()
neg =(sent_range.scores<0).sum()
other=(sent_range.scores==0).sum()
index=(pos-neg)/other

postiveCounts-负计数/所有计数（所有计数均在一小时内发生）

预期的数据框应如下所示：

date                      Score        sent_index
DELETED                    0.038889      na
DELETED                    0.112000      na
DELETED                    -0.134991     na
DELETED                     0            na
2017-03-02 02:00:00        -0.9338       0.87(example score from 01:01:04.000 to 01:39:52.000)    
 .
 .                           next hour scores
 .                           and so on up to 018
 2018-05-24 01:00:00.000

现在，sent_index现在为空，我想填写此列，并希望每小时将所有列“ date”的时间合并到一个中，因为在一个小时的时间段内，得分列（0.87）中的所有观察结果的得分均为1 。例如从2017-03-02 01:00:00到2017-03-02 1:59:00到单一时间2017-03-02 2:00:00（第一个小时的得分是0.87）。数据的此序列在以下两列中一直持续到2018年（大约：6000）：时间戳，极性和得分* 我将不胜感激。预先感谢。

计算每小时时间范围内的情绪指数

0 个答案: