我希望将我的数据总和为每天一个数字(即所有分钟的总和)
我的数据看起来像这样:
Date negative_sentiment positive_sentiment neutral_sentiment compount_sentiment
2015.03.22.13.00 1.407692 3.655128 54.937179 3.698333
2015.03.22.13.01 1.839572 3.457345 54.702827 2.742424
2015.03.22.13.02 1.852847 3.187877 54.959512 2.649846
2015.03.22.13.03 1.758206 3.444771 54.762926 3.495089
2015.03.22.13.04 1.611731 3.274262 55.114041 2.847284
2015.03.22.13.05 1.833436 3.241374 54.907794 2.881480
,格式为:
Date datetime64[ns]
negative_sentiment float64
positive_sentiment float64
neutral_sentiment float64
compount_sentiment float64
dtype: object
我尝试了很多选项,但没有任何工作:
import pandas as pd
pd.set_option('display.width', 1000)
path_name = "C:/Users/Alex/Desktop/03_2015.csv"
data_sentimental = pd.read_csv(path_name, sep=';', header=None, names = ['Date', 'negative_sentiment', 'positive_sentiment','neutral_sentiment','compount_sentiment'])
# converting column 1 to datetime and assigning it back to column 1
data_sentimental['Date'] = pd.to_datetime(data_sentimental['Date'], format='%Y.%m.%d.%H.%M')
print(data_sentimental.dtypes) #giving us the type of data so we can be sure that we have the good type
data_sentimental = pd.DatetimeIndex(data_sentimental['Date']).normalize()
data_sentimental = data_sentimental.groupby(data_sentimental['Date'].dt.normalize())
但是这给了我这个错误:
Traceback (most recent call last):
File "C:/Users/Alex/PycharmProjects/master_thesis/result.py", line 19, in <module>
data_sentimental = data_sentimental.groupby(data_sentimental['Date'].dt.normalize())
File "C:\Users\Alex\AppData\Local\Programs\Python\Python36\lib\site-packages\pandas\core\indexes\datetimelike.py", line 267, in __getitem__
raise ValueError
谢谢你的帮助
答案 0 :(得分:0)
我找到了解决方案
df = data_sentimental
df = df.reset_index().set_index('Date').resample('1D').mean()
df = df.drop('index' , 1 )
感谢您的帮助