白天用熊猫蟒蛇加载数据

时间:2017-07-11 14:20:04

标签: python pandas

我希望将我的数据总和为每天一个数字(即所有分钟的总和)

我的数据看起来像这样:

    Date              negative_sentiment  positive_sentiment  neutral_sentiment  compount_sentiment
2015.03.22.13.00            1.407692            3.655128          54.937179            3.698333
2015.03.22.13.01            1.839572            3.457345          54.702827            2.742424
2015.03.22.13.02            1.852847            3.187877          54.959512            2.649846
2015.03.22.13.03            1.758206            3.444771          54.762926            3.495089
2015.03.22.13.04            1.611731            3.274262          55.114041            2.847284
2015.03.22.13.05            1.833436            3.241374          54.907794            2.881480

,格式为:

    Date                  datetime64[ns]
negative_sentiment           float64
positive_sentiment           float64
neutral_sentiment            float64
compount_sentiment           float64
dtype: object

我尝试了很多选项,但没有任何工作:

import pandas as pd

pd.set_option('display.width', 1000)

path_name = "C:/Users/Alex/Desktop/03_2015.csv"
data_sentimental = pd.read_csv(path_name, sep=';', header=None, names = ['Date', 'negative_sentiment', 'positive_sentiment','neutral_sentiment','compount_sentiment'])
# converting column 1 to datetime and assigning it back to column 1
data_sentimental['Date'] =  pd.to_datetime(data_sentimental['Date'], format='%Y.%m.%d.%H.%M')
print(data_sentimental.dtypes) #giving us the type of data so we can be sure that we have the good type

data_sentimental = pd.DatetimeIndex(data_sentimental['Date']).normalize()
data_sentimental = data_sentimental.groupby(data_sentimental['Date'].dt.normalize())

但是这给了我这个错误:

Traceback (most recent call last):
  File "C:/Users/Alex/PycharmProjects/master_thesis/result.py", line 19, in <module>
    data_sentimental = data_sentimental.groupby(data_sentimental['Date'].dt.normalize())
  File "C:\Users\Alex\AppData\Local\Programs\Python\Python36\lib\site-packages\pandas\core\indexes\datetimelike.py", line 267, in __getitem__
    raise ValueError

谢谢你的帮助

1 个答案:

答案 0 :(得分:0)

我找到了解决方案

df = data_sentimental
df = df.reset_index().set_index('Date').resample('1D').mean()
df = df.drop('index' , 1 )

感谢您的帮助