我有一个名为df1的数据框:
df1 = pd.read_csv('C:/Users/Demonstrator/Desktop/equipement3.csv',delimiter=';', usecols = ['TIMESTAMP','ACT_TIME_AERATEUR_1_F1'])
时间戳; ACT_TIME_AERATEUR_1_F1
2015-07-31 23:00:00; 90
2015-07-31 23:10:00; 0
2015-07-31 23:20:00; 0
2015-07-31 23:30:00; 0
2015-07-31 23:40:00; 0
2015-07-31 23:50:00; 0
2015-08-01 00:00:00; 0
2015-08-01 00:10:00; 50
2015-08-01 00:20:00; 0
2015-08-01 00:30:00; 0
2015-08-01 00:40:00; 0
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
from matplotlib import style
import pandas as pd
style.use('ggplot')
df1.index = pd.to_datetime(df1['TIMESTAMP'], format='%Y-%m-%d %H:%M:%S.%f')
df1 = df1.drop('TIMESTAMP', axis=1)
df1 = d1f.resample('resamplestring', how='mean')
我遇到了这种错误:
你可以帮助我吗?IndexError:只有整数,切片(
:
),省略号(...
), numpy.newaxis(None
)和整数或布尔数组是有效索引
谢谢
答案 0 :(得分:0)
您可以将参数parse_dates
和index_col
添加到read_csv
,然后使用resample
:
import pandas as pd
import io
temp=u"""TIMESTAMP;ACT_TIME_AERATEUR_1_F1
2015-07-31 23:00:00;90
2015-07-31 23:10:00;0
2015-07-31 23:20:00;0
2015-07-31 23:30:00;0
2015-07-31 23:40:00;0
2015-07-31 23:50:00;0
2015-08-01 00:00:00;0
2015-08-01 00:10:00;50
2015-08-01 00:20:00;0
2015-08-01 00:30:00;0
2015-08-01 00:40:00;0"""
#after testing replace io.StringIO(temp) to filename
df1 = pd.read_csv(io.StringIO(temp),
sep=";",
usecols = ['TIMESTAMP','ACT_TIME_AERATEUR_1_F1'],
parse_dates=['TIMESTAMP'],
index_col=['TIMESTAMP'] )
print (df1)
ACT_TIME_AERATEUR_1_F1
TIMESTAMP
2015-07-31 23:00:00 90
2015-07-31 23:10:00 0
2015-07-31 23:20:00 0
2015-07-31 23:30:00 0
2015-07-31 23:40:00 0
2015-07-31 23:50:00 0
2015-08-01 00:00:00 0
2015-08-01 00:10:00 50
2015-08-01 00:20:00 0
2015-08-01 00:30:00 0
2015-08-01 00:40:00 0
print (df1.index)
DatetimeIndex(['2015-07-31 23:00:00', '2015-07-31 23:10:00',
'2015-07-31 23:20:00', '2015-07-31 23:30:00',
'2015-07-31 23:40:00', '2015-07-31 23:50:00',
'2015-08-01 00:00:00', '2015-08-01 00:10:00',
'2015-08-01 00:20:00', '2015-08-01 00:30:00',
'2015-08-01 00:40:00'],
dtype='datetime64[ns]', name='TIMESTAMP', freq=None)
#pandas 0.18.0 and more
print (df1.resample('30Min').mean())
ACT_TIME_AERATEUR_1_F1
TIMESTAMP
2015-07-31 23:00:00 30.000000
2015-07-31 23:30:00 0.000000
2015-08-01 00:00:00 16.666667
2015-08-01 00:30:00 0.000000
#pandas bellow 0.18.0
print (df1.resample('30Min', how='mean'))
TIMESTAMP
2015-07-31 23:00:00 30.000000
2015-07-31 23:30:00 0.000000
2015-08-01 00:00:00 16.666667
2015-08-01 00:30:00 0.000000