Question

我有一个名为df1的数据框：

df1 = pd.read_csv('C:/Users/Demonstrator/Desktop/equipement3.csv',delimiter=';', usecols = ['TIMESTAMP','ACT_TIME_AERATEUR_1_F1'])

时间戳; ACT_TIME_AERATEUR_1_F1

2015-07-31 23：00：00; 90

2015-07-31 23：10：00; 0

2015-07-31 23：20：00; 0

2015-07-31 23：30：00; 0

2015-07-31 23：40：00; 0

2015-07-31 23：50：00; 0

2015-08-01 00：00：00; 0

2015-08-01 00：10：00; 50

2015-08-01 00：20：00; 0

2015-08-01 00：30：00; 0

2015-08-01 00：40：00; 0

import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
from matplotlib import style
import pandas as pd
style.use('ggplot')


df1.index = pd.to_datetime(df1['TIMESTAMP'], format='%Y-%m-%d %H:%M:%S.%f')
df1 = df1.drop('TIMESTAMP', axis=1)
df1 = d1f.resample('resamplestring', how='mean')

我遇到了这种错误：

IndexError：只有整数，切片（:），省略号（...）， numpy.newaxis（None）和整数或布尔数组是有效索引

你可以帮助我吗？

谢谢

Answer 1

您可以将参数parse_dates和index_col添加到read_csv，然后使用resample：

import pandas as pd
import io

temp=u"""TIMESTAMP;ACT_TIME_AERATEUR_1_F1

2015-07-31 23:00:00;90

2015-07-31 23:10:00;0

2015-07-31 23:20:00;0

2015-07-31 23:30:00;0

2015-07-31 23:40:00;0

2015-07-31 23:50:00;0

2015-08-01 00:00:00;0

2015-08-01 00:10:00;50

2015-08-01 00:20:00;0

2015-08-01 00:30:00;0

2015-08-01 00:40:00;0"""
#after testing replace io.StringIO(temp) to filename
df1 = pd.read_csv(io.StringIO(temp), 
                  sep=";", 
                  usecols = ['TIMESTAMP','ACT_TIME_AERATEUR_1_F1'], 
                  parse_dates=['TIMESTAMP'],
                  index_col=['TIMESTAMP'] )

print (df1)
                     ACT_TIME_AERATEUR_1_F1
TIMESTAMP                                  
2015-07-31 23:00:00                      90
2015-07-31 23:10:00                       0
2015-07-31 23:20:00                       0
2015-07-31 23:30:00                       0
2015-07-31 23:40:00                       0
2015-07-31 23:50:00                       0
2015-08-01 00:00:00                       0
2015-08-01 00:10:00                      50
2015-08-01 00:20:00                       0
2015-08-01 00:30:00                       0
2015-08-01 00:40:00                       0

print (df1.index)
DatetimeIndex(['2015-07-31 23:00:00', '2015-07-31 23:10:00',
               '2015-07-31 23:20:00', '2015-07-31 23:30:00',
               '2015-07-31 23:40:00', '2015-07-31 23:50:00',
               '2015-08-01 00:00:00', '2015-08-01 00:10:00',
               '2015-08-01 00:20:00', '2015-08-01 00:30:00',
               '2015-08-01 00:40:00'],
              dtype='datetime64[ns]', name='TIMESTAMP', freq=None)

#pandas 0.18.0 and more 
print (df1.resample('30Min').mean())
                     ACT_TIME_AERATEUR_1_F1
TIMESTAMP                                  
2015-07-31 23:00:00               30.000000
2015-07-31 23:30:00                0.000000
2015-08-01 00:00:00               16.666667
2015-08-01 00:30:00                0.000000

#pandas bellow 0.18.0
print (df1.resample('30Min', how='mean'))
TIMESTAMP                                  
2015-07-31 23:00:00               30.000000
2015-07-31 23:30:00                0.000000
2015-08-01 00:00:00               16.666667
2015-08-01 00:30:00                0.000000

不可能用python重新采样日期

1 个答案: