想按日期时间从原始文件中选择数据并将其插入到csv文件中
data = pd.read_csv(r'dataset.csv', low_memory=False, header = None, sep = ',')
s = pd.Series(data.loc['4/1/2019 7:57':'4/1/2019 12:27' , data.index[1,8,15,22,29,36,43]])
data = pd.DataFrame(s)
data.to_csv('summary.csv', index = False, header = None)
错误是“数组索引过多”
<ipython-input-430-ca5724310254> in <module>
1 # Load the dataset using Pandas
2 data = pd.read_csv(r'Mill Operation U1.csv', low_memory=False, header = None, sep = ',')
----> 3 s = pd.Series(data.loc['4/1/2019 7:57':'4/1/2019 12:27' , data.index[1,8,15,22,29,36,43]])
4
5
~\Anaconda3\lib\site-packages\pandas\core\indexes\range.py in __getitem__(self, key)
588
589 # fall back to Int64Index
--> 590 return super_getitem(key)
591
592 def __floordiv__(self, other):
~\Anaconda3\lib\site-packages\pandas\core\indexes\base.py in __getitem__(self, key)
3967
3968 key = com.values_from_object(key)
-> 3969 result = getitem(key)
3970 if not is_scalar(result):
3971 return promote(result)
IndexError: too many indices for array
有什么主意吗?
答案 0 :(得分:1)
我相信您需要:
df = data.loc['4/1/2019 7:57':'4/1/2019 12:27', data.columns[[1,8,15,22,29,36,43]]]
示例:
idx = ['4/1/2019 6:57', '4/1/2019 7:57', '4/1/2019 8:57', '4/1/2019 9:57',
'4/1/2019 12:27', '4/1/2019 15:57']
data = pd.DataFrame({
'A':list('abcdef'),
'B':[4,5,4,5,5,4],
'C':[7,8,9,4,2,3],
'D':[1,3,5,7,1,0],
'E':[5,3,6,9,2,4],
'F':list('aaabbb')
}, index=idx)
df = data.loc['4/1/2019 7:57':'4/1/2019 12:27', data.columns[[1,2,4]]]
print (df)
B C E
4/1/2019 7:57 5 8 3
4/1/2019 8:57 4 9 6
4/1/2019 9:57 5 4 9
4/1/2019 12:27 5 2 2