Question

想按日期时间从原始文件中选择数据并将其插入到csv文件中

data = pd.read_csv(r'dataset.csv', low_memory=False, header = None, sep = ',')
s = pd.Series(data.loc['4/1/2019 7:57':'4/1/2019 12:27' , data.index[1,8,15,22,29,36,43]])

data = pd.DataFrame(s)
data.to_csv('summary.csv', index = False, header = None)

错误是“数组索引过多”

<ipython-input-430-ca5724310254> in <module>
      1 # Load the dataset using Pandas
      2 data = pd.read_csv(r'Mill Operation U1.csv', low_memory=False, header = None, sep = ',')
----> 3 s = pd.Series(data.loc['4/1/2019 7:57':'4/1/2019 12:27' , data.index[1,8,15,22,29,36,43]])
      4 
      5 

~\Anaconda3\lib\site-packages\pandas\core\indexes\range.py in __getitem__(self, key)
    588 
    589         # fall back to Int64Index
--> 590         return super_getitem(key)
    591 
    592     def __floordiv__(self, other):

~\Anaconda3\lib\site-packages\pandas\core\indexes\base.py in __getitem__(self, key)
   3967 
   3968         key = com.values_from_object(key)
-> 3969         result = getitem(key)
   3970         if not is_scalar(result):
   3971             return promote(result)

IndexError: too many indices for array

有什么主意吗？

Answer 1

我相信您需要：

df = data.loc['4/1/2019 7:57':'4/1/2019 12:27', data.columns[[1,8,15,22,29,36,43]]]

示例：

idx = ['4/1/2019 6:57', '4/1/2019 7:57', '4/1/2019 8:57', '4/1/2019 9:57',
       '4/1/2019 12:27', '4/1/2019 15:57']
data = pd.DataFrame({
        'A':list('abcdef'),
         'B':[4,5,4,5,5,4],
         'C':[7,8,9,4,2,3],
         'D':[1,3,5,7,1,0],
         'E':[5,3,6,9,2,4],
         'F':list('aaabbb')
}, index=idx)

df = data.loc['4/1/2019 7:57':'4/1/2019 12:27', data.columns[[1,2,4]]]
print (df)
                B  C  E
4/1/2019 7:57   5  8  3
4/1/2019 8:57   4  9  6
4/1/2019 9:57   5  4  9
4/1/2019 12:27  5  2  2

如何选择多列和多行

1 个答案: