类型错误,iteratig

时间:2019-10-30 13:50:04

标签: python pandas numpy for-loop

我正在处理巨大的数据框:

reader = pd.read_csv("D:/...path.../test.csv", names=["id_easy","ordinal", "latitude", "longitude","epoch",'weekday'], 
                 parse_dates=['epoch'], chunksize=n_rows, error_bad_lines=False)

day_names = (('0:00', '1:00'),('1:00', '2:00'),('2:00', '3:00'),('3:00', '4:00'),('4:00', '5:00'),('5:00', '6:00'),
             ('6:00', '7:00'),('7:00', '8:00'),('8:00', '9:00'),('9:00', '10:00'),('10:00', '11:00'),('11:00', '12:00'),
             ('12:00', '13:00'),('13:00', '14:00'),('14:00', '15:00'),('15:00', '16:00'),('16:00', '17:00'),('17:00', '18:00'),
             ('18:00', '19:00'),('19:00', '20:00'),('20:00', '21:00'),('21:00', '22:00'),('22:00', '23:00'),('23:00', '00:00'))

for df in reader: 
    if not df.empty: 
        df['epoch'] = pd.to_datetime(df.epoch,unit = 's')
        df.index = pd.to_datetime(df.epoch)
        for day in day_names: 
            day_df = df.between_time[day] # ERROR IS HERE
            if not day_df.empty:
                day_df.to_csv(f'{day}.csv', index=False, header=False, mode='a')

  

TypeError:“方法”对象不可下标


所需的输出是24个.csv文件,例如:final1,final2,...,final24


样本数据:

e35f652a    68  11.9125 3.7432  1465084811  Sunday
e35f652a    69  11.8992 3.7412  1465084870  Sunday
e35f652a    70  11.8866 3.7342  1465084930  Sunday
e35f652a    71  11.8755 3.7321  1465084990  Sunday
e35f652a    72  11.8675 3.7247  1465085050  Sunday

某种程度上this的问题或多或少相似

1 个答案:

答案 0 :(得分:3)

因为DataFrame.between_time()将用于索引的[]更改为(),并通过索引选择元组的第一个和第二个值:

for day in day_names: 
    day_df = df.between_time(day[0], day[1])

或更改循环以打开元组:

for s, e in day_names: 
    day_df = df.between_time(s, e)