在大熊猫的块中进行分组

时间:2017-09-06 08:18:19

标签: python pandas

这是我的代码:

path = 'C:\\Users\\Daniil\\Desktop\\dw_payments'
#list of all df:
all_files = glob.glob(path + '/*.csv')
all_payments_data = pd.DataFrame()
dfs = []
for file in all_files:
    df = pd.read_csv(file,index_col = None,chunksize = 200000)
    df_f = df[df['CUSTOMER_NO'] == 20069675]
    df_f = pd.concat(df_f,ignore_index = True)
    dfs.append(df_f)

all_payments_data = pd.concat(dfs)

正如您在行df_f = df[df['CUSTOMER_NO'] == 20069675]中看到的那样,我想在一个块中选择特定客户,然后将其合并到空数据框中。我想多次重复这个过程(有很多文件)。

但它给我一个错误:

TypeError: 'TextFileReader' object is not subscriptable 

我该如何解决?

1 个答案:

答案 0 :(得分:2)

我认为您需要按TextFileReader进行迭代,过滤并附加到df_s。最后一次concat

注意 - 所有文件的结构必须相同(相同列的名称相同)

df_s = []
for file in all_files:
    txt = pd.read_csv(file,index_col = None,chunksize = 200000)
    for df in txt:
        df_s.append(df[df['CUSTOMER_NO'] == 20069675])

df_f = pd.concat(df_s,ignore_index = True)
相关问题