Question

这是我的代码：

path = 'C:\\Users\\Daniil\\Desktop\\dw_payments'
#list of all df:
all_files = glob.glob(path + '/*.csv')
all_payments_data = pd.DataFrame()
dfs = []
for file in all_files:
    df = pd.read_csv(file,index_col = None,chunksize = 200000)
    df_f = df[df['CUSTOMER_NO'] == 20069675]
    df_f = pd.concat(df_f,ignore_index = True)
    dfs.append(df_f)

all_payments_data = pd.concat(dfs)

正如您在行df_f = df[df['CUSTOMER_NO'] == 20069675]中看到的那样，我想在一个块中选择特定客户，然后将其合并到空数据框中。我想多次重复这个过程（有很多文件）。

但它给我一个错误：

TypeError: 'TextFileReader' object is not subscriptable

我该如何解决？

Answer 1

我认为您需要按TextFileReader进行迭代，过滤并附加到df_s。最后一次concat。

注意 - 所有文件的结构必须相同（相同列的名称相同）

df_s = []
for file in all_files:
    txt = pd.read_csv(file,index_col = None,chunksize = 200000)
    for df in txt:
        df_s.append(df[df['CUSTOMER_NO'] == 20069675])

df_f = pd.concat(df_s,ignore_index = True)

在大熊猫的块中进行分组

1 个答案: