Question

我有一个看起来像这样的数据框：

   Product   Order   Sales   Variable1  Variable2
0   AB12      500     47000    sdf         345
1   AC19      812     89300    sdf         4235
2   AD55      987     23280    wef         sdf
3   ID92      854     96821    sdf2        2342
4   OP23      851     98600    ewt         342

 .....
95  IU84      789     537850
96  OD93      785     218651

我想获取所有行，包括来自行（索引）3的所有列的值，其产品为ID92，直到索引号95，其乘积为IU84。另外，我希望所有列值都到销售为止，不包括变量1和变量2。

这是我当前的代码

df_total =pd.Dataframe()

for subdir, dirs, files in os.walk(data_location):
    for file in files:
        if file.endswith(".xls") or file.endswith(".xlsx"):
            df_file = pd.read_excel(subdir + '/' +file)
            a = int(df_file[df_file['Product']=="ID92"].index[0])
            b = int(df_file[df_file['Product']=="IU84"].index[0])
            selected = df_file.loc[a:b, :'Sales']

            df_total = pd.concat([selected, df_total], ignore_index=True)

但是我仍然收到一条错误消息，指出这两行“索引0超出了轴0的大小为0的范围”

a = int(df_file[df_file['Product']=="ID92"].index[0])
b = int(df_file[df_file['Product']=="IU84"].index[0])

我想知道是否是因为我将多个文件循环在一起。有人可以帮我解决这个错误吗？

Answer 1

如果这些是您需要的特定行，则：

df.loc[3:95,:]

根据列值获取某些行

1 个答案: