Question

我有一个数据帧名称列表，我想为其分配不同的数据帧数据。

filenames =[]

for i in np.arange(1,7):
    a = "C:\Users\...........\Python code\Cp error for MPE MR%s.csv" %(i)
    filenames.append(a)

dfs =[df1,df2,df3,df4,df5,df6]

for i, j in enumerate(filenames):
    dfs[j]= pd.DataFrame.from_csv(i,header=0, index_col=None)

但是，会出现以下错误代码：

NameError: name 'df1' is not defined

我定义值列表的方式有问题吗？为什么列表中的值不能被指定为变量？

如何将以下代码放入循环中？

df1 = pd.DataFrame.from_csv(filenames[0],header=0, index_col=None)
df2 = pd.DataFrame.from_csv(filenames[1],header=0, index_col=None)
df3 = pd.DataFrame.from_csv(filenames[2],header=0, index_col=None)
df4 = pd.DataFrame.from_csv(filenames[3],header=0, index_col=None)
df5 = pd.DataFrame.from_csv(filenames[4],header=0, index_col=None)
df6 = pd.DataFrame.from_csv(filenames[5],header=0, index_col=None)

Answer 1

您似乎需要dict comprehension，list文件的一种可能方式是使用glob：

示例文件：

a.csv，b.csv，c.csv。

files = glob.glob('files/*.csv')
#windows solution for files names - os.path.splitext(os.path.split(fp)[1])
dfs = {os.path.splitext(os.path.split(fp)[1])[0]:pd.read_csv(fp) for fp in files}
print (dfs)
{'b':    a  b  c  d
0  0  9  6  5
1  1  6  4  2, 'a':    a  b  c  d
0  0  1  2  5
1  1  5  8  3, 'c':    a  b  c  d
0  0  7  1  7
1  1  3  2  6}

print (dfs['a'])
   a  b  c  d
0  0  1  2  5
1  1  5  8  3

如果每个文件中的相同列可以按concat创建一个大的df：

df = pd.concat(dfs)
print (df)
     a  b  c  d
a 0  0  1  2  5
  1  1  5  8  3
b 0  0  9  6  5
  1  1  6  4  2
c 0  0  7  1  7
  1  1  3  2  6

编辑：
更好的是pd.DataFrame.from_csv使用read_csv：

全局变量解决方案：

#for df0, df1, df2...
for i, fp in enumerate(files):
    print (fp)
    df = pd.read_csv(fp, header=0, index_col=None)
    globals()['df' + str(i)] = df

print (df1)
   a  b  c  d
0  0  9  6  5
1  1  6  4  2

更好地解决DataFrames列表并按职位选择：

#for dfs[0], dfs[1], dfs[2]...
dfs = [pd.read_csv(fp, header=0, index_col=None) for fp in files]

print (dfs[1])
   a  b  c  d
0  0  9  6  5
1  1  6  4  2

Answer 2

dfs =[df1,df2,df3,df4,df5,df6]?

为什么这个字符串？为什么不应该：

dfs =[]

是的，我认为你换了i和j，它应该是这样的：

dfs.append(pd.DataFrame.from_csv(j,header=0, index_col=None))

枚举是多余的：

for f in filenames:
    dfs.append(pd.DataFrame.from_csv(f,header=0, index_col=None))

使用循环将数据框数据分配给数据框标签

2 个答案: