我尝试使用python循环打开cvs文件列表。我首先想到了将文件打开到字典的想法,因为我被告知不要尝试动态创建名称,我尝试了以下代码
filenames = ["broaderRelationsSkillPillar.csv","ISCOGroups_en.csv"]
dataframes ={} ## create a dictionary
for i in filenames :
dataframes[i] = pd.read_csv(i)
for k ,v in dataframes.items():
[k] = pd.DataFrame.from_dict(dataframes[k])
请注意-我的问题在这里发生,我的结果只是2个数据帧之一
我可以直接从循环中打开此csv文件并即时为其命名吗?我大约有20个csv,并且正在尝试使代码自动化。 tks
d = {'col1': [1, 2], 'col2': [3, 4]}
a = {'col3': [1, 2], 'col4': [3, 4]}
c = {'col3': [1, 2], 'col4': [3, 4]}
d= pd.DataFrame(data=d)
a= pd.DataFrame(data=a)
c= pd.DataFrame(data=c)
filenames = [a ,d ,c]
dataframes ={} ## create a dictionary
for i in filenames :
dataframes[i] = i
del a , c, d
for k ,v in dataframes.items():
k = pd.from_dict(dataframes[k])
答案 0 :(得分:1)
我相信需要dict comprehension
来存储带有按文件名键的DataFrames字典:
dataframes = {i:pd.read_csv(i) for i in filenames}
print (dataframes['broaderRelationsSkillPillar.csv'])
print (dataframes['ISCOGroups_en.csv'])
或者可以通过建立索引来删除最后一个.csv
dataframes = {i[:-4]: pd.read_csv(i) for i in filenames}
print (dataframes['broaderRelationsSkillPillar'])
print (dataframes['ISCOGroups_en'])
示例数据帧:
df1 = pd.DataFrame({'A': ['a','a'],'B': list(range(2))})
df2 = pd.DataFrame({'C': ['b','f','s'],'D': list(range(3))})
df3 = pd.DataFrame({'E': ['f','g','h'],'F': list(range(3))})
print (df1)
A B
0 a 0
1 a 1
print (df2)
C D
0 b 0
1 f 1
2 s 2
print (df3)
E F
0 f 0
1 g 1
2 h 2
创建了dictionary of DataFrames
:
dataframes = {'file1':df1, 'file2':df2, 'file3':df3}
print (dataframes)
{'file1': A B
0 a 0
1 a 1, 'file2': C D
0 b 0
1 f 1
2 s 2, 'file3': E F
0 f 0
1 g 1
2 h 2}
对于DataFrame
,请选择key
-请选择file1
:
print (dataframes['file1'])
A B
0 a 0
1 a 1
循环v
是DataFrame
:
for k ,v in dataframes.items():
print (k)
print (v)
print (type(v))
file1
A B
0 a 0
1 a 1
<class 'pandas.core.frame.DataFrame'>
file2
C D
0 b 0
1 f 1
2 s 2
<class 'pandas.core.frame.DataFrame'>
file3
E F
0 f 0
1 g 1
2 h 2
<class 'pandas.core.frame.DataFrame'>
如果要循环修改DataFrames
,则需要使用df
中的key
来引用原始的dictionary
:
for k ,v in dataframes.items():
#modify df - e.g. add `a` to first column
v.iloc[:, 0] = v.iloc[:, 0] + 'a'
print (v)
dataframes[k] = v
A B
0 aa 0
1 aa 1
C D
0 ba 0
1 fa 1
2 sa 2
E F
0 fa 0
1 ga 1
2 ha 2
Dictionary of DataFrames
:
print (dataframes)
{'file1': A B
0 aa 0
1 aa 1, 'file2': C D
0 ba 0
1 fa 1
2 sa 2, 'file3': E F
0 fa 0
1 ga 1
2 ha 2}
选中一个DataFrame
:
print (dataframes['file1'])
A B
0 aa 0
1 aa 1