我已经尝试过:
validator="/\d*$/"
文件夹结构为:
>>> df = [pd.read_csv(x,header=None,names=["L1","L2","cache","cached","result"]) for x in iglob(os.path.join("test","**","*.csv"), recursive=True)]
>>> df
[ L1 L2 cache cached result
0 0 0 0 0 0
1 1 2 3 4 5
2 1 1 1 1 1
3 2 2 2 2 2
4 4 4 4 4 4, L1 L2 cache cached result
0 1 2 3 4 5
1 1 2 3 4 5
2 3 4 5 6 7
3 2 1 3 2 4]
这两个文件包含:
0.2322.csv
test
|
|_______ wait
|
|______ 0.2322.csv
|______ 1.234.csv
1.234.csv
0,0,0,0,0
1,2,3,4,5
1,1,1,1,1
2,2,2,2,2
4,4,4,4,4
当我尝试从1,2,3,4,5
1,2,3,4,5
3,4,5,6,7
2,1,3,2,4
数组访问数据帧时,必须使用索引值为df
即0,1
来调用它。
但是我想用文件名作为索引df[0] and df[1]
和df["0.2322"]
来调用各个文件的数据帧。但是我不知道这怎么可能。请让我知道我能做些什么来达到我的期望。
答案 0 :(得分:1)
我认为您需要具有解析文件名且不带扩展名的字典理解:
import os
#https://stackoverflow.com/a/678242
df = {os.path.splitext(x)[0]: pd.read_csv(x,header=None,names=["L1","L2","cache","cached","result"]) for x in iglob(os.path.join("test","**","*.csv"), recursive=True)}
编辑:
#https://stackoverflow.com/a/37760212
df = {os.path.splitext(os.path.basename(x))[0]: pd.read_csv(x,header=None,names=["L1","L2","cache","cached","result"]) for x in iglob(os.path.join("test","**","*.csv"), recursive=True)}