Question

我有一个包含多个文本文件的文件夹（例如：164400）。每个文件都有数行浮动格式的几行（例如：x，y，z）。 Ny代码有时会读取一组3000个文件，并将这些值存储在一个dictionatry行中（参见示例）。

打开3000个文件时代码很慢。

[[points_dict[os.path.split(x)[1]].append(p) for p in open(x,"r")] for x in lf]

我想知道某人是否有最有效和快速的方法来阅读文件

file_folder = "C:\\junk" #where i stored my file
points_dict = defaultdict(list)
groups = groupby(file_folder, key=lambda k, line=count(): next(line) // 3000)
for k, group in groups:
    lf = [p for p in group]
    [[points_dict[os.path.split(x)[1]].append(p) for p in open(x,"r")] for x in lf]
# do other

其中函数**os.path.split(x)[1]**在字典中存储具有相同文件名（id）的行，**lf**是要打开的文件列表

Answer 1

使用numpy怎么样？沿着这些方向的东西（编辑的答案，经过测试的代码）

[points_dict[os.path.split(x)[1]].append(numpy.loadtxt(x, delimiter=",")) for x in lf]
for x, np_arrays in points_dict.iteritems():
    points_dict[x]=numpy.vstack(np_arrays)

最后，你得到一个漂亮的numpy阵列中的积分。

快速阅读列表理解中的几个文件的方法

1 个答案: