第一步：读取这350个文件名

Question

我在python中的迭代过程有问题，我已经尝试过并搜索解决方案，但是我认为这比我的能力更复杂（仅供参考，我已经编写了1个月的代码）。

案例：
假设我有3个CSV文件（实际为350个文件），它们分别是file_1.csv，file_2.csv，file_3.csv。我已经完成了迭代过程/算法，将所有文件名都创建到一个列表中。

每个csv包含具有这么多行的单列。
即

#actual cvs much more like this:
# for file_1.csv:
value_1
value_2
value_3

以下不是实际的csv内容（我是说我已将其转换为数组/系列）

file_1.csv-> [[''value_1']，['value_2']，['value_3']]
file_2.csv-> [[''value_4']，['value_5']]
file_3.csv-> [[''value_6']]

#first step was done, storing csv files name to a list, so it can be read and use in csv function.

filename = ['file_1.csv', 'file_2.csv', 'file_3.csv']

我想要结果作为列表：

#assigning a empty list
result = []

所需结果

print (result)

out:
[{'keys': 'file_1', 'values': 'value_1, value_2, value_3'},
{'keys': 'file_2', 'values': 'value_4, value_5'}
{'keys': 'file_3', 'values': 'value_6'}]

请参见上方，结果键不再在文件名末尾包含（'.csv'），它们均被替换。并请注意，csv值（以前是列表或系列的列表）变成了单个字符串-用逗号分隔。

感谢您的帮助，非常感谢

Answer 1

我想尽我所能回答这个问题（我也是新手）。

第一步：读取这350个文件名

（如果您尚未弄清楚，可以在此步骤中使用glob模块）

定义放置文件的目录，比方说'C：\ Test'

directory = "C:/Test"
import glob
filename = sorted (glob.glob(directory, + "/*.csv"))

这将读取目录中的所有“ CSV”文件。

第2步：读取CSV文件并将其映射到字典

result = []
import os
for file in files:
    filename = str (os.path.basename(file).split('.')[0]) # removes the CSV extension from the filename
    with open (file, 'r') as infile:
        tempvalue = []
        tempdict = {}
        print (filename)
        for line in infile.readlines():
            tempvalue.append(line.strip()) # strips the lines and adds them to a list of temporary values
        value = ",".join(tempvalue)        # converts the temp list to a string
        tempdict[filename] = value         # Assigns the filename as key and the contents as value to a temporary dictionary
        result.append(tempdict)            # Adds the new temp dictionary for each file to the result list
print (result)

这段代码应该可以工作（尽管其他人可能共享一个更小，更多的pythonic代码）。

Answer 2

由于文件的内容似乎已经非常符合您所需要的格式（不包括行尾），并且列表中包含350个文件的名称，因此无需进行大量处理你需要做。主要是读取每个文件的内容并剥离换行符的问题。

例如：

import os

result = []

filenames = ['file_1.csv', 'file_2.csv', 'file_3.csv']

for name in filenames:
    # Set the filename minus extension as 'keys'
    file_data = {'keys': os.path.basename(name).split('.')[0]}
    with open(name) as f:
        # Read the entire file
        contents = f.read()
        # Strip the line endings (and trailing comma), and set as 'values'
        file_data['values'] = contents.replace(os.linesep, ' ').rstrip(',')
    result.append(file_data)

print(result)

将csv文件分配给字典（列表）集合，文件名作为键，文件内容作为值

2 个答案:

第一步：读取这350个文件名

第2步：读取CSV文件并将其映射到字典