Question

Python的新手，几乎没有。我有一个包含20,000个csvs的文件夹，对于我来说很重要的文件夹以'IC'开始，以prev_date + '.csv'结尾。我试图找到一种方法，当应用上述开始和结束过滤器时，将所有这些内容合并为一个文件。

import os

prev_date = str('20190624')
csv_header = 'Index no,date,thesis,quantity'
csv_out = 'R:/Sam/simulator/consolidated_positions.csv'

csv_dir = 'R:/Sam/simulator/'

dir_tree = csv_dir
for dirpath, dirnames, filenames in dir_tree:
    pass

csv_list = []
for file in filenames:
    if file.endswith(prev_date + '.csv') and file.startswith('IC'):
        csv_list.append(file)

csv_merge = open(csv_out, 'w')
csv_merge.write(csv_header)
csv_merge.write('\n')

for file in csv_list:
    csv_in = open(file)
    for line in csv_in:
        if line.startswith(csv_header):
            continue
        csv_merge.write(line)
    csv_in.close()
    csv_merge.close()
print('Verify consolidated CSV file : ' + csv_out)

问题是我一直将项目文件夹放回回溯中

Traceback (most recent call last):
  File "R:/Sam/Project/Branch/concatanator.py", line 10, in <module>
    for dirpath, dirnames, filenames in dir_tree:
ValueError: not enough values to unpack (expected 3, got 1)

我认为它与预期的相对文件路径有关，而不是我提供的实际文件路径。

如果可能的话，如果有人在文件名中的任何位置都带有单词EXTRA的话，该如何快速排除这些文件呢？

Answer 1

尝试更改代码的这一行

for dirpath, dirnames, filenames in dir_tree:

为此，请使用os.walk()：

for dirpath, dirnames, filenames in os.walk(dir_tree):

关于第二个问题：

如果可能的话，如果有人在文件名中的任何地方都带有EXTRA字样，该如何快速排除这些文件呢？

您可以尝试以下方法：

for file in filenames:
    if 'EXTRA' not in file:
        if file.endswith(prev_date + '.csv') and file.startswith('IC'):
            csv_list.append(file)

最终代码（您应使用os.path.join(dirpath, file)来获取完整的文件路径）：

csv_list = []
for dirpath, dirnames, filenames in os.walk(csv_dir):
    for file in filenames:
        if 'EXTRA' not in file:
            if file.endswith(prev_date + '.csv') and file.startswith('IC'):
                csv_list.append(os.path.join(dirpath, file))

Python中文件合并的ValueError

1 个答案: