如何在python中读取多个csv并获得一个csv作为输出

时间:2016-12-16 09:46:37

标签: python csv dataset

我已经问过如何在熊猫中解决它的问题。但现在我需要一个非熊猫版本。

我的代码

import glob
import os

## path
path = r'C:/x/x/Desktop/xxx/'
all_files = glob.glob(os.path.join(path, '*.csv'))

## column
column_headers = ['Date', 'Time', 'Duration', 'IP', 'Request']

## open only one csv. -- I want to read here not only 1 file --
## my approach:
## with open(all_files) as log, ....
with open('log.csv') as log, open('out355.csv', 'w') as out:
    out.write(';'.join(column_headers)+'\n') 
    while True:
        try:
            lines = [next(log).strip('\n').split(' ',4) for i in range(6)][3:]
            out.write(';'.join(lines[1][:2]+[l[4] for l in lines])+'\n')
        except StopIteration:
            break

因为我是python的新手,所以我不能仅仅修改我的运行代码。如果我能得到完整的代码,我会很高兴。

谢谢!

3 个答案:

答案 0 :(得分:0)

您已关闭,您需要阅读每个*.csv文件并将它们连接起来。因此,您必须打开一个新文件并使用glob读取每个csv文件。确保在执行此操作时,在其末尾的每个csv文件中都有一个尾随的新行,或者您最后得到file_x的最后一行和{{1的第一行数据行在同一行

file_x+1

运行时:

a.csv

from glob import glob

with open('combined.csv', 'a') as combinedFile:
    combinedFile.write('a,b,c,d,e\n') # Headers
    for eachFile in glob('*.csv'):
        if eachFile == 'combined.csv':
            pass
        else:
            count = 0
            for line in open(eachFile, 'r'):
                if count != 0: # So that you don't read 1st line of every file if it contains the headers.
                    combinedFile.write(line)
                count = 1

b.csv

a,b,c,d,e
1,2,3,4,5
6,7,8,9,10

combined.csv

a,b,c,d,e
11,12,13,14,15
16,17,18,19,20

答案 1 :(得分:-1)

这些方面应该有效:

with open('out355.csv', 'w') as out:
  for csvfile in all_files:
    with open(csvfile) as log:
      out.write(...)
      .. the rest of your script ..

答案 2 :(得分:-1)

这应该可行

import glob
import os

## path
path = r'C:/x/x/Desktop/xxx/'
all_files = glob.glob(os.path.join(path, '*.csv'))

## column
column_headers = ['Date', 'Time', 'Duration', 'IP', 'Request']

out = open('out355.csv', 'w')
out.write(';'.join(column_headers)+'\n')
for file_ in all_files:
    log = open(file_)
    while True:
        try:
            lines = [next(log).strip('\n').split(' ',4) for i in range(6)][3:]
            out.write(';'.join(lines[1][:2]+[l[4] for l in lines])+'\n')
        except StopIteration:
            break