处理和打印CSV文件中的数据

时间:2016-01-18 16:17:38

标签: python python-2.7 loops csv

我有两个CSV文件。两个文件中的第一列是时间戳,但所有其他列包含不同的数据。其中一些时间戳重叠但出现在不同的行中 我想创建一个新文件,其中包含所有重叠的时间戳,以及两个文件中的相关数据。

例如:

文件1:

['1', 'John', 'Doe'] 
['2', 'Jane', 'Deer']
['3', 'Horror', 'Movie']

文件2:

['2', 'Mac']
['3', 'bro']
['4', 'come']
['1', '@mebro']

输出文件:

['1', 'John', 'Doe', '@mebro']
['2', 'Jane', 'Deer', 'Mac']
['3', 'Horror', 'Movie', 'bro']

这是我到目前为止的代码:

Outfile = []

for row in file2:
Outfile.append(tuple(row))

if len(file1) >= len(file2):
    for n in xrange(1,len(file2)):
        if file1[0][n] == file2[0][:]:
            Outfile.append(file1[1:8][n])

if len(file2) >= len(file1):
    for n in xrange(1,len(file1)):
        if file1[0][n] == file2[0][:]:
            Outfile.append(file1[1:8][n])

resultFile = open("resultFile.csv","wb")
wr = csv.writer(Outfile, dialect= "excel")
wr.writerows(Outfile)

2 个答案:

答案 0 :(得分:0)

使用pandas库。

import pandas as pd

df1 = pd.read_csv("path to file 1")
df2 = pd.read_csv("path to file 2")

result = merge(df1, df2, on='First column', sort=True)
result.to_csv("path to result file")

merge将使用指定的列连接两个数据帧。

More Information

答案 1 :(得分:0)

mds给出的答案更有效率,我只将此作为补充信息,因为您尝试使用列表索引的方式存在许多基本问题。此代码将提供您正在查找的输出列表,并可能更好地说明它们的工作方式(在file2中添加'example'以显示它将如何添加其他元素)。

list1 = [['1', 'John', 'Doe'], 
        ['2', 'Jane', 'Deer'],
        ['3', 'Horror', 'Movie']]

list2 = [['2', 'Mac', 'example'],
        ['3', 'bro'],
        ['4', 'come'],
        ['1', '@mebro']]

for x in range(len(list1)):
    print "List1 timestamp for consideration: " + str(list1[x][0])
    for y in range(len(list2)):
        print "Compared to list2 timestamp: " + str(list2[y][0])
        if list1[x][0] == list2[y][0]:
            print "Match"
            for z in range(1,len(list2[y])):
                list1[x].append(list2[y][z])

您的打印输出是:

List1 timestamp for consideration: 1
Compared to list2 timestamp: 2
Compared to list2 timestamp: 3
Compared to list2 timestamp: 4
Compared to list2 timestamp: 1
Match
List1 timestamp for consideration: 2
Compared to list2 timestamp: 2
Match
Compared to list2 timestamp: 3
Compared to list2 timestamp: 4
Compared to list2 timestamp: 1
List1 timestamp for consideration: 3
Compared to list2 timestamp: 2
Compared to list2 timestamp: 3
Match
Compared to list2 timestamp: 4
Compared to list2 timestamp: 1

使用list1然后看起来像:

list 1 = [['1', 'John', 'Doe', '@mebro'],
 ['2', 'Jane', 'Deer', 'Mac', 'example'],
 ['3', 'Horror', 'Movie', 'bro']]