我有两个csv文件,我想从两者的合并中创建第三个csv。这是我的文件的外观:
Num |状态
1213 |封闭
4223 |开放
2311 |开
和另一个文件有这个:
Num |代码
1002 | 9822
1213 | 1891年
4223 | 0011
所以,这是我试图循环的小代码,但它没有打印输出,第三列添加了匹配正确的值。
def links():
first = open('closed.csv')
csv_file = csv.reader(first)
second = open('links.csv')
csv_file2 = csv.reader(second)
for row in csv_file:
for secrow in csv_file2:
if row[0] == secrow[0]:
print row[0]+"," +row[1]+","+ secrow[0]
time.sleep(1)
所以我想要的是:
Num |状态|代码
1213 |关闭| 1891年
4223 |打开| 0011
2311 |打开|空白没有比赛
答案 0 :(得分:4)
答案 1 :(得分:3)
如果您决定使用pandas
,则只需五行即可完成。
import pandas as pd
first = pd.read_csv('closed.csv')
second = pd.read_csv('links.csv')
merged = pd.merge(first, second, how='left', on='Num')
merged.to_csv('merged.csv', index=False)
答案 2 :(得分:1)
您可以将第二个文件的值读入字典,然后将它们添加到第一个文件中。
Code = {}
for row in csv_file2:
Code[row[0]] = row[1]
for row in csv_file1:
row.append(Code.get(row[0], "blank no match"))
答案 3 :(得分:1)
问题是你只能在csv阅读器上迭代一次,这样csv_file2在第一次迭代后就不起作用了。要解决这个问题,您应该保存csv_file2的输出并迭代保存的列表。 它可能看起来像那样:
import time, csv
def links():
first = open('closed.csv')
csv_file = csv.reader(first, delimiter="|")
second = open('links.csv')
csv_file2 = csv.reader(second, delimiter="|")
list=[]
for row in csv_file2:
list.append(row)
for row in csv_file:
match=False
for secrow in list:
if row[0].replace(" ","") == secrow[0].replace(" ",""):
print row[0] + "," + row[1] + "," + secrow[1]
match=True
if not match:
print row[0] + "," + row[1] + ", blank no match"
time.sleep(1)
输出:
Num , status, code
1213 , closed, 1891
4223 , open, 0011
2311 , open, blank no match
答案 4 :(得分:1)
此代码将为您完成:
import csv
def links():
# open both files
with open('closed.csv') as closed, open('links.csv') as links:
# using DictReader instead to be able more easily access information by num
csv_closed = csv.DictReader(closed)
csv_links = csv.DictReader(links)
# create dictionaries out of the two CSV files using dictionary comprehensions
num_dict = {row['num']:row['status'] for row in csv_closed}
link_dict = {row['num']:row['code'] for row in csv_links}
# print header, each column has width of 8 characters
print("{0:8} | {1:8} | {2:8}".format("Num", "Status", "Code"))
# print the information
for num, status in num_dict.items():
# note this call to link_dict.get() - we are getting values out of the link dictionary,
# but specifying a default return value of an empty string if num is not found in it
# to avoid an exception
print("{0:8} | {1:8} | {2:8}".format(num, status, link_dict.get(num, '')))
links()
在其中,我正在利用词典,它允许您通过键访问信息。我也使用隐式循环(字典理解),它往往更快,需要更少的代码。
您应该注意这个代码有两个怪癖,您的示例建议没问题:
最后注意:由于您将输入文件称为“CSV”文件,因此我对输入文件的格式进行了一些假设。这是我的输入文件的代码:
closed.csv
NUM,状态
1213,收盘
4223,开
2311,打开
links.csv
NUM,代码
1002,9822
1213,1891
4223,0011
鉴于这些输入文件,结果如下所示:
Num | Status | Code
1213 | closed | 1891
2311 | open |
4223 | open | 0011